• Title/Summary/Keyword: Instruction set design

Search Result 121, Processing Time 0.036 seconds

Benchmarking Korean Block Ciphers on 32-Bit RISC-V Processor (32-bit RISC-V 프로세서에서 국산 블록 암호 성능 밴치마킹)

  • Kwak, YuJin;Kim, YoungBeom;Seo, Seog Chung
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.31 no.3
    • /
    • pp.331-340
    • /
    • 2021
  • As the communication industry develops, the development of SoC (System on Chip) is increasing. Accordingly, the paradigm of technology design of industries and companies is changing. In the existing process, companies purchased micro-architecture, but now they purchase ISA (Instruction Set Architecture), and companies design the architecture themselves. RISC-V is an open instruction set based on a reduced instruction set computer. RISC-V is equipped with ISA, which can be expanded through modularization, and an expanded version of ISA is currently being developed through the support of global companies. In this paper, we present benchmarking frameworks ARIA, LEA, and PIPO of Korean block ciphers in RISC-V. We propose implementation methods and discuss performance by utilizing the basic instruction set and features of RISC-V.

Design of an Automatic Generation System for Cycle-accurate Instruction-set Simulators for DSP Processors (DSP 프로세서용 인스트럭션 셋 시뮬레이터 자동생성기의 설계에 관한 연구)

  • Hong, Sung-Min;Park, Chang-Soo;Hwang, Sun-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.32 no.9A
    • /
    • pp.931-939
    • /
    • 2007
  • This paper describes the system which automatically generates instruction-set simulators cores using the SMDL. SMDL describes structure and instruction-set information of a target DSP machine. Analyzing behavioral information of each pipeline stage of all instructions on a target ASIPS, the proposed system automatically generates a cycle-accurate instruction set simulator in C++ for a target processor. The proposed system has been tested by generating instruction-set simulators for ARM9E-S, ADSP-TS20x, and TMS320C2x architectures. Experiments were performed by checking the functions of the $4{\times}4$ matrix multiplication, 16-bit IIR filter, 32-bit multiplication, and the FFT using the generated simulators. Experimental results show the functional accuracy of the generated simulators.

Automatic Generation of Instruction Set Simulators for Microprocessors (마이크로프로세서를 위한 명령어 집합 시뮬레이터의 자동 생성)

  • Lee, Seong-Uk;Hong, Man-Pyo
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.38 no.3
    • /
    • pp.220-228
    • /
    • 2001
  • Simulation of an instruction set is essential to design and optimize new microprocessors, and to develop application programs. Though many simulation tools are widely used, their low-level description and simulation make users construct simulators difficult and spend a lot of time for simulation. We developed an automatic generator of instruction set simulators that perform register-transfer-level simulation. This automatic generator might be adaptable so as to be suitable for new modification or different conditions in designing microprocessors. In this paper, we describe a structure of automatic generation system and an implementation details.

  • PDF

Design of Vector Register Architecture in DSP Processor for Efficient Multimedia Processing

  • Wu, Chou-Pin;Wu, Jen-Ming
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.7 no.4
    • /
    • pp.229-234
    • /
    • 2007
  • In this paper, we present an efficient instruction set architecture using vector register file hardware to accelerate operation of general matrix-vector operations in DSP microprocessor. The technique enables in-situ row-access as well as column access to the register files. It can reduce the number of memory access significantly. The technique is especially useful for block-based video signal processing kernels such as FFT/IFFT, DCT/IDCT, and two-dimensional filtering. We have applied the new instruction set architecture to in-loop deblocking filter processing in H.264 decoder. Performance comparisons show that the required load/store operations for the in-loop deblocking filter can be reduced about 42%. The architecture would improve the processing speed, and code density in DSP microprocessor especially for video signal processing substantially.

The Compressed Instruction Set Architecture for the OpenRISC Processor (OpenRISC 프로세서를 위한 압축 명령어 집합 구조)

  • Kim, Dae-Hwan
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.10
    • /
    • pp.11-23
    • /
    • 2012
  • To achieve efficient code size reduction, this paper proposes a new compressed instruction set architecture for the OpenRISC architecture. The new instructions and their corresponding formats are designed by the profiling information of the existing instruction usage. New 16-bit instructions and 32-bit instructions are proposed to compressed the existing 32-bit instructions and instruction sequences, respectively. The proposed instructions can be classified into three types. The first is the new 16-bit instructions for the frequent normal 32-bit instructions such as add, load, store, branch, and jump instructions. The second type is the new 32-bit instructions for the consecutive two load instructions, two store instructions, and 32-bit data mov instructions. Finally, two new 32-bit instructions are proposed to compress function prolog and epilog code, respectively. OpenRISC hardware decoder is extended to support the new instructions. Experiments show that the efficiency of code size reduction improves by an average of 30.4% when compared to the OR1200 instruction set architecture without loss of execution performance.

Resuable Design of 32-Bit RISC Processor for System On-A Chip (SOC 설계를 위한 저전력 32-비트 RISC 프로세서의 재사용 가능한 설계)

  • 이세환;곽승호;양훈모;이문기
    • Proceedings of the IEEK Conference
    • /
    • 2001.06b
    • /
    • pp.105-108
    • /
    • 2001
  • 4 32-bit RISC core is designed for embedded application and DSP. This processor offers low power consumption by fully static operation and compact code size by efficient instruction set. Processor performance is improved by wing conditional instruction execution, block data transfer instruction, multiplication instruction, bunked register file structure. To support compact code size of embedded application, It is capable cf executing both 16-bit instructions and 32-bit instruction through mixed mode instruction conversion Furthermore, for fast MAC operation for DSP applications, the processor has a dedicated hardware multiplier, which can complete a 32-bit by 32-bit integer multiplication within seven clock cycles. These result in high instruction throughput and real-time interrupt response. This chip is implemented with 0.35${\mu}{\textrm}{m}$, 4- metal CMOS technology and consists of about 50K gate equivalents.

  • PDF

The Design of A Program Counter Unit for RISC Processors (RISC 프로세서의 프로그램 카운터 부(PCU)의 설계)

  • 홍인식;임인칠
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.7
    • /
    • pp.1015-1024
    • /
    • 1990
  • This paper proposes a program counter unit(PCU) on the pipelined architecture of RISC (Reduced Instruction Set Computer) type high performance processors, PCU is used for supplying instruction addresses to memory units(Instruction Cache) efficiently. A RISC processor's PCU has to compute the instruction address within required intervals continnously. So, using the method of self-generated incrementor, is more efficient than the conventional one's using ALU or private adder. The proposed PCU is designed to have the fast +4(Byte Address) operation incrementor that has no carry propagation delay. Design specifications are taken by analyzing the whole data path operation of target processor's default and exceptional mode instructions. CMOS and wired logic circuit technologic are used in PCU for the fast operation which has small layout area and power dissipation. The schematic capture and logic, timing simulation of proposed PCU are performed on Apollo W/S using Mentor Graphics CAD tooks.

  • PDF

A Design of an Embedded Microprocessor with Variable Length Instruction Mode (가변길이 명령어 모드를 갖는 Embedded Microprocessor의 설계)

  • 박기현;오민석;이광엽;한진호;김영수;배영환;조한진
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.4
    • /
    • pp.83-90
    • /
    • 2004
  • In this paper, we proposed a new instruction set(X32Y ISA) with 3 different types of instruction mode. The proposed instruction set organizes 32-bit, 24-bit, 16-bit instruction in order to solves a problem of memory size limitation in an embedded microprocessor. We designed a 32-bit 5 stage pipeline RISC microprocessor based on the X32V ISA. To verify the proposed the X32V ISA and a microprocessor, we estimated a program code size of multimedia application programs using a X32V simulator. In result, we verified that the Light mode and the Ultra Light mode obtains 8%, 27% reduction of a program code size through comparison with the Default mode. The proposed microprocessor was verified all X32V instructions execution at Xilinx FPGA with 33MHz operating frequency,

Design of A On-Chip Caches for RISC Processors (RISC 프로세서 On-Chip Cache의 설계)

  • 홍인식;임인칠
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.27 no.8
    • /
    • pp.1201-1210
    • /
    • 1990
  • This paper proposes on-chip instruction and data cache memories on RISC reduced instruction set computer) architecture which supports fast instruction fetch and data read/write, and enables RISC processor under research to obtain high performance. In the execution of HLL(high level language) programs, heavily used local scalar variables are stored in large register file, but arrays, structures, and global scalar variables are difficult for compiler to allocate registers. These problems can be solved by on-chip Instruction/Data cache. And each cycle of instruction fetch, pad delay causes the lowering of the processors's performance. Cache memories are designed in CMOS technology and SRAM(static-RAM), that saves layout area and power dissipation, is used for instruction and data storage. To speed up and support RISC processor's piplined architecture efficiently, hardwired logic technology is used overall circuits i cache blocks. The schematic capture and timing simulation of proposed cache memorises are performed on Apollo DN4000 workstation using Mentor Graphics CAD tools.

  • PDF

Efficient Loop Accelerator for Motion Estimation Specific Instruction-set Processor (움직임 추정 전용 프로세서를 위한 효율적인 루프 가속기)

  • Ha, Jae Myung;Jung, Ho Sun;Sunwoo, Myung Hoon
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.7
    • /
    • pp.159-166
    • /
    • 2013
  • This paper proposes an efficient loop accelerator for a motion estimation specific instruction-set processor. ME algorithms in nature contain complex and multiple loop operations. To support efficient hardware (HW) loop operations, this paper introduces four loop instructions and their specific HW architecture. The simulation results show that the proposed loop accelerator can reduce about 29% average instruction cycles for ME early-termination schemes compared with typical implementation having a combination of compare and conditional jump instructions. The proposed loop accelerator of the motion estimation specific instruction-set processor can significantly reduce the number of program memory accesses and greatly save power consumption. Hence, it can be quite suitable for low power and flexible ME implementation.