• Title/Summary/Keyword: datapath

Search Result 63, Processing Time 0.022 seconds

A SoC Design Synthesis System for High Performance Vehicles (고성능 차량용 SoC 설계 합성 시스템)

  • Chang, Jeong-Uk;Lin, Chi-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.3
    • /
    • pp.181-187
    • /
    • 2020
  • In this paper, we proposed a register allocation algorithm and resource allocation algorithm in the high level synthesis process for the SoC design synthesis system of high performance vehicles We have analyzed to the operator characteristics and structure of datapath in the most important high-level synthesis. We also introduced the concept of virtual operator for the scheduling of multi-cycle operations. Thus, we demonstrated the complexity to implement a multi-cycle operation of the operator, regardless of the type of operation that can be applied for commonly use in the resources allocation algorithm. The algorithm assigns the functional operators so that the number of connecting signal lines which are repeatedly used between the operators would be minimum. This algorithm provides regional graphs with priority depending on connected structure when the registers are allocated. The registers with connecting structure are allocated to the maximum cluster which is generated by the minimum cluster partition algorithm. Also, it minimize the connecting structure by removing the duplicate inputs for the multiplexor in connecting structure and arranging the inputs for the multiplexor which is connected to the operators. In order to evaluate the scheduling performance of the described algorithm, we demonstrate the utility of the proposed algorithm by executing scheduling on the fifth digital wave filter, a standard bench mark model.

Color Media Instructions for Embedded Parallel Processors (임베디드 병렬 프로세서를 위한 칼라미디어 명령어 구현)

  • Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.35 no.7
    • /
    • pp.305-317
    • /
    • 2008
  • As a mobile computing environment is rapidly changing, increasing user demand for multimedia-over-wireless capabilities on embedded processors places constraints on performance, power, and sire. In this regard, this paper proposes color media instructions (CMI) for single instruction, multiple data (SIMD) parallel processors to meet the computational requirements and cost goals. While existing multimedia extensions store and process 48-bit pixels in a 32-bit register, CMI, which considers that color components are perceptually less significant, supports parallel operations on two-packed compressed 16-bit YCbCr (6 bit Y and 5 bits Cb, Cr) data in a 32-bit datapath processor. This provides greater concurrency and efficiency for YCbCr data processing. Moreover, the ability to reduce data format size reduces system cost. The reduction in data bandwidth also simplifies system design. Experimental results on a representative SIMD parallel processor architecture show that CMI achieves an average speedup of 6.3x over the baseline SIMD parallel processor performance. This is in contrast to MMX (a representative Intel's multimedia extensions), which achieves an average speedup of only 3.7x over the same baseline SIMD architecture. CMI also outperforms MMX in both area efficiency (a 52% increase versus a 13% increase) and energy efficiency (a 50% increase versus an 11% increase). CMI improves the performance and efficiency with a mere 3% increase in the system area and a 5% increase in the system power, while MMX requires a 14% increase in the system area and a 16% increase in the system power.

Scalable RSA public-key cryptography processor based on CIOS Montgomery modular multiplication Algorithm (CIOS 몽고메리 모듈러 곱셈 알고리즘 기반 Scalable RSA 공개키 암호 프로세서)

  • Cho, Wook-Lae;Shin, Kyung-Wook
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.22 no.1
    • /
    • pp.100-108
    • /
    • 2018
  • This paper describes a design of scalable RSA public-key cryptography processor supporting four key lengths of 512/1,024/2,048/3,072 bits. The modular multiplier that is a core arithmetic block for RSA crypto-system was designed with 32-bit datapath, which is based on the CIOS (Coarsely Integrated Operand Scanning) Montgomery modular multiplication algorithm. The modular exponentiation was implemented by using L-R binary exponentiation algorithm. The scalable RSA crypto-processor was verified by FPGA implementation using Virtex-5 device, and it takes 456,051/3,496347/26,011,947/88,112,770 clock cycles for RSA computation for the key lengths of 512/1,024/2,048/3,072 bits. The RSA crypto-processor synthesized with a $0.18{\mu}m$ CMOS cell library occupies 10,672 gate equivalent (GE) and a memory bank of $6{\times}3,072$ bits. The estimated maximum clock frequency is 147 MHz, and the RSA decryption takes 3.1/23.8/177/599.4 msec for key lengths of 512/1,024/2,048/3,072 bits.