• 제목/요약/키워드: 32bit

Search Result 861, Processing Time 0.032 seconds

A Study on the 32 bit RISC/DSP Microprocessor Appropriate for Embedded Systems (내장형 시스템에 적합한 32 비트 RISC/DSP 마이크로프로세서에 관한 연구)

  • 유동열;문병인;홍종욱;이태영;이용석
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.257-260
    • /
    • 1999
  • We have designed a 32-bit RISC microprocessor with 16/32-bit fixed-point DSP functionality. This processor, called YRD-5, combines both general-purpose microprocessor and digital signal processor (DSP) functionality using the reduced instruction set computer (RISC) design principles. It has functional units for arithmetic operation, digital signal processing (DSP) and memory access. They operate in parallel in order to remove stall cycles after DSP and load/store instructions with one or more issue latency cycles. High performance was achieved with these parallel functional units while adopting a sophisticated 5-stage pipeline structure and an improved DSP unit.

  • PDF

Design of a high-speed 32-bit adder using a new dynamic CMOS logic (새로운 동적 CMOS 논리 설계방식을 이용한 고성능 32비트 가산기 설계)

  • 김강철;한석붕
    • Journal of the Korean Institute of Telematics and Electronics A
    • /
    • v.33A no.3
    • /
    • pp.187-195
    • /
    • 1996
  • This paper proposes two new dynamic CMOS logic styles, called ZMODL (zipper-MODL) and EZMODL (enhanced-ZMODL), which can reduce more area dnd propagation delya than conventional MODL (multiple output domino logic). The 32-bit CLAs(carry look-ahead adder) are designed by ZMODL, EZMODL circuits, and their operations are verified by SPICE 3 with 2$\mu$ double metal CMOS parameters. The results shwo that the CLA designed by EZMODL circuit has achived 32-bit additin time of less than 4.8NS with VDD=5.0V and 8% of transistors cn be redcued, compared to the CLA designed by MODL. The EZMODL logic style can improve the performance in the high-speed computing circuits depending on the degree of recurrence.

  • PDF

Optimized Implementation of PIPO Lightweight Block Cipher on 32-bit RISC-V Processor (32-bit RISC-V상에서의 PIPO 경량 블록암호 최적화 구현)

  • Eum, Si Woo;Jang, Kyung Bae;Song, Gyeong Ju;Lee, Min Woo;Seo, Hwa Jeong
    • KIPS Transactions on Computer and Communication Systems
    • /
    • v.11 no.6
    • /
    • pp.167-174
    • /
    • 2022
  • PIPO lightweight block ciphers were announced in ICISC'20. In this paper, a single-block optimization implementation and parallel optimization implementation of PIPO lightweight block cipher ECB, CBC, and CTR operation modes are performed on a 32-bit RISC-V processor. A single block implementation proposes an efficient 8-bit unit of Rlayer function implementation on a 32-bit register. In a parallel implementation, internal alignment of registers for parallel implementation is performed, and a method for four different blocks to perform Rlayer function operations on one register is described. In addition, since it is difficult to apply the parallel implementation technique to the encryption process in the parallel implementation of the CBC operation mode, it is proposed to apply the parallel implementation technique in the decryption process. In parallel implementation of the CTR operation mode, an extended initialization vector is used to propose a register internal alignment omission technique. This paper shows that the parallel implementation technique is applicable to several block cipher operation modes. As a result, it is confirmed that the performance improvement is 1.7 times in a single-block implementation and 1.89 times in a parallel implementation compared to the performance of the existing research implementation that includes the key schedule process in the ECB operation mode.

SIMD Multiply-accumulate Unit Design for Multimedia Data Processing (멀티미디어 처리에 적합한 SIMD 곱셈누적 연산기의 설계)

  • 홍인표;정재원;정우경;이용석
    • Proceedings of the IEEK Conference
    • /
    • 2000.11b
    • /
    • pp.349-352
    • /
    • 2000
  • In this paper, a SIMD 64bit MAC (Multiply -Accumulate) unit is designed. It is composed of two 32bit MAC unit which supports SIMD 16bit operations. As a result, It can process two 32bit MAC operations or four 16bit operations in one cycle. Proposed MAC unit is described in Verilog HDL. After functional verification is performed, MAC unit is synthesized and optimized with 0.35$\mu\textrm{m}$ standard cell library. The synthesis result shows that this MAC unit can operate at 80㎒ of clock frequency in 85$^{\circ}C$, 3.0V, worst case process and 125㎒ of clock frequency at 25$^{\circ}C$, 3.3V, typical case process. It achieves 320Mops of performance, and is suitable for embedded DSP processors.

  • PDF

A Study on Extendable Instruction Set Computer 32 bit Microprocessor (확장 명령어 32비트 마이크로 프로세서에 관한 연구)

  • 조건영
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.36D no.5
    • /
    • pp.11-20
    • /
    • 1999
  • The data transfer width between the mocroprocessor and the memory comes to a critical part that limits system performance since the data transfer width has been as it was while the performance of a microprocessor is getting higher due to its continuous development in speed. And it is important that the memory should be in small size for the reduction of embedded microprocessor's price which is integrated on a single chip with the memory and IO circuit. In this paper, a mocroprocessor tentatively named as Extendable Instruction Set Computer(EISC) is proposed as the high code density 32 bit mocroprocessor architecture. The 32 bit EISC has 16 general purpose registers and 16 bit fixed length instruction which has the short length offset and small immediate operand. By using and extend register and extend flag, the offset and immediate operand could be extended. The proposed 32 bit EISC is implemented with an FPGA and all of its functions have been tested and verified at 1.8432MHz. And the cross assembler, the cross C/C++ compiler and the instruction simulator of the 32 bit EISC shows 140-220% and 120-140% higher code density than RISC and CISC respectively, which is much higher than any other traditional architectures. As a consequence, the EISC is suitable for the next generation computer architecture since it requires less data transfer width compared to any other ones. And its lower memory requirement will embedded microprocessor more useful.

  • PDF

Replica Technique regarding research for Bit-Line tracking (비트라인 트래킹을 위한 replica 기술에 관한 연구)

  • Oh, Se-Hyeok;Jung, Han-wool;Jung, Seong-Ook
    • Journal of IKEEE
    • /
    • v.20 no.2
    • /
    • pp.167-170
    • /
    • 2016
  • Replica bit-line technique is used for making enable signal of sense amplifier which accurately tracks bit-line of SRAM. However, threshold voltage variation in the replica bit-line circuit changes the cell current, which results in variation of the sense amplifier enable time, $T_{SAE}$. The variation of $T_{SAE}$ makes the sensing operation unstable. In this paper, in addition to conventional replica bit-line delay ($RBL_{conv}$), dual replica bit-line delay (DRBD) and multi-stage dual replica bit-line delay (MDRBD) which are used for reducing $T_{SAE}$ variation are briefly introduced, and the maximum possible number of on-cell which can satisfy $6{\sigma}$ sensing yield is determined through simulation at a supply voltage of 0.6V with 14nm FinFET technology. As a result, it is observed that performance of DRBD and MDRBD is improved 24.4% and 48.3% than $RBL_{conv}$ and energy consumption is reduced which 8% and 32.4% than $RBL_{conv}$.

Area Efficient Implementation of 32-bit Architecture of ARIA Block Cipher Using Light Weight Diffusion Layer (경량화된 확산계층을 이용한 32-비트 구조의 소형 ARIA 연산기 구현)

  • Ryu, Gwon-Ho;Koo, Bon-Seok;Yang, Sang-Woon;Chang, Tae-Joo
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.16 no.6
    • /
    • pp.15-24
    • /
    • 2006
  • Recently, the importance of the area efficient implementation of cryptographic algorithm for the portable device is increasing. Previous ARIA(Academy, Research Institute, Agency) implementation styles that usually concentrate upon speed, we not suitable for mobile devices in area and power aspects. Thus in this paper, we present an area efficient AR processor which use 32-bit architecture. Using new implementation technique of diffusion layer, the proposed processor has 11301 gates chip area. For 128-bit master key, the ARIA processor needs 87 clock cycles to generate initial round keys, n8 clock cycles to encrypt, and 256 clock cycles to decrypt a 128-bit block of data. Also the processor supports 192-bit and 256-bit master keys. These performances are 7% in area and 13% in speed improved results from previous cases.

Design of Programmable and Configurable Elliptic Curve Cryptosystem Coprocessor (재구성 가능한 타원 곡선 암호화 프로세서 설계)

  • Lee Jee-Myong;Lee Chanho;Kwon Woo-Suk
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.6 s.336
    • /
    • pp.67-74
    • /
    • 2005
  • Crypto-systems have difficulties in designing hardware due to the various standards. We propose a programmable and configurable architecture for cryptography coprocessors to accommodate various crypto-systems. The proposed architecture has a 32 bit I/O interface and internal bus width, and consists of a programmable finite field arithmetic unit, an input/output unit, a register file, and a control unit. The crypto-system is determined by the micro-codes in memory of the control unit, and is configured by programming the micro-codes. The coprocessor has a modular structure so that the arithmetic unit can be replaced if a substitute has an appropriate 32 bit I/O interface. It can be used in many crypto-systems by re-programming the micro-codes for corresponding crypto-system or by replacing operation units. We implement an elliptic curve crypto-processor using the proposed architecture and compare it with other crypto-processors

Sigma Delta Decimation Filter Design for High Resolution Audio Based on Low Power Techniques (저전력 기법을 사용한 고해상도 오디오용 Sigma Delta Decimation Filter 설계)

  • Au, Huynh Hai;Kim, SoYoung
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.11
    • /
    • pp.141-148
    • /
    • 2012
  • A design of a 32-bit fourth-stage decimation filter decimation filter used in sigma-delta analog-to-digital (A/D) converter is proposed in this work. A four-stage decimation filter with down-sampling factor of 512 and 32-bit output is developed. A multi-stage cascaded integrator-comb (CIC) filter, which reduces the sampling rate by 64, is used in the first stage. Three half-band FIR filters are used after the CIC filter, each of which reduces the sampling rate by two. The pipeline structure is applied in the CIC filter to reduce the power consumption of the CIC. The Canonic Signed Digit (CSD) arithmetic is used to optimize the multiplier structure of the FIR filter. This filter is implemented based on a semi-custom design flow and a 130nm CMOS standard cell library. This decimation filter operates at 98.304 MHz and provides 32-bit output data at an audio frequency of 192 kHz with power consumption of $697{\mu}W$. In comparison to the previous work, this design shows a higher performance in resolution, operation frequency and decimation factor with lower power consumption and small logic utilization.

Design of 32 bit Parallel Processor Core for High Energy Efficiency using Instruction-Levels Dynamic Voltage Scaling Technique

  • Yang, Yil-Suk;Roh, Tae-Moon;Yeo, Soon-Il;Kwon, Woo-H.;Kim, Jong-Dae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.9 no.1
    • /
    • pp.1-7
    • /
    • 2009
  • This paper describes design of high energy efficiency 32 bit parallel processor core using instruction-levels data gating and dynamic voltage scaling (DVS) techniques. We present instruction-levels data gating technique. We can control activation and switching activity of the function units in the proposed data technique. We present instruction-levels DVS technique without using DC-DC converter and voltage scheduler controlled by the operation system. We can control powers of the function units in the proposed DVS technique. The proposed instruction-levels DVS technique has the simple architecture than complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system and a hardware implementation is very easy. But, the energy efficiency of the proposed instruction-levels DVS technique having dual-power supply is similar to the complicated DVS which is DC-DC converter and voltage scheduler controlled by the operation system. We simulate the circuit simulation for running test program using Spectra. We selected reduced power supply to 0.667 times of the supplied power supply. The energy efficiency of the proposed 32 bit parallel processor core using instruction-levels data gating and DVS techniques can improve about 88.4% than that of the 32 bit parallel processor core without using those. The designed high energy efficiency 32 bit parallel processor core can utilize as the coprocessor processing massive data at high speed.