• Title/Summary/Keyword: Fully Pipelined

Search Result 21, Processing Time 0.026 seconds

A Design of Giga-bit security module Using Fully pipelined CTR-AES (Full-pipelined CTR-AES를 이용한 Giga-bit 보안모듈 설계)

  • Vinh, T.Q.;Park, Ju-Hyun;Kim, Young-Chul
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.225-228
    • /
    • 2008
  • In this paper, we presented our implementation of a counter mode AES based on Virtex4 FPGA. Our design exploits three advanced features: composite field arithmetic SubByte, efficient MixColumn transformation, and On-the-Fly Key-Scheduling for fully pipelined architecture. By pipelining the composite field implementation of the S-box, the area cost is reduced to average 17 percent. By designing the On-the-Fly key scheduling, we implemented an efficient key-expander module which is specialized for a pipelined architecture.

  • PDF

VHDL Design for Out-of-Order Superscalar Processor of A Fully Pipelined Scheme (완전한 파이프라인 방식의 비순차실행 수퍼스칼라 프로세서의 VHDL 설계)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.21 no.1
    • /
    • pp.99-105
    • /
    • 2021
  • Today, a superscalar processor is the basic unit or an essential component of a multi-core processor, SoCs, and GPUs. Hence, a high-performance out-of-order superscalar processor must be adopted for these systems to maximize its performance. The superscalar processor fetches, issues, executes, and writes back multiple instructions per cycle by utilizing reorder buffers and reservation stations to dynamically schedule instructions in a pipelined scheme. In this paper, a fully pipelined out-of-order superscalar processor with speculative execution is designed with VHDL and verified with GHDL. As a result of the simulation, the program composed of ARM instructions is successfully performed.

Design and Simulation for Out-of-Order Execution Processor of a Fully Pipelined Scheme (완전한 파이프라인 방식의 비순차실행 프로세서의 설계 및 모의실행)

  • Lee, Jongbok
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.20 no.5
    • /
    • pp.143-149
    • /
    • 2020
  • Currently, a multi-core processor is mainly used as a central processing unit of a computer system, and a high-performance out-of-order processor is adopted as each core to maximize system performance. The early out-of-order execution processor with Tomasulo algorithm aimed at floating-point instructions, and it took several cycles to execute by the use of complex structures such as reorder buffer and reservation station. However, in order for the processor to properly utilize out-of-order execution and increase the throughput of instructions, it must operate in a fully pipelined manner. In this paper, a fully pipelined out-of-order processor with speculative execution is designed with VHDL and verified with GHDL. As a result of the simulation, a program composed of ARM instructions is successfully performed.

Design of an Area-Efficient Reed-Solomon Decoder using Pipelined Recursive Technique (파이프라인 재귀적인 기술을 이용한 면적 효율적인 Reed-Solomon 복호기의 설계)

  • Lee, Han-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.7 s.337
    • /
    • pp.27-36
    • /
    • 2005
  • This paper presents an area-efficient architecture to implement the high-speed Reed-Solomon(RS) decoder, which is used in a variety of communication systems such as wireless and very high-speed optical communications. We present the new pipelined-recursive Modified Euclidean(PrME) architecture to achieve high-throughput rate and reducing hardware-complexity using folding technique. The proposed pipelined recursive architecture can reduce the hardware complexity about 80$\%$ compared to the conventional systolic-array and fully-parallel architecture. The proposed RS decoder has been designed and implemented with the 0.13um CMOS technology in a supply voltage of 1.2 V. The result show that total number of gate is 393 K and it has a data processing rate of S Gbits/s at clock frequency of 625 MHz. The proposed area-efficient architecture can be readily applied to the next generation FEC devices for high-speed optical communications as well as wireless communications.

A 9-Bit 80-MS/s CMOS Pipelined Folding A/D Converter with an Offset Canceling Technique

  • Lee, Seung-Chul;Jeon, Young-Deuk;Kwon, Jong-Kee
    • ETRI Journal
    • /
    • v.29 no.3
    • /
    • pp.408-410
    • /
    • 2007
  • A 9-bit 80-MS/s CMOS pipelined folding analog-to-digital converter employing offset-canceled preamplifiers and a subranging scheme is proposed to extend the resolution of a folding architecture. A fully differential dc-decoupled structure achieves high linearity in circuit design. The measured differential nonlinearity and integral nonlinearity of the prototype are ${\pm}0.6$ LSB and ${\pm}1.6$ LSB, respectively.

  • PDF

A Design of 3D Graphics Lighting Processor for Mobile Applications (휴대 단말기용 3D Graphics Lighting Processor 설계)

  • Yang, Joon-Seok;Kim, Ki-Chul
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.837-840
    • /
    • 2005
  • This paper presents 3D graphics lighting processor based on vector processing using pipeline chaining. The lighting process of 3D graphics rendering contains many arithmetic operations and its complexity is very high. For high throughput, proposed processor uses pipelined functional units. To implement fully pipelined architecture, we have to use many functional units. Hence, the number of functional units is restricted. However, with the restricted number of pipelined functional units, the utilization of the units is reduced and a resource reservation problem is caused. To resolve these problems, the proposed architecture uses vector processing using pipeline chaining. Due to its pipeline chaining based architecture, it can perform 4.09M vertices per 1 second with 100MHz frequency. The proposed 3D graphics lighting processor is compatible with OpenGL ES API and the design is implemented and verified on FPGA.

  • PDF

A Pipelined Hardware Architecture of an H.264 Deblocking Filter with an Efficient Data Distribution

  • Lee, Sang-Heon;Lee, Hyuk-Jae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.6 no.4
    • /
    • pp.227-233
    • /
    • 2006
  • In order to reduce blocking artifacts and improve compression efficiency, H.264/AVC standard employs an adaptive in-loop deblocking filter. This paper proposes a new hardware architecture of the deblocking filter that employs a four-stage pipelined structure with an efficient data distribution. The proposed architecture allows a simultaneous supply of eight data samples to fully utilize the pipelined filter in both horizontal and vertical filterings. This paper also presents a new filtering order and data reuse scheme between consecutive macroblock filterings to reduce the communication for external memory access. The number of required cycles for filtering one macroblock (MB) is 357 cycles when the proposed filter uses dual port SRAMs. This execution speed is only 41.3% of that of the fastest previous work.

A VLSI Design for Digital Pre-distortion with Pipelined CORDIC Processors

  • Park, Jong Kang;Moon, Jun Young;Kim, Kyunghoon;Yang, Youngoo;Kim, Jong Tae
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.6
    • /
    • pp.718-727
    • /
    • 2014
  • In a wireless communications system, a predistorter is often used to compensate for the nonlinear distortions that result from operating a power amplifier near the saturation region, thereby improving system performance and increasing the spectral efficiency for the communication channels. This paper presents a new VLSI design for the polynomial digital predistorter (DPD). The proposed DPD uses a Coordinate Rotation Digital Computing (CORDIC) processor and a PD process with a fully-pipelined architecture. Due to its simple and regular structure, it can be a competitive design when compared to existing polynomial-type and approximated DPDs. Implementing a fifth-order distorter with the proposed design requires only 43,000 logic gates in a $0.35{\mu}m$ CMOS standard cell library.

Pipelined Parallel Processing System for Image Processing (영상처리를 위한 Pipelined 병렬처리 시스템)

  • Lee, Hyung;Kim, Jong-Bae;Choi, Sung-Hyk;Park, Jong-Won
    • Journal of IKEEE
    • /
    • v.4 no.2 s.7
    • /
    • pp.212-224
    • /
    • 2000
  • In this paper, a parallel processing system is proposed for improving the processing speed of image related applications. The proposed parallel processing system is fully synchronous SIMD computer with pipelined architecture and consists of processing elements and a multi-access memory system. The multi-access memory system is made up of memory modules and a memory controller, which consists of memory module selection module, data routing module, and address calculating and routing module, to perform parallel memory accesses with the variety of types: block, horizontal, and vertical access way. Morphological filter had been applied to verify the parallel processing system and resulted in faithful processing speed.

  • PDF

A design of Giga-bit security module using Fully pipe-lined CTR-AES (Full-pipelined CTR-AES를 이용한 Giga-bit 보안모듈 설계)

  • Vinh, T.Q.;Park, Ju-Hyun;Kim, Young-Chul;Kim, Kwang-Ok
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.6
    • /
    • pp.1026-1031
    • /
    • 2008
  • Nowdays, homes and small businesses rely more and more PON(Passive Optical Networks) for financial transactions, private communications and even telemedicine. Thus, encryption for these data transactions is very essential due to the multicast nature of the PON In this parer, we presented our implementation of a counter mode AES based on Virtex4 FPGA. Our design exploits three advanced features; 1) Composite field arithmetic SubByte, 2) efficient MixColumn transformation 3) and on-the-fly key-scheduling for fully pipelined architecture. By pipeling the composite field implementation of the S-box, the area cost is reduced to average 17 percent. By designing the on-the-fly key-scheduling, we implemented an efficient key-expander module which is specialized for a pipelined architecture.