• Title/Summary/Keyword: pipelined DSP

Search Result 13, Processing Time 0.018 seconds

A Design and Implementation of 32-bit RISC-V RV32IM Pipelined Processor in Embedded Systems (임베디드 환경에서의 32-bit RISC-V RV32IM 파이프라인 프로세서 설계 및 구현)

  • Subin Park;Yongwoo Kim
    • Journal of the Semiconductor & Display Technology
    • /
    • v.22 no.4
    • /
    • pp.81-86
    • /
    • 2023
  • Recently, demand for embedded systems requiring low power and high specifications has been increasing, and RISC-V processors are being widely applied. RISC-V, a RISC-based open instruction set architecture (ISA), has been developed and researched by UC Berkeley and other researchers since 2010. RV32I ISA is sufficient to support integer operations such as addition and subtraction instructions, but M-extension should be defined for multiplication and division instructions. This paper proposes an RV32I, RV32IM processor, and indicates benchmark performance scores compared to an existing processor. Additionally, A non-stalling method was proposed to support a 2-stage pipelined DSP multiplier to the 5-stage pipelined RV32IM processor. Proposed RV32I and RV32IM processors satisfied a maximum operating frequency of 50 MHz on Artix-7 FPGA. The performance of the proposed processors was verified using benchmark programs from Dhrystone and Coremark. As a result, the Coremark benchmark results of the proposed processor showed that it outperformed the existing RV32IM processor by 23.91%.

  • PDF

A VLSI DESIGN OF CD SIGNAL PROCESSOR for High-Speed CD-ROM

  • Kim, Jae-Won;Kim, Jae-Seok;Lee, Jaeshin
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.1296-1299
    • /
    • 2002
  • We implemented a CD signal processor operated on a CAV 48-speed CD-ROM drive into a VLSI. The CD signal processor is a mixed mode monolithic IC including servo-processor, data recovery, data-processor, and I-bit DAC. For servo signal processing, we included a DSP core, while, for CAV mode playback, we adopted a PLL with a wide recovery range. Data processor (DP) was designed to meet the yellow book specification.[2]So, the DP block consists of EFM demodulator, C1/C2 ECC block, audio processor and a block transferring data to an ATAPI chip. A modified Euclid's algorithm was used as a key equation solver for the ECC block To achieve the high-speed decoding, the RS decoder is operated by a pipelined method. Audio playability is increased by playing a CD-DA disc at the speed of 12X or 16X. For this, subcode sync and data are processed in the same way as main data processing. The overall performance of IC is verified by measuring a transfer rate from the innermost area of disc to the outermost area. At 48-speed, the operating frequency is 210 ㎒, and this chip is fabricated by 0.35 um STD90 cell library of Samsung Electronics.

  • PDF

A New Hardware Design for Generating Digital Holographic Video based on Natural Scene (실사기반 디지털 홀로그래픽 비디오의 실시간 생성을 위한 하드웨어의 설계)

  • Lee, Yoon-Hyuk;Seo, Young-Ho;Kim, Dong-Wook
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.49 no.11
    • /
    • pp.86-94
    • /
    • 2012
  • In this paper we propose a hardware architecture of high-speed CGH (computer generated hologram) generation processor, which particularly reduces the number of memory access times to avoid the bottle-neck in the memory access operation. For this, we use three main schemes. The first is pixel-by-pixel calculation rather than light source-by-source calculation. The second is parallel calculation scheme extracted by modifying the previous recursive calculation scheme. The last one is a fully pipelined calculation scheme and exactly structured timing scheduling by adjusting the hardware. The proposed hardware is structured to calculate a row of a CGH in parallel and each hologram pixel in a row is calculated independently. It consists of input interface, initial parameter calculator, hologram pixel calculators, line buffer, and memory controller. The implemented hardware to calculate a row of a $1,920{\times}1,080$ CGH in parallel uses 168,960 LUTs, 153,944 registers, and 19,212 DSP blocks in an Altera FPGA environment. It can stably operate at 198MHz. Because of the three schemes, the time to access the external memory is reduced to about 1/20,000 of the previous ones at the same calculation speed.