• Title/Summary/Keyword: FPGA Hardware

Search Result 800, Processing Time 0.027 seconds

An Efficient Motion Estimation Method which Supports Variable Block Sizes and Multi-frames for H.264 Video Compression (H.264 동영상 압축에서의 가변 블록과 다중 프레임을 지원하는 효율적인 움직임 추정 방법)

  • Yoon, Mi-Sun;Chang, Seung-Ho;Moon, Dong-Sun;Shin, Hyun-Chul
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.5
    • /
    • pp.58-65
    • /
    • 2007
  • As multimedia portable devices become popular, the amount of computation for processing data including video compression has significantly increased. Various researches for low power consumption of the mobile devices and real time processing have been reported. Motion Estimation is responsible for 67% of H.264 encoder complexity. In this research, a new circuit is designed for motion estimation. The new circuit uses motion prediction based on approximate SAD, Alternative Row Scan (ARS), DAU, and FDVS algorithms. Our new method can reduce the amount of computation by 75% when compared to multi-frame motion estimation suggested in JM8.2. Furthermore, optimal number and size of reference frame blocks are determined to reduce computation without affecting the PSNR. The proposed Motion Estimation method has been verified by using the hardware and software Co-Simulation with iPROVE. It can process 30 CIF frames/sec at 50MHz.

A Real time Image Resizer with Enhanced Scaling Precision and Self Parameter Calculation (강화된 스케일링 정밀도와 자체 파라미터 계산 기능을 가진 실시간 이미지 크기 조절기)

  • Kim, Kihyun;Ryoo, Kwangki
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2012.10a
    • /
    • pp.99-102
    • /
    • 2012
  • An image scaler is a IP used in a image processing block of display devices to adjust image size. Proposed image scaler adopts line memories instead of a conventional method using a frame memory. This method reduced hardware resources and enhanced data precision by using shift operations that number is multiplied by $2^m$ and divided again at final stage for scaling. Also image scaler increased efficiency of IP by using serial divider to calculate parameters by itself. Parameters used in image scaling is automatically produced by it. Suggested methods are designed by Verilog HDL and implemented with Xilinx Vertex-4 XC4LX80 and ASIC using TSMC 0.18um process.

  • PDF

A Design of All-Digital QPSK Demodulator for High-Speed Wireless Transmission Systems (고속 무선 전송시스템을 위한 All-Digital QPSK 복조기의 설계)

  • 고성찬;정지원
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.8 no.1
    • /
    • pp.83-91
    • /
    • 2003
  • High-speed QPSK demodulator has been in important design objective of any wireless communication systems, especially those offering broadband multimedia service. This paper describes all-digital QPSK demodulator for high-speed wireless communications, and its hardware structures are discussed. All-digital QPSK demodulator is mainly composed of symbol time circuit and carrier recovery circuit to estimate timing and phase-offsets. There are various schemes. Among them, we use Gardner algorithm and Decision-Directed carrier recovery algorithm which is most efficient scheme to warrant the fast acquisition and tacking to fabricate FPGA chip. The testing results of the implemented onto CPLD-EPF10K100GC 503-4 chip show demodulation speed is reached up to 2.6[Mbps]. If it is implemented a CPLD chip with speed grade 1, the demodulation speed can be faster by about 5 times. Actually in case of designing by ASIC, its speed my be faster than CPLD by 5 times. Therefore, it is possible to fabricate the all-digital QPSK demodulator chipset with speed of 50[Mbps].

  • PDF

The Design of Multi-channel Synchronous and Asynchronous Communication IC for the Smart Grid (스마트그리드를 위한 다채널 동기 및 비동기 통신용 IC 설계)

  • Ock, Seung-Kyu;Yang, Oh
    • Journal of the Semiconductor & Display Technology
    • /
    • v.10 no.4
    • /
    • pp.7-13
    • /
    • 2011
  • In this paper, the IC(Integrated Circuit) for multi-channel synchronous communication was designed by using FPGA and VHDL language. The existing chips for synchronous communication that has been used commercially are composed for one to two channels. Therefore, when communication system with three channels or more is made, the cost becomes high and it becomes complicated for communication system to be realized and also has very little buffer, load that is placed into Microprocessor increases heavily in case of high speed communication or transmission of high-capacity data. The designed IC was improved the function and performance of communication system and reduced costs by designing 8 synchronous communication channels with only one IC, and it has the size of transmitter/receiver buffer with 1024 bytes respectively and consequently high speed communication became possible. It was designed with a communication signal of a form various encoding. To detect errors of communications, the CRC-ITU-T logic and channel MUX logic was designed with hardware logics so that the malfunction can be prevented and errors can be detected more easily and input/output port regarding each communication channel can be used flexibly and consequently the reliability of system was improved. In order to show the performance of designed IC, the test was conducted successfully in Quartus simulation and experiment and the excellence was compared with the 85C3016VSC of ZILOG company that are used widely as chips for synchronous communication.

A Novel Arithmetic Unit Over GF(2$^{m}$) for Reconfigurable Hardware Implementation of the Elliptic Curve Cryptographic Processor (타원곡선 암호프로세서의 재구성형 하드웨어 구현을 위한 GF(2$^{m}$)상의 새로운 연산기)

  • 김창훈;권순학;홍춘표;유기영
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.31 no.8
    • /
    • pp.453-464
    • /
    • 2004
  • In order to solve the well-known drawback of reduced flexibility that is associate with ASIC implementations, this paper proposes a novel arithmetic unit over GF(2$^{m}$ ) for field programmable gate arrays (FPGAs) implementations of elliptic curve cryptographic processor. The proposed arithmetic unit is based on the binary extended GCD algorithm and the MSB-first multiplication scheme, and designed as systolic architecture to remove global signals broadcasting. The proposed architecture can perform both division and multiplication in GF(2$^{m}$ ). In other word, when input data come in continuously, it produces division results at a rate of one per m clock cycles after an initial delay of 5m-2 in division mode and multiplication results at a rate of one per m clock cycles after an initial delay of 3m in multiplication mode respectively. Analysis shows that while previously proposed dividers have area complexity of Ο(m$^2$) or Ο(mㆍ(log$_2$$^{m}$ )), the Proposed architecture has area complexity of Ο(m), In addition, the proposed architecture has significantly less computational delay time compared with the divider which has area complexity of Ο(mㆍ(log$_2$$^{m}$ )). FPGA implementation results of the proposed arithmetic unit, in which Altera's EP2A70F1508C-7 was used as the target device, show that it ran at maximum 121MHz and utilized 52% of the chip area in GF(2$^{571}$ ). Therefore, when elliptic curve cryptographic processor is implemented on FPGAs, the proposed arithmetic unit is well suited for both division and multiplication circuit.

Effective Decoding Algorithm of Three dimensional Product Code Decoding Scheme with Single Parity Check Code (Single Parity Check 부호를 적용한 3차원 Turbo Product 부호의 효율적인 복호 알고리즘)

  • Ha, Sang-chul;Ahn, Byung-kyu;Oh, Ji-myung;Kim, Do-kyoung;Heo, Jun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.9
    • /
    • pp.1095-1102
    • /
    • 2016
  • In this paper, we propose a decoding scheme that can apply to a three dimensional turbo product code(TPC) with a single parity check code(SPC). In general, SPC is used an axis with shortest code length in order to maximize a code rate of the TPC. However, SPC does not have any error correcting capability, therefore, the error correcting capability of the three-dimensional TPC results in little improvement in comparison with the two-dimensional TPC. We propose two schemes to improve performance of three dimensional TPC decoder. One is $min^*$-sum algorithm that has advantages for low complexity implementation compared to Chase-Pyndiah algorithm. The other is a modified serial iterative decoding scheme for high performance. In addition, the simulation results for the proposed scheme are shown and compared with the conventional scheme. Finally, we introduce some practical considerations for hardware implementation.

A Design and Implementation of a Timing Analysis Simulator for a Design Space Exploration on a Hybrid Embedded System (Hybrid 내장형 시스템의 설계공간탐색을 위한 시간분석 시뮬레이터의 설계 및 구현)

  • Ahn, Seong-Yong;Shim, Jea-Hong;Lee, Jeong-A
    • The KIPS Transactions:PartA
    • /
    • v.9A no.4
    • /
    • pp.459-466
    • /
    • 2002
  • Modern embedded system employs a hybrid architecture which contains a general micro processor and reconfigurable devices such as FPGAS to retain flexibility and to meet timing constraints. It is a hard and important problem for embedded system designers to explore and find a right system configuration, which is known as design space exploration (DSE). With DES, it is possible to predict a final system configuration during the design phase before physical implementation. In this paper, we implement a timing analysis simulator for a DSE on a hybrid embedded system. The simulator, integrating exiting timing analysis tools for hardware and software, is designed by extending Y-chart approach, which allows quantitative performance analysis by varying design parameters. This timing analysis simulator is expected to reduce design time and costs and be used as a core module of a DSE for a hybrid embedded system.

A Study on Architecture of Motion Compensator for H.264/AVC Encoder (H.264/AVC부호화기용 움직임 보상기의 아키텍처 연구)

  • Kim, Won-Sam;Sonh, Seung-Il;Kang, Min-Goo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.3
    • /
    • pp.527-533
    • /
    • 2008
  • Motion compensation always produces the principal bottleneck in the real-time high quality video applications. Therefore, a fast dedicated hardware is needed to perform motion compensation in the real-time video applications. In many video encoding methods, the frames are partitioned into blocks of Pixels. In general, motion compensation predicts present block by estimating the motion from previous frame. In motion compensation, the higher pixel accuracy shows the better performance but the computing complexity is increased. In this paper, we studied an architecture of motion compensator suitable for H.264/AVC encoder that supports quarter-pixel accuracy. The designed motion compensator increases the throughput using transpose array and 3 6-tap Luma filters and efficiently reduces the memory access. The motion compensator is described in VHDL and synthesized in Xilinx ISE and verified using Modelsim_6.1i. Our motion compensator uses 36-tap filters only and performs in 640 clock-cycle per macro block. The motion compensator proposed in this paper is suitable to the areas that require the real-time video processing.

A Study on Design and Implementation of Scalable Angle Estimator Based on ESPRIT Algorithm (ESPRIT 알고리즘 기반 재구성 가능한 각도 추정기 설계에 관한 연구)

  • Dohyun Lee;Byunghyun Kim;Jongwha Chong;Sungjin Lee;Kyeongyuk Min
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.624-629
    • /
    • 2023
  • Estimation of signal parameters via rotational invariance techniques (ESPRIT) is an algorithm that estimates the angle of a signal arriving at an array antenna using the shift invariance property of an array antenna. ESPRIT offers the good trade-off between performance and complexity. However, the ESPRIT algorithm still requires high-complexity operations such as covariance matrix and eigenvalue decomposition, so implementation with a hardware processor is essential to estimate the angle of arrival in real time. In addition, ESPRIT processors should have high performance. The performance is related to the number of antennas, and the number of antennas required for each application are different. Therefore, we proposed an ESPRIT processor that provides 2 to 8 variable antenna configurations to meet the performance and complexity requirements according to the applied field. The proposed ESPRIT processor was designed using the Verilog-HDL and implemented on a field programmable gate array (FPGA).

A Hardware Implementation of the Underlying Field Arithmetic Processor based on Optimized Unit Operation Components for Elliptic Curve Cryptosystems (타원곡선을 암호시스템에 사용되는 최적단위 연산항을 기반으로 한 기저체 연산기의 하드웨어 구현)

  • Jo, Seong-Je;Kwon, Yong-Jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.1
    • /
    • pp.88-95
    • /
    • 2002
  • In recent years, the security of hardware and software systems is one of the most essential factor of our safe network community. As elliptic Curve Cryptosystems proposed by N. Koblitz and V. Miller independently in 1985, require fewer bits for the same security as the existing cryptosystems, for example RSA, there is a net reduction in cost size, and time. In this thesis, we propose an efficient hardware architecture of underlying field arithmetic processor for Elliptic Curve Cryptosystems, and a very useful method for implementing the architecture, especially multiplicative inverse operator over GF$GF (2^m)$ onto FPGA and futhermore VLSI, where the method is based on optimized unit operation components. We optimize the arithmetic processor for speed so that it has a resonable number of gates to implement. The proposed architecture could be applied to any finite field $F_{2m}$. According to the simulation result, though the number of gates are increased by a factor of 8.8, the multiplication speed We optimize the arithmetic processor for speed so that it has a resonable number of gates to implement. The proposed architecture could be applied to any finite field $F_{2m}$. According to the simulation result, though the number of gates are increased by a factor of 8.8, the multiplication speed and inversion speed has been improved 150 times, 480 times respectively compared with the thesis presented by Sarwono Sutikno et al. [7]. The designed underlying arithmetic processor can be also applied for implementing other crypto-processor and various finite field applications.