• Title/Summary/Keyword: FPGA Hardware

Search Result 801, Processing Time 0.032 seconds

SoC Implementation of Deblocking Filter for Block-based Compressed Images and Videos (블록 기반 압축 이미지 및 비디오를 위한 디블로킹 필터의 SoC 구현)

  • Seo, Gwang-Seok;Lee, Joo-Heung
    • Journal of IKEEE
    • /
    • v.23 no.3
    • /
    • pp.925-933
    • /
    • 2019
  • In this paper, we implement ZYNQ SoC-based post-processing system that utilizes partial reconfiguration to remove blocking artifacts generated by compression algorithm. Hardware implementation of the deblocking filter in a Field Programmable Gate Array (FPGA) provides high computational capability and can be partially reconfigured to process 1080p images in real time. Partially reconfigurable areas in FPGA can be utilized to use hardware more efficiently in highly resource-constrained embedded systems. Experimental results of the proposed system show improvement of visual quality both objectively and subjectively with 0.6dB higher PSNR after deblocking filtering process. The measured power consumption of the deblocking filter during run-time is 68.33mW.

Design and Implementation of Multi-mode Sensor Signal Processor on FPGA Device (다중모드 센서 신호 처리 프로세서의 FPGA 기반 설계 및 구현)

  • Soongyu Kang;Yunho Jung
    • Journal of Sensor Science and Technology
    • /
    • v.32 no.4
    • /
    • pp.246-251
    • /
    • 2023
  • Internet of Things (IoT) systems process signals from various sensors using signal processing algorithms suitable for the signal characteristics. To analyze complex signals, these systems usually use signal processing algorithms in the frequency domain, such as fast Fourier transform (FFT), filtering, and short-time Fourier transform (STFT). In this study, we propose a multi-mode sensor signal processor (SSP) accelerator with an FFT-based hardware design. The FFT processor in the proposed SSP is designed with a radix-2 single-path delay feedback (R2SDF) pipeline architecture for high-speed operation. Moreover, based on this FFT processor, the proposed SSP can perform filtering and STFT operation. The proposed SSP is implemented on a field-programmable gate array (FPGA). By sharing the FFT processor for each algorithm, the required hardware resources are significantly reduced. The proposed SSP is implemented and verified on Xilinxh's Zynq Ultrascale+ MPSoC ZCU104 with 53,591 look-up tables (LUTs), 71,451 flip-flops (FFs), and 44 digital signal processors (DSPs). The FFT, filtering, and STFT algorithm implementations on the proposed SSP achieve 185x average acceleration.

Hardware Implementation of Integer Transform and Quantization for H.264 (하드웨어 기반의 H.264 정수 변환 및 양자화 구현)

  • 임영훈;정용진
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.28 no.12C
    • /
    • pp.1182-1191
    • /
    • 2003
  • In this paper, we propose a new hardware architecture for integer transform, quantizer, inverse quantizer, and inverse integer transform of a new video coding standard H.264/JVT. We describe the algorithm and derive hardware architecture emphasizing the importance of area for low cost and low power consumption. The proposed architecture has been verified by PCI-interfaced emulation board using APEX-II Alters FPGA and also by ASIC synthesis using Samsung 0.18 um CMOS cell library. The ASIC synthesis result shows that the proposed hardware can operate at 100 MHz, processing more than 1,300 QCIF video frames per second. The hardware is going to be used as a core module when implementing a complete H.264 video encoder/decoder ASIC for real-time multimedia application.

Design of Open Vector Graphics Accelerator for Mobile Vector Graphics (모바일 벡터 그래픽을 위한 OpenVG 가속기 설계)

  • Kim, Young-Ouk;Roh, Young-Sup
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.10
    • /
    • pp.1460-1470
    • /
    • 2008
  • As the performance of recent mobile systems increases, a vector graphic has been implemented to represent various types of dynamic menus, mails, and two-dimensional maps. This paper proposes a hardware accelerator for open vector graphics (OpenVG), which is widely used for two-dimensional vector graphics. We analyze the specifications of an OpenVG and divide the OpenVG into several functions suitable for hardware implementation. The proposed hardware accelerator is implemented on a field programmable gate array (FPGA) board using hardware description language (HDL) and is about four times faster than an Alex processor.

  • PDF

A Design of an AES-based Security Chip for IoT Applications using Verilog HDL (IoT 애플리케이션을 위한 AES 기반 보안 칩 설계)

  • Park, Hyeon-Keun;Lee, Kwangjae
    • The Transactions of the Korean Institute of Electrical Engineers P
    • /
    • v.67 no.1
    • /
    • pp.9-14
    • /
    • 2018
  • In this paper, we introduce an AES-based security chip for the embedded system of Internet of Things(IoT). We used Verilog HDL to implement the AES algorithm in FPGA. The designed AES module creates 128-bit cipher by encrypting 128-bit plain text and vice versa. RTL simulations are performed to verify the AES function and the theory is compared to the results. An FPGA emulation was also performed with 40 types of test sequences using two Altera DE0-Nano-SoC boards. To evaluate the performance of security algorithms, we compared them with AES implemented by software. The processing cycle per data unit of hardware implementation is 3.9 to 7.7 times faster than software implementation. However, there is a possibility that the processing speed grow slower due to the feature of the hardware design. This can be solved by using a pipelined scheme that divides the propagation delay time or by using an ASIC design method. In addition to the AES algorithm designed in this paper, various algorithms such as IPSec can be implemented in hardware. If hardware IP design is set in advance, future IoT applications will be able to improve security strength without time difficulties.

Hardware Accelerated Design on Bag of Words Classification Algorithm

  • Lee, Chang-yong;Lee, Ji-yong;Lee, Yong-hwan
    • Journal of Platform Technology
    • /
    • v.6 no.4
    • /
    • pp.26-33
    • /
    • 2018
  • In this paper, we propose an image retrieval algorithm for real-time processing and design it as hardware. The proposed method is based on the classification of BoWs(Bag of Words) algorithm and proposes an image search algorithm using bit stream. K-fold cross validation is used for the verification of the algorithm. Data is classified into seven classes, each class has seven images and a total of 49 images are tested. The test has two kinds of accuracy measurement and speed measurement. The accuracy of the image classification was 86.2% for the BoWs algorithm and 83.7% the proposed hardware-accelerated software implementation algorithm, and the BoWs algorithm was 2.5% higher. The image retrieval processing speed of BoWs is 7.89s and our algorithm is 1.55s. Our algorithm is 5.09 times faster than BoWs algorithm. The algorithm is largely divided into software and hardware parts. In the software structure, C-language is used. The Scale Invariant Feature Transform algorithm is used to extract feature points that are invariant to size and rotation from the image. Bit streams are generated from the extracted feature point. In the hardware architecture, the proposed image retrieval algorithm is written in Verilog HDL and designed and verified by FPGA and Design Compiler. The generated bit streams are stored, the clustering step is performed, and a searcher image databases or an input image databases are generated and matched. Using the proposed algorithm, we can improve convenience and satisfaction of the user in terms of speed if we search using database matching method which represents each object.

High Performance Integer Multiplier on FPGA with Radix-4 Number Theoretic Transform

  • Chang, Boon-Chiao;Lee, Wai-Kong;Goi, Bok-Min;Hwang, Seong Oun
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.16 no.8
    • /
    • pp.2816-2830
    • /
    • 2022
  • Number Theoretic Transform (NTT) is a method to design efficient multiplier for large integer multiplication, which is widely used in cryptography and scientific computation. On top of that, it has also received wide attention from the research community to design efficient hardware architecture for large size RSA, fully homomorphic encryption, and lattice-based cryptography. Existing NTT hardware architecture reported in the literature are mainly designed based on radix-2 NTT, due to its small area consumption. However, NTT with larger radix (e.g., radix-4) may achieve faster speed performance in the expense of larger hardware resources. In this paper, we present the performance evaluation on NTT architecture in terms of hardware resource consumption and the latency, based on the proposed radix-2 and radix-4 technique. Our experimental results show that the 16-point radix-4 architecture is 2× faster than radix-2 architecture in expense of approximately 4× additional hardware. The proposed architecture can be extended to support the large integer multiplication in cryptography applications (e.g., RSA). The experimental results show that the proposed 3072-bit multiplier outperformed the best 3k-multiplier from Chen et al. [16] by 3.06%, but it also costs about 40% more LUTs and 77.8% more DSPs resources.

Field Programmable Gate Array Reliability Analysis Using the Dynamic Flowgraph Methodology

  • McNelles, Phillip;Lu, Lixuan
    • Nuclear Engineering and Technology
    • /
    • v.48 no.5
    • /
    • pp.1192-1205
    • /
    • 2016
  • Field programmable gate array (FPGA)-based systems are thought to be a practical option to replace certain obsolete instrumentation and control systems in nuclear power plants. An FPGA is a type of integrated circuit, which is programmed after being manufactured. FPGAs have some advantages over other electronic technologies, such as analog circuits, microprocessors, and Programmable Logic Controllers (PLCs), for nuclear instrumentation and control, and safety system applications. However, safety-related issues for FPGA-based systems remain to be verified. Owing to this, modeling FPGA-based systems for safety assessment has now become an important point of research. One potential methodology is the dynamic flowgraph methodology (DFM). It has been used for modeling software/hardware interactions in modern control systems. In this paper, FPGA logic was analyzed using DFM. Four aspects of FPGAs are investigated: the "IEEE 1164 standard," registers (D flip-flops), configurable logic blocks, and an FPGA-based signal compensator. The ModelSim simulations confirmed that DFM was able to accurately model those four FPGA properties, proving that DFM has the potential to be used in the modeling of FPGA-based systems. Furthermore, advantages of DFM over traditional reliability analysis methods and FPGA simulators are presented, along with a discussion of potential issues with using DFM for FPGA-based system modeling.

Hardware architecture of a wavelet based multiple line addressing driving system for passive matrix displays

  • Lam, San;Smet, Herbert De
    • 한국정보디스플레이학회:학술대회논문집
    • /
    • 2007.08a
    • /
    • pp.802-805
    • /
    • 2007
  • A hardware architecture is presented of a wavelet based multiple line addressing driving scheme for passive matrix displays using the FPGA (Field Programmable Gate Arrays), which will be integrated in the scalable video coding $architecture^{[1]}$. The incoming compressed video data stream will then directly be transformed to the required column voltages by the hardware architecture without the need of employing the video decompression.

  • PDF

Implementation of back propagation algorithm for wearable devices using FPGA (FPGA를 이용한 웨어러블 디바이스를 위한 역전파 알고리즘 구현)

  • Choi, Hyun-Sik
    • The Journal of Korean Institute of Next Generation Computing
    • /
    • v.15 no.2
    • /
    • pp.7-16
    • /
    • 2019
  • Neural networks can be implemented in variety of ways, and specialized chips is being developed for hardware improvement. In order to apply such neural networks to wearable devices, the compactness and the low power operation are essential. In this point of view, a suitable implementation method is a digital circuit design using field programmable gate array (FPGA). To implement this system, the learning algorithm which takes up a large part in neural networks must be implemented within FPGA for better performance. In this paper, a back propagation algorithm among various learning algorithms is implemented using FPGA, and this neural network is verified by OR gate operation. In addition, it is confirmed that this neural network can be used to analyze various users' bio signal measurement results by learning algorithm.