• Title/Summary/Keyword: FPGA design

Search Result 1,019, Processing Time 0.022 seconds

Design of Image Extraction Hardware for Hand Gesture Vision Recognition

  • Lee, Chang-Yong;Kwon, So-Young;Kim, Young-Hyung;Lee, Yong-Hwan
    • Journal of Advanced Information Technology and Convergence
    • /
    • v.10 no.1
    • /
    • pp.71-83
    • /
    • 2020
  • In this paper, we propose a system that can detect the shape of a hand at high speed using an FPGA. The hand-shape detection system is designed using Verilog HDL, a hardware language that can process in parallel instead of sequentially running C++ because real-time processing is important. There are several methods for hand gesture recognition, but the image processing method is used. Since the human eye is sensitive to brightness, the YCbCr color model was selected among various color expression methods to obtain a result that is less affected by lighting. For the CbCr elements, only the components corresponding to the skin color are filtered out from the input image by utilizing the restriction conditions. In order to increase the speed of object recognition, a median filter that removes noise present in the input image is used, and this filter is designed to allow comparison of values and extraction of intermediate values at the same time to reduce the amount of computation. For parallel processing, it is designed to locate the centerline of the hand during scanning and sorting the stored data. The line with the highest count is selected as the center line of the hand, and the size of the hand is determined based on the count, and the hand and arm parts are separated. The designed hardware circuit satisfied the target operating frequency and the number of gates.

A Study on Design and Implementation of Scalable Angle Estimator Based on ESPRIT Algorithm (ESPRIT 알고리즘 기반 재구성 가능한 각도 추정기 설계에 관한 연구)

  • Dohyun Lee;Byunghyun Kim;Jongwha Chong;Sungjin Lee;Kyeongyuk Min
    • Journal of IKEEE
    • /
    • v.27 no.4
    • /
    • pp.624-629
    • /
    • 2023
  • Estimation of signal parameters via rotational invariance techniques (ESPRIT) is an algorithm that estimates the angle of a signal arriving at an array antenna using the shift invariance property of an array antenna. ESPRIT offers the good trade-off between performance and complexity. However, the ESPRIT algorithm still requires high-complexity operations such as covariance matrix and eigenvalue decomposition, so implementation with a hardware processor is essential to estimate the angle of arrival in real time. In addition, ESPRIT processors should have high performance. The performance is related to the number of antennas, and the number of antennas required for each application are different. Therefore, we proposed an ESPRIT processor that provides 2 to 8 variable antenna configurations to meet the performance and complexity requirements according to the applied field. The proposed ESPRIT processor was designed using the Verilog-HDL and implemented on a field programmable gate array (FPGA).

Improvement of Residual Delay Compensation Algorithm of KJJVC (한일상관기의 잔차 지연 보정 알고리즘의 개선)

  • Oh, Se-Jin;Yeom, Jae-Hwan;Roh, Duk-Gyoo;Oh, Chung-Sik;Jung, Jin-Seung;Chung, Dong-Kyu;Oyama, Tomoaki;Kawaguchi, Noriyuki;Kobayashi, Hideyuki;Kawakami, Kazuyuki;Ozeki, Kensuke;Onuki, Hirohumi
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.14 no.2
    • /
    • pp.136-146
    • /
    • 2013
  • In this paper, the residual delay compensation algorithm is proposed for FX-type KJJVC. In case of initial version as that design algorithm of KJJVC, the integer calculation and the cos/sin table for the phase compensation coefficient were introduced in order to speed up of calculation. The mismatch between data timing and residual delay phase and also between bit-jump and residual delay phase were found and fixed. In final design of KJJVC residual delay compensation algorithm, the initialization problem on the rotation memory of residual delay compensation was found when the residual delay compensated value was applied to FFT-segment, and this problem is also fixed by modifying the FPGA code. Using the proposed residual delay compensation algorithm, the band shape of cross power spectrum becomes flat, which means there is no significant loss over the whole bandwidth. To verify the effectiveness of proposed residual delay compensation algorithm, we conducted the correlation experiments for real observation data using the simulator and KJJVC. We confirmed that the designed residual delay compensation algorithm is well applied in KJJVC, and the signal to noise ratio increases by about 8%.

Design of Video Encoder activating with variable clocks of CCDs for CCTV applications (CCTV용 CCD를 위한 가변 clock으로 동작되는 비디오 인코더의 설계)

  • Kim, Joo-Hyun;Ha, Joo-Young;Kang, Bong-Soon
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.1
    • /
    • pp.80-87
    • /
    • 2006
  • SONY corporation preoccupies $80\%$ of a market of the CCD used in a CCTV system. The CCD of SONY have high duality which can not follow the progress of capability. But there are some problems which differ the clock frequency used in CCD from the frequency used in common video encoder. To get the result by using common video encoder, the system needs a scaler that could adjust image size and PLL that synchronizes CCD's with encoder's clock So, this paper proposes the video encoder that is activated at equal clock used in CCD without scaler and PLL. The encoder converts ITU-R BT.601 4:2:2 or ITU-R BT.656 inputs from various video sources into NTSC or PAL signals in CVBS. Due to variable clock, property of filters used in the encoder is automatically changed by clock and filters adopt multiplier-free structures to reduce hardware complexity. The hardware bit width of programmable digital filters for luminance and chrominance signals, along with other operating blocks, are carefully determined to produce hish-quality digital video signals of ${\pm}1$ LSB error or less. The proposed encoder is experimentally demonstrated by using the Altera Stratix EP1S80B953C6ES device.

Design and Implementation of a Hardware-based Transmission/Reception Accelerator for a Hybrid TCP/IP Offload Engine (하이브리드 TCP/IP Offload Engine을 위한 하드웨어 기반 송수신 가속기의 설계 및 구현)

  • Jang, Han-Kook;Chung, Sang-Hwa;Yoo, Dae-Hyun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.34 no.9
    • /
    • pp.459-466
    • /
    • 2007
  • TCP/IP processing imposes a heavy load on the host CPU when it is processed by the host CPU on a very high-speed network. Recently the TCP/IP Offload Engine (TOE), which processes TCP/IP on a network adapter instead of the host CPU, has become an attractive solution to reduce the load in the host CPU. There have been two approaches to implement TOE. One is the software TOE in which TCP/IP is processed by an embedded processor and the other is the hardware TOE in which TCP/IP is processed by a dedicated ASIC. The software TOE has poor performance and the hardware TOE is neither flexible nor expandable enough to add new features. In this paper we designed and implemented a hybrid TOE architecture, in which TCP/IP is processed by cooperation of hardware and software, based on an FPGA that has two embedded processor cores. The hybrid TOE can have high performance by processing time-critical operations such as making and processing data packets in hardware. The software based on the embedded Linux performs operations that are not time-critical such as connection establishment, flow control and congestions, thus the hybrid TOE can have enough flexibility and expandability. To improve the performance of the hybrid TOE, we developed a hardware-based transmission/reception accelerator that processes important operations such as creating data packets. In the experiments the hybrid TOE shows the minimum latency of about $19{\mu}s$. The CPU utilization of the hybrid TOE is below 6 % and the maximum bandwidth of the hybrid TOE is about 675 Mbps.

Implementation of High-Throughput SHA-1 Hash Algorithm using Multiple Unfolding Technique (다중 언폴딩 기법을 이용한 SHA-1 해쉬 알고리즘 고속 구현)

  • Lee, Eun-Hee;Lee, Je-Hoon;Jang, Young-Jo;Cho, Kyoung-Rok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.4
    • /
    • pp.41-49
    • /
    • 2010
  • This paper proposes a new high speed SHA-1 architecture using multiple unfolding and pre-computation techniques. We unfolds iterative hash operations to 2 continuos hash stage and reschedules computation timing. Then, the part of critical path is computed at the previous hash operation round and the rest is performed in the present round. These techniques reduce 3 additions to 2 additions on the critical path. It makes the maximum clock frequency of 118 MHz which provides throughput rate of 5.9 Gbps. The proposed architecture shows 26% higher throughput with a 32% smaller hardware size compared to other counterparts. This paper also introduces a analytical model of multiple SHA-1 architecture at the system level that maps a large input data on SHA-1 block in parallel. The model gives us the required number of SHA-1 blocks for a large multimedia data processing that it helps to make decision hardware configuration. The hs fospeed SHA-1 is useful to generate a condensed message and may strengthen the security of mobile communication and internet service.

An Improvement of Implementation Method for Multi-Layer AHB BusMatrix (ML-AHB 버스 매트릭스 구현 방법의 개선)

  • Hwang Soo-Yun;Jhang Kyoung-Sun
    • Journal of KIISE:Computer Systems and Theory
    • /
    • v.32 no.11_12
    • /
    • pp.629-638
    • /
    • 2005
  • In the System on a Chip design, the on chip bus is one of the critical factors that decides the overall system performance. Especially, in the case or reusing the IPs such as processors, DSPs and multimedia IPs that requires higher bandwidth, the bandwidth problems of on chip bus are getting more serious. Recently ARM proposes the Multi-Layer AHB BusMatrix that is a highly efficient on chip bus to solve the bandwidth problems. The Multi-Layer AHB BusMatrix allows parallel access paths between multiple masters and slaves in a system. This is achieved by using a more complex interconnection matrix and gives the benefit of increased overall bus bandwidth, and a more flexible system architecture. However, there is one clock cycle delay for each master in existing Multi-Layer AHB BusMatrix whenever the master starts new transactions or changes the slave layers because of the Input Stage and arbitration logic realized with Moore type. In this paper, we improved the existing Multi-Layer AHB BusMatrix architecture to solve the one clock cycle delay problems and to reduce the area overhead of the Input Stage. With the elimination of the Input Stage and some restrictions on the arbitration scheme, we tan take away the one clock cycle delay and reduce the area overhead. Experimental results show that the end time of total bus transaction and the average latency time of improved Multi-Layer AHB BusMatrix are improved by $20\%\;and\;24\%$ respectively. in ease of executing a number of transactions by 4-beat incrementing burst type. Besides the total area and the clock period are reduced by $22\%\;and\;29\%$ respectively, compared with existing Multi-layer AHB BusMatrix.

A STUDY ON THE RE-QUANTIZATION METHOD FOR PREVENTING DISTORTION OF CORRELATION RESULT (상관결과의 왜곡 방지를 위한 재양자화 방법에 관한 연구)

  • Yeom, Jae-Hwan;Oh, Se-Jin;Roh, Duk-Gyoo;Oh, Chung-Sik;Jung, Jin-Seung;Chung, Dong-Kyu;Oyama, Tomoaki;Kawaguchi, Noriyuki;Kobayashi, Hideyuki;Kawakami, Kazuyuki;Onuki, Hirofumi;Ozeki, Kensuke
    • Publications of The Korean Astronomical Society
    • /
    • v.27 no.5
    • /
    • pp.419-429
    • /
    • 2012
  • In this paper, we propose a new re-quantization method after FFT processing to prevent the distortion of correlation result of VCS (VLBI Correlation Subsystem). The re-quantization is used to rearrange the data bit so as to reduce the data rate processed as 16-bit of FFT result of VCS. Having done this procedure, we found that the distorted spectrum of correlation result occurred in the delay tracking experiments by the re-quantization method introduced for initial design of VCS. In order to solve this, two kinds of re-quantization method, that is, the comparison and selection-type, are proposed. The first is to re-quantize the FFT result as a valid-bit by comparing with the input data after determining the adequate threshold. The second is manually to select the valid-bit of FFT result after finding the valid-field of data according to the bit-distribution of input data. We confirmed that the second is more effective compared with the first through the experimental result, and it will be implemented without so much modification of applied method in the condition of the limited resource of FPGA. The re-quantization is, however, carried out with 4-bit in the proposed second method for FFT result, and then the distortion of correlation result is also appeared. To fix this problem, the bit for re-quantization is extended to 8-bit. The proposed 8-bit selection-type is effectively verified so that the distortion of correlation result disappeared by applying to VCS in consequence of the simulation and correlation experiments.

A Hardware Implementation of the Underlying Field Arithmetic Processor based on Optimized Unit Operation Components for Elliptic Curve Cryptosystems (타원곡선을 암호시스템에 사용되는 최적단위 연산항을 기반으로 한 기저체 연산기의 하드웨어 구현)

  • Jo, Seong-Je;Kwon, Yong-Jin
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.8 no.1
    • /
    • pp.88-95
    • /
    • 2002
  • In recent years, the security of hardware and software systems is one of the most essential factor of our safe network community. As elliptic Curve Cryptosystems proposed by N. Koblitz and V. Miller independently in 1985, require fewer bits for the same security as the existing cryptosystems, for example RSA, there is a net reduction in cost size, and time. In this thesis, we propose an efficient hardware architecture of underlying field arithmetic processor for Elliptic Curve Cryptosystems, and a very useful method for implementing the architecture, especially multiplicative inverse operator over GF$GF (2^m)$ onto FPGA and futhermore VLSI, where the method is based on optimized unit operation components. We optimize the arithmetic processor for speed so that it has a resonable number of gates to implement. The proposed architecture could be applied to any finite field $F_{2m}$. According to the simulation result, though the number of gates are increased by a factor of 8.8, the multiplication speed We optimize the arithmetic processor for speed so that it has a resonable number of gates to implement. The proposed architecture could be applied to any finite field $F_{2m}$. According to the simulation result, though the number of gates are increased by a factor of 8.8, the multiplication speed and inversion speed has been improved 150 times, 480 times respectively compared with the thesis presented by Sarwono Sutikno et al. [7]. The designed underlying arithmetic processor can be also applied for implementing other crypto-processor and various finite field applications.