• Title/Summary/Keyword: Hardware Architecture

Search Result 1,324, Processing Time 0.029 seconds

High-Performance Givens Rotation-based QR Decomposition Architecture Applicable for MIMO Receiver (MIMO 수신기에 적용 가능한 고성능 기븐스 회전 기반의 QR 분해 하드웨어 구조)

  • Yoon, Ji-Hwan;Lee, Min-Woo;Park, Jong-Sun
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.49 no.3
    • /
    • pp.31-37
    • /
    • 2012
  • This paper presents an efficient hardware architecture to enable the high-speed Givens rotation-based QR decomposition. The proposed architecture achieves a highly parallel givens rotation process by maximizing the number of pivots selected for parallel zero-insertions. Sign-select lookahed (SSL)-CORDIC is also efficiently used for the high-speed givens rotation. The performance of QR decomposition hardware considerably increases compared to the conventional triangular systolic array (TSA) architecture. Moreover, the circuit area of QR decomposition hardware was reduced by decreasing the number of flip-flops for holding the pre-computed results during the decomposition process. The proposed QR decomposition hardware was implemented using TSMC $0.25{\mu}m$ technology. The experimental results show that the proposed architecture achieves up to 70 % speed-up over the TACR/TSA-based architecture for the $8{\times}8$ matrix decomposition.

An FPGA-based Parallel Hardware Architecture for Real-time Eye Detection

  • Kim, Dong-Kyun;Jung, Jun-Hee;Nguyen, Thuy Tuong;Kim, Dai-Jin;Kim, Mun-Sang;Kwon, Key-Ho;Jeon, Jae-Wook
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.12 no.2
    • /
    • pp.150-161
    • /
    • 2012
  • Eye detection is widely used in applications, such as face recognition, driver behavior analysis, and human-computer interaction. However, it is difficult to achieve real-time performance with software-based eye detection in an embedded environment. In this paper, we propose a parallel hardware architecture for real-time eye detection. We use the AdaBoost algorithm with modified census transform(MCT) to detect eyes on a face image. We parallelize part of the algorithm to speed up processing. Several downscaled pyramid images of the eye candidate region are generated in parallel using the input face image. We can detect the left and the right eye simultaneously using these downscaled images. The sequential data processing bottleneck caused by repetitive operation is removed by employing a pipelined parallel architecture. The proposed architecture is designed using Verilog HDL and implemented on a Virtex-5 FPGA for prototyping and evaluation. The proposed system can detect eyes within 0.15 ms in a VGA image.

A Novel Spiral-Type Motion Estimation Architecture for H.264/AVC

  • Hirai, Naoyuki;Song, Tian;Liu, Yizhong;Shimamoto, Takashi
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.10 no.1
    • /
    • pp.37-44
    • /
    • 2010
  • New features of motion compensation, such as variable block size and multiple reference frames are introduced in H.264/AVC. However, these new features induce significant implementation complexity increases. In this paper, an efficient architecture for spiral-type motion estimation is proposed. First, we propose a hardware-friendly spiral search order. Then, an efficient processing element (PE) architecture for ME is proposed to achieve the proposed search order. The improved PE enables one-pixel-move of the reference pixel data to top, bottom, right, and left by four ports for input and output. Moreover, the parallel calculation architecture to calculate all block size with the SAD of 4x4 is introduced in the proposed architecture. As the result of hardware implementation, the hardware cost is about 145k gates. Maximum clock frequency is 134 MHz in the case of FPGA (Xilinx Vertex5) implementation.

VLSI Design of Demodulating Fingers with Lowe Hardware Complexity for MC-CDMA Mobile System (MC-CDMA 이동국의 하드웨어 복잡도를 줄이기 위한 다중경로 복조기의 설계)

  • 황상윤;이성주김재석
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1113-1116
    • /
    • 1998
  • This paper presents an efficient hardware architecture of demodulating fingers to demodulate the multi-path propagating signals in MC-CDMA Mobile System. We design a new architecture of demodulating fingers which share the single arithmetic unit to reduce the hardware complexity. This arithmetic unit performs MAC(Multiplication and Accumulation) operations of all demodulating fingers. The proposed architecture is suitable for Is-95 based CDMA PCS system. Three demodulating fingers for MC-CDMA which demodulate 7 channels contain about 42K logic gates. Our proposed system is shown to be very useful for Multi-Code CDMA system in which several channels are demodulated simultaneously.

  • PDF

COSIM(HARDWARE-SOFTWARE COSIMULATOR): JAVABEANS-BASED TOOL FOR WEB APPLICATIONS

  • Lee, Kangsun;Jaeho Jung;Youngsuk Hwang
    • Proceedings of the Korea Society for Simulation Conference
    • /
    • 2001.10a
    • /
    • pp.354-358
    • /
    • 2001
  • Cosim (Hardware and Software Co-Simulator) is a JavaBeans-based simulation tool fur validating systems architecture and estimating performance of web applications. Cosim has four components: Modeler, Translator, Engine and Scenario. Users start from Modeler to describe systems architecture in UML(Unified Modeling Language) deployment diagram, and then specify hardware & software performance parameters such as execution delay, network topology, and frame size. All information specified on Modeler are sent to Translator, and then automatically converted to Java programs. Scenario is responsible to run the Java program and produce results in text reports and graphs. Developers can reduce development time and cost by validating systems architecture of web applications before the actual deployment.

  • PDF

Design of FM sound synthesizer IC for multimedia with phase bit optimized (위상 데이터 비트수를 최적화한 멀티미디어용 FM 음원합성 IC의 설계)

  • 홍현석;김이섭
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.21 no.11
    • /
    • pp.2978-2990
    • /
    • 1996
  • With the advent of multimedia era, there are ever increasing interest in computer music and sound syntheis. An FM type sound synthesizing method makes possible the syntheis ofvarious sounds ofmusical instruments with a relatively simple hardware architecture. Therefore, in this paper, we designed a hardware architecture for real-time sound synthesizer and its logic gates. In this paper, we designed a basic sound generator for implementation of real-time logic gates, analzed characteristics of sounds synthesized in this architecture and extracted parameters of FM sounds of musical instruments by using the Csound software. The major bolkcs to build the hardware are a phase-generator, a singe-function-generator, an envelope-generator and a multiplier-part. Finally, logic circuits are designed and verified in VHDL and logic gates by 1.0um standard cell library, which will be easily implementable by the form of ASIC.

  • PDF

A PC-Based Open Robot Control System : PC-ORC (PC에 기반을 둔 개방형 로봇제어시스템 : PC-ORC)

  • 김점구;최경현;홍금식
    • Journal of Institute of Control, Robotics and Systems
    • /
    • v.6 no.5
    • /
    • pp.415-425
    • /
    • 2000
  • An open architecture manufacturing strategy intends to integrate manufacturing components on a single platform so that a particular component can be easily added and/or replaced. Therefore, the control scheme based upon the open architecture concept is hardware-independent. In this paper, a modular and object oriented approach for a PC-based open robot control system is investigated. A standard reference model for robot systems, which consists of three modules; hardware module, operating system module, and application software module, is first proposed. Then, a PC-based Open Robot Controller(PC-ORC), which can reconfigure robot control systems in various production environments, is developed. The PC-ORC is built upon the object-oriented method, and allows an easy implementation and modification of various modules. The PC-ORC consists of basic softwares, application objects, and additional hardware device on the PC Platform. The application objects are: sequencer, computation unit, servo control, ancillary equipment, external sensor control, and so on. In order to demonstrate the applicability of the PC-ORC, the proposed PC-ORC configuration is applied to an industrial SCARA robot system.

  • PDF

Object Recognition using On-Chip Multiprocessing Microprocessor (다중처리 마이크로프로세서를 이용한 객체 인식)

  • Chung, Yong-Wha;Park, Kyoung;Hahn, Woo-Jong
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10c
    • /
    • pp.762-767
    • /
    • 1999
  • 객체 인식은 고성능 컴퓨팅을 필요로 하는 흥미있는 응용 분야이다. 현재 대부분의 고성능 컴퓨터는 슈퍼스칼라 구조의 범용 마이크로프로세서를 채택하고 있으나, 반도체 집적도가 증가함에 따라 슈퍼스칼라 구조를 대신할 새로운 마이크로프로세서가 구조가 제안되고 있다. 본 논문에서는 최근 새로운 마이크로프로세서 구조로 급부상하고 있는 다중처리 마이크로프로세서 구조가 객체 인식 응용에 적합한지를 분석한다. 성능 특성을 확인하기 위하여 먼저 프로그램 구동방식의 마이크로프로세서 시뮬레이터와 프로그래밍 환경을 개발하였다. 이를 기반으로 시뮬레이션을 수행한 결과, 다중처리 마이크로프로세서가 작은 오버헤드로 쓰레드 수준의 병렬성을 적절히 활용하고 있어 객체 인식 응용에 적합한 구조임을 확인하였다.

  • PDF

High Throughput Radix-4 SISO Decoding Architecture with Reduced Memory Requirement

  • Byun, Wooseok;Kim, Hyeji;Kim, Ji-Hoon
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.14 no.4
    • /
    • pp.407-418
    • /
    • 2014
  • As the high-throughput requirement in the next generation communication system increases, it becomes essential to implement high-throughput SISO (Soft-Input Soft-Output) decoder with minimal hardware resources. In this paper, we present the comparison results between cascaded radix-4 ACS (Add-Compare-Select) and LUT (Look-Up Table)-based radix-4 ACS in terms of delay, area, and power consumption. The hardware overhead incurred from the retiming technique used for high speed radix-4 ACS operation is also analyzed. According to the various analysis results, high-throughput radix-4 SISO decoding architecture based on simple path metric recovery circuit is proposed to minimize the hardware resources. The proposed architecture is implemented in 65 nm CMOS process and memory requirement and power consumption can be reduced up to 78% and 32%, respectively, while achieving high-throughput requirement.

Hardware Design of High Performance HEVC Deblocking Filter for UHD Videos (UHD 영상을 위한 고성능 HEVC 디블록킹 필터 설계)

  • Park, Jaeha;Ryoo, Kwangki
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.19 no.1
    • /
    • pp.178-184
    • /
    • 2015
  • This paper proposes a hardware architecture for high performance Deblocking filter(DBF) in High Efficiency Video Coding for UHD(Ultra High Definition) videos. This proposed hardware architecture which has less processing time has a 4-stage pipelined architecture with two filters and parallel boundary strength module. Also, the proposed filter can be used in low-voltage design by using clock gating architecture in 4-stage pipeline. The segmented memory architecture solves the hazard issue that arises when single port SRAM is accessed. The proposed order of filtering shortens the delay time that arises when storing data into the single port SRAM at the pre-processing stage. The DBF hardware proposed in this paper was designed with Verilog HDL, and was implemented with 22k logic gates as a result of synthesis using TSMC 0.18um CMOS standard cell library. Furthermore, the dynamic frequency can process UHD 8k($7680{\times}4320$) samples@60fps using a frequency of 150MHz with an 8K resolution and maximum dynamic frequency is 285MHz. Result from analysis shows that the proposed DBF hardware architecture operation cycle for one process coding unit has improved by 32% over the previous one.