• Title/Summary/Keyword: Hardware Efficient

Search Result 1,130, Processing Time 0.03 seconds

A hardware implementation of neural network with modified HANNIBAL architecture (수정된 하니발 구조를 이용한 신경회로망의 하드웨어 구현)

  • 이범엽;정덕진
    • The Transactions of the Korean Institute of Electrical Engineers
    • /
    • v.45 no.3
    • /
    • pp.444-450
    • /
    • 1996
  • A digital hardware architecture for artificial neural network with learning capability is described in this paper. It is a modified hardware architecture known as HANNIBAL(Hardware Architecture for Neural Networks Implementing Back propagation Algorithm Learning). For implementing an efficient neural network hardware, we analyzed various type of multiplier which is major function block of neuro-processor cell. With this result, we design a efficient digital neural network hardware using serial/parallel multiplier, and test the operation. We also analyze the hardware efficiency with logic level simulation. (author). refs., figs., tabs.

  • PDF

A bitwidth optimization algorithm for efficient hardware sharing (효율적인 하드웨어 공유를 위한 단어길이 최적화 알고리듬)

  • 최정일;전홍신;이정주;김문수;황선영
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.22 no.3
    • /
    • pp.454-468
    • /
    • 1997
  • This paper presents a bitwidth optimization algorithm for efficient hardware sharing in digital signal processing system. The proposed algorithm determines the fixed-point representation for each signal through bitwidth optimization to generate the hardware requiring less area. To reduce the operator area, the algorithm partitions the abstract operations in the design description into several groups, such that the operations in the same group can share an operator. The partitioning result are fed to a high-level synthesis system to generate the pipelined fixed-point datapaths. The proposed algorithm has been implemented in SODAS-DSP an automatic synthesis system for fixed-point DSP hardware. Accepting the models of DSP algorithms in schematics, the system automatically generates the fixed-point datapath and controller satisfying the design constraints in area, speed, and SNR(Signal-to-Noise Ratio). Experimental results show that the efficiency of the proposed algorithm by generates the area-efficient DSP hardwares satisfying performance constraints.

  • PDF

An Efficient Hardware Architecture of Coordinate Transformation for Panorama Unrolling of Catadioptric Omnidirectional Images

  • Lee, Seung-Ho
    • Journal of IKEEE
    • /
    • v.15 no.1
    • /
    • pp.10-14
    • /
    • 2011
  • In this paper, we present an efficient hardware architecture of unrolling image mapper of catadioptric omnidirectional imaging systems. The catadioptric omnidirectional imaging systems generate images of 360 degrees of view and need to be transformed into panorama images in rectangular coordinate. In most application, it has to perform the panorama unrolling in real-time and at low-cost, especially for high-resolution images. The proposed hardware architecture adopts a software/hardware cooperative structure and employs several optimization schemes using look-up-table(LUT) of coordinate conversion. To avoid the on-line division operation caused by the coordinate transformation algorithm, the proposed architecture has the LUT which has pre-computed division factors. And then, the amount of memory used by the LUT is reduced to 1/4 by using symmetrical characteristic compared with the conventional architecture. Experimental results show that the proposed hardware architecture achieves an effective real-time performance and lower implementation cost, and it can be applied to other kinds of catadioptric omnidirectional imaging systems.

Hardware based set-associative IP address lookup scheme (하드웨어 기란 집합연관 IP 주소 검색 방식)

  • Yun Sang-Kyun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.8B
    • /
    • pp.541-548
    • /
    • 2005
  • IP lookup and forwarding process becomes the bottleneck of packet transmission as IP traffic increases. Previous hardware-based IP address lookup schemes using an index-based table are not memory-efficient due to sparse distribution of the routing prefixes. In this paper, we propose memory-efficient hardware based IP lookup scheme called set-associative IP address lookup scheme, which provides the same IP lookup speed with much smaller memory requirement. In the proposed scheme, an NHA entry stores the prefix and next hop together. The IP lookup procedure compares a destination IP address with eight entries in a corresponding set simultaneously and finds the longest matched prefix. The memory requirement of the proposed scheme is about $42\%$ of that of Lin's scheme. Thus, the set-associative IP address lookup scheme is a memory-efficient hardware based IP address lookup scheme.

Design of 1-D DCT processor using a new efficient computation sharing multiplier (새로운 연산 공유 승산기를 이용한 1차원 DCT 프로세서의 설계)

  • Lee, Tae-Wook;Cho, Sang-Bock
    • The KIPS Transactions:PartA
    • /
    • v.10A no.4
    • /
    • pp.347-356
    • /
    • 2003
  • The OCT algorithm needs efficient hardware architecture to compute inner product. The conventional methods have large hardware complexity. Because of this reason. a computation sharing multiplier was proposed for implementing inner product. However, the existing multiplier has inefficient hardware architecture in precomputer and select units. Therefore it degrades the performance of the multiplier. In this paper, we proposed a new efficient computation sharing multiplier and applied it to implementation of 1-D DCT processor. The comparison results show that the new multiplier is more efficient than an old one when hardware architectures and logic synthesis results were compared. The designed 1-D DCT processor by using the proposed multiplier is more high performance than typical design methods.

EFFICIENT PARALLEL GAUSSIAN NORMAL BASES MULTIPLIERS OVER FINITE FIELDS

  • Kim, Young-Tae
    • Honam Mathematical Journal
    • /
    • v.29 no.3
    • /
    • pp.415-425
    • /
    • 2007
  • The normal basis has the advantage that the result of squaring an element is simply the right cyclic shift of its coordinates in hardware implementation over finite fields. In particular, the optimal normal basis is the most efficient to hardware implementation over finite fields. In this paper, we propose an efficient parallel architecture which transforms the Gaussian normal basis multiplication in GF($2^m$) into the type-I optimal normal basis multiplication in GF($2^{mk}$), which is based on the palindromic representation of polynomials.

The clone of Moore machine using Hardware genetic algorithm (하드웨어 유전자 알고리즘을 이용한 무어 머신의 복제)

  • 권혁수;박세현;이정환;노석호;서기성
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2002.05a
    • /
    • pp.466-468
    • /
    • 2002
  • This paper proposes a new type of evolvable hardware for implementing the clone of Moore State machine. The proposed Evolvable Hardware is employed efficient pipeline parallelization, handshaking mechanism and fitness function in FPGA Genetic Algorithm(GA) has known as a method of solving NP problem in various applications. Since a major drawback of the GA is that it needs a long computation time, the hardware implementation of Genetic Algorithm is focused on in recent studies. Conventional hardware GA uses the fired length of chromosome but the proposed Evolvable Hardware uses the variable length of chromosome by the efficient 16 bit Pipeline Unit. Experimental results show that the proposed evolvable hardware is applicable to the implementation of the clone for Moore State machine

  • PDF

The clone of Moore machine using hardware genetic algorithm (하드웨어 유전자 알고리즘을 이용한 무어 머신의 복제)

  • 서기성;박세현;권혁수;이정환;노석호
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.6 no.5
    • /
    • pp.718-723
    • /
    • 2002
  • This paper proposes a new type of evolvable hardware for implementing the clone of Moore State machine. The proposed Evolvable Hardware is employed efficient pipeline parallelization, handshaking mechanism and fitness function in FPGA. Genetic Algorithm(GA) has known as a method of solving NP problem in various applications. Since a major drawback of the GA is that it needs a long computation time, the hardware implementation of Genetic Algorithm is focused on in recent studies. Conventional hardware GA uses the fixed length of chromosome but the proposed Evolvable Hardware uses the variable length of chromosome by the efficient 16 bit Pipeline Unit. Experimental results show that the proposed evolvable hardware is applicable to the implementation of the clone for Moore State machine.

An Improvement MPEG-2 Video Encoder Through Efficient Frame Memory Interface (효율적인 프레임 메모리 인터페이스를 통한 MPEG-2 비디오 인코더의 개선)

  • 김견수;고종석;서기범;정정화
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.24 no.6B
    • /
    • pp.1183-1190
    • /
    • 1999
  • This paper presents an efficient hardware architecture to improve the frame memory interface occupying the largest hardware area together with motion estimator in implementing MPEG-2 video encoder as an ASIC chip. In this architecture, the memory size for internal data buffering and hardware area for frame memory interface control logic are reduced through the efficient memory map organization of the external SDRAM having dual bank and memory access timing optimization between the video encoder and external SDRAM. In this design, 0.5 m, CMOS, TLM (Triple Layer Metal) standard cells are used as design libraries and VHDL simulator and logic synthesis tools are used for hardware design add verification. The hardware emulator modeled by C-language is exploited for various test vector generation and functional verification. The architecture of the improved frame memory interface occupies about 58% less hardware area than the existing architecture[2-3], and it results in the total hardware area reduction up to 24.3%. Thus, the (act that the frame memory interface influences on the whole area of the video encoder severely is presented as a result.

  • PDF

A Study on the Efficient Occlusion Culling Using Z-Buffer and Simplified Model (Z-Buffer와 간략화된 모델을 이용한 효율적인 가려지는 물체 제거 기법(Occlusion Culling)에 관한 연구)

  • 정성준;이규열;최항순;성우제;조두연
    • Korean Journal of Computational Design and Engineering
    • /
    • v.8 no.2
    • /
    • pp.65-74
    • /
    • 2003
  • For virtual reality, virtual manufacturing system, or simulation based design, we need to visualize very large and complex 3D models which are comprising of very large number of polygons. To overcome the limited hardware performance and to attain smooth realtime visualization, there have been many researches about algorithms which reduce the number of polygons to be processed by graphics hardware. One of these algorithms, occlusion culling is a method of rejecting the objects which are not visible because they are occluded by other objects, and then passing only the visible objects to graphics hardware. Existing occlusion culling algorithms have some shortcomings such as the required long preprocessing time, the limitation of occluder shape, or the need for special hardware implementation. In this study, an efficient occlusion culling algorithm is proposed. The proposed algorithm reads and analyzes Z-buffer of graphics hardware using Microsoft DirectX, and then determines each object's visibility. This proposed algorithm can speed up visualization by reading Z-buffer using DirectX which can access hardware directly compared to OpenGL, by reading only the region to which each object is projected instead of reading the whole Z-Buffer, and the proposed algorithm can perform more exact visibility test by using simplified model instead of using bounding box. For evaluation, the proposed algorithm was applied to very large polygonal models. And smooth realtime visualization was attained.