• Title/Summary/Keyword: Systolic Structure

Search Result 57, Processing Time 0.028 seconds

Implementation of systolic array for 2-D IIR digital filters (2-D IIR digital filter에 대한 systolic array구현)

  • 김수현
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1992.06a
    • /
    • pp.29-32
    • /
    • 1992
  • In this paper, a systolic array structure is derived from the realization of 2-D IIR digital filters directed from the SFG(signal flow graph). After realized the 1-D formed partial systolic array, we implemented the complete systolic array to be cascaded 1-D form. The cascading of partial systolic arrays reduce the storage element which sued to delay input signal. 1-D systolic array is derived from that DG is designed through local communication approach and then it mapping to SFG. The derived structure is very simple and has high throughput because during new imput sample is supplied, new output is obtained every sampling period. And broadcast input signal is eliminated. Since the systolic array has property of regularity, modularity, local interconnection and highly synchronized multiprocessing, thus is very suitable for VLSI implementation.

  • PDF

A low-power systolic structure for MP3 IMDCT Using addition and shift operation (덧셈과 쉬프트 연산을 사용한 MP3 IMDCT의 저전력 Systolic 구조)

  • Jang Young Beom;Lee Won Sang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.10C
    • /
    • pp.1451-1459
    • /
    • 2004
  • In this paper, a low-power 32-point IMDCT structure is proposed for MP3. Through re-odering of IMDCT matrices, we propose the systolic structure operating with 16, 8, 4, 2, and 1 cycle, respectively. To reduce power consumption, multiplication of each sub blocks are implemented by add and shift operation with CSD(Canrmic sigled digit) form coefficients. To reduce, furthermore, the number of adders, we utilize the common sub-expression sharing techniques. With these techniques, the relative power consumption of the proposed structure is reduced by 58.4% comparison to the conventional structure using only 2's complement form coefficient. Validity of the proposed structure is proved through Verilog-HDL coding.

Characteristic Analysis of Modular Multiplier for GF($2^m$) (유한 필드 GF($2^m$)상의 모듈러 곱셈기 특성 분석)

  • 한상덕;김창훈;홍춘표
    • Proceedings of the IEEK Conference
    • /
    • 2002.06b
    • /
    • pp.277-280
    • /
    • 2002
  • This paper analyze the characteristics of three multipliers in finite fields GF(2m) from the point of view of processing time and area complexity. First, we analyze structure of three multipliers; 1) LSB-first systolic array, 2) LFSR structure, and 3) CA structure. To make performance analysis, each multiplier was modeled in VHDL and was synthesized for FPGA implementation. The simulation results show that LFSR structure is best from the point of view of area complexity, and LSB systolic array is best from the point of view of processing time per clock.

  • PDF

Optimization of a Systolic Array BCH encoder with Tree-Type Structure

  • Lim, Duk-Gyu;Shakya, Sharad;Lee, Je-Hoon
    • International Journal of Contents
    • /
    • v.9 no.1
    • /
    • pp.33-37
    • /
    • 2013
  • BCH code is one of the most widely used error correcting code for the detection and correction of random errors in the modern digital communication systems. The conventional BCH encoder that is operated in bit-serial manner cannot adequate with the recent high speed appliances. Therefore, parallel encoding algorithms are always a necessity. In this paper, we introduced a new systolic array type BCH parallel encoder. To study the area and speed, several parallel factors of the systolic array encoder is compared. Furthermore, to prove the efficiency of the proposed algorithm using tree-type structure, the throughput and the area overhead was compared with its counterparts also. The proposed BCH encoder has a great flexibility in parallelization and the speed was increased by 40% than the original one. The results were implemented on synthesis and simulation on FPGA using VHDL.

Design of a motion estimator with systolic array structure (Systolic array 구조를 갖는 움직임 추정기 설계)

  • 정대호;최석준;김환영
    • Journal of the Korean Institute of Telematics and Electronics C
    • /
    • v.34C no.10
    • /
    • pp.36-42
    • /
    • 1997
  • In the whole world, the research about the VLSI implementation of motion estimation algorithm is progressed to actively full (brute force) search algorithm research with the development of systolic array possible to parallel and pipeline processing. But, because of processing time's limit in a field to handle a huge data quantily such as a high definition television, many problems are happened to full search algorithm. In the paper, as a fast processing to using parallel scheme for the serial input image data, motion estimator of systolic array structure verifying that processing time is improved in contrast to the conventional full search algorithm.

  • PDF

Performance Analysis of Uplink Beamforming using Systolic Array Structure in W-CDMA Systems (W-CDMA용 Systolic 어레이 구조를 갖는 상향링크 빔형성기법 성능 분석)

  • 이호중;서상우;이원철
    • Proceedings of the IEEK Conference
    • /
    • 2002.06a
    • /
    • pp.25-28
    • /
    • 2002
  • 본 논문에서는 W-CDMA(Wide-Code Division Mul-tiple Access)용 Systolic 어레이 구조를 잣는 상향링크 빔형성기법에 대한 성능 분석을 하였다. 적응 어레이 안테나와 Systolic 구조의 MVDR(Minimum Variance Distortionless Response) 알고리즘을 사용하여 구해진 가중치 벡터를 이용하여 원하는 사용자의 방향으로 빔을 형성하고 원하지 않는 사용자의 방향으로는 null을 형성하는 공간필터를 적용하여 W-CDMA 상향링크에서 다중 경로 페이딩과 다중 접속 간섭의 증가에 따른 수신 성능을 분석하였다. 그리고, 안테나 시스템에서 사용되는 가중벡터를 갱신하기 위해 Systolic 구조의 MVDR과 역방향 파일럿 채널을 이용하는 QR-RLS(QR-Recursive Least Squares) 알고리즘을 적용하였다. 본 논문에서는 빔 형성기에 사용하기 위한 역행렬의 계산과 정에 Systolic 어레이 구조를 적용하여 병렬적인 고속처리가 가능한 방법과 효율적인 계산과정을 위해 MVDR 과 QR-RLS 알고리즘을 적용한 공간 필터링의 성능을 소개한다.

  • PDF

Design and Implementation of Hi-speed/Low-power Extended QRD-RLS Equalizer using Systolic Array and CORDIC (시스톨릭 어레이 구조와 CORDIC을 사용한 고속/저전력 Extended QRD-RLS 등화기 설계 및 구현)

  • Moon, Dae-Won;Jang, Young-Beom;Cho, Yong-Hoon
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.47 no.6
    • /
    • pp.1-9
    • /
    • 2010
  • In this paper, we propose a hi-speed/low-power Extended QRD-RLS(QR-Decomposition Recursive Least Squares) equalizer with systolic array structure. In the conventional systolic array structure, vector mode CORDIC on the boundary cell calculates angle of input vector, and the rotation mode CORDIC on the internal cell rotates vector. But, in the proposed structure, it is shown that implementation complexity can be reduced using the rotation direction of vector mode CORDIC and rotation mode CORDIC. Furthermore, calculation time can be reduced by 1/2 since vector mode and rotation mode CORDIC operate at the same time. Through HDL coding and chip implementation, it is shown that implementation area is reduced by 23.8% compared with one of conventional structure.

A systolic Array to Effectively Solve Large Sparce Matrix Linear System of Equations (대형 스파스 메트릭스 선형방정식을 효율적으로 해석하는 씨스톨릭 어레이)

  • 이병홍;채수환;김정선
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.17 no.7
    • /
    • pp.739-748
    • /
    • 1992
  • A CGM iterative systolic algorithm to solve large sparse linear systems of equations is presented. For implementation of the algorithm, a systolic array using the stripe structure is proposed. The matrix A is decomposed into a strictly lower triangular matrix, a diagonal matrix, and a strictly up-per triangular matrix, and the two formers and the tatter· are concurrently computed by different linear arrays. Hence, the execution time of this approach Is reduced to half of the execution time of the that a linear array is used. computation of the Irregularly distributed sparse matrix can be executed effectively by using the stripe structure.

  • PDF

Design of a Bit-Level Super-Systolic Array (비트 수준 슈퍼 시스톨릭 어레이의 설계)

  • Lee Jae-Jin;Song Gi-Yong
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.42 no.12
    • /
    • pp.45-52
    • /
    • 2005
  • A systolic array formed by interconnecting a set of identical data-processing cells in a uniform manner is a combination of an algorithm and a circuit that implements it, and is closely related conceptually to arithmetic pipeline. High-performance computation on a large array of cells has been an important feature of systolic array. To achieve even higher degree of concurrency, it is desirable to make cells of systolic array themselves systolic array as well. The structure of systolic array with its cells consisting of another systolic array is to be called super-systolic array. This paper proposes a scalable bit-level super-systolic amy which can be adopted in the VLSI design including regular interconnection and functional primitives that are typical for a systolic architecture. This architecture is focused on highly regular computational structures that avoids the need for a large number of global interconnection required in general VLSI implementation. A bit-level super-systolic FIR filter is selected as an example of bit-level super-systolic array. The derived bit-level super-systolic FIR filter has been modeled and simulated in RT level using VHDL, then synthesized using Synopsys Design Compiler based on Hynix $0.35{\mu}m$ cell library. Compared conventional word-level systolic array, the newly proposed bit-level super-systolic arrays are efficient when it comes to area and throughput.

Efficient Architecture of an n-bit Radix-4 Modular Multiplier in Systolic Array Structure (시스톨릭 어레이 구조를 갖는 효율적인 n-비트 Radix-4 모듈러 곱셈기 구조)

  • Park, Tae-geun;Cho, Kwang-won
    • The KIPS Transactions:PartA
    • /
    • v.10A no.4
    • /
    • pp.279-284
    • /
    • 2003
  • In this paper, we propose an efficient architecture for radix-4 modular multiplication in systolic array structure based on the Montgomery's algorithm. We propose a radix-4 modular multiplication algorithm to reduce the number of iterations, so that it takes (3/2)n+2 clock cycles to complete an n-bit modular multiplication. Since we can interleave two consecutive modular multiplications for 100% hardware utilization and can start the next multiplication at the earliest possible moment, it takes about only n/2 clock cycles to complete one modular multiplication in the average. The proposed architecture is quite regular and scalable due to the systolic array structure so that it fits in a VLSI implementation. Compared to conventional approaches, the proposed architecture shows shorter period to complete a modular multiplication while requiring relatively less hardware resources.