• Title/Summary/Keyword: Distributed Arithmetic

Search Result 72, Processing Time 0.025 seconds

High-Performance VLSI Architecture Using Distributed Arithmetic for Higher-Order FIR Filters with Complex Coefficients

  • Tsunekawa, Yoshitaka;Nozaki, Takeshi;Tayama, Norio
    • Proceedings of the IEEK Conference
    • /
    • 2002.07b
    • /
    • pp.856-859
    • /
    • 2002
  • This paper proposes a high-performance VLSl architecture using distributed arithmetic for higher-order FIR filters with complex coefficients. For the purpose of realizing high sampling rate with small latency in high-order filters, we apply distributed arithmetic[1]. Moreover, in order to decrease drastically the power dissipation, the structure applying not ROM's but optimum function circuits which we have previously proposed, is utilized[2][3]. However, this structure increases in the number of adders as compared to the conventional structure applying ROM's. In order to realize a more effective method for further higher-order filter, we propose newly an implementation applying two methods which have large effects on the unit using the adders. First , we propose an implementation applying SFAs(Serial Full Adders) and SFSs(Serial Full Subtractors). Second, we propose a structure applying proposed 4-2 adders. Finally, it is shown that the proposed architecture is an effective way to realize low power dissipation and small latency while the sampling rate is kept constant for further higher-order filters with complex coefficients.

  • PDF

Low-power Horizontal DA Filter Structure Using Radix-16 Modified Booth Algorithm (Radix-16 Modified Booth 알고리즘을 이용한 저전력 Horizontal DA 필터 구조)

  • Shin, Ji-Hye;Jang, Young-Beom
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.47 no.12
    • /
    • pp.31-38
    • /
    • 2010
  • In tins paper, a new DA(Distributed Arithmetic) tilter implementation technique has been proposed. Contrary to vertical directional calculation of input sample bit format in the conventional DA implementation technique, proposed implementation technique utilizes horizontal directional calculation of input sample bit format. Since proposed technique calculates in horizontal direction, it does not need ROM and utilizes the Modified Booth algorithm. Furthermore proposed technique can be applied to implement the variable coefficients filters in addition to the fixed coefficients filters. Using conventional and proposed techniques, a 20 tap filter is implemented by Verilog-HDL coding. Through Synopsis synthesis tool, it has been shown that 41.6% area reduction can be achieved.

Hardwired Distributed Arithmetic for Multiple Constant Multiplications and Its Applications for Transformation (다중 상수 곱셈을 위한 하드 와이어드 분산 연산)

  • Kim, Dae-Won;Choi, Jun-Rim
    • Proceedings of the IEEK Conference
    • /
    • 2005.11a
    • /
    • pp.949-952
    • /
    • 2005
  • We propose the hardwired distributed arithmetic which is applied to multiple constant multiplications and the fixed data path in the inner product of fixed coefficient as a result of variable radix-2 multi-bit coding. Variable radix-2 multi-bit coding is to reduce the partial product in constant multiplication and minimize the number of addition and shifts. At results, this procedure reduces the number of partial products that the required multiplication timing is shortened, whereas the area reduced relative to the DA architecture. Also, this architecture shows the best performance for DCT/IDCT and DWT architecture in the point of area reduction up to 20% from reducing the partial products up to 40% maximally.

  • PDF

VLSI Architecture of Fast Jacket Transform (Fast Jacket Transform의 VLSI 아키텍쳐)

  • 유경주;홍선영;이문호;정진균
    • Proceedings of the IEEK Conference
    • /
    • 2001.09a
    • /
    • pp.769-772
    • /
    • 2001
  • Waish-Hadamard Transform은 압축, 필터링, 코드 디자인 등 다양한 이미지처리 분야에 응용되어왔다. 이러한 Hadamard Transform을 기본으로 확장한 Jacket Transform은 행렬의 원소에 가중치를 부여함으로써 Weighted Hadamard Matrix라고 한다. Jacket Matrix의 cocyclic한 특성은 암호화, 정보이론, TCM 등 더욱 다양한 응용분야를 가질 수 있고, Space Time Code에서 대역효율, 전력면에서도 효율적인 특성을 나타낸다 [6],[7]. 본 논문에서는 Distributed Arithmetic(DA) 구조를 이용하여 Fast Jacket Transform(FJT)을 구현한다. Distributed Arithmetic은 ROM과 어큐뮬레이터를 이용하고, Jacket Watrix의 행렬을 분할하고 간략화하여 구현함으로써 하드웨어의 복잡도를 줄이고 기존의 시스톨릭한 구조보다 면적의 이득을 얻을 수 있다. 이 방법은 수학적으로 간단할 뿐 만 아니라 행렬의 곱의 형태를 단지 덧셈과 뺄셈의 형태로 나타냄으로써 하드웨어로 쉽게 구현할 수 있다. 이 구조는 입력데이타의 워드길이가 n일 때, O(2n)의 계산 복잡도를 가지므로 기존의 시스톨릭한 구조와 비교하여 더 적은 면적을 필요로 하고 FPGA로의 구현에도 적절하다.

  • PDF

DCT/IDCT Processor Design using Adder-based Distributed Arithmetic (가산기-기반 분산 연산을 이용한 DCT/IDCT 프로세서 설계)

  • 임국찬;장영진;이현수
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.04a
    • /
    • pp.30-32
    • /
    • 2000
  • 내적을 계산하는데 있어서 Distributed Arithmetic(DA)을 사용하면 곱셈기를 사용하는 것보다 소비전력 및 크기를 효율적으로 줄일 수 있고, 고속동작이 가능한 회로구현이 쉽기 때문에 신호처리 시스템 설계에 많이 사용하고 있다. DA에는 롬-기반 DA와 가산기-기반 DA를 이용한 방법이 있는데, 가산기-기반 DA는 Sharing property와 계수의 Spare non-zero bit property를 최대한 이용하여 설계가 가능하기 때문에 크기 및 동작속도 측면에서 효율적인 구현이 가능하다. 본 논문에서는 가산기-기반 DA의 이러한 특성을 최대한 이용하여 멀티미디어 신호처리에 적합한 DCT/IDCT 프로세서를 설계하였고 다른 구조 및 롬-기반 DA와 비교 평가해본 결과 크기 및 속도 측면에서 효율적인 결과를 얻었다.

  • PDF

A Technical Trend on Automatic Vacuum Capacitor Switch with Modified Digital Filter Design (디지털 필터 설계를 이용한 자동 진공 콘덴서 스위치의 기술 동향)

  • Oh, Gi-Soo;Chang, Young-Ho;Yun, Ju-Ho;Hwang, Jong-Sun;Choi, Yong-Sung;Lee, Kyung-Sup
    • Proceedings of the KIEE Conference
    • /
    • 2007.07a
    • /
    • pp.1978-1979
    • /
    • 2007
  • In this paper, the authors introduce a high-speed microprocessor based on automatic vacuum capacitor switch with a modified digital filter design using distributed arithmetic. The automation trends particularly the automatic vacuum capacitor switch has helped ameliorate the power factor essentials and automatically triggered to close when the line current exceeds rated value. Microprocessor relays use digital filters to extract only the fundamental and attenuate harmonics. To provide optimum speed characteristics a distributed arithmetic based filter design in the microprocessor controller which not only enhances filtering speed but additionally enables lower power consumption at the cost of area has been introduced. The result is a unified description that describes a digital filter structure down to bit level.

  • PDF

Efficient SAD Processor for Motion Estimation of H.264 (H.264 움직임 추정을 위한 효율적인 SAD 프로세서)

  • Jang, Young-Beom;Oh, Se-Man;Kim, Bee-Chul;Yoo, Hyeon-Joong
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.2 s.314
    • /
    • pp.74-81
    • /
    • 2007
  • In this paper, an efficient SAD(Sum of Absolute Differences) processor structure for motion estimation of H.264 is proposed. SAD processors are commonly used both in full search methods for motion estimation and in fast search methods for motion estimation. Proposed structure consists of SAD calculator block, combinator block, and minimum value calculator block. Especially, proposed structure is simplified by using Distributed Arithmetic for addition operation. The Verilog-HDL(Hard Description Language) coding and FPGA(Field Programmable Gate Array) implementation results for the proposed structure show 39% and 32% gate count reduction in comparison with those of the conventional structure, respectively. Due to its efficient processing scheme, the proposed SAD processor structure can be widely used in size dominant H.264 chip.

The Optimal Extraction Method of Adder Sharing Component for Inner Product and its Application to DCT Design (내적연산을 위한 가산기 공유항의 최적 추출기법 제안 및 이를 이용한 DCT 설계)

  • Im, Guk-Chan;Jang, Yeong-Jin;Lee, Hyeon-Su
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.38 no.7
    • /
    • pp.503-512
    • /
    • 2001
  • The general DSP algorithm, like orthogonal transform or filter processing, needs efficient hardware architecture to compute inner product. The typical MAC architecture has high cost of silicon. Because of this reason, the distributed arithmetic without multiplier is widely used for implementing inner product. This paper presents the optimization to reduce required hardware in distributed arithmetic by using extraction method of adder sharing component. The optimization process uses Boltzmann-machine which is one of the neural network. This proposed method can solve problem that is increasing complexity depending on depth of inner product and compose optimal summation-network with the minimum FA and FF in a few time. The designed DCT by using Proposed method is more efficient than a ROM-based distributed arithmetic.

  • PDF

High-speed Radix-8 FFT Structure for OFDM (OFDM용 고속 Radix-8 FFT 구조)

  • Jang, Young-Beom;Hur, Eun-Sung;Park, Jin-Su;Hong, Dae-Ki
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.44 no.5
    • /
    • pp.84-93
    • /
    • 2007
  • In this paper, a Radix-8 structure for high-speed FFT is propose. Main block of the proposed FFT structure is Radix-8 DIF(Decimation In Frequency) butterfly. Even throughput of the Radix-8 FFT is twice than that of the Radix-4 FFT, implementation area of the Radix-8 is larger than that of Radix-4 FFT. But, implementation area of the proposed Radix-8 FFT was reduced by using DA(Distributed Arithmetic) for multiplication. For comparison, the 64-point FFT was implemented using conventional Radix-4 butterfly and proposed Radix-8 butterfly, respectively. The Verilog-HDL coding results for the proposed FFT structure show 49.2% cell area increment comparison with those of the conventional Radix-4 FFT structure. Namely, to speed up twice, 49.2% of area cost is required. In case of same throughput, power consumption of the proposed structure is reduced by 25.4%. Due to its efficient processing scheme, the proposed FFT structure can be used in large size of FFT like OFDM Modem.

Design of a Synchronous Control Unit for a Datapath with Variable Delay Arithmetic Units (가변지연시간 연산기를 가진 데이터 경로에 대한 동기식 제어기의 설계)

  • 김의석;이정근;이동익
    • Proceedings of the IEEK Conference
    • /
    • 2002.06b
    • /
    • pp.321-324
    • /
    • 2002
  • Nowadays variable delay arithmetic units have been used for implementing a datapath of\ulcorner target system in pursuit of performance improvement. However. adoption of variable delay arithmetic units requires modification of a typical synchronous control units design methodology. There is a representative approach, which is called a monolithic approach. Although its results are good, its proposed methodology may cause critical problems in the aspects of area and performance with the size increase of initial system specifications. In order to solve this problems, a distributed approach is suggested. Experimental results show that the Proposed method can guarantee original performance of an initial system specification with minimized additional area increase.

  • PDF