• Title/Summary/Keyword: multiplier transform

Search Result 49, Processing Time 0.035 seconds

Modified CSD Group Multiplier Design for Predetermined Coefficient Groups (그룹 곱셈 계수를 위한 Modified CSD 그룹 곱셈기 디자인)

  • Kim, Yong-Eun;Xu, Yi-Nan;Chung, Jin-Gyun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.9
    • /
    • pp.48-53
    • /
    • 2007
  • Some digital signal processing applications, such as FFT, request multiplications with a group(or, groups) of a few predetermined coefficients. In this paper, based on the modified CSD algorithm, an efficient multiplier design method for predetermined coefficient groups is proposed. In the multiplier design for sine-cosine generator used in direct digital frequency synthesizer(DDFS), and in the multiplier design used in 128 point $radix-2^4$ FFT, it is shown that the area, power and delay time can be reduced up to 34%.

A Efficient Architecture of MBA-based Parallel MAC for High-Speed Digital Signal Processing (고속 디지털 신호처리를 위한 MBA기반 병렬 MAC의 효율적인 구조)

  • 서영호;김동욱
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.41 no.7
    • /
    • pp.53-61
    • /
    • 2004
  • In this paper, we proposed a new architecture of MAC(Multiplier-Accumulator) to operate high-speed multiplication-accumulation. We used the MBA(Modified radix-4 Booth Algorithm) which is based on the 1's complement number system, and CSA(Carry Save Adder) for addition of the partial products. During the addition of the partial product, the signed numbers with the 1's complement type after Booth encoding are converted in the 2's complement signed number in the CSA tree. Since 2-bit CLA(Carry Look-ahead Adder) was used in adding the lower bits of the partial product, the input bit width of the final adder and whole delay of the critical path were reduced. The proposed MAC was applied into the DWT(Discrete Wavelet Transform) filtering operation for JPEG2000, and it showed the possibility for the practical application. Finally we identified the improved performance according to the comparison with the previous architecture in the aspect of hardware resource and delay.

2-D Large Inverse Transform (16×16, 32×32) for HEVC (High Efficiency Video Coding)

  • Park, Jong-Sik;Nam, Woo-Jin;Han, Seung-Mok;Lee, Seong-Soo
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.12 no.2
    • /
    • pp.203-211
    • /
    • 2012
  • This paper proposes a $16{\times}16$ and $32{\times}32$ inverse transform architecture for HEVC (High Efficiency Video Coding). HEVC large transform of $16{\times}16$ and $32{\times}32$ suffers from huge computational complexity. To resolve this problem, we proposed a new large inverse transform architecture based on hardware reuse. The processing element is optimized by exploiting fully recursive and regular butterfly structure. To achieve low area, the processing element is implemented by shifters and adders without multiplier. Implementation of the proposed 2-D inverse transform architecture in 0.18 ${\mu}m$ technology shows about 300 MHz frequency and 287 Kgates area, which can process 4K ($3840{\times}2160$)@ 30 fps image.

Low Space Complexity Bit-Parallel Shifted Polynomial Basis Multipliers using Irreducible Trinomials (삼항 기약다항식 기반의 저면적 Shifted Polynomial Basis 비트-병렬 곱셈기)

  • Chang, Nam-Su;Kim, Chang-Han
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.20 no.5
    • /
    • pp.11-22
    • /
    • 2010
  • Recently, Fan and Dai introduced a Shifted Polynomial Basis and construct a non-pipeline bit-parallel multiplier for $F_{2^n}$. As the name implies, the SPB is obtained by multiplying the polynomial basis 1, ${\alpha}$, ${\cdots}$, ${\alpha}^{n-1}$ by ${\alpha}^{-\upsilon}$. Therefore, it is easy to transform the elements PB and SPB representations. After, based on the Modified Shifted Polynomial Basis(MSPB), SPB bit-parallel Mastrovito type I and type II multipliers for all irreducible trinomials are presented. In this paper, we present a bit-parallel architecture to multiply in SPB. This multiplier have a space complexity efficient than all previously presented architecture when n ${\neq}$ 2k. The proposed multiplier has more efficient space complexity than the best-result when 1 ${\leq}$ k ${\leq}$ (n+1)/3. Also, when (n+2)/3 ${\leq}$ k < n/2 the proposed multiplier has more efficient space complexity than the best-result except for some cases.

The FPGA Implementation of Wavelet Transform Chip using Daubechies′4 Tap Filter for DSP Application

  • Jeong, Chang-Soo;Kim, Nam-Young
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.376-379
    • /
    • 1999
  • The wavelet transform chip is implemented with Daubechies' 4 tap filter. It works at 20MHz in Field Programmable Gate array (FPGA) implementation of Quadrature Mirror Filter(QMF) Lattice Structure. In this paper, the structure contains taro-channel quadrature mirror filter, data format converter(DFC), delay control unit(DCU), and three 20$\times$8 bits real multiplier. The structures for the DFC and DCU need to he regular and scalable, require minimum number of regular, and thereby lead to an efficient and scalable architecture for the Discrete Wavelet Transform(DWT). These results present the possibility that it can be used in Digital Signal Processing(DSP) application faster than Fourier transform at small area with lour cost.

  • PDF

A Simple Discrete Cosine Transform Systolic Array Based on DFT for Video Codec (DFT에 의한 비데오 코덱용 DCT의 단순한 시스톨릭 어레이)

  • 박종오;이광재;양근호;박주용;이문호
    • Journal of the Korean Institute of Telematics and Electronics
    • /
    • v.26 no.11
    • /
    • pp.1880-1885
    • /
    • 1989
  • In this paper, a new approach for systolic array realizing the discrete cosine transform (DCT) based on discrete Fourier transform (DFT) of an input sequence is presented. The proposed array is based on a simple modified DFT(MDFT) version of the Goertzel algorithm combined with Kung's approach and is proved perfectly. This array requires N cells, one multiplier and takes N clock cycles to produce a complete N-point DCT and also is able to process a continuous stream of data sequences. We have analyzed the output signal-to-noise ratio(SNR) and designed the circuit level layout of one-PE chip. The array coefficients are static adn thus stored-product ROM's can be used in place of multipliers to limit cost as eliminate errors due to coefficients quantization.

  • PDF

A Study on the Design of DCT Module using Distributed Arithmetic Method

  • Yang Dong Hyun;Ku Dae Sung;Kim Phil Jung;Yon Jung Hyun;Kim Sang Duk;Hwang Jung Yeun;Jeong Rae Sung;Kim Jong Bin
    • Proceedings of the IEEK Conference
    • /
    • 2004.08c
    • /
    • pp.636-639
    • /
    • 2004
  • In present, there are many methods such as DCT, Wavelet Transform, or Quantization -to the image compression field, but the basic image compression method have based on DCT. The representative thing of the efficient techniques for information compression is DCT method. It is more superior than other information conversion method. It is widely applied in digital signal processing field and MPEG and JPEG which are selected as basis algorithm for an image compression by the international standardization group. It is general that DCT is consisted of using multiplier with main arithmetic blocks having many arithmetic amounts. But, the use of multiplier requires many areas when hardware is embodied, and there is fault that the processing speed is low. In this paper, we designed the hardware module that could run high-speed operation using row-column separation calculation method and Chen algorithm by distributed arithmetic method using ROM table instead of multiplier for design DCT module of high speed.

  • PDF

Two-Channel Multiwavelet Transform and Pre/Post-Filtering for Image Compression (영상 데이터 압축을 위한 2-채널 멀티웨이브렛 변환과 전후처리 필터의 적용)

  • Heo, Ung;Choi, Jae-Ho
    • Journal of the Korea Computer Industry Society
    • /
    • v.5 no.5
    • /
    • pp.737-746
    • /
    • 2004
  • Two-channel multiwavelet system is investigated for image compression application in this paper. Generally, multiwavelets are known for their superb capability of compressing non-stationary signals like voice. However, multivavelet system have a critical problem in processing and compressing image data due to mesh-grid visual artifacts. In our two-channel multiwavelet system we have investigated incorporation of pre and post filtering to the multiwavelet transform and compression system for alleviating those ingerent visual artifacts due to multiwavelet effect. In addition, to quantify the image data compression performance of proposed multiwavelet system, computer simulations have been performed using various image data. For bit allocation and quantization, the Lagrange multiplier technique considering data rate vs. distortion rate along with a nonlinear companding method are applied equallly to all systems considered, here. The simulation results have yielded 1 ~ 2 dB compression enhancement over the scalar savelet systems. If the more advanced compression methods like SPIHT and run-length channel coding were adopted for the proposed multiwavelet system, a much higher compression gain could be obtained.

  • PDF