• Title/Summary/Keyword: Signed-DIgit

Search Result 69, Processing Time 0.028 seconds

Optimization of Pipelined Discrete Wavelet Packet Transform Based on an Efficient Transpose Form and an Advanced Functional Sharing Technique

  • Nguyen, Hung-Ngoc;Kim, Cheol-Hong;Kim, Jong-Myon
    • Journal of Information Processing Systems
    • /
    • v.15 no.2
    • /
    • pp.374-385
    • /
    • 2019
  • This paper presents an optimal implementation of a Daubechies-based pipelined discrete wavelet packet transform (DWPT) processor using finite impulse response (FIR) filter banks. The feed-forward pipelined (FFP) architecture is exploited for implementation of the DWPT on the field-programmable gate array (FPGA). The proposed DWPT is based on an efficient transpose form structure, thereby reducing its computational complexity by half of the system. Moreover, the efficiency of the design is further improved by using a canonical-signed digit-based binary expression (CSDBE) and advanced functional sharing (AFS) methods. In this work, the AFS technique is proposed to optimize the convolution of FIR filter banks for DWPT decomposition, which reduces the hardware resource utilization by not requiring any embedded digital signal processing (DSP) blocks. The proposed AFS and CSDBE-based DWPT system is embedded on the Virtex-7 FPGA board for testing. The proposed design is implemented as an intellectual property (IP) logic core that can easily be integrated into DSP systems for sub-band analysis. The achieved results conclude that the proposed method is very efficient in improving hardware resource utilization while maintaining accuracy of the result of DWPT.

An Efficient Array Algorithm for VLSI Implementation of Vector-radix 2-D Fast Discrete Cosine Transform (Vector-radix 2차원 고속 DCT의 VLSI 구현을 위한 효율적인 어레이 알고리듬)

  • 신경욱;전흥우;강용섬
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.18 no.12
    • /
    • pp.1970-1982
    • /
    • 1993
  • This paper describes an efficient array algorithm for parallel computation of vector-radix two-dimensional (2-D) fast discrete cosine transform (VR-FCT), and its VLSI implementation. By mapping the 2-D VR-FCT onto a 2-D array of processing elements (PEs), the butterfly structure of the VR-FCT can be efficiently importanted with high concurrency and local communication geometry. The proposed array algorithm features architectural modularity, regularity and locality, so that it is very suitable for VLSI realization. Also, no transposition memory is required, which is invitable in the conventional row-column decomposition approach. It has the time complexity of O(N+Nnzp-log2N) for (N*N) 2-D DCT, where Nnzd is the number of non-zero digits in canonic-signed digit(CSD) code, By adopting the CSD arithmetic in circuit desine, the number of addition is reduced by about 30%, as compared to the 2`s complement arithmetic. The computational accuracy analysis for finite wordlength processing is presented. From simulation result, it is estimated that (8*8) 2-D DCT (with Nnzp=4) can be computed in about 0.88 sec at 50 MHz clock frequency, resulting in the throughput rate of about 72 Mega pixels per second.

  • PDF

Design of Low Error Fixed-Width Group CSD Multiplier (저오차 고정길이 그룹 CSD 곱셈기 설계)

  • Kim, Yong-Eun;Cho, Kyung-Ju;Chung, Jin-Gyun
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.9
    • /
    • pp.33-38
    • /
    • 2009
  • The group CSD (GCSD) multiplier was recently proposed based on the variation of canonic signed digit (CSD) encoding and partial product sharing. This multiplier provides an efficient design when the multiplications are performed only with a few predetermined coefficients (e.g., FFT). In many DSP applications such as FFT, the (2W-1)-bit product obtained from W-bit multiplicand and W-bit multiplier is quantized to W-bits by eliminating the (W-1) least-significant bits. This paper presents an error compensation method for a fixed-width GCSD multiplier that receives a W-bit input and produces a W-bit product. To efficiently compensate for the quantization error, the encoded signals from the GCSD multiplier are used for the generation of error compensation bias. By Synopsys simulations, it is shown that the proposed method leads to up to 84% reduction in power consumption and up to 79% reduction in area compared with the fixed-width modified Booth multiplier.

Montgomery Multiplier Base on Modified RBA and Hardware Architecture (변형된 RBA를 이용한 몽고메리 곱셈기와 하드웨어 구조)

  • Ji Sung-Yeon;Lim Dae-Sung;Jang Nam-Su;Kim Chang-Han;Lee Sang-Jin
    • Proceedings of the Korea Institutes of Information Security and Cryptology Conference
    • /
    • 2006.06a
    • /
    • pp.351-355
    • /
    • 2006
  • RSA 암호 시스템은 IC카드, 모바일 및 WPKI, 전자화폐, SET, SSL 시스템 등에 많이 사용된다. RSA는 모듈러 지수승 연산을 통하여 수행되며, Montgomery 곱셈기를 사용하는 것이 효율적이라고 알려져 있다. Montgomery 곱셈기에서 임계 경로 지연 시간(Critical Path Delay)은 세 피연산자의 덧셈에 의존하고 캐리 전파를 효율적으로 처리하는 문제는 Montgomery 곱셈기의 효율성에 큰 영향을 미친다. 최근 캐리 전파를 제거하는 방법으로 캐리 저장 덧셈기(Carry Save Adder, CSA)를 사용하는 연구가 계속 되고 있다. McIvor외 세 명은 지수승 연산에 최적인 CSA 3단계로 구성된 Montgomery 곱셈기와 CSA 2단계로 구성된 Montgomery 곱셈기를 제안했다. 시간 복잡도 측면에서 후자는 전자에 비해 효율적이다. 본 논문에서는 후자보다 빠른 연산을 수행하기 위해 캐리 전파 제거 특성을 가진 이진 부호 자리(Signed-Digit, SD) 수 체계를 사용한다. 두 이진 SD 수의 덧셈을 수행하는 잉여 이진 덧셈기(Redundant Binary Adder, RBA)를 새로 제안하고 Montgomery 곱셈기에 적용한다. 기존의 RBA에서 사용하는 이진 SD 덧셈 규칙 대신 새로운 덧셈 규칙을 제안하고 삼성 STD130 $0.18{\mu}m$ 1.8V 표준 셀 라이브러리에서 지원하는 게이트들을 사용하여 설계하고 시뮬레이션 하였다. 그 결과 McIvor의 2 방법과 기존의 RBA보다 최소 12.46%의 속도 향상을 보였다.

  • PDF

Design of Redundant Binary Adder based on Memristor-CMOS (멤리스터-CMOS 기반의 잉여 이진 가산기 설계)

  • Ahn, Yeongyu;Lee, Sang-Jin;Kim, Seokman;Eshraghian, Kamran;Cho, Kyoungrok
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.51 no.9
    • /
    • pp.67-74
    • /
    • 2014
  • This paper presents a memristor-CMOS based RBSD adder. Conventional RBSD adders suffer bigger hardware due to the extra logic handling larger number of bits. The purpose of this paper is to improve the silicon surface area and the computation delay of conventional RBSD adders. The proposed method employs memristor-CMOS based circuit. The implementation results shows that the proposed memristor-CMOS based RBSD adder saves the cell area by 45%, and reduces time delay 24% compared to conventional RBSD adders. The proposed RBSD adder design can bring further area saving for large scale designs.

Fast Fourier Transform Processor based on Low-power and Area-efficient Algorithm (저 전력 및 면적 효율적인 알고리즘 기반 고속 퓨리어 변환 프로세서)

  • Oh Jung-yeol;Lim Myoung-seob
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.42 no.2 s.302
    • /
    • pp.143-150
    • /
    • 2005
  • This paper proposes a new $radix-2^4$ FFT algorithm and an efficient pipeline architecture based on this new algorithm for OFDM systems. The pipeline architecture based on the new algorithm has the same number of multipliers as that of the $radix-2^2$ algorithm. However, the multiplier complexity could be reduced by more than $30\%$ by replacing one half of the programmable complex multipliers by the newly proposed CSD constant complex multipliers. From synthesis simulations of a standard 0.35um CMOS Samsung process, a proposed CSD constant complex multiplier achieved more than $60\%$ area efficiency when compared with the conventional programmable complex multiplier. This promoted efficiency can be used for the design of a long length FFT processor in wireless OFDM applications which needs more power and area efficiency.

Study on the Amplitude Modification Audio Watermarking Technique for Mixed Music with High Inaudibility (높은 비가청성을 갖는 믹스 음악의 크기 변조 오디오 워터마킹 기술에 관한 연구)

  • Kang, Se-Koo;Lee, Young-Seok
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.9 no.1
    • /
    • pp.67-74
    • /
    • 2016
  • In this paper, we propose a watermarking technology for a mixed music. The mixed music means recreated music that contained a number of musics in one audio clip. Royalty associated with the audio content is typically imposed by the full audio content. However, the calculation of royalties gives rise to conflict between copyright holders and users in the mixed music because it uses not full audio content but a fraction of that. To solve the conflict related with the mixed music, we propose a audio watermarking technique that inserts different watermarks for each audio in the audio that make up the mixed music. The proposed watermarking scheme might have poor SNR (signal to noise ratio) to embed to each audio clip. To overcome poor SNR problem, we used inaudible pseudo random sequence which modifies typical pseudo random sequence to canonical signed digit (CSD) form. The proposed method verifies the performance by each watermark extraction and the time internal estimation valies from the mixed music.

Bit-serial Discrete Wavelet Transform Filter Design (비트 시리얼 이산 웨이블렛 변환 필터 설계)

  • Park Tae geun;Kim Ju young;Noh Jun rye
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.30 no.4A
    • /
    • pp.336-344
    • /
    • 2005
  • Discrete Wavelet Transform(DWT) is the oncoming generation of compression technique that has been selected for MPEG4 and JEPG2000, because it has no blocking effects and efficiently determines frequency property of temporary time. In this paper, we propose an efficient bit-serial architecture for the low-power and low-complexity DWT filter, employing two-channel QMF(Qudracture Mirror Filter) PR(Perfect Reconstruction) lattice filter. The filter consists of four lattices(filter length=8) and we determine the quantization bit for the coefficients by the fixed-length PSNR(peak-signal-to-noise ratio) analysis and propose the architecture of the bit-serial multiplier with the fixed coefficient. The CSD encoding for the coefficients is adopted to minimize the number of non-zero bits, thus reduces the hardware complexity. The proposed folded 1D DWT architecture processes the other resolution levels during idle periods by decimations and its efficient scheduling is proposed. The proposed architecture requires only flip-flops and full-adders. The proposed architecture has been designed and verified by VerilogHDL and synthesized by Synopsys Design Compiler with a Hynix 0.35$\mu$m STD cell library. The maximum operating frequency is 200MHz and the throughput is 175Mbps with 16 clock latencies.

SPA-Resistant Unsigned Left-to-Right Receding Method (SPA에 안전한 Unsigned Left-to-Right 리코딩 방법)

  • Kim, Sung-Kyoung;Kim, Ho-Won;Chung, Kyo-Il;Lim, Jong-In;Han, Dong-Guk
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.17 no.1
    • /
    • pp.21-32
    • /
    • 2007
  • Vuillaume-Okeya presented unsigned receding methods for protecting modular exponentiations against side channel attacks, which are suitable for tamper-resistant implementations of RSA or DSA which does not benefit from cheap inversions. The proposed method was using a signed representation with digits set ${1,2,{\cdots},2^{\omega}-1}$, where 0 is absent. This receding method was designed to be computed only from the right-to-left, i.e., it is necessary to finish the receding and to store the receded string before starting the left-to-right evaluation stage. This paper describes new receding methods for producing SPA-resistant unsigned representations which are scanned from left to right contrary to the previous ones. Our contributions are as follows; (1) SPA-resistant unsigned left-to-right receding with general width-${\omega}$, (2) special case when ${\omega}=1$, i.e., unsigned binary representation using the digit set {1,2}, (3) SPA-resistant unsigned left-to-right Comb receding, (4) extension to unsigned radix-${\gamma}$ left-to-right receding secure against SPA. Hence, these left-to-right methods are suitable for implementing on memory limited devices such as smartcards and sensor nodes