• Title/Summary/Keyword: low-complexity hardware architecture

Search Result 86, Processing Time 0.027 seconds

An Architecture for IEEE 802.11n LDPC Decoder Supporting Multi Block Lengths (다중 블록길이를 지원하는 IEEE 802.11n LDPC 복호기 구조)

  • Na, Young-Heon;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2010.05a
    • /
    • pp.798-801
    • /
    • 2010
  • This paper describes an efficient architecture for LDPC(Low-Density Parity Check) decoder, which supports three block lengths (648, 1,296, 1,944) of IEEE 802.11n standard. To minimize hardware complexity, the min-sum algorithm and block-serial layered structure are adopted in DFU(Decoding Function Unit) which is a main functional block in LDPC decoder. The optimized H-ROM structure for multi block lengths reduces the ROM size by 42% as compared to the conventional method. Also, pipelined memory read/write scheme for inter-layer DFU operations is proposed for an optimized operation of LDPC decoder.

  • PDF

An HEVC intra encoder sharing DCT with RDO for a low complex hardware (하드웨어 복잡도를 줄이기 위한 RDO내 DCT 공유구조의 HEVC 화면내 예측부호화기)

  • Lee, Sukho;Jang, Juneyoung;Byun, Kyungjun;Eum, Nakwoong
    • Smart Media Journal
    • /
    • v.3 no.4
    • /
    • pp.16-21
    • /
    • 2014
  • HEVC is the latest joint video coding standard with ITU-T SG16 WP and ISO/IEC JTC1/SC29/WG11. Its coding efficiency is about two times compared to H.264 high profile. Intra prediction has 35 directional modes including dc and planer. However an accurate mode decision on lots of modes with SSE is too costly to implement it with hardware. The key idea of this paper is a DCT shared architecture to reduce the complexity of HEVC intra encoder. It is to use same DCT block to quantize as well as to calculate SSE in RDO. The proposed intra encoder uses two step mode decision to lighten complexity with simplified RDO blocks and shares the transform resources. Its BD-rate increase is negligible at 20% on hardware aspect and the operating clock frequency is 300MHz@60fps on FHD ($1920{\times}1080$) image.

Low Space Complexity Bit Parallel Multiplier For Irreducible Trinomial over GF($2^n$) (삼항 기약다항식을 이용한 GF($2^n$)의 효율적인 저면적 비트-병렬 곱셈기)

  • Cho, Young-In;Chang, Nam-Su;Kim, Chang-Han;Hong, Seok-Hie
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.12
    • /
    • pp.29-40
    • /
    • 2008
  • The efficient hardware design of finite field multiplication is an very important research topic for and efficient $f(x)=x^n+x^k+1$ implementation of cryptosystem based on arithmetic in finite field GF($2^n$). We used special generating trinomial to construct a bit-parallel multiplier over finite field with low space complexity. To reduce processing time, The hardware architecture of proposed multiplier is similar with existing Mastrovito multiplier. The complexity of proposed multiplier is depend on the degree of intermediate term $x^k$ and the space complexity of the new multiplier is $2k^2-2k+1$ lower than existing multiplier's. The time complexity of the proposed multiplier is equal to that of existing multiplier or increased to $1T_X(10%{\sim}12.5%$) but space complexity is reduced to maximum 25%.

Packet Detection and Frequency Offset Estimation/Correction Architecture Design and Analysis for OFDM-based WPAN Systems (OFDM-기반 WPAN 시스템을 위한 패킷 검출 및 반송파 주파수 옵셋 추정/보정 구조 설계 및 분석)

  • Back, Seung-Ho;Lee, Han-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.49 no.7
    • /
    • pp.30-38
    • /
    • 2012
  • This paper presents packet detection, frequency offset estimation architecture and performance analysis for OFDM-based wireless personal area network (WPAN) systems. Packet detection structure is used to find the start point of a packet exactly in WPAN system as the correlation value passes the constant threshold value. The applied autocorrelation structure of the algorithm can be implemented simply compared to conventional packet detection algorithms. The proposed frequency offset estimation architecture is designed for phase rotation process structure, internal bit reduction to reduce hardware size and the frequency offset adjustment block to reduce look-up table size unlike the conventional structure. If the received signal can be compensated by estimated frequency offset through the correction block, it can reduce the impact on the frequency offset. Through the performance result, the proposed structure has lower hardware complexity and gate count compared to the conventional structure. Thus, the proposed structure for OFDM-based WPAN systems can be applied to the initial synchronization process and high-speed low-power WPAN chips.

A Design of Parameterized Viterbi Decoder using Hardware Sharing (하드웨어 공유를 이용한 파라미터화된 비터비 복호기 설계)

  • Park, Sang-Deok;Jeon, Heung-Woo;Shin, Kyung-Wook
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.05a
    • /
    • pp.93-96
    • /
    • 2008
  • This paper describes an efficient design of a multi-standard Viterbi decoder that supports multiple constraint lengths and code rates. The Viterbi decode. is parameterized for the code rates 1/2, 1/3 and constraint lengths 7, 9, thus it has four operation modes. In order to achieve low hardware complexity and low power, an efficient architecture based on hardware sharing techniques is devised. Also, the optimization of ACCS (Accumulate-Subtract) circuit for the one-point trace-back algorithm reduces its area by about 35% compared to the full parallel ACCS circuit. The parameterized Viterbi decoder core has 79,818 gates and 25,600 bits memory, and the estimated throughput is about 105 Mbps at 70 MHz clock frequency.

  • PDF

Logic gate implementation of constant amplitude coded CS/CDMA transmitter (정포락선 부호화된 CS-CDMA 송신기의 논리 게이트를 이용한 구현)

  • 김성필;류형직;김명진;오종갑
    • Proceedings of the IEEK Conference
    • /
    • 2003.11c
    • /
    • pp.281-284
    • /
    • 2003
  • Multi-code CDMA is an appropriate scheme for transmitting high rate data. However, dynamic range of the signal is large, and power amplifier with good linearity is required. Code select CDMA (CS/CDMA) is a variation of multi-code CDMA scheme that ensures constant amplitude transmission. In CS/CDMA input data selects multiple orthogonal codes, and sum of these selected codes are MPSK modulated to convert multi-level symbol into different carrier phases. CS/CDMA system employs level clipping to limit the number of levels at the output symbol to avoid hish density of signal constellation. In our previous work we showed that by encoding input data of CS/CDMA amplitude of the output symbol can be made constant. With this coding scheme, level clipping is not necessary and the output signal can be BPSK modulated for transmission. In this paper we show that the constant amplitude coded(CA-) CS/CDMA transmitter can be implemented using only logic gates, and the hardware complexity is very low. In the proposed transmitter architecture there is no apparent redundant encoder block which plays a major role in the constant amplitude coded CS/CDMA.

  • PDF

Low System Complexity Bit-Parallel Architecture for Computing $AB^2+C$ in a Class of Finite Fields $GF(2^m)$ (시스템 복잡도를 개선한 $GF(2^m)$ 상의 병렬 $AB^2+C$ 연산기 설계)

  • 변기령;김흥수
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.40 no.6
    • /
    • pp.24-30
    • /
    • 2003
  • This study focuses on the arithmetical methodology and hardware implementation of low system-complexity A $B^2$+C operator over GF(2$^{m}$ ) using the irreducible AOP of degree m. The proposed parallel-in parallel-out operator is composed of CS, PP, and MS modules, each can be established using the array structure of AND and XOR gates. The proposed multiplier is composed of (m+1)$^2$ 2-input AND gates and (m+1)(m+2) 2-input XOR gates. And the minimum propagation delay is $T_{A}$ +(1+$\ulcorner$lo $g_2$$^{m}$ $\lrcorner$) $T_{x}$ . Comparison result of the related A $B^2$+C operators of GF(2$^{m}$ ) are shown by table, It reveals that our operator involve more lower circuit complexity and shorter propagation delay then the others. Moreover, the interconnections of the out operators is very simple, regular, and therefore well-suited for VLSI implementation.

Disign of Non-coherent Demodulator for LR-WPAN Systems (LR-WPAN 시스템을 위한 비동기 복조 알고리즘 및 하드웨어 구조설계)

  • Lee, Dong-Chan;Jang, Soo-Hyun;Jung, Yun-Ho
    • Journal of Advanced Navigation Technology
    • /
    • v.17 no.6
    • /
    • pp.705-711
    • /
    • 2013
  • In this paper, we present a low-complexity non-coherent demodulation algorithm and hardware architecture for LR-WPAN systems which can support the variable data rate for various applications. The need for LR-WPAN systems that can support the variable data rate is increasing due to the emergence of various sensor applications. Since the existing symbol based double correlation (SBDC) algorithm requires the increase of complexity to support the variable data rate, we propose the sample based double correlation (SPDC) algorithm which can be implemented without the increase of complexity. The proposed non-coherent demodulator was designed by verilog HDL and implemented with FPGA prototype board.

High-Speed Reed-Solomon Decoder Using New Degree Computationless Modified Euclid´s Algorithm (새로운 DCME 알고리즘을 사용한 고속 Reed-Solomon 복호기)

  • 백재현;선우명훈
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.40 no.6
    • /
    • pp.459-468
    • /
    • 2003
  • This paper proposes a novel low-cost and high-speed Reed-Solomon (RS) decoder based on a new degree computationless modified Euclid´s (DCME) algorithm. This architecture has quite low hardware complexity compared with conventional modified Euclid´s (ME) architectures, since it can remove completely the degree computation and comparison circuits. The architecture employing a systolic away requires only the latency of 2t clock cycles to solve the key equation without initial latency. In addition, the DCME architecture using 3t+2 basic cells has regularity and scalability since it uses only one processing element. The RS decoder has been synthesized using the 0.25${\mu}{\textrm}{m}$. Faraday CMOS standard cell library and operates at 200MHz and its data rate suppots up to 1.6Gbps. For tile (255, 239, 8) RS code, the gate counts of the DCME architecture and the whole RS decoder excluding FIFO memory are only 21,760 and 42,213, respectively. The proposed RS decoder can reduce the total fate count at least 23% and the total latency at least 10% compared with conventional ME architectures.

An Efficient List Successive Cancellation Decoder for Polar Codes

  • Piao, Zheyan;Kim, Chan-Mi;Chung, Jin-Gyun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.16 no.5
    • /
    • pp.550-556
    • /
    • 2016
  • Polar codes are one of the most favorable capacity-achieving codes due to their simple structure and low decoding complexity. However, because of the disappointing decoding performance realized using conventional successive cancellation (SC) decoders, polar codes cannot be used directly in practical applications. In contrast to conventional SC decoders, list SC (SCL) decoders with large list sizes (e.g. 32) achieve performances very close to those of maximum-likelihood (ML) decoders. In SCL decoders with large list sizes, however, hardware increase is a severe problem because an SCL decoder with list size L consists of L copies of an SC decoder. In this paper, we present a low-area SCL decoder architecture that applies the proposed merged processing element-sharing (MPES) algorithm. A merged processing element (MPE) is the basic processing unit in SC decoders, and the required number of MPEs is L(N-1) in conventional SCL decoders. Using the proposed algorithm reduces the number of MPEs by about 70% compared with conventional SCL decoders when the list size is larger than 32.