• Title/Summary/Keyword: parallel decoding

Search Result 152, Processing Time 0.031 seconds

VLSI design of efficient VLC/VLD utilizing the characteristics of MPEG DCT coefficients (MPEG DCT 계수의 특징을 이용한 효율적인 VLC/VLD의 VLSI 설계)

  • Kong, Jong-Pil;Kim, Young-Min
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.1
    • /
    • pp.79-86
    • /
    • 1996
  • In this paper we propose an architecture for VLC(Variable Length Coder) and VLD(Variable Length Decoder) which is simple with respect to implementation point and efficient in memory. We implemented encoding and decoding circuit where we need only 7-bit address memory space for 114 MPEG1 DCT coefficients and employed minimal number of flip-flops and logics for an architecture to integrate a shift register for serial-to-parallel or parallel-to-serial conversion of the data in code mapping ROM. We obtained 50Mbps operating speed in both encoding and decoding process as the result of simulation using 0.80.8${\mu}m$ CMOS standard cells.

  • PDF

A 18-Mbp/s, 8-State, High-Speed Turbo Decoder

  • Jung Ji-Won;Kim Min-Hyuk;Jeong Jin-Hee
    • Journal of electromagnetic engineering and science
    • /
    • v.6 no.3
    • /
    • pp.147-154
    • /
    • 2006
  • In this paper, we propose and present implementation results of a high-speed turbo decoding algorithm. The latency caused by (de) interleaving and iterative decoding in a conventional maximum a posteriori(MAP) turbo decoder can be dramatically reduced with the proposed design. The source of the latency reduction is come from the combination of the radix-4, dual-path processing, parallel decoding, and rearly-stop algorithms. This reduced latency enables the use of the turbo decoder as a forward error correction scheme in real-time wireless communication services. The proposed scheme results in a slight degradation in bit-error rate(BER) performance for large block sizes because the effective interleaver size in a radix-4 implementation is reduced to half, relative to the conventional method. Fixed on the parameters of N=212, iteration=3, 8-states, 3 iterations, and QPSK modulation scheme, we designed the adaptive high-speed turbo decoder using the Xilinx chip (VIRTEX2P (XC2VP30-5FG676)) with the speed of 17.78 Mb/s. From the results, we confirmed that the decoding speed of the proposed decoder is faster than conventional algorithms by 8 times.

Design of R=1/2, K=7 Type High Speed Viterbi Decoder with Circularly Connected 2-D Analog Parallel Processing Cell Array (아날로그 2차원 셀의 순환형 배열을 이용한 R=l/2. K=7형 고속 비터비 디코더 설계)

  • 손홍락;김형석
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.52 no.11
    • /
    • pp.650-656
    • /
    • 2003
  • A high speed Viterbi decoder with a circularly connected 2-dimensional analog processing ceil array Is proposed. The proposed Viterbi .decoder has a 2-dimensional parallel processing structure in which an analog processing cell is placed at each node of a trellis diagram, the output column of the analog processing cells is connected to the decoding column, and thus, the output(last) column becomes a column right before the decoding(first) column. The reference input signal given at a decoding column is propagated to the whole network while Its magnitude is reduced by the amount of a error metric on each branch. The circuit-based decoding is done by adding a trigger signals of same magnitudes to disconnect the path corresponding to logic 0 (or 1) and by observing its effect at an output column (the former column of the decoding column). The proposed Viterbi decoder has advantages in that it is operated with better performance of error correction, has a shorter latency and requires no path memories. The performance of error correction with the proposed Viterbi decoder is tested via the software simulation.

Recent Successive Cancellation Decoding Methods for Polar Codes

  • Choi, Soyeon;Lee, Youngjoo;Yoo, Hoyoung
    • Journal of Semiconductor Engineering
    • /
    • v.1 no.2
    • /
    • pp.74-80
    • /
    • 2020
  • Due to its superior error correcting performance with affordable hardware complexity, the Polar code becomes one of the most important error correction codes (ECCs) and now intensively examined to check its applicability in various fields. However, Successive Cancellation (SC) decoding that brings the advanced Successive Cancellation List (SCL) decoding suffers from the long latency due to the nature of serial processing limiting the practical implementation. To mitigate this problem, many decoding architectures, mainly divided into pruning and parallel decoding, are presented in previous manuscripts. In this paper, we compare the recent SC decoding architectures and analyze them using a tree structure.

Parallel Descrambling of Transponder Telegram for High-Speed Train (고속철도용 트랜스폰더 텔레그램의 병렬 디스크램블링 기법)

  • Kwon, Soon-Hee;Park, Sungsoo;Shin, Dong-Joon;Lee, Jae-Ho;Ko, Kyeongjun
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.41 no.2
    • /
    • pp.163-171
    • /
    • 2016
  • In order to detect the exact position of high-speed train, it is necessary to obtain location information from the transponder tag installed along the track. In this paper, we proposed parallel descrambling scheme for high-speed railway transponder system, which aims for reducing the processing time required to decode telegram. Since a telegram is stored in a tag after information bits are scrambled by an encoder, decoding procedure includes descrambling of received telegram to recover the original information bits. By analyzing the structure of the descrambling shift register circuit, we proposed a parallel descrambling scheme for fast decoding of telegram. By comparing the required number of clocks, it is shown that the proposed scheme significantly outperforms the original one.

A Parallelization Technique with Integrated Multi-Threading for Video Decoding on Multi-core Systems

  • Hong, Jung-Hyun;Kim, Won-Jin;Chung, Ki-Seok
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.7 no.10
    • /
    • pp.2479-2496
    • /
    • 2013
  • Increasing demand for Full High-Definition (FHD) video and Ultra High-Definition (UHD) video services has led to active research on high speed video processing. Widespread deployment of multi-core systems has accelerated studies on high resolution video processing based on parallelization of multimedia software. Even if parallelization of a specific decoding step may improve decoding performance partially, such partial parallelization may not result in sufficient performance improvement. Particularly, entropy decoding has often been considered separately from other decoding steps since the entropy decoding step could not be parallelized easily. In this paper, we propose a parallelization technique called Integrated Multi-Threaded Parallelization (IMTP) which takes parallelization of the entropy decoding step, with other decoding steps, into consideration in an integrated fashion. We used the Simultaneous Multi-Threading (SMT) technique with appropriate thread scheduling techniques to achieve the best performance for the entire decoding step. The speedup of the proposed IMTP method is up to 3.35 times faster with respect to the entire decoding time over a conventional decoding technique for H.264/AVC videos.

Design of a High Throughput Parallel Turbo Decoder (고 처리율 병렬 터보 복호기 설계)

  • Lee, Won-Ho;Park, Heemin;Rim, Chong S.
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.50 no.11
    • /
    • pp.50-57
    • /
    • 2013
  • This paper provides a design of high-throughput parallel turbo decoder that is able to decode several packets of various length simultaneously. For high-speed communications, designing of Turbo decoder as parallel structures reduces the long decoding time caused by iterative turbo decode way. Also, by employing the double buffer structure for input and output packets improves the decoder throughput by enabling continuous decoding. Because parallel turbo decoder is designed to be able to decode the packet of the longest length, there exist idle PE's(Processing Element) in the case of decoding packets of short length. The main idea of this paper is to increase the utilization of PE's in parallel Turbo decoder and to improve the decoder throughput by using the idle PE's immediately for the subsequent packets decoding. For this, the control is necessary to enable the concurrent decoding of several short packets and we propose the method of this control. Applying the proposed method, we implemented Turbo Decoder with 32 PE's that can decode packets of 6144 bits maximum. Compared to the conventional Turbo decoder, although the area was increased about 16%, the decoder throughput was improved 28 times for short packets.

A Study on High Speed LDPC Decoder Based on HSS (HSS기반의 고속 LDPC 복호기 연구)

  • Jung, Ji Won
    • The Journal of Korea Institute of Information, Electronics, and Communication Technology
    • /
    • v.5 no.3
    • /
    • pp.164-168
    • /
    • 2012
  • LDPC decoder architectures are generally classified into serial, parallel and partially parallel architectures. Conventional method of LDPC decoding in general give rise to a large number of computation operations, mass power consumption, and decoding delay. It is necessary to reduce the iteration numbers and computation operations without performance degradation. This paper studies Horizontal Shuffle Scheduling (HSS) algorithm. In the result, number of iteration is half than conventional algorithm without performance degradation. Finally, this paper present design methodology of high-speed LDPC decoder and confirmed its throughput is up to about 600Mbps.

Accelerating Soft-Decision Reed-Muller Decoding Using a Graphics Processing Unit

  • Uddin, Md. Sharif;Kim, Cheol Hong;Kim, Jong-Myon
    • Asia-pacific Journal of Multimedia Services Convergent with Art, Humanities, and Sociology
    • /
    • v.4 no.2
    • /
    • pp.369-378
    • /
    • 2014
  • The Reed-Muller code is one of the efficient algorithms for multiple bit error correction, however, its high-computation requirement inherent in the decoding process prohibits its use in practical applications. To solve this problem, this paper proposes a graphics processing unit (GPU)-based parallel error control approach using Reed-Muller R(r, m) coding for real-time wireless communication systems. GPU offers a high-throughput parallel computing platform that can achieve the desired high-performance decoding by exploiting massive parallelism inherent in the algorithm. In addition, we compare the performance of the GPU-based approach with the equivalent sequential approach that runs on the traditional CPU. The experimental results indicate that the proposed GPU-based approach exceedingly outperforms the sequential approach in terms of execution time, yielding over 70× speedup.