• Title/Summary/Keyword: encoder- decoder

Search Result 454, Processing Time 0.029 seconds

Block-based Adaptive Bit Allocation for Reference Memory Reduction (효율적인 참조 메모리 사용을 위한 블록기반 적응적 비트할당 알고리즘)

  • Park, Sea-Nae;Nam, Jung-Hak;Sim, Dong-Gy;Joo, Young-Hun;Kim, Yong-Serk;Kim, Hyun-Mun
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.46 no.3
    • /
    • pp.68-74
    • /
    • 2009
  • In this paper, we propose an effective memory reduction algorithm to reduce the amount of reference frame buffer and memory bandwidth in video encoder and decoder. In general video codecs, decoded previous frames should be stored and referred to reduce temporal redundancy. Recently, reference frames are recompressed for memory efficiency and bandwidth reduction between a main processor and external memory. However, these algorithms could hurt coding efficiency. Several algorithms have been proposed to reduce the amount of reference memory with minimum quality degradation. They still suffer from quality degradation with fixed-bit allocation. In this paper, we propose an adaptive block-based min-max quantization that considers local characteristics of image. In the proposed algorithm, basic process unit is $8{\times}8$ for memory alignment and apply an adaptive quantization to each $4{\times}4$ block for minimizing quality degradation. We found that the proposed algorithm can obtain around 1.7% BD-bitrate gain and 0.03dB BD-PSNR gain, compared with the conventional fixed-bit min-max algorithm with 37.5% memory saving.

FPGA-based One-Chip Architecture and Design of Real-time Video CODEC with Embedded Blind Watermarking (블라인드 워터마킹을 내장한 실시간 비디오 코덱의 FPGA기반 단일 칩 구조 및 설계)

  • 서영호;김대경;유지상;김동욱
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.8C
    • /
    • pp.1113-1124
    • /
    • 2004
  • In this paper, we proposed a hardware(H/W) structure which can compress and recontruct the input image in real time operation and implemented it into a FPGA platform using VHDL(VHSIC Hardware Description Language). All the image processing element to process both compression and reconstruction in a FPGA were considered each of them was mapped into H/W with the efficient structure for FPGA. We used the DWT(discrete wavelet transform) which transforms the data from spatial domain to the frequency domain, because use considered the motion JPEG2000 as the application. The implemented H/W is separated to both the data path part and the control part. The data path part consisted of the image processing blocks and the data processing blocks. The image processing blocks consisted of the DWT Kernel fur the filtering by DWT, Quantizer/Huffman Encoder, Inverse Adder/Buffer for adding the low frequency coefficient to the high frequency one in the inverse DWT operation, and Huffman Decoder. Also there existed the interface blocks for communicating with the external application environments and the timing blocks for buffering between the internal blocks The global operations of the designed H/W are the image compression and the reconstruction, and it is operated by the unit of a field synchronized with the A/D converter. The implemented H/W used the 69%(16980) LAB(Logic Array Block) and 9%(28352) ESB(Embedded System Block) in the APEX20KC EP20K600CB652-7 FPGA chip of ALTERA, and stably operated in the 70MHz clock frequency. So we verified the real time operation of 60 fields/sec(30 frames/sec).

DCT-domain MPEG-2/H.264 Video Transcoder System Architecture for DMB Services (DMB 서비스를 위한 DCT 기반 MPEG-2/H.264 비디오 트랜스코더 시스템 구조)

  • Lee Joo-Kyong;Kwon Soon-Young;Park Seong-Ho;Kim Young-Ju;Chung Ki-Dong
    • The KIPS Transactions:PartB
    • /
    • v.12B no.6 s.102
    • /
    • pp.637-646
    • /
    • 2005
  • Most of the multimedia contents for DBM services art provided as MPEG-2 bit streams. However, they have to be transcoded to H.264 bit streams for practical services because the standard video codec for DMB is H.264. The existing transcoder architecture is Cascaded Pixel-Domain Transcoding Architecture, which consists of the MPEG-2 dacoding phase and the H.264 encoding phase. This architecture can be easily implemented using MPEG-2 decoder and H.264 encoder without source modifying. However. It has disadvantages in transcoding time and DCT-mismatch problem. In this paper, we propose two kinds of transcoder architecture, DCT-OPEN and DCT-CLOSED, to complement the CPDT architecture. Although DCT-OPEN has lower PSNR than CPDT due to drift problem, it is efficient for real-time transcoding. On the contrary, the DCT-CLOSED architecture has the advantage of PSNR over CPDT at the cost of transcoding time.

Korean Abbreviation Generation using Sequence to Sequence Learning (Sequence-to-sequence 학습을 이용한 한국어 약어 생성)

  • Choi, Su Jeong;Park, Seong-Bae;Kim, Kweon-Yang
    • KIISE Transactions on Computing Practices
    • /
    • v.23 no.3
    • /
    • pp.183-187
    • /
    • 2017
  • Smart phone users prefer fast reading and texting. Hence, users frequently use abbreviated sequences of words and phrases. Nowadays, abbreviations are widely used from chat terms to technical terms. Therefore, gathering abbreviations would be helpful to many services, including information retrieval, recommendation system, and so on. However, manually gathering abbreviations needs to much effort and cost. This is because new abbreviations are continuously generated whenever a new material such as a TV program or a phenomenon is made. Thus it is required to generate of abbreviations automatically. To generate Korean abbreviations, the existing methods use the rule-based approach. The rule-based approach has limitations, in that it is unable to generate irregular abbreviations. Another problem is to decide the correct abbreviation among candidate abbreviations generated rules. To address the limitations, we propose a method of generating Korean abbreviations automatically using sequence-to-sequence learning in this paper. The sequence-to-sequence learning can generate irregular abbreviation and does not lead to the problem of deciding correct abbreviation among candidate abbreviations. Accordingly, it is suitable for generating Korean abbreviations. To evaluate the proposed method, we use dataset of two type. As experimental results, we prove that our method is effective for irregular abbreviations.

Real-Time Implementation of the G.729.1 Using ARM926EJ-S Processor Core (ARM926EJ-S 프로세서 코어를 이용한 G.729.1의 실시간 구현)

  • So, Woon-Seob;Kim, Dae-Young
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.8C
    • /
    • pp.575-582
    • /
    • 2008
  • In this paper we described the process and the results of real-time implementation of G.729.1 wideband speech codec which is standardized in SG15 of ITU-T. To apply the codec on ARM926EJ-S(R) processor core. we transformed some parts of the codec C program including basic operations and arithmetic functions into assembly language to operate the codec in real-time. G.729.1 is the standard wideband speech codec of ITU-T having variable bit rates of $8{\sim}32kbps$ and inputs quantized 16 bits PCM signal per sample at the rate of 8kHz or 16kHz sampling. This codec is interoperable with the G.729 and G.729A and the bandwidth extended wideband($50{\sim}7,000Hz$) version of existing narrowband($300{\sim}3,400Hz$) codec to enhance voice quality. The implemented G.729.1 wideband speech codec has the complexity of 31.2 MCPS for encoder and 22.8 MCPS for decoder and the execution time of the codec takes 11.5ms total on the target with 6.75ms and 4.76ms respectively. Also this codec was tested bit by bit exactly against all set of test vectors provided by ITU-T and passed all the test vectors. Besides the codec operated well on the Internet phone in real-time.

The Design of Object-based 3D Audio Broadcasting System (객체기반 3차원 오디오 방송 시스템 설계)

  • 강경옥;장대영;서정일;정대권
    • The Journal of the Acoustical Society of Korea
    • /
    • v.22 no.7
    • /
    • pp.592-602
    • /
    • 2003
  • This paper aims to describe the basic structure of novel object-based 3D audio broadcasting system To overcome current uni-directional audio broadcasting services, the object-based 3D audio broadcasting system is designed for providing the ability to interact with important audio objects as well as realistic 3D effects based on the MPEG-4 standard. The system is composed of 6 sub-modules. The audio input module collects the background sound object, which is recored by 3D microphone, and audio objects, which are recorded by monaural microphone or extracted through source separation method. The sound scene authoring module edits the 3D information of audio objects such as acoustical characteristics, location, directivity and etc. It also defines the final sound scene with a 3D background sound, which is intended to be delievered to a receiving terminal by producer. The encoder module encodes scene descriptors and audio objects for effective transmission. The decoder module extracts scene descriptors and audio objects from decoding received bistreams. The sound scene composition module reconstructs the 3D sound scene with scene descriptors and audio objects. The 3D sound renderer module maximizes the 3D sound effects through adapting the final sound to the listner's acoustical environments. It also receives the user's controls on audio objects and sends them to the scene composition module for changing the sound scene.

An I/O Interface Circuit Using CTR Code to Reduce Number of I/O Pins (CTR 코드를 사용한 I/O 핀 수를 감소 시킬 수 있는 인터페이스 회로)

  • Kim, Jun-Bae;Kwon, Oh-Kyong
    • Journal of the Korean Institute of Telematics and Electronics D
    • /
    • v.36D no.1
    • /
    • pp.47-56
    • /
    • 1999
  • As the density of logic gates of VLSI chips has rapidly increased, more number of I/O pins has been required. This results in bigger package size and higher packager cost. The package cost is higher than the cost of bare chips for high I/O count VLSI chips. As the density of logic gates increases, the reduction method of the number of I/O pins for a given complexity of logic gates is required. In this paper, we propose the novel I/O interface circuit using CTR (Constant-Transition-Rate) code to reduce 50% of the number of I/O pins. The rising and falling edges of the symbol pulse of CTR codes contain 2-bit digital data, respectively. Since each symbol of the proposed CTR codes contains 4-bit digital data, the symbol rate can be reduced by the factor of 2 compared with the conventional I/O interface circuit. Also, the simultaneous switching noise(SSN) can be reduced because the transition rate is constant and the transition point of the symbols is widely distributed. The channel encoder is implemented only logic circuits and the circuit of the channel decoder is designed using the over-sampling method. The proper operation of the designed I/O interface circuit was verified using. HSPICE simulation with 0.6 m CMOS SPICE parameters. The simulation results indicate that the data transmission rate of the proposed circuit using 0.6 m CMOS technology is more than 200 Mbps/pin. We implemented the proposed circuit using Altera's FPGA and confimed the operation with the data transfer rate of 22.5 Mbps/pin.

  • PDF

VLSI Design of DWT-based Image Processor for Real-Time Image Compression and Reconstruction System (실시간 영상압축과 복원시스템을 위한 DWT기반의 영상처리 프로세서의 VLSI 설계)

  • Seo, Young-Ho;Kim, Dong-Wook
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.29 no.1C
    • /
    • pp.102-110
    • /
    • 2004
  • In this paper, we propose a VLSI structure of real-time image compression and reconstruction processor using 2-D discrete wavelet transform and implement into a hardware which use minimal hardware resource using ASIC library. In the implemented hardware, Data path part consists of the DWT kernel for the wavelet transform and inverse transform, quantizer/dequantizer, the huffman encoder/huffman decoder, the adder/buffer for the inverse wavelet transform, and the interface modules for input/output. Control part consists of the programming register, the controller which decodes the instructions and generates the control signals, and the status register for indicating the internal state into the external of circuit. According to the programming condition, the designed circuit has the various selective output formats which are wavelet coefficient, quantization coefficient or index, and Huffman code in image compression mode, and Huffman decoding result, reconstructed quantization coefficient, and reconstructed wavelet coefficient in image reconstructed mode. The programming register has 16 stages and one instruction can be used for a horizontal(or vertical) filtering in a level. Since each register automatically operated in the right order, 4-level discrete wavelet transform can be executed by a programming. We synthesized the designed circuit with synthesis library of Hynix 0.35um CMOS fabrication using the synthesis tool, Synopsys and extracted the gate-level netlist. From the netlist, timing information was extracted using Vela tool. We executed the timing simulation with the extracted netlist and timing information using NC-Verilog tool. Also PNR and layout process was executed using Apollo tool. The Implemented hardware has about 50,000 gate sizes and stably operates in 80MHz clock frequency.

Adaptive Intra Prediction Method using Modified Cubic-function and DCT-IF (변형된 3차 함수와 DCT-IF를 이용한 적응적 화면내 예측 방법)

  • Lee, Han-Sik;Lee, Ju-Ock;Moon, Joo-Hee
    • Journal of Broadcast Engineering
    • /
    • v.17 no.5
    • /
    • pp.756-764
    • /
    • 2012
  • In current HEVC, prediction pixels are finally calculated by linear-function interpolation on two reference pixels. It is hard to expect good performance on the case of occurring large difference between two reference pixels. This paper decides more accurate prediction pixel values than current HEVC using linear function. While existing prediction process only uses two reference pixels, proposed method uses DCT-IF. DCT-IF analyses frequency characteristics of more than two reference pixels in frequency domain. And proposed method calculates prediction value adaptively by using linear-function, DCT-IF and cubic-function to decide more accurate interpolation value than to only use linear function. Cubic-function has a steep slope than linear-function. So, using cubic-function is utilized on edge in prediction unit. The complexity of encoder and decoder in HM6.0 has increased 3% and 1%, respectively. BD-rate has decreased 0.4% in luma signal Y, 0.3% in chroma signal U and 0.3% in chroma signal V in average. Through this experiment, proposed adaptive intra prediction method using DCT-IF and cubic-function shows increased performance than HM6.0.

Adaptive In-loop Filter Method for High-efficiency Video Coding (고효율 비디오 부호화를 위한 적응적 인-루프 필터 방법)

  • Jung, Kwang-Su;Nam, Jung-Hak;Lim, Woong;Jo, Hyun-Ho;Sim, Dong-Gyu;Choi, Byeong-Doo;Cho, Dae-Sung
    • Journal of Broadcast Engineering
    • /
    • v.16 no.1
    • /
    • pp.1-13
    • /
    • 2011
  • In this paper, we propose an adaptive in-loop filter to improve the coding efficiency. Recently, there are post-filter hint SEI and block-based adaptive filter control (BAFC) methods based on the Wiener filter which can minimize the mean square error between the input image and the decoded image in video coding standards. However, since the post-filter hint SEI is applied only to the output image, it cannot reduce the prediction errors of the subsequent frames. Because BAFC is also conducted with a deblocking filter, independently, it has a problem of high computational complexity on the encoder and decoder sides. In this paper, we propose the low-complexity adaptive in-loop filter (LCALF) which has lower computational complexity by using H.264/AVC deblocking filter, adaptively, as well as shows better performance than the conventional method. In the experimental results, the computational complexity of the proposed method is reduced about 22% than the conventional method. Furthermore, the coding efficiency of the proposed method is about 1% better than the BAFC.