• Title/Summary/Keyword: SD/HD Video

Search Result 26, Processing Time 0.019 seconds

Multi-Threaded Parallel H.264/AVC Decoder for Multi-Core Systems (멀티코어 시스템을 위한 멀티스레드 H.264/AVC 병렬 디코더)

  • Kim, Won-Jin;Cho, Keol;Chung, Ki-Seok
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.47 no.11
    • /
    • pp.43-53
    • /
    • 2010
  • Wide deployment of high resolution video services leads to active studies on high speed video processing. Especially, prevalent employment of multi-core systems accelerates researches on high resolution video processing based on parallelization of multimedia software. In this paper, we propose a novel parallel H.264/AVC decoding scheme on a multi-core platform. Parallel H.264/AVC decoding is challenging not only because parallelization may incur significant synchronization overhead but also because software may have complicated dependencies. To overcome such issues, we propose a novel approach called Multi-Threaded Parallelization(MTP). In MTP, to reduce synchronization overhead, a separate thread is allocated to each stage in the pipeline. In addition, an efficient memory reuse technique is used to reduce the memory requirement. To verify the effectiveness of the proposed approach, we parallelized FFmpeg H.264/AVC decoder with the proposed technique using OpenMP, and carried out experiments on an Intel Quad-Core platform. The proposed design performs better than FFmpeg H.264/AVC decoder before the parallelization by 53%. We also reduced the amount of memory usage by 65% and 81% for a high-definition(HD) and a full high-definition(FHD) video, respectively compared with that of popular existing method called 2Dwave.

Design of Scalable Intra-prediction Architecture for H.264 Decoders (H.264 복호기를 위한 스케일러블 인트라 예측기 구조 설계)

  • Lee, Chan-Ho
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.45 no.11
    • /
    • pp.77-82
    • /
    • 2008
  • H.264 is a video coding standard of ITU-T and ISO/IEC, and widely spreads its application due to its high compression ratio more than twice that of MPEG-2 and high image quality. It has different architecture depending on demands since it is a lied from small image of QVGA to large size of HD. In this paper, We propose a scalable architecture for intra-prediction of H.264 decoders. The proposed scheme has a scalable architecture that can accommodate up to 4 processing elements depending on performance demands and can reduce the number of access to memory using efficient memory management so as to be energy-efficient. We design the intra-prediction unit using Verilog-HDL and verily it by prototyping using an FPGA. The performance is analyzed using the results of design.

A Parallel Hardware Architecture for H.264/AVC Deblocking Filter (H.264/AVC를 위한 블록현상 제거필터의 병렬 하드웨어 구조)

  • Jeong, Yong-Jin;Kim, Hyun-Jip
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.43 no.10 s.352
    • /
    • pp.45-53
    • /
    • 2006
  • In this paper, we proposed a parallel hardware architecture for deblocking filter in K264/AVC. The deblocking filter has high efficiency in H.264/AVC, but it also has high computational complexity. For real time video processing, we chose a two 1-D parallel filter architecture, and tried to reduce memory access using dual-port SRAM. The proposed architecture has been described in Verilog-HDL and synthesized on Hynix 0.25um CMOS Cell Library using Synopsys Design Compiler. The hardware size was about 27.3K logic gates (without On-chip Memory) and the maximum operating frequency was 100Mhz. It consumes 258 clocks to process one macroblock, witch means it can process 47.8 HD1080P(1920pixel* 1080pixel) frames per second. It seems that it can be used for real time H.264/AVC encoding and decoding of various multimedia applications.

Fast block error detection method in video using a corner information and Adaboost recognition technology (코너 정보와 Adaboost 인식 기술을 이용한 비디오 내의 블록 오류 고속 검출 방법)

  • Ha, Myunghwan;Lee, Moonsik;Park, Sungchoon;Ahn, Kiok;Kim, Min-Gi
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 2011.11a
    • /
    • pp.58-61
    • /
    • 2011
  • 방송 콘텐츠 제작에는 카메라, VCR, NLE, 인코더 등의 장비가 사용되고 있으며, VCR 헤더 불량, 테이프 노후화/보관불량, NLE 편집 오류, 인코더 장비 불량 등의 다양한 이유로 콘텐츠에 예기치 않은 비디오 및 오디오 오류가 발생할 수 있다. 이러한 문제점을 해결하기 위하여 콘텐츠에 포함된 다양한 비디오 및 오디오 오류를 자동으로 검사할 수 있는 자동 검사 시스템이 요구된다. 본 논문에서는 이러한 다양한 오류를 자동으로 검사할 수 있는 방법 중 특히 비디오 내에 종종 포함되는 블록 오류를 대상으로 하는 고속 오류 검출 방법을 설명한다. 제안한 방법은 비디오 내의 매 프레임의 코너 수를 계산하고, 시간 증가에 따른 코너 수의 변화량을 검사하여 블록 오류가 포함될 것으로 예상되는 후보 프레임을 찾는 1단계 과정과, 후보 프레임을 대상으로 Adaboost 인식 기술을 사용하여 학습한 분류기를 통해 최종 블록 오류가 포함된 프레임을 검출하는 2단계 과정으로 구성된다. 시스템 구현 실험 결과, 비디오 내에 포함된 블록 오류를 프레임 단위로 정확하게 고속 검출 하는 것이 가능함을 확인하였다. SD급의 경우 실시간 대비 2.3배속 가량의 고속 검사가 가능하고 HD의 경우에도 0.8배속 수준의 고속 검사가 가능하였다.

  • PDF

Design of High-Performance Motion Estimation Circuit for H.264/AVC Video CODEC (H.264/AVC 동영상 코덱용 고성능 움직임 추정 회로 설계)

  • Lee, Seon-Young;Cho, Kyeong-Soon
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.7
    • /
    • pp.53-60
    • /
    • 2009
  • Motion estimation for H.264/AVC video CODEC is very complex and requires a huge amount of computational efforts because it uses multiple reference frames and variable block sizes. We propose the architecture of high-performance integer-pixel motion estimation circuit based on fast algorithms for multiple reference frame selection, block matching, block mode decision and motion vector estimation. We also propose the architecture of high-performance interpolation circuit for sub-pixel motion estimation. We described the RTL circuit in Verilog HDL and synthesized the gate-level circuit using 130nm standard cell library. The integer-pixel motion estimation circuit consists of 77,600 logic gates and four $32\times8\times32$-bit dual-port SRAM's. It has tile maximum operating frequency of 161MHz and can process up to 51 D1 (720$\times$480) color in go frames per second. The fractional motion estimation circuit consists of 22,478 logic gates. It has the maximum operating frequency of 200MHz and can process up to 69 1080HD (1,920$\times$1,088) color image frames per second.

4-way Search Window for Improving The Memory Bandwidth of High-performance 2D PE Architecture in H.264 Motion Estimation (H.264 움직임추정에서 고속 2D PE 아키텍처의 메모리대역폭 개선을 위한 4-방향 검색윈도우)

  • Ko, Byung-Soo;Kong, Jin-Hyeung
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.46 no.6
    • /
    • pp.6-15
    • /
    • 2009
  • In this paper, a new 4-way search window is designed for the high-performance 2D PE architecture in H.264 Motion Estimation(ME) to improve the memory bandwidth. While existing 2D PE architectures reuse the overlapped data of adjacent search windows scanned in 1 or 3-way, the new window utilizes the overlapped data of adjacent search windows as well as adjacent multiple scanning (window) paths to enhance the reusage of retrieved search window data. In order to scan adjacent windows and multiple paths instead of single raster and zigzag scanning of adjacent windows, bidirectional row and column window scanning results in the 4-way(up. down, left, right) search window. The proposed 4-way search window could improve the reuse of overlapped window data to reduce the redundancy access factor by 3.1, though the 1/3-way search window redundantly requires $7.7{\sim}11$ times of data retrieval. Thus, the new 4-way search window scheme enhances the memory bandwidth by $70{\sim}58%$ compared with 1/3-way search window. The 2D PE architecture in H.264 ME for 4-way search window consists of $16{\times}16$ pe array. computing the absolute difference between current and reference frames, and $5{\times}16$ reusage array, storing the overlapped data of adjacent search windows and multiple scanning paths. The reference data could be loaded upward and downward into the new 2D PE depending on scanning direction, and the reusage array is combined with the pe array rotating left as well as right to utilize the overlapped data of adjacent multiple scan paths. In experiments, the new implementation of 4-way search window on Magnachip 0.18um could deal with the HD($1280{\times}720$) video of 1 reference frame, $48{\times}48$ search area and $16{\times}16$ macroblock by 30fps at 149.25MHz.