• Title/Summary/Keyword: encoder- decoder

Search Result 454, Processing Time 0.025 seconds

A Study on the Full-HD HEVC Encoder IP Design (고해상도 비디오 인코더 IP 설계에 대한 연구)

  • Lee, Sukho;Cho, Seunghyun;Kim, Hyunmi;Lee, Jehyun
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.12
    • /
    • pp.167-173
    • /
    • 2015
  • This paper presents a study on the Full-HD HEVC(High Efficiency Video Coding) encoder IP(Intellectual Property) design. The designed IP is for HEVC main profile 4.1, and performs encoding with a speed of 60 fps of full high definition. Before hardware and software design, overall reference model was developed with C language, and we proposed a parallel processing architecture for low-power consumption. And also we coded firmware and driver programs relating IP. The platform for verification of developed IP was developed, and we verified function and performance for various pictures under several encoding conditions by implementing designed IP to FPGA board. Compared to HM-13.0, about 35% decrease in bit-rate under same PSNR was achieved, and about 25% decrease in power consumption under low-power mode was performed.

Combining multi-task autoencoder with Wasserstein generative adversarial networks for improving speech recognition performance (음성인식 성능 개선을 위한 다중작업 오토인코더와 와설스타인식 생성적 적대 신경망의 결합)

  • Kao, Chao Yuan;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.38 no.6
    • /
    • pp.670-677
    • /
    • 2019
  • As the presence of background noise in acoustic signal degrades the performance of speech or acoustic event recognition, it is still challenging to extract noise-robust acoustic features from noisy signal. In this paper, we propose a combined structure of Wasserstein Generative Adversarial Network (WGAN) and MultiTask AutoEncoder (MTAE) as deep learning architecture that integrates the strength of MTAE and WGAN respectively such that it estimates not only noise but also speech features from noisy acoustic source. The proposed MTAE-WGAN structure is used to estimate speech signal and the residual noise by employing a gradient penalty and a weight initialization method for Leaky Rectified Linear Unit (LReLU) and Parametric ReLU (PReLU). The proposed MTAE-WGAN structure with the adopted gradient penalty loss function enhances the speech features and subsequently achieve substantial Phoneme Error Rate (PER) improvements over the stand-alone Deep Denoising Autoencoder (DDAE), MTAE, Redundant Convolutional Encoder-Decoder (R-CED) and Recurrent MTAE (RMTAE) models for robust speech recognition.

Atrous Residual U-Net for Semantic Segmentation in Street Scenes based on Deep Learning (딥러닝 기반 거리 영상의 Semantic Segmentation을 위한 Atrous Residual U-Net)

  • Shin, SeokYong;Lee, SangHun;Han, HyunHo
    • Journal of Convergence for Information Technology
    • /
    • v.11 no.10
    • /
    • pp.45-52
    • /
    • 2021
  • In this paper, we proposed an Atrous Residual U-Net (AR-UNet) to improve the segmentation accuracy of semantic segmentation method based on U-Net. The U-Net is mainly used in fields such as medical image analysis, autonomous vehicles, and remote sensing images. The conventional U-Net lacks extracted features due to the small number of convolution layers in the encoder part. The extracted features are essential for classifying object categories, and if they are insufficient, it causes a problem of lowering the segmentation accuracy. Therefore, to improve this problem, we proposed the AR-UNet using residual learning and ASPP in the encoder. Residual learning improves feature extraction ability and is effective in preventing feature loss and vanishing gradient problems caused by continuous convolutions. In addition, ASPP enables additional feature extraction without reducing the resolution of the feature map. Experiments verified the effectiveness of the AR-UNet with Cityscapes dataset. The experimental results showed that the AR-UNet showed improved segmentation results compared to the conventional U-Net. In this way, AR-UNet can contribute to the advancement of many applications where accuracy is important.

Selective B Slice Skip Decoding for Complexity Scalable H.264/AVC Video Decoder (H.264/AVC 복호화기의 복잡도 감소를 위한 선택적 B 슬라이스 복호화 스킵 방법)

  • Lee, Ho-Young;Kim, Jae-Hwan;Jeon, Byeung-Woo
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.48 no.3
    • /
    • pp.79-89
    • /
    • 2011
  • Recent development of embedded processors makes it possible to play back video contents in real-time on portable devices. Because of their limited battery capacity and low computational performance, however, portable devices still have significant problems in real-time decoding of high quality or high resolution compressed video. Although previous approaches are successful in achieving complexity-scalable decoder by controlling computational complexity of decoding elements, they cause significant objective quality loss coming from mismatch between encoder and decoder. In this paper, we propose a selective B slice skip-decoding method to implement a low complexity video decoder. The proposed method performs selective skip decoding process of B slice which satisfies the proposed conditions. The skipped slices are reconstructed by simple reconstruction method utilizing adjacent reconstructed pictures. Experimental result shows that proposed method not only reduces computational complexity but also maintains subjective visual quality.

A Complexity Reduction Method of MPEG-4 Audio Lossless Coding Encoder by Using the Joint Coding Based on Cross Correlation of Residual (여기신호의 상관관계 기반 joint coding을 이용한 MPEG-4 audio lossless coding 인코더 복잡도 감소 방법)

  • Cho, Choong-Sang;Kim, Je-Woo;Choi, Byeong-Ho
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.47 no.3
    • /
    • pp.87-95
    • /
    • 2010
  • Portable multi-media products which can service the highest audio-quality by using lossless audio codec has been released and the international lossless codecs, MPEG-4 audio lossless coding(ALS) and MPEG-4 scalable lossless coding(SLS), were standardized by MPEG in 2006. The simple profile of MPEG-4 ALS, it supports up to stereo, was defined by MPEG in 2009. The lossless audio codec should have low-complexity in stereo to be widely used in portable multi-media products. But the previous researches of MPEG-4 ALS have focused on an improvement of compression ratio, a complexity reduction in multi-channels coding, and a selection of linear prediction coefficients(LPCs) order. In this paper, the complexity and compression ratio of MPEG-4 ALS encoder is analyzed in simple profile of MPEG-4 ALS, the method to reduce a complexity of MPEG-4 ALS encoder is proposed. Based on an analysis of complexity of MPEG-4 ALS encoder, the complexity of short-term prediction filter of MPEG-4 ALS encoder is reduced by using the low-complexity filter that is proposed in previous research to reduce the complexity of MPEG-4 ALS decoder. Also, we propose a joint coding decision method, it reduces the complexity and keeps the compression ratio of MPEG-4 ALS encoder. In proposed method, the operation of joint coding is decided based on the relation between cross-correlation of residual and compression ratio of joint coding. The performance of MPEG-4 ALS encoder that has the method and low-complexity filter is evaluated by using the MPEG-4 ALS conformance test file and normal music files. The complexity of MPEG-4 ALS encoder is reduced by about 24% by comparing with MPEG-4 ALS reference encoder, while the compression ratio by the proposed method is comparable to MPEG-4 ALS reference encoder.

초저속 전송을 위한 wavelet 변환기반의 동화상 압축기술

  • 김성환;이홍규
    • Information and Communications Magazine
    • /
    • v.11 no.8
    • /
    • pp.60-77
    • /
    • 1994
  • This paper presents a survey of video coding schemes which use wavelet transform for the videophone on very low bit rate commun ication chan nel( ego 10 Kbps Public Service Telephone Network). Firstly, we introduce the standardization efforts to make the low bit rate videophone architecture and the typical application of low bit rate video coding scheme. Secondly, we summarize the several requirements on videophone, delay, encoder/decoder complexity, low bitrate, and progressive transmission capability. Third, we review the basic theory of wavelet transform without much mathematics. We compare the wavelet transform with short-time fourier transform and subband filters. Fourth, we summarize the video coding schemes proposed so far, and evaluate them with Ule requirements. Lastly, we conclude with fu¬ture research directions.

  • PDF

Architecture Design of Turbo Codec using on-the-fly interleaving (On-the-fly 인터리빙 방식의 터보코덱의 아키텍쳐 설계)

  • Lee, Sung-Gyu;Song, Na-Gun;Kay, Yong-Chul
    • The KIPS Transactions:PartC
    • /
    • v.10C no.2
    • /
    • pp.233-240
    • /
    • 2003
  • In this paper, an improved architecture of turbo codec for IMT-2000 is proposed. The encoder consists of an interleaver using an on-the-fly type address generator and a modified shift register instead of an external RAM, and the decoder uses a decreased number of RAM. The proposed architecture is simulated with C/VHDL languages, where BER (bit-error-rate) performances are generally in agreement with previous data by varying interaction numbers, interleaver block sizes and code rates.

H.263 Encoding Speed up Research (H.263 인코딩 속도향상연구)

  • 유환종;강의선;강석찬;김영환;김진구;임영환
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 1999.10b
    • /
    • pp.392-394
    • /
    • 1999
  • PSTN(Public Switch Telephone Network)에서 동영상을 전송하기 위해 H.263이라는 표준이 발표되었다. 저속의 전송률을 가지는 PSTN을 이용해서 영상회의나 영상전화 등을 구현하기 위해서는 기존의 코딩방식으로는 데이터를 전송하는데 문제점이 많았다. 이를 위해서 개발된 것이 H.263이다. H.263은 H.261에 기반을 두고 있으며 .261에 비해서 동일화질을 제공하는데 반정도의 데이터 양으로도 가능하게 해준다. 영상 압축 Encoder는 일반적으로 Decoder에 비하여 영상을 처리하는데 많은 시간이 소요된다. 그러나 VOD등과 같은 실시간으로 압축할 필요가 없는 경우에 대해서는 인코더가 많은 시간을 소비하더라고 큰 문제가 없는 반면에, 영상 회의나 영상 전화 등은 실시간 영상 Encoding, Decoding을 수행해야 한다. 그러기 위해서 고가의 하드웨어를 사용하게 된다. 이와 같은 이유에서 본 연구에서는 H.263을 소프트웨어만으로 Encoding 속도향상을 꾀하고자 하는 것이 이 논문의 목표이다.

  • PDF

Channel Coding Based Physical Layer Security for Wireless Networks (채널 부호화를 통한 물리계층 무선네트워크 보안기술)

  • Asaduzzaman, Asaduzzaman;Kong, Hyung Yun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.8 no.3
    • /
    • pp.57-70
    • /
    • 2008
  • This paper introduces a new paradigm of physical layer security through channel coding for wireless networks. The well known spread spectrum based physical layer security in wireless network is applicable when code division multiple access (CDMA) is used as wireless air link interface. In our proposal, we incorporate the proposed security protocol within channel coding as channel coding is an essential part of all kind of wireless communications. Channel coding has a built-in security in the sense of encoding and decoding algorithm. Decoding of a particular codeword is possible only when the encoding procedure is exactly known. This point is the key of our proposed security protocol. The common parameter that required for both encoder and decoder is generally a generator matrix. We proposed a random selection of generators according to a security key to ensure the secrecy of the networks against unauthorized access. Therefore, the conventional channel coding technique is used as a security controller of the network along with its error correcting purpose.

  • PDF

A Variable Rate LDPC Coded V-BLAST System (가변 부호화 율을 가지는 LDPC 부호화된 V-BLAST 시스템)

  • Noh, Min-Seok;Kim, Nam-Sik;Park, Hyun-Cheol
    • Proceedings of the IEEK Conference
    • /
    • 2004.06a
    • /
    • pp.55-58
    • /
    • 2004
  • This this paper, we propose vertical Bell laboratories layered space time (V-BLAST) system based on variable rate Low-Density Parity Check (LDPC) codes to improve performance of receiver when QR decomposition interference suppression combined with interference cancellation is used over independent Rayleigh fading channel. The different rate LDPC codes can be made by puncturing some rows of a given parity check matrix. This allows to implement a single encoder and decoder for different rate LDPC codes. The performance can be improved by assigning stronger LDPC codes in lower layer than upper layer because the poor SNR of first detected data streams makes error propagation. Keeping the same overall code rates, the V-BLAST system with different rate LDPC codes has the better performance (in terms of Bit Error Rate) than with constant rate LDPC code in fast fading channel.

  • PDF