• Title/Summary/Keyword: Encoder-decoder

Search Result 453, Processing Time 0.031 seconds

Latent Shifting and Compensation for Learned Video Compression (신경망 기반 비디오 압축을 위한 레이턴트 정보의 방향 이동 및 보상)

  • Kim, Yeongwoong;Kim, Donghyun;Jeong, Se Yoon;Choi, Jin Soo;Kim, Hui Yong
    • Journal of Broadcast Engineering
    • /
    • v.27 no.1
    • /
    • pp.31-43
    • /
    • 2022
  • Traditional video compression has developed so far based on hybrid compression methods through motion prediction, residual coding, and quantization. With the rapid development of technology through artificial neural networks in recent years, research on image compression and video compression based on artificial neural networks is also progressing rapidly, showing competitiveness compared to the performance of traditional video compression codecs. In this paper, a new method capable of improving the performance of such an artificial neural network-based video compression model is presented. Basically, we take the rate-distortion optimization method using the auto-encoder and entropy model adopted by the existing learned video compression model and shifts some components of the latent information that are difficult for entropy model to estimate when transmitting compressed latent representation to the decoder side from the encoder side, and finally compensates the distortion of lost information. In this way, the existing neural network based video compression framework, MFVC (Motion Free Video Compression) is improved and the BDBR (Bjøntegaard Delta-Rate) calculated based on H.264 is nearly twice the amount of bits (-27%) of MFVC (-14%). The proposed method has the advantage of being widely applicable to neural network based image or video compression technologies, not only to MFVC, but also to models using latent information and entropy model.

Design and Implementation of Hybrid Network Associated 3D Video Broadcasting System (이종망 연동형 3D 비디오 방송시스템 설계 및 구현)

  • Yun, Kugjin;Cheong, Won-Sik;Lee, Jinyoung;Kim, Kyuheon
    • Journal of Broadcast Engineering
    • /
    • v.19 no.5
    • /
    • pp.687-698
    • /
    • 2014
  • ATSC is currently working on standardization of hybrid 3DTV broadcasting service in heterogenous network environment after completion of service-compatible 3DTV broadcasting service standard based on broadcasting channel. This paper proposes a convergence 3D video broadcasting method on broadcasting and IP network while guaranteeing a Full-HD 3D quality without degrading the image quality of legacy DTV. Specifically, this paper describes transmission of the 3D additional video using the ISO/IEC 23009-1 DASH, robust synchronization method under heterogenous network environments and system target decoder model for hybrid 3DTV receiver. Based on experimental results, we confirm that proposed technologies can be used as a core technology in the hybrid 3DTV standardization and a reference model for a development of hybrid 3DTV encoder and receiver.

A Study On Development of Fast Image Detector System (고속 영상 검지기 시스템 개발에 관한 연구)

  • Kim Byung Chul;Ha Dong Mun;Kim Yong Deak
    • Journal of the Institute of Electronics Engineers of Korea SC
    • /
    • v.41 no.1
    • /
    • pp.25-32
    • /
    • 2004
  • Nowadays image processing is very useful for some field of traffic applications. The one reason is we can construct the system in a low price, the other is the improvement of hardware processing power, it can be more fast to processing the data. In traffic field, the development of image using system is interesting issue. Because it has the advantage of price of installation and it does not obstruct traffic during the installation. In this study, 1 propose the traffic monitoring system that implement on the embedded system environment. The whole system consists of two main part, one is host controller board, the other is image processing board. The part of host controller board take charge of control the total system interface of external environment, and OSD(On screen display). The part of image processing board takes charge of image input and output using video encoder and decoder, Image classification and memory control of using FPGA, control of mouse signal. And finally, for stable operation of host controller board, uC/OS-II operating system is ported on the board.

Fine-scalable SPIHT Hardware Design for Frame Memory Compression in Video Codec

  • Kim, Sunwoong;Jang, Ji Hun;Lee, Hyuk-Jae;Rhee, Chae Eun
    • JSTS:Journal of Semiconductor Technology and Science
    • /
    • v.17 no.3
    • /
    • pp.446-457
    • /
    • 2017
  • In order to reduce the size of frame memory or bus bandwidth, frame memory compression (FMC) recompresses reconstructed or reference frames of video codecs. This paper proposes a novel FMC design based on discrete wavelet transform (DWT) - set partitioning in hierarchical trees (SPIHT), which supports fine-scalable throughput and is area-efficient. In the proposed design, multi-cores with small block sizes are used in parallel instead of a single core with a large block size. In addition, an appropriate pipelining schedule is proposed. Compared to the previous design, the proposed design achieves the processing speed which is closer to the target system speed, and therefore it is more efficient in hardware utilization. In addition, a scheme in which two passes of SPIHT are merged into one pass called merged refinement pass (MRP) is proposed. As the number of shifters decreases and the bit-width of remained shifters is reduced, the size of SPIHT hardware significantly decreases. The proposed FMC encoder and decoder designs achieve the throughputs of 4,448 and 4,000 Mpixels/s, respectively, and their gate counts are 76.5K and 107.8K. When the proposed design is applied to high efficiency video codec (HEVC), it achieves 1.96% lower average BDBR and 0.05 dB higher average BDPSNR than the previous FMC design.

Distributed Matching Algorithms for Spectrum Access: A Comparative Study and Further Enhancements

  • Ali, Bakhtiar;Zamir, Nida;Ng, Soon Xin;Butt, Muhammad Fasih Uddin
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.4
    • /
    • pp.1594-1617
    • /
    • 2018
  • In this paper, we consider a spectrum access scenario which consists of two groups of users, namely Primary Users (PUs) and Secondary Users (SUs) in Cooperative Cognitive Radio Networks (CCRNs). SUs cooperatively relay PUs messages based on Amplify-and-Forward (AF) and Decode-and-Forward (DF) cooperative techniques, in exchange for accessing some of the spectrum for their secondary communications. From the literatures, we found that the Conventional Distributed Algorithm (CDA) and Pragmatic Distributed Algorithm (PDA) aim to maximize the PU sum-rate resulting in a lower sum-rate for the SU. In this contribution, we have investigated a suit of distributed matching algorithms. More specifically, we investigated SU-based CDA (CDA-SU) and SU-based PDA (PDA-SU) that maximize the SU sum-rate. We have also proposed the All User-based PDA (PDA-ALL), for maximizing the sum-rates of both PU and SU groups. A comparative study of CDA, PDA, CDA-SU, PDA-SU and PDA-ALL is conducted, and the strength of each scheme is highlighted. Different schemes may be suitable for different applications. All schemes are investigated under the idealistic scenario involving perfect coding and perfect modulation, as well as under practical scenario involving actual coding and actual modulation. Explicitly, our practical scenario considers the adaptive coded modulation based DF schemes for transmission flexibility and efficiency. More specifically, we have considered the Self-Concatenated Convolutional Code (SECCC), which exhibits low complexity, since it invokes only a single encoder and a single decoder. Furthermore, puncturing has been employed for enhancing the bandwidth efficiency of SECCC. As another enhancement, physical layer security has been applied to our system by introducing a unique Advanced Encryption Standard (AES) based puncturing to our SECCC scheme.

Hardware Architecture and its Design of Real-Time Video Compression Processor for Motion JPEG2000 (Motion JPEG2000을 위한 실시간 비디오 압축 프로세서의 하드웨어 구조 및 설계)

  • 서영호;김동욱
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.53 no.1
    • /
    • pp.1-9
    • /
    • 2004
  • In this paper, we proposed a hardware(H/W) structure which can compress and recontruct the input image in real time operation and implemented it into a FPGA platform using VHDL(VHSIC Hardware Description Language). All the image processing element to process both compression and reconstruction in a FPGA were considered each of them was mapped into a H/W with the efficient structure for FPGA. We used the DWT(discrete wavelet transform) which transforms the data from spatial domain to the frequency domain, because use considered the motion JPEG2000 as the application. The implemented H/W is separated to both the data path part and the control part. The data path part consisted of the image processing blocks and the data processing blocks. The image processing blocks consisted of the DWT Kernel for the filtering by DWT, Quantizer/Huffman Encoder, Inverse Adder/Buffer for adding the low frequency coefficient to the high frequency one in the inverse DWT operation, and Huffman Decoder. Also there existed the interface blocks for communicating with the external application environments and the timing blocks for buffering between the internal blocks. The global operations of the designed H/W are the image compression and the reconstruction, and it is operated by the unit or a field synchronized with the A/D converter. The implemented H/W used the 54%(12943) LAB(Logic Array Block) and 9%(28352) ESB(Embedded System Block) in the APEX20KC EP20K600CB652-7 FPGA chip of ALTERA, and stably operated in the 70MHz clock frequency. So we verified the real time operation. that is. processing 60 fields/sec(30 frames/sec).

A Study on the Session Description Protocol Stack for VoIP (VoIP를 위한 Session Description Protocol 스택에 관한 연구)

  • Jung, Sung-Ok;Ko, Kwang-Man
    • Journal of the Institute of Electronics Engineers of Korea TC
    • /
    • v.38 no.3
    • /
    • pp.19-27
    • /
    • 2001
  • Accordingly it is very important to not only develop the stack of protocol, but also try an international standardization regarding the standard protocol of VoIP. Has compared to the advanced countries having already some success in commercialization, Korea is relatively much less involved in relation to this technology and endeavors. In this regards, this paper is focused on developing a protocol stack made with encoder/decoder, the generator or the header file, syntax analyzer etc. based on the protocol grammars of Session Description Protocol supported by IETF RFC2327. For the sake of it, first describe the SDP BNF grammar based on IETF RFC2327 Augmented BNF. And then we produce the Abstract Syntax Tree, header file generator for encoding/decoding as applying the method of syntax directed to SDP protocol grammar.

  • PDF

Dynamic Full-Scalability-Conversion in SVC (스케일러블 비디오 코딩에서의 실시간 스케일러빌리티 변환)

  • Lee, Dong-Su;Bae, Tae-Meon;Ro, Yong-Man
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.6 s.312
    • /
    • pp.60-70
    • /
    • 2006
  • Currently, Scalable Video Coding (SVC) is being standardized. By using scalability of SVC, QoS managed video streaming service is enabled in heterogeneous networks even with only one original bitstream. But current SVC is insufficient to dynamic video conversion for the scalability, thereby the adaptation of bitrate to meet a fluctuating network condition is limited. In this paper, we propose dynamic full-scalability conversion method for QoS adaptive video streaming in H.264/AVC SVC. To accomplish full scalability dynamic conversion, we propose corresponding bitstream extraction, encoding and decoding schemes. On the encoder, we newly insert the IDR NAL to solve the problems of spatial scalability conversion. On the extractor, we analyze the SVC bitstream to get the information which enable dynamic extraction. By using this information, real time extraction is achieved. Finally, we develop the decoder so that it can manage changing bitrate to support real time full-scalability. The experimental results showed that dynamic full-scalability conversion was verified and it was necessary for time varying network condition.

Edge-Directional Joint Disparity-Motion Estimation of Stereoscopic Sequences (경계 방향성을 고려한 스테레오 동영상의 움직임-변이 동시추정 기법)

  • 김용태;서형갑;박창섭;이재호;손광훈
    • Journal of Broadcast Engineering
    • /
    • v.9 no.3
    • /
    • pp.196-206
    • /
    • 2004
  • This paper presents an efficient joint disparity-motion estimation algorithm for stereo sequence CODEC. Disparity vectors are estimated by the left and right motion vectors and previous disparity vectors for every frame. In order to obtain more accurate disparity vectors. we include a spatial prediction Process after the feint estimation. From joint estimation and spatial prediction, we can obtain accurate disparity vectors and then Increase coding efficiency. Finally, we proposed the backward quadtree decomposition. which helps the encoder to have a more detailed disparity vector map without transmitting additional coding bits for quadtree information. We confirmed superior performance of the proposed method through computer simulation.

2D Game Image Color Synthesis System Using Convolutional Neural Network (컨볼루션 인공신경망을 이용한 2차원 게임 이미지 색상 합성 시스템)

  • Hong, Seung Jin;Kang, Shin Jin;Cho, Sung Hyun
    • Journal of Korea Game Society
    • /
    • v.18 no.2
    • /
    • pp.89-98
    • /
    • 2018
  • The recent Neural Network technique has shown good performance in content generation such as image generation in addition to the conventional classification problem and clustering problem solving. In this study, we propose an image generation method using artificial neural network as a next generation content creation technique. The proposed artificial neural network model receives two images and combines them into a new image by taking color from one image and shape from the other image. This model is made up of Convolutional Neural Network, which has two encoders for extracting color and shape from images, and a decoder for taking all the values of each encoder and generating a combination image. The result of this work can be applied to various 2D image generation and modification works in game development process at low cost.