• Title/Summary/Keyword: ENCODER

Search Result 1,918, Processing Time 0.031 seconds

Transform-domain Wyner-Ziv Residual Coding using Temporal Correlation (시간적 상관도를 활용한 변환 영역 잔차 신호 Wyner-Ziv 부호화)

  • Cho, Hyon-Myong;Eun, Hyun;Shim, Hiuk-Jae;Jeon, Byeung-Woo
    • Journal of Broadcast Engineering
    • /
    • v.17 no.1
    • /
    • pp.140-151
    • /
    • 2012
  • In Wyner-Ziv coding, key picture is encoded by conventional H.264/AVC intra coding which has low complexity. Although inter coding is more efficient than intra coding, its complexity is much higher than intra coding due to its motion estimation. Since the main feature of Wyner-Ziv coding is low complexity of encoder, inter coding is not suitable to encode key picture in Wyner-Ziv coding. However, inter picture coding with zero motion vector can be usable for Wyner-Ziv key picture coding instead of intra coding. Moreover, while current transform-domain Wyner-Ziv residual coding only utilizes temporal correlation of WZ picture, if zero motion coding is jointly used to encode key picture in transform-domain Wyner-Ziv residual coding, there will be a significant improvement in R-D performance. Experimental results show that the complexity of Wyner-Ziv coding with the proposed zero motion key picture coding is higher than conventional Wyner-Ziv coding with intra key picture coding by about 9%, however, there are BDBR gains up to 54%. Additionally, if the proposed zero motion key coding is implemented on top of the transform-domain Wyner-Ziv residual coding, the result shows rate gains up to 70% in BDBR compared to conventional Wyner-Ziv coding with intra key picture coding.

Selective Inter-layer Residual Prediction Coding and Fast Mode Decision for Spatial Enhancement Layers in Scalable Video Coding (스케일러블 비디오 부호화에서 선택적 계층간 차분 신호 부호화 및 공간적 향상 계층에서의 모드 결정)

  • Lee, Bum-Shik;Hahm, Sang-Jin;Park, Chang-Seob;Park, Keun-Soo;Kim, Mun-Churl
    • Journal of Broadcast Engineering
    • /
    • v.12 no.6
    • /
    • pp.596-610
    • /
    • 2007
  • In order to reduce the complexity of SVC encoding, we introduce a fast mode decision method in the enhancement layers of spatial scalability by selectively performing the inter-layer residual prediction of SVC. The Inter-layer residual prediction coding in Scalable Video Coding has a large advantage of enhancing the coding efficiency since it utilizes the correlation between two residuals from a lower spatial layer and its next higher spatial layer. However, this entails the dramatical increase in the complexity of SVC encoders. The proposed method is to analyze the characteristics of integer transform coefficients for the subtracted signal for two residuals from lower and upper spatial layers. Then it selectively performs the inter-layer residual prediction coding and rate-distortion optimizations in the upper spatial enhancement layer if the SAD values of residuals exceed adaptive threshold values. Therefore, by classifying the residuals according to the properties of integer-transform coefficients only with SAD of residuals between two layers, the SVC encoder can perform the inter-layer residual coding selectively, thus significantly reducing the total required encoding time. The proposed method results in reduction of the total encoding time with 51.5% in average while maintaining the RD performance with negligible amounts of quality degradation.

Subjective Video Quality Evaluation of H.265/HEVC Encoded Low Resolution Videos for Ultra-Low Band Transmission System (초협대역 전송 시스템상에서 H.265/HEVC 부호화 저해상도 비디오에 대한 주관적 화질 평가)

  • Uddina, A.F.M. Shahab;Monira, Mst. Sirazam;Chung, TaeChoong;Kim, Donghyun;Choi, Jeung Won;Jun, Ki Nam;Bae, Sung-Ho
    • Journal of Broadcast Engineering
    • /
    • v.24 no.6
    • /
    • pp.1085-1095
    • /
    • 2019
  • In this paper, we perform a subjective quality assessment on low-resolution surveillance videos, which are encoded with a very low target bit-rate to use in an ultra-low band transmission system and investigate the encoding effects on the perceived video quality. The test videos are collected based on their spatial and temporal characteristics which affect the perceived quality. H.265/HEVC encoder is used to prepare the impaired sequences for three target bit-rates 20, 45, and 65 kbps and subjective quality assessment is conducted to evaluate the quality from a viewing distance of 3H. The experimental results show that the quality of encoded videos, even at target bit-rate of 45 kbps can satisfy the users. Also we compare objective image/video quality assessment methods on the proposed dataset to measure their correlation with subjective scores. The experimental results show that the existing methods poorly performed, that indicates the need for a better quality assessment method.

Study of Scene change Detection and Adaptive Rate Control Schemes for MPEG Video Encoder (MPEG 비디오 인코더를 위한 장면전환 검출 및 적응적 율 제어 방식 연구)

  • Nam, Jae-Yeol;Gang, Byeong-Ho;Son, Yu-Ik
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.2
    • /
    • pp.534-542
    • /
    • 1999
  • A sell-designed rate control strategy can improve overall picture quality for video transmission over a constant bit rate channel and the rate control method is not a normative part of MPEG-video standard, the performance of MPEG video codec can be quite different depends on how to implement the rate control scheme. The rate control scheme proposed in MPEG show good results when scene changes is not occurred. But it has weakness that it does not properly handle scene-changed pictures. Therefore picture quality after scene change is deteriorated, and possibility of overflow occurrence becomes high. In this paper, a new method for detection of scene change occurrence using local variance and a new determination scheme for adaptive quantization parameter, mqunt, which can consider local characteristic of an image by using previously computed the local variance from the scene change detection part are proposed. IN addition, and adaptive rate control scheme which can handles scene changed picture very efficiently by scene-changed picture is proposed. Computer simulations are performed to verify the performance of the proposed algorithm. The suggested detection algorithm precisely detected scene change. And the proposed rate control scheme shows better rate control performance as compared with that of the conventional MPEG scheme.

  • PDF

A H.264 based Selective Fine Granular Scalable Coding Scheme (H.264 기반 선택적인 미세입자 스케일러블 코딩 방법)

  • 박광훈;유원혁;김규헌
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.10 no.4
    • /
    • pp.309-318
    • /
    • 2004
  • This paper proposes the H.264-based selective fine granular scalable (FGS) coding scheme that selectively uses the temporal prediction data in the enhancement layer. The base layer of the proposed scheme is basically coded by the H.264 (MPEG-4 Part 10 AVC) visual coding scheme that is the state-of-art in codig efficiency. The enhancement layer is basically coded by the same bitplane-based algorithm of the MPEG-4 (Part 2) fine granular scalable coding scheme. In this paper, we introduce a new algorithm that uses the temproal prediction mechanism inside the enhancement layer and the effective selection mechanism to decide whether the temporally-predicted data would be sent to the decoder or not. Whenever applying the temporal prediction inside the enhancement layer, the temporal redundancies may be effectively reduced, however the drift problem would be severly occurred especially at the low bitrate transmission, due to the mismatch bewteen the encoder's and decoder's reference frame images. Proposed algorithm selectively uses the temporal-prediction data inside the enhancement layer only in case those data could siginificantly reduce the temporal redundancies, to minimize the drift error and thus to improve the overall coding efficiency. Simulation results, based on several test image sequences, show that the proposed scheme has 1∼3 dB higher coding efficiency than the H.264-based FGS coding scheme, even 3∼5 dB higher coding efficiency than the MPEG-4 FGS international standard.

Signaling Method of Multiple Motion Vector Resolutions Using Contradiction Testing (모순 검증을 통한 다중 움직임 벡터 해상도 시그널링 방법)

  • Won, Kwanghyun;Park, Younghyeon;Jeon, Byeungwoo
    • Journal of the Institute of Electronics and Information Engineers
    • /
    • v.52 no.7
    • /
    • pp.107-118
    • /
    • 2015
  • Although most current video coding standards set a fixed motion vector resolution like quarter-pel accuracy, a scheme supporting multiple motion vector resolutions can improve the coding efficiency of video since it can allow to use just required motion vector accuracy depending on the video content and at the same time to generate more accurate motion predictor. However, the selected motion vector resolution for each motion vector is a signaling overhead. This paper proposes a contradiction testing-based signaling scheme of the motion vector resolution. The proposed method selects a best resolution for each motion vector among multiple candidates in such a way to produce the minimum amount of coded bits for the motion vector. The signaling overhead is reduced by contradiction testing that operates under a predefined criterion at both encoder and decoder with a purpose of pruning irrelevant candidate motion vector resolutions from signaling responsibility. Experimental results verified that the proposed scheme is effective in reducing coded motion information by achieving its $Bj{\o}ntegaard$ delta bit rate (BDBR) gain of about 4.01% on average (and up to 15.17%) compared to the conventional scheme with a fixed motion vector resolution.

Automatic Word Spacing of the Korean Sentences by Using End-to-End Deep Neural Network (종단 간 심층 신경망을 이용한 한국어 문장 자동 띄어쓰기)

  • Lee, Hyun Young;Kang, Seung Shik
    • KIPS Transactions on Software and Data Engineering
    • /
    • v.8 no.11
    • /
    • pp.441-448
    • /
    • 2019
  • Previous researches on automatic spacing of Korean sentences has been researched to correct spacing errors by using n-gram based statistical techniques or morpheme analyzer to insert blanks in the word boundary. In this paper, we propose an end-to-end automatic word spacing by using deep neural network. Automatic word spacing problem could be defined as a tag classification problem in unit of syllable other than word. For contextual representation between syllables, Bi-LSTM encodes the dependency relationship between syllables into a fixed-length vector of continuous vector space using forward and backward LSTM cell. In order to conduct automatic word spacing of Korean sentences, after a fixed-length contextual vector by Bi-LSTM is classified into auto-spacing tag(B or I), the blank is inserted in the front of B tag. For tag classification method, we compose three types of classification neural networks. One is feedforward neural network, another is neural network language model and the other is linear-chain CRF. To compare our models, we measure the performance of automatic word spacing depending on the three of classification networks. linear-chain CRF of them used as classification neural network shows better performance than other models. We used KCC150 corpus as a training and testing data.

Automatic Text Summarization based on Selective Copy mechanism against for Addressing OOV (미등록 어휘에 대한 선택적 복사를 적용한 문서 자동요약)

  • Lee, Tae-Seok;Seon, Choong-Nyoung;Jung, Youngim;Kang, Seung-Shik
    • Smart Media Journal
    • /
    • v.8 no.2
    • /
    • pp.58-65
    • /
    • 2019
  • Automatic text summarization is a process of shortening a text document by either extraction or abstraction. The abstraction approach inspired by deep learning methods scaling to a large amount of document is applied in recent work. Abstractive text summarization involves utilizing pre-generated word embedding information. Low-frequent but salient words such as terminologies are seldom included to dictionaries, that are so called, out-of-vocabulary(OOV) problems. OOV deteriorates the performance of Encoder-Decoder model in neural network. In order to address OOV words in abstractive text summarization, we propose a copy mechanism to facilitate copying new words in the target document and generating summary sentences. Different from the previous studies, the proposed approach combines accurate pointing information and selective copy mechanism based on bidirectional RNN and bidirectional LSTM. In addition, neural network gate model to estimate the generation probability and the loss function to optimize the entire abstraction model has been applied. The dataset has been constructed from the collection of abstractions and titles of journal articles. Experimental results demonstrate that both ROUGE-1 (based on word recall) and ROUGE-L (employed longest common subsequence) of the proposed Encoding-Decoding model have been improved to 47.01 and 29.55, respectively.

Detection of Zebra-crossing Areas Based on Deep Learning with Combination of SegNet and ResNet (SegNet과 ResNet을 조합한 딥러닝에 기반한 횡단보도 영역 검출)

  • Liang, Han;Seo, Suyoung
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.39 no.3
    • /
    • pp.141-148
    • /
    • 2021
  • This paper presents a method to detect zebra-crossing using deep learning which combines SegNet and ResNet. For the blind, a safe crossing system is important to know exactly where the zebra-crossings are. Zebra-crossing detection by deep learning can be a good solution to this problem and robotic vision-based assistive technologies sprung up over the past few years, which focused on specific scene objects using monocular detectors. These traditional methods have achieved significant results with relatively long processing times, and enhanced the zebra-crossing perception to a large extent. However, running all detectors jointly incurs a long latency and becomes computationally prohibitive on wearable embedded systems. In this paper, we propose a model for fast and stable segmentation of zebra-crossing from captured images. The model is improved based on a combination of SegNet and ResNet and consists of three steps. First, the input image is subsampled to extract image features and the convolutional neural network of ResNet is modified to make it the new encoder. Second, through the SegNet original up-sampling network, the abstract features are restored to the original image size. Finally, the method classifies all pixels and calculates the accuracy of each pixel. The experimental results prove the efficiency of the modified semantic segmentation algorithm with a relatively high computing speed.

Driving Control System applying Position Recognition Method of Ball Robot using Image Processing (영상처리를 이용하는 볼 로봇의 위치 인식 방법을 적용한 주행 제어 시스템)

  • Heo, Nam-Gyu;Lee, Kwang-Min;Park, Seong-Hyun;Kim, Min-Ji;Park, Sung-Gu;Chung, Myung-Jin
    • Journal of IKEEE
    • /
    • v.25 no.1
    • /
    • pp.148-155
    • /
    • 2021
  • As robot technology advances, research on the driving system of mobile robots is actively being conducted. The driving system of a mobile robot configured based on two-wheels and four-wheels has an advantage in unidirectional driving such as a straight line, but has disadvantages in turning direction and rotating in place. A ball robot using a ball as a wheel has an advantage in omnidirectional movement, but due to its structurally unstable characteristics, balancing control to maintain attitude and driving control for movement are required. By estimating the position from an encoder attached to the motor, conventional ball robots have a limitation, which causes the accumulation of errors during driving control. In this study, a driving control system was proposed that estimates the position coordinates of a ball robot through image processing and uses it for driving control. A driving control system including an image processing unit, a communication unit, a display unit, and a control unit for estimating the position of the ball robot was designed and manufactured. Through the driving control experiment applying the driving control system of the ball robot, it was confirmed that the ball robot was controlled within the error range of ±50.3mm in the x-axis direction and ±53.9mm in the y-axis direction without accumulating errors.