Search | Korea Science

A New Video Coding Algorithm using 3D-Subband Coding and Lattice Vector Quantization

Park, Joong-Han;Lee, Keun-Young
- Journal of Electrical Engineering and information Science
- /
- v.2 no.6
- /
- pp.131-137
- /
- 1997
In this paper, we propose an efficient motion adaptive 3-dimensional (3D) video coding algorithm using 3D subband coding (3D-SBC) and lattice vector quantization (LVQ) for low bit rate. Instead of splitting input video sequences into the fixed number of subbands along the temporal axes, we decompose them into temporal subbands of variable size according to motions in frames. Each spatio-temporally splitted 7 subbands are partitioned by quadtree technique and coded with lattice vector quantization(LVQ). The simulation results show 0.1∼4.3dB gain over H.261 in peak signal to noise ratio (PSNR) at low bit rate(64Kbps).
PDF

Auto Setup Method of Best Expression Transfer Path at the Space of Facial Expressions (얼굴 표정공간에서 최적의 표정전이경로 자동 설정 방법)

Kim, Sung-Ho
- The KIPS Transactions:PartA
- /
- v.14A no.2
- /
- pp.85-90
- /
- 2007
This paper presents a facial animation and expression control method that enables the animator to select any facial frames from the facial expression space, whose expression transfer paths the system can setup automatically. Our system creates the facial expression space from approximately 2500 captured facial frames. To create the facial expression space, we get distance between pairs of feature points on the face and visualize the space of expressions in 2D space by using the Multidimensional scaling(MDS). To setup most suitable expression transfer paths, we classify the facial expression space into four field on the basis of any facial expression state. And the system determine the state of expression in the shortest distance from every field, then the system transfer from the state of any expression to the nearest state of expression among thats. To complete setup, our system continue transfer by find second, third, or fourth near state of expression until finish. If the animator selects any key frames from facial expression space, our system setup expression transfer paths automatically. We let animators use the system to create example animations or to control facial expression, and evaluate the system based on the results.
https://doi.org/10.3745/KIPSTA.2007.14-A.2.085 인용 PDF KSCI

Estimation of 2D Position and Flatness Errors for a Planar XY Stage Based on Measured Guideway Profiles

Hwang, Joo-Ho;Park, Chun-Hong;Kim, Seung-Woo
- International Journal of Precision Engineering and Manufacturing
- /
- v.8 no.2
- /
- pp.64-69
- /
- 2007
Aerostatic planar XY stages are frequently used as the main frames of precision positioning systems. The machining and assembly process of the rails and bed of the stage is one of first processes performed when the system is built. When the system is complete, the 2D position, motion, and stage flatness errors are measured in tests. If the stage errors exceed the application requirements, the stage must be remachined and the assembly process must be repeated. This is difficult and time-consuming work. In this paper, a method for estimating the errors of a planar XY stage is proposed that can be applied when the rails and bed of the stage are evaluated. Profile measurements, estimates of the motion error, and 2D position estimation models were considered. A comparison of experimental results and our estimates indicated that the estimated errors were within $1{\mu}m$ of their true values. Thus, the proposed estimation method for 2D position and flatness errors of an aerostatic planar XY stage is expected to be a useful tool during the assembly process of guideways.
PDF KSCI

A Single Channel Voice Activity Detection for Noisy Environments Using Wavelet Packet Decomposition and Teager Energy (웨이블렛 패킷 변환과 Teager 에너지를 이용한 잡음 환경에서의 단일 채널 음성 판별)

Koo, Boneung
- The Journal of the Acoustical Society of Korea
- /
- v.33 no.2
- /
- pp.139-145
- /
- 2014
In this paper, a feature parameter is obtained by applying the Teager energy to the WPD(Wavelet Packet Decomposition) coefficients. The threshold value is obtained based on means and standard deviations of nonspeech frames. Experimental results by using TIMIT speech and NOISEX-92 noise databases show that the proposed algorithm is superior to the typical VAD algorithm. The ROC(Receiver Operating Characteristics) curves are used to compare performance of VAD's for SNR values of ranging from 10 to -10 dB.
https://doi.org/10.7776/ASK.2014.33.2.139 인용 PDF KSCI

Parametric 3D elastic solutions of beams involved in frame structures

Bordeu, Felipe;Ghnatios, Chady;Boulze, Daniel;Carles, Beatrice;Sireude, Damien;Leygue, Adrien;Chinesta, Francisco
- Advances in aircraft and spacecraft science
- /
- v.2 no.3
- /
- pp.233-248
- /
- 2015
Frame structures have been traditionally represented as an assembling of components, these last described within the beam theory framework. In the case of frames involving complex components in which classical beam theory could fail, 3D descriptions seem the only valid route for performing accurate enough analyses. In this work we propose a framework for frame structure analyses that proceeds by assembling the condensed parametric rigidity matrices associated with the elementary beams composing the beams involved in the frame structure. This approach allows a macroscopic analysis in which only the condensed degrees of freedom at the elementary beams interfaces are considered, while fine 3D parametric descriptions are retained for local analyses.
https://doi.org/10.12989/aas.2015.2.3.233 인용 KSCI

A Study on the Spoken Korean Citynames Using Multi-Layered Perceptron of Back-Propagation Algorithm (오차 역전파 알고리즘을 갖는 MLP를 이용한 한국 지명 인식에 대한 연구)

Song, Do-Sun;Lee, Jae-Gheon;Kim, Seok-Dong;Lee, Haing-Sei
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.6
- /
- pp.5-14
- /
- 1994
This paper is about an experiment of speaker-independent automatic Korean spoken words recognition using Multi-Layered Perceptron and Error Back-propagation algorithm. The object words are 50 citynames of D.D.D local numbers. 43 of those are 2 syllables and the rest 7 are 3 syllables. The words were not segmented into syllables or phonemes, and some feature components extracted from the words in equal gap were applied to the neural network. That led independent result on the speech duration, and the PARCOR coefficients calculated from the frames using linear predictive analysis were employed as feature components. This paper tried to find out the optimum conditions through 4 differerent experiments which are comparison between total and pre-classified training, dependency of recognition rate on the number of frames and PAROCR order, recognition change due to the number of neurons in the hidden layer, and the comparison of the output pattern composition method of output neurons. As a result, the recognition rate of $89.6\%$ is obtaimed through the research.
PDF

Efficient Browsing Method based on Metadata of Video Contents (동영상 컨텐츠의 메타데이타에 기반한 효율적인 브라우징 기법)

Chun, Soo-Duck;Shin, Jung-Hoon;Lee, Sang-Jun
- Journal of KIISE:Computing Practices and Letters
- /
- v.16 no.5
- /
- pp.513-518
- /
- 2010
The advancement of information technology along with the proliferation of communication and multimedia has increased the demand of digital contents. Video data of digital contents such as VOD, NOD, Digital Library, IPTV, and UCC are getting more permeated in various application fields. Video data have sequential characteristic besides providing the spatial and temporal information in its 3D format, making searching or browsing ineffective due to long turnaround time. In this paper, we suggest ATVC(Authoring Tool for Video Contents) for solving this issue. ATVC is a video editing tool that detects key frames using visual rhythm and insert metadata such as keywords into key frames via XML tagging. Visual rhythm is applied to map 3D spatial and temporal information to 2D information. Its processing speed is fast because it can get pixel information without IDCT, and it can classify edit-effects such as cut, wipe, and dissolve. Since XML data save key frame information via XML tag and keyword information, it can furnish efficient browsing.
PDF KSCI

Design and Verification of LAN Emulation Function for Hybrid Two-Stage AWG based WDM-PON (혼합형 2단 AWG 기반의 WDM-PON을 위한 LAN 에뮬레이션 기능 설계 및 검증)

Han, Kyeong-Eun;Yang, Won-Hyuk;Kim, Young-Chon
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.33 no.3B
- /
- pp.91-99
- /
- 2008
In this paper, we design the function of ULSLE(Upper Layer Shared LAN Emulation) to provide both the efficient LAN service and compatibility with 802.1D bridge in Hybrid two-stage AWG based WDM-PON. The ULSLE layer lies above MAC control layer in order to provide a mean to interface WDM-PON and 802.1D bridge. It also performs LAN emulation based on PON-Tag which is only used to decide both the transmission mode and the destination of frames transmitted from ONUs. That is, the PON-Tag is not used for downstream frames but destination address field in original frame instead. This decreases the processing overhead and complexity caused by PON-Tag at OLT and ONU. The verification of designed ULSLE is performed according to the specific scenarios based on transmission mode and destination using OPNET.
PDF KSCI

An Application of the Kalman Filter for Attenuation of Colored Noise Superimposed on Speech Signal (칼만필터를 이용한 음성신호에 중첩된 유색잡음의 감쇠)

Gu, Bon-Eung
- The Journal of the Acoustical Society of Korea
- /
- v.13 no.2
- /
- pp.76-85
- /
- 1994
A speech enhancement algorithm which attenuates nonstationary colored noise is presented In this paper. The algorithm consists of a stationary Kalman filter and the simple speech/nonspeech detector. While the conventional enhancement systems are focused on a stationary and/or white background noise, this study Is focused on the mort realistic nonstationary and nonwhite noise. An AR model-based vector Kalman filter is used as a noise suppression system and a short-time energy threshold logic is used as a speech/nonspeech classifier. For Kalman filtering. noise coefficients are estimated in the nonspeech frame, and speech coefficients are estimated by applying the EM iteration algorithm. Simulation results using the car noise are presented based on the signal-to-noise ratio and informal listening tests. According to the experimental results, background noises in the nonspeech frames are eliminated almost completely, while some distortions are noticed in the speech frames. The distortion becomes severer as the SNR is reduced to 0dB and -5dB. Intelligibility, however, is not degraded significantly.
PDF

A bit-rate control of MPEG-2 video coding using quantization ratio coefficient and the mean MQUANT (양자화 비례 계수와 평균 MQUANT를 이용한 MPEG-2 비디오 부호화 비트율 제어)

이근영;임용순;김주도;한승욱
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.23 no.8
- /
- pp.2025-2031
- /
- 1998
In moving picture coding standard MPEG2, a bit rate control system plays a key role for the compressing ratio and picture quality. We proposed a bit rate control scheme which assigns more bits to I, P frames and uses the average MQUANT of previous mackoblocks. The proposed scheme showed about 0.9dB improvement of image quality when compared to bit rate control method of MPEG2 Test-Model5.
PDF

Search Result 315, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)