• Title/Summary/Keyword: Image synthesis

Search Result 446, Processing Time 0.027 seconds

A Simulation Study of the Vocal Tract in Tracheoesophageal Speaker

  • Kim, Cheol-Soo;Wang, Soo-Geun;Roh, Hwan-Jung;Goh, Eui-Kyung;Chon, Kyong-Myong;Lee, Byung-Joo;Kwon, Soon-Bok;Lee, Suck-Hong;Kim, Hak-Jin;Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.7 no.3
    • /
    • pp.197-218
    • /
    • 2000
  • The vocal tract shapes were measured from tracheoesophageal speakers during the sustained phonation of five Korean vowels /u/, /o/, /a/, /e/, /i/ using magnetic resonance image(MRI). The subject's original vowel utterances with speech intelligibility and the synthesized vowels from MR images were analyzed. The results were as follows: (1) The vowels /a/, /e/, /i/ were perceived as the same sounds of actual subject's speech, but the vowels /o/ and /u/ were perceived as /$\partial$/ and strained /u/, respectively. (2) The synthesized vowels /a/ and /e/ from the MR images were perceived as the same sounds, but the vowels /u/, /o/, /i/ were perceived as different sounds. (3) The synthesized vowel by the expanded pharyngeal segment of 3 times in vowel /o/ was perceived as more natural than that of 2 times. The pharyngeal areas with varied sizes should be experimented to secure better speech production because the correct shapes of the vocal tract lead to distinct vowel production.

  • PDF

Stereoscopic Video Display System Based on H.264/AVC (H.264/AVC 기반의 스테레오 영상 디스플레이 시스템)

  • Kim, Tae-June;Kim, Jee-Hong;Yun, Jung-Hwan;Bae, Byung-Kyu;Kim, Dong-Wook;Yoo, Ji-Sang
    • The Journal of Korean Institute of Communications and Information Sciences
    • /
    • v.33 no.6C
    • /
    • pp.450-458
    • /
    • 2008
  • In this paper, we propose a real-time stereoscopic display system based on H.264/AVC. We initially acquire stereo-view images from stereo web-cam using OpenCV library. The captured images are converted to YUV 4:2:0 format as a preprocess. The input files are encoded by stereo-encoder, which has a proposed estimation structure, with more than 30 fps. The encoded bitstream are decoded by stereo-decoder reconstructing left and right images. The reconstructed stereo images are postprocessed by stereoscopic image synthesis technique to offer users more realistic images with 3D effect. Experimental results show that the proposed system has better encoding efficiency compared with using a conventional stereo CODEC(coder and decoder) and operates with real-time processing and low complexity suitable for an application with a mobile environment.

Colloidal synthesis of IR-Iuminescent HgTe quantum dots (콜로이드 합성법에 의한 HgTe 양자점의 제조와 특성 분석)

  • Song, Hyun-Woo;Cho, Kyoung-Ah;Kim, Hyun-Suk;Kim, Sang-Sig
    • Proceedings of the Korean Institute of Electrical and Electronic Material Engineers Conference
    • /
    • 2002.11a
    • /
    • pp.31-34
    • /
    • 2002
  • HgTe quantum dots were synthesized in aqueous solution at room temperature by colloidal method. The synthesized materials were identified to be zincblende cubic structured HgTe quantum dots by X-ray diffraction and transmission electron microscopy image revealed that these quantum dots are agglomerate of a individual particle. The colloidally prepared HgTe quantum dots have the sphere-like shape with a diameter of approximately 4 nm. The optical properties of the HgTe quantum dots were investigated with photoluminescence(PL). The PL appears in the near-infrared region, which represent a dramatic shift from bulk HgTe behavior. The analytic results revealed that HgTe quantum dots have the broad size distribution, as PL emission spectrum covers the spectral region from 900 to 1400 nm. In this study, the factors affecting PL of HgTe quantum dots and particle size distributiont are described.會Ā᐀䁇?⨀젲岒Ā㰀會Ā㰀顇?⨀끩Ā㈀會Ā㈀?⨀䡪ఀĀ᐀會Ā᐀䡈?⨀Ā᐀會Ā᐀ꁈ?⨀硫ᜀĀ저會Ā저?⨀샟ගऀĀ저會Ā저偉?⨀栰岒ఀĀ저會Ā저ꡉ?⨀1岒Ā저會Ā저J?⨀惝ග؀Ā؀會Ā؀塊?⨀ග嘀Ā切會Ā切끊?⨀⣟ගĀ搀會Ā搀ࡋ?⨀큭킢Ā저會Ā저恋?⨀桮킢Ā저會Ā저롋?⨀⣅沥ࠀĀࠀ會Āࠀ၌?⨀샅沥Ā저會Ā저桌?⨀壆沥ሀĀ저會Ā저쁌?⨀o킢瀀ꀏ會Āᡍ?⨀棤좗ĀĀĀ會ĀĀ灍?⨀å좗ĀĀĀ會ĀĀ졍?⨀飥좗ĀĀĀ會ĀĀ⁎?⨀?ꆟᤀ

  • PDF

Pattern-based Depth Map Generation for Low-complexity 2D-to-3D Video Conversion (저복잡도 2D-to-3D 비디오 변환을 위한 패턴기반의 깊이 생성 알고리즘)

  • Han, Chan-Hee;Kang, Hyun-Soo;Lee, Si-Woong
    • The Journal of the Korea Contents Association
    • /
    • v.15 no.2
    • /
    • pp.31-39
    • /
    • 2015
  • 2D-to-3D video conversion vests 3D effects in a 2D video by generating stereoscopic views using depth cues inherent in the 2D video. This technology would be a good solution to resolve the problem of 3D content shortage during the transition period to the full ripe 3D video era. In this paper, a low-complexity depth generation method for 2D-to-3D video conversion is presented. For temporal consistency in global depth, a pattern-based depth generation method is newly introduced. A low-complexity refinement algorithm for local depth is also provided to improve 3D perception in object regions. Experimental results show that the proposed method outperforms conventional methods in terms of complexity and subjective quality.

Listener Auditory Perception Enhancement using Virtual Sound Source Design for 3D Auditory System

  • Kang, Cheol Yong;Mariappan, Vinayagam;Cho, Juphil;Lee, Seon Hee
    • International journal of advanced smart convergence
    • /
    • v.5 no.4
    • /
    • pp.15-20
    • /
    • 2016
  • When a virtual sound source for 3D auditory system is reproduced by a linear loudspeaker array, listeners can perceive not only the direction of the source, but also its distance. Control over perceived distance has often been implemented via the adjustment of various acoustic parameters, such as loudness, spectrum change, and the direct-to-reverberant energy ratio; however, there is a neglected yet powerful cue to the distance of a nearby virtual sound source that can be manipulated for sources that are positioned away from the listener's median plane. This paper address the problem of generating binaural signals for moving sources in closed or in open environments. The proposed perceptual enhancement algorithm composed of three main parts is developed: propagation, reverberation and the effect of the head, torso and pinna. For propagation the effect of attenuation due to distance and molecular air-absorption is considered. Related to the interaction of sounds with the environment, especially in closed environments is reverberation. The effects of the head, torso and pinna on signals that arrive at the listener are also objectives of the consideration. The set of HRTF that have been used to simulate the virtual sound source environment for 3D auditory system. Special attention has been given to the modelling and interpolation of HRTFs for the generation of new transfer functions and definition of trajectories, definition of closed environment, etc. also be considered for their inclusion in the program to achieve realistic binaural renderings. The evaluation is implemented in MATLAB.

Mobile Camera Processor Design with Multi-lane Serial Interface (멀티레인을 지원하는 모바일 카메라용 직렬 인터페이스 프로세서 설계)

  • Hyun, Eu-Gin;Kwon, Soon;Lee, Jong-Hun;Jung, Woo-Young
    • Journal of the Institute of Electronics Engineers of Korea SD
    • /
    • v.44 no.7 s.361
    • /
    • pp.62-70
    • /
    • 2007
  • In this paper, we design a mobile camera processor to support the MIPI CSI-2 and DPHY specification. The lane management sub-layer of CIS2 handles multi-lane configuration. Thus conceptually, the transmitter and receiver have each independent buffer on multi lanes. In the proposed architecture, the independent buffers are merged into a single common buffer. The single buffer architecture can flexibly manage data on multi lanes though the number of supported lanes are mismatched in a camera processor transmitter and a host processor. For a key issue for the data synchronization problem, the synchronization start codes are added as the starting for image data. We design synchronization logic to synchronize the received clock and to generate the byte clock. We present the verification results under proposed test bench. And we show the waves of simulation and logic synthesis results of the designed processor.

Synthesis and Characterization of Molybdenum (V)-1, 6-Diaminohexane-N, N, N', N'-tetraacetic Acid Derivatives Complexes (몰리브덴 (V) 와 1, 6-Diaminohexane-N, N, N', N'-tetraacetic Acid 계 착물합성과 그 성질)

  • Sang Oh Oh;Sig Young Choi
    • Journal of the Korean Chemical Society
    • /
    • v.33 no.1
    • /
    • pp.90-96
    • /
    • 1989
  • A new series of $dioxo-di-{\mu}-oxo-dimolybdate(V)(*image)$, has been prepared by the reaction of pyridinum oxoisothiocyanato-molybdate(V) with 1, 6-diaminohexane-N, N, N', N'-tetraacetic acid derivatives containing amine carboxyl groups. The properties and possible molecular structure of these complexes were discussed by elemental analysis, spectroscopic studies and magnetic susceptibility measurements. The infrared spectra of these complexes show two strong Mo=$O_t$ stretching modes in the $900-965cm^{-1}$, MoO$_2$Mo stretching bands at around 450∼500 and $740-765 cm^{-1}$ to symmetrical and asymmetrical O-bridge stretching, a coordinated $COO^-$ asymmetrical band in the $1600-1635 cm^{-1}$. The complexes synthesized were yellow or orange and diamagnetic.

  • PDF

A 3D Face Reconstruction and Tracking Method using the Estimated Depth Information (얼굴 깊이 추정을 이용한 3차원 얼굴 생성 및 추적 방법)

  • Ju, Myung-Ho;Kang, Hang-Bong
    • The KIPS Transactions:PartB
    • /
    • v.18B no.1
    • /
    • pp.21-28
    • /
    • 2011
  • A 3D face shape derived from 2D images may be useful in many applications, such as face recognition, face synthesis and human computer interaction. To do this, we develop a fast 3D Active Appearance Model (3D-AAM) method using depth estimation. The training images include specific 3D face poses which are extremely different from one another. The landmark's depth information of landmarks is estimated from the training image sequence by using the approximated Jacobian matrix. It is added at the test phase to deal with the 3D pose variations of the input face. Our experimental results show that the proposed method can efficiently fit the face shape, including the variations of facial expressions and 3D pose variations, better than the typical AAM, and can estimate accurate 3D face shape from images.

Optimum Subband Quantization Filter Design for Image Compression (영상압축을 위한 최적의 서브밴드 양자화 필터 설계)

  • Park, Kyu-Sik;Park, Jae-Hyun
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.379-386
    • /
    • 2005
  • This paper provides a rigorous theory for analysis of quantization effects and optimum filter bank design in quantized multidimensional subband filter banks. Even though subband filter design has been a hot topic for last decades, a few results have been reported on the subband filter with a quantizer. Each pdf-optimized quantizer is modeled by a nonlinear gain-plus-additive uncorrelated noise and embedded into the subband structure. Using polyphase decomposition of the analysis/synthesis filter banks, we derive the exact expression for the output mean square quantization error. Based on the minimization of the output mean square error, the technique for optimal filter design methodology is developed. Numerical design examples for optimum nonseparable paraunitary and biorthogonal filter banks are presented with a quincunx subsampling lattice. Through the simulation, $10\~20\;\%$ decreases in MSE have been observed compared with subband filter with no quantizers especially for low bit rate cases.

3-D Facial Animation on the PDA via Automatic Facial Expression Recognition (얼굴 표정의 자동 인식을 통한 PDA 상에서의 3차원 얼굴 애니메이션)

  • Lee Don-Soo;Choi Soo-Mi;Kim Hae-Hwang;Kim Yong-Guk
    • The KIPS Transactions:PartB
    • /
    • v.12B no.7 s.103
    • /
    • pp.795-802
    • /
    • 2005
  • In this paper, we present a facial expression recognition-synthesis system that recognizes 7 basic emotion information automatically and renders face with non-photorelistic style in PDA For the recognition of the facial expressions, first we need to detect the face area within the image acquired from the camera. Then, a normalization procedure is applied to it for geometrical and illumination corrections. To classify a facial expression, we have found that when Gabor wavelets is combined with enhanced Fisher model the best result comes out. In our case, the out put is the 7 emotional weighting. Such weighting information transmitted to the PDA via a mobile network, is used for non-photorealistic facial expression animation. To render a 3-D avatar which has unique facial character, we adopted the cartoon-like shading method. We found that facial expression animation using emotional curves is more effective in expressing the timing of an expression comparing to the linear interpolation method.