Search | Korea Science

Visual Voice Activity Detection and Adaptive Threshold Estimation for Speech Recognition (음성인식기 성능 향상을 위한 영상기반 음성구간 검출 및 적응적 문턱값 추정)

Song, Taeyup;Lee, Kyungsun;Kim, Sung Soo;Lee, Jae-Won;Ko, Hanseok
- The Journal of the Acoustical Society of Korea
- /
- v.34 no.4
- /
- pp.321-327
- /
- 2015
In this paper, we propose an algorithm for achieving robust Visual Voice Activity Detection (VVAD) for enhanced speech recognition. In conventional VVAD algorithms, the motion of lip region is found by applying an optical flow or Chaos inspired measures for detecting visual speech frames. The optical flow-based VVAD is difficult to be adopted to driving scenarios due to its computational complexity. While invariant to illumination changes, Chaos theory based VVAD method is sensitive to motion translations caused by driver's head movements. The proposed Local Variance Histogram (LVH) is robust to the pixel intensity changes from both illumination change and translation change. Hence, for improved performance in environmental changes, we adopt the novel threshold estimation using total variance change. In the experimental results, the proposed VVAD algorithm achieves robustness in various driving situations.
https://doi.org/10.7776/ASK.2015.34.4.321 인용 PDF KSCI

3D Facial Animation with Head Motion Estimation and Facial Expression Cloning (얼굴 모션 추정과 표정 복제에 의한 3차원 얼굴 애니메이션)

Kwon, Oh-Ryun;Chun, Jun-Chul
- The KIPS Transactions:PartB
- /
- v.14B no.4
- /
- pp.311-320
- /
- 2007
This paper presents vision-based 3D facial expression animation technique and system which provide the robust 3D head pose estimation and real-time facial expression control. Many researches of 3D face animation have been done for the facial expression control itself rather than focusing on 3D head motion tracking. However, the head motion tracking is one of critical issues to be solved for developing realistic facial animation. In this research, we developed an integrated animation system that includes 3D head motion tracking and facial expression control at the same time. The proposed system consists of three major phases: face detection, 3D head motion tracking, and facial expression control. For face detection, with the non-parametric HT skin color model and template matching, we can detect the facial region efficiently from video frame. For 3D head motion tracking, we exploit the cylindrical head model that is projected to the initial head motion template. Given an initial reference template of the face image and the corresponding head motion, the cylindrical head model is created and the foil head motion is traced based on the optical flow method. For the facial expression cloning we utilize the feature-based method, The major facial feature points are detected by the geometry of information of the face with template matching and traced by optical flow. Since the locations of varying feature points are composed of head motion and facial expression information, the animation parameters which describe the variation of the facial features are acquired from geometrically transformed frontal head pose image. Finally, the facial expression cloning is done by two fitting process. The control points of the 3D model are varied applying the animation parameters to the face model, and the non-feature points around the control points are changed by use of Radial Basis Function(RBF). From the experiment, we can prove that the developed vision-based animation system can create realistic facial animation with robust head pose estimation and facial variation from input video image.
https://doi.org/10.3745/KIPSTB.2007.14-B.4.311 인용 PDF KSCI

Optical Flow Based Vehicle Counting and Speed Estimation in CCTV Videos (Optical Flow 기반 CCTV 영상에서의 차량 통행량 및 통행 속도 추정에 관한 연구)

Kim, Jihae;Shin, Dokyung;Kim, Jaekyung;Kwon, Cheolhee;Byun, Hyeran
- Journal of Broadcast Engineering
- /
- v.22 no.4
- /
- pp.448-461
- /
- 2017
This paper proposes a vehicle counting and speed estimation method for traffic situation analysis in road CCTV videos. The proposed method removes a distortion in the images using Inverse perspective Mapping, and obtains specific region for vehicle counting and speed estimation using lane detection algorithm. Then, we can obtain vehicle counting and speed estimation results from using optical flow at specific region. The proposed method achieves stable accuracy of 88.94% from several CCTV images by regional groups and it totally applied at 106,993 frames, about 3 hours video.
https://doi.org/10.5909/JBE.2017.22.4.448 인용 PDF KSCI KPUBS

Motion Map Generation for Maintaining the Temporal Coherence of Brush Strokes in the Painterly Animation (회화적 애니메이션에서 브러시 스트로크의 시간적 일관성을 유지하기 위한 모션 맵 생성)

Park Youngs-Up;Yoon Kyung-Hyun
- Journal of KIISE:Computer Systems and Theory
- /
- v.33 no.8
- /
- pp.536-546
- /
- 2006
Painterly animation is a method that expresses painterly images with a hand-painted appearance from a video, and the most crucial element for it is the temporal coherence of brush strokes between frames. A motion map is proposed in this paper as a solution to the issue of maintaining the temporal coherence in the brush strokes between the frames. A motion map is the region that frame-to-frame motions have occurred. Namely, this map refers to the region frame-to-frame edges move by the motion information with the motion occurred edges as a starting point. In this paper, we employ the optical flow method and block-based method to estimate the motion information. The method that yielded the biggest PSNR using the motion information (the directions and magnitudes) acquired by various methods of motion estimation has been chosen as the final motion information to form a motion map. The created motion map determine the part of the frame that should be re-painted. In order to express painterly images with a hand- painted appearance and maintain the temporal coherence of brush strokes, the motion information was applied to only the strong edges that determine the directions of the brush strokes. Also, this paper seek to reduce the flickering phenomenon between the frames by using the multiple exposure method and the difference map created by the difference between images of the source and the canvas. Maintenance of the coherence in the direction of the brush strokes was also attempted by a local gradient interpolation to maintain the structural coherence.
PDF KSCI

Search Result 14, Processing Time 0.023 seconds

Visual Voice Activity Detection and Adaptive Threshold Estimation for Speech Recognition (음성인식기 성능 향상을 위한 영상기반 음성구간 검출 및 적응적 문턱값 추정)

3D Facial Animation with Head Motion Estimation and Facial Expression Cloning (얼굴 모션 추정과 표정 복제에 의한 3차원 얼굴 애니메이션)

Optical Flow Based Vehicle Counting and Speed Estimation in CCTV Videos (Optical Flow 기반 CCTV 영상에서의 차량 통행량 및 통행 속도 추정에 관한 연구)

Motion Map Generation for Maintaining the Temporal Coherence of Brush Strokes in the Painterly Animation (회화적 애니메이션에서 브러시 스트로크의 시간적 일관성을 유지하기 위한 모션 맵 생성)

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)