Search | Korea Science

Voice Activity Detection using Motion and Variation of Intensity in The Mouth Region (입술 영역의 움직임과 밝기 변화를 이용한 음성구간 검출 알고리즘 개발)

Kim, Gi-Bak;Ryu, Je-Woong;Cho, Nam-Ik
- Journal of Broadcast Engineering
- /
- v.17 no.3
- /
- pp.519-528
- /
- 2012
Voice activity detection (VAD) is generally conducted by extracting features from the acoustic signal and a decision rule. The performance of such VAD algorithms driven by the input acoustic signal highly depends on the acoustic noise. When video signals are available as well, the performance of VAD can be enhanced by using the visual information which is not affected by the acoustic noise. Previous visual VAD algorithms usually use single visual feature to detect the lip activity, such as active appearance models, optical flow or intensity variation. Based on the analysis of the weakness of each feature, we propose to combine intensity change measure and the optical flow in the mouth region, which can compensate for each other's weakness. In order to minimize the computational complexity, we develop simple measures that avoid statistical estimation or modeling. Specifically, the optical flow is the averaged motion vector of some grid regions and the intensity variation is detected by simple thresholding. To extract the mouth region, we propose a simple algorithm which first detects two eyes and uses the profile of intensity to detect the center of mouth. Experiments show that the proposed combination of two simple measures show higher detection rates for the given false positive rate than the methods that use a single feature.
https://doi.org/10.5909/JBE.2012.17.3.519 인용 PDF KSCI

Autonomous Mobile Robot System Using Adaptive Spatial Coordinates Detection Scheme based on Stereo Camera (스테레오 카메라 기반의 적응적인 공간좌표 검출 기법을 이용한 자율 이동로봇 시스템)

Ko Jung-Hwan;Kim Sung-Il;Kim Eun-Soo
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.31 no.1C
- /
- pp.26-35
- /
- 2006
In this paper, an automatic mobile robot system for a intelligent path planning using the detection scheme of the spatial coordinates based on stereo camera is proposed. In the proposed system, face area of a moving person is detected from a left image among the stereo image pairs by using the YCbCr color model and its center coordinates are computed by using the centroid method and then using these data, the stereo camera embedded on the mobile robot can be controlled for tracking the moving target in real-time. Moreover, using the disparity map obtained from the left and right images captured by the tracking-controlled stereo camera system and the perspective transformation between a 3-D scene and an image plane, depth information can be detected. Finally, based-on the analysis of these calculated coordinates, a mobile robot system is derived as a intelligent path planning and a estimation. From some experiments on robot driving with 240 frames of the stereo images, it is analyzed that error ratio between the calculated and measured values of the distance between the mobile robot and the objects, and relative distance between the other objects is found to be very low value of $2.19\%$ and $1.52\%$ on average, respectably.
PDF KSCI

Vehicle Detection using Feature Points with Directional Features (방향성 특징을 가지는 특징 점에 의한 차량 검출)

Choi Dong-Hyuk;Kim Byoung-Soo
- Journal of the Institute of Electronics Engineers of Korea SC
- /
- v.42 no.2 s.302
- /
- pp.11-18
- /
- 2005
To detect vehicles in image, first the image is transformed with the steerable pyramid which has independent directions and levels. Feature vectors are the collection of filter responses at different scales of a steerable image pyramid. For the detection of vehicles in image, feature vectors in feature points of the vehicle image is used. First the feature points are selected with the grid points in vehicle image that are evenly spaced, and second, the feature points are comer points which m selected by human, and last the feature points are corner Points which are selected in grid points. Next the feature vectors of the model vehicle image we compared the patch of the test images, and if the distance of the model and the patch of the test images is lower than the predefined threshold, the input patch is decided to a vehicle. In experiment, the total 11,191 vehicle images are captured at day(10,576) and night(624) in the two local roads. And the $92.0\%$ at day and $87.3\%$ at night detection rate is achieved.
PDF KSCI

Model-Based Plane Detection in Disparity Space Using Surface Partitioning (표면분할을 이용한 시차공간상에서의 모델 기반 평면검출)

Ha, Hong-joon;Lee, Chang-hun
- KIPS Transactions on Software and Data Engineering
- /
- v.4 no.10
- /
- pp.465-472
- /
- 2015
We propose a novel plane detection in disparity space and evaluate its performance. Our method simplifies and makes scenes in disparity space easily dealt with by approximating various surfaces as planes. Moreover, the approximated planes can be represented in the same size as in the real world, and can be employed for obstacle detection and camera pose estimation. Using a stereo matching technique, our method first creates a disparity image which consists of binocular disparity values at xy-coordinates in the image. Slants of disparity values are estimated by exploiting a line simplification algorithm which allows our method to reflect global changes against x or y axis. According to pairs of x and y slants, we label the disparity image. 4-connected disparities with the same label are grouped, on which least squared model estimates plane parameters. N plane models with the largest group of disparity values which satisfy their plane parameters are chosen. We quantitatively and qualitatively evaluate our plane detection. The result shows 97.9%와 86.6% of quality in our experiment respectively on cones and cylinders. Proposed method excellently extracts planes from Middlebury and KITTI dataset which are typically used for evaluation of stereo matching algorithms.
https://doi.org/10.3745/KTSDE.2015.4.10.465 인용 PDF KSCI

Real-time Hand Region Detection based on Cascade using Depth Information (깊이정보를 이용한 케스케이드 방식의 실시간 손 영역 검출)

Joo, Sung Il;Weon, Sun Hee;Choi, Hyung Il
- KIPS Transactions on Software and Data Engineering
- /
- v.2 no.10
- /
- pp.713-722
- /
- 2013
This paper proposes a method of using depth information to detect the hand region in real-time based on the cascade method. In order to ensure stable and speedy detection of the hand region even under conditions of lighting changes in the test environment, this study uses only features based on depth information, and proposes a method of detecting the hand region by means of a classifier that uses boosting and cascading methods. First, in order to extract features using only depth information, we calculate the difference between the depth value at the center of the input image and the average of depth value within the segmented block, and to ensure that hand regions of all sizes will be detected, we use the central depth value and the second order linear model to predict the size of the hand region. The cascade method is applied to implement training and recognition by extracting features from the hand region. The classifier proposed in this paper maintains accuracy and enhances speed by composing each stage into a single weak classifier and obtaining the threshold value that satisfies the detection rate while exhibiting the lowest error rate to perform over-fitting training. The trained classifier is used to classify the hand region, and detects the final hand region in the final merger stage. Lastly, to verify performance, we perform quantitative and qualitative comparative analyses with various conventional AdaBoost algorithms to confirm the efficiency of the hand region detection algorithm proposed in this paper.
https://doi.org/10.3745/KTSDE.2013.2.10.713 인용 PDF KSCI

Salient Region Detection Algorithm for Music Video Browsing (뮤직비디오 브라우징을 위한 중요 구간 검출 알고리즘)

Kim, Hyoung-Gook;Shin, Dong
- The Journal of the Acoustical Society of Korea
- /
- v.28 no.2
- /
- pp.112-118
- /
- 2009
This paper proposes a rapid detection algorithm of a salient region for music video browsing system, which can be applied to mobile device and digital video recorder (DVR). The input music video is decomposed into the music and video tracks. For the music track, the music highlight including musical chorus is detected based on structure analysis using energy-based peak position detection. Using the emotional models generated by SVM-AdaBoost learning algorithm, the music signal of the music videos is classified into one of the predefined emotional classes of the music automatically. For the video track, the face scene including the singer or actor/actress is detected based on a boosted cascade of simple features. Finally, the salient region is generated based on the alignment of boundaries of the music highlight and the visual face scene. First, the users select their favorite music videos from various music videos in the mobile devices or DVR with the information of a music video's emotion and thereafter they can browse the salient region with a length of 30-seconds using the proposed algorithm quickly. A mean opinion score (MOS) test with a database of 200 music videos is conducted to compare the detected salient region with the predefined manual part. The MOS test results show that the detected salient region using the proposed method performed much better than the predefined manual part without audiovisual processing.
https://doi.org/10.7776/ASK.2009.28.2.112 인용 PDF KSCI

Statistical Model for Emotional Video Shot Characterization (비디오 셧의 감정 관련 특징에 대한 통계적 모델링)

박현재;강행봉
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.28 no.12C
- /
- pp.1200-1208
- /
- 2003
Affective computing plays an important role in intelligent Human Computer Interactions(HCI). To detect emotional events, it is desirable to construct a computing model for extracting emotion related features from video. In this paper, we propose a statistical model based on the probabilistic distribution of low level features in video shots. The proposed method extracts low level features from video shots and then from a GMM(Gaussian Mixture Model) for them to detect emotional shots. As low level features, we use color, camera motion and sequence of shot lengths. The features can be modeled as a GMM by using EM(Expectation Maximization) algorithm and the relations between time and emotions are estimated by MLE(Maximum Likelihood Estimation). Finally, the two statistical models are combined together using Bayesian framework to detect emotional events in video.
PDF KSCI

Design of A Speech Recognition System using Hidden Markov Models (은닉 마코프 모델을 이용한 음성 인식 시스템 설계)

Lee, Chul-Won;Lim, In-Chil
- Journal of the Korean Institute of Telematics and Electronics B
- /
- v.33B no.1
- /
- pp.108-115
- /
- 1996
This paper proposes an algorithm and a model topology for the connected speech recognition using Discrete Hidden Markov Models. A proposed model uses diphone and triphone model which consider the recognition rate and recognisable vocabulary. Considering more exact inter- phoneme segmentation and execution speed of algorithm, 4 states have to exist in diphone model where the first state and the last state are keeping a steady state, the other states hold a transient state. 7 states have to exist in triphone model where 7 states are specified and improved to 3 steady states and 4 transition states. Also, the proposed speech recognition algorithm is designed to detect the inter-phoneme segmentation during the recognition processing.
PDF

Subdivision Ensemble Model for Highlight Detection (하이라이트 검출을 위한 구간 분할 앙상블 모델)

Lee, Hansol;Lee, Gyemin
- Journal of Broadcast Engineering
- /
- v.25 no.4
- /
- pp.620-628
- /
- 2020
Automatically predicting video highlight is an important task for media industry and streaming platform providers to save time and cost of manual video editing process. We propose a new ensemble model that combines multiple highlight detectors with each focusing on different parts of highlight events. Therefore, our model can capture more information-rich sections of events. Furthermore, the proposed model can extract improved features for highlight detection particularly when the train video set is small. We evaluate our model on e-sports and baseball videos.
https://doi.org/10.5909/JBE.2020.25.4.620 인용 PDF KSCI KPUBS

Band Fault Modelling Based on specification for the Time Domain Test of RFIC (RF 집적회로의 시간영역 테스팅을 위한 사양기반 구간고장모델링)

Kim, Kang-Chul;Han, Seok-Bung
- Journal of the Korea Institute of Information and Communication Engineering
- /
- v.12 no.2
- /
- pp.299-308
- /
- 2008
This paper proposes a new design specification-based band fault modelling technique that can test design specification in a time domain. The band fault model is defined and the conditions of band fault model are gained as normal operation regions are defined. And the conditions of band fault model are used in a 5.25GHz low noise amplifier, then 9 band fault models that can detect hard and parametric faults of active and passive devices are obtained.
https://doi.org/10.6109/jkiice.2008.12.2.299 인용 PDF KSCI

Search Result 1,734, Processing Time 0.033 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)