Search | Korea Science

A Weighted Feature Voting Approach for Robust and Real-Time Voice Activity Detection

Moattar, Mohammad Hossein;Homayounpour, Mohammad Mehdi
- ETRI Journal
- /
- v.33 no.1
- /
- pp.99-109
- /
- 2011
This paper concerns a robust real-time voice activity detection (VAD) approach which is easy to understand and implement. The proposed approach employs several short-term speech/nonspeech discriminating features in a voting paradigm to achieve a reliable performance in different environments. This paper mainly focuses on the performance improvement of a recently proposed approach which uses spectral peak valley difference (SPVD) as a feature for silence detection. The main issue of this paper is to apply a set of features with SPVD to improve the VAD robustness. The proposed approach uses a weighted voting scheme in order to take the discriminative power of the employed feature set into account. The experiments show that the proposed approach is more robust than the baseline approach from different points of view, including channel distortion and threshold selection. The proposed approach is also compared with some other VAD techniques for better confirmation of its achievements. Using the proposed weighted voting approach, the average VAD performance is increased to 89.29% for 5 different noise types and 8 SNR levels. The resulting performance is 13.79% higher than the approach based only on SPVD and even 2.25% higher than the not-weighted voting scheme.
https://doi.org/10.4218/etrij.11.1510.0158 인용 PDF KSCI

Robust Features and Accurate Inliers Detection Framework: Application to Stereo Ego-motion Estimation

MIN, Haigen;ZHAO, Xiangmo;XU, Zhigang;ZHANG, Licheng
- KSII Transactions on Internet and Information Systems (TIIS)
- /
- v.11 no.1
- /
- pp.302-320
- /
- 2017
In this paper, an innovative robust feature detection and matching strategy for visual odometry based on stereo image sequence is proposed. First, a sparse multiscale 2D local invariant feature detection and description algorithm AKAZE is adopted to extract the interest points. A robust feature matching strategy is introduced to match AKAZE descriptors. In order to remove the outliers which are mismatched features or on dynamic objects, an improved random sample consensus outlier rejection scheme is presented. Thus the proposed method can be applied to dynamic environment. Then, geometric constraints are incorporated into the motion estimation without time-consuming 3-dimensional scene reconstruction. Last, an iterated sigma point Kalman Filter is adopted to refine the motion results. The presented ego-motion scheme is applied to benchmark datasets and compared with state-of-the-art approaches with data captured on campus in a considerably cluttered environment, where the superiorities are proved.
https://doi.org/10.3837/tiis.2017.01.016 인용 PDF KSCI

Robust Terrain Classification Against Environmental Variation for Autonomous Off-road Navigation (야지 자율주행을 위한 환경에 강인한 지형분류 기법)

Sung, Gi-Yeul;Lyou, Joon
- Journal of the Korea Institute of Military Science and Technology
- /
- v.13 no.5
- /
- pp.894-902
- /
- 2010
This paper presents a vision-based robust off-road terrain classification method against environmental variation. As a supervised classification algorithm, we applied a neural network classifier using wavelet features extracted from wavelet transform of an image. In order to get over an effect of overall image feature variation, we adopted environment sensors and gathered the training parameters database according to environmental conditions. The robust terrain classification algorithm against environmental variation was implemented by choosing an optimal parameter using environmental information. The proposed algorithm was embedded on a processor board under the VxWorks real-time operating system. The processor board is containing four 1GHz 7448 PowerPC CPUs. In order to implement an optimal software architecture on which a distributed parallel processing is possible, we measured and analyzed the data delivery time between the CPUs. And the performance of the present algorithm was verified, comparing classification results using the real off-road images acquired under various environmental conditions in conformity with applied classifiers and features. Experiments show the robustness of the classification results on any environmental condition.
PDF KSCI

Robust Visual Tracking for 3-D Moving Object using Kalman Filter (칼만필터를 이용한 3-D 이동물체의 강건한 시각추적)

조지승;정병묵
- Proceedings of the Korean Society of Precision Engineering Conference
- /
- 2003.06a
- /
- pp.1055-1058
- /
- 2003
The robustness and reliability of vision algorithms is the key issue in robotic research and industrial applications. In this paper robust real time visual tracking in complex scene is considered. A common approach to increase robustness of a tracking system is the use of different model (CAD model etc.) known a priori. Also fusion or multiple features facilitates robust detection and tracking of objects in scenes of realistic complexity. Voting-based fusion of cues is adapted. In voting. a very simple or no model is used for fusion. The approach for this algorithm is tested in a 3D Cartesian robot which tracks a toy vehicle moving along 3D rail, and the Kalman filter is used to estimate the motion parameters. namely the system state vector of moving object with unknown dynamics. Experimental results show that fusion of cues and motion estimation in a tracking system has a robust performance.
PDF

CAD-Based 3-D Object Recognition Using the Robust Stereo Vision and Hough Transform (강건 스테레오 비전과 허프 변환을 이용한 캐드 기반 삼차원 물체인식)

송인호;정성종
- Proceedings of the Korean Society of Precision Engineering Conference
- /
- 1997.10a
- /
- pp.500-503
- /
- 1997
In this paper, a method for recognizing 3-D objects using the 3-D Hough transform and the robust stereo vision is studied. A 3-D object is recognized through two steps; modeling step and matching step. In modeling step, features of the object are extracted by analyzing the IGES file. In matching step, the values of the sensed image are compared with those of the IGES file which is assumed to location and orientation in the 3-D Hough transform domain. Since we use the 3-D Hough transform domain of the input image directly, the sensitivity to the noise and the high computational complexity could be significantly allcv~ated. Also, the cost efficiency is improved using the robust stereo vision for obtaining depth map image which is needed for 3-D Hough transform. In order lo verify the proposed method, real telephone model is recognized. Thc results of the location and orientation of the model are presented.
PDF

A Robust Behavior Planning technique for Mobile Robots (이동 로봇의 강인 행동 계획 방법)

Lee, Sang-Hyoung;Lee, Sang-Hoon;Suh, Il-Hong
- The Journal of Korea Robotics Society
- /
- v.1 no.2
- /
- pp.107-116
- /
- 2006
We propose a planning algorithm to automatically generate a robust behavior plan (RBP) with which mobile robots can achieve their task goal from any initial states under dynamically changing environments. For this, task description space (TDS) is formulated, where a redundant task configuration space and simulation model of physical space are employed. Successful task episodes are collected, where $A^*$ algorithm is employed. Interesting TDS state vectors are extracted, where occurrence frequency is used. Clusters of TDS state vectors are found by using state transition tuples and features of state transition tuples. From these operations, characteristics of successfully performed tasks by a simulator are abstracted and generalized. Then, a robust behavior plan is constructed as an ordered tree structure, where nodes of the tree are represented by attentive TDS state vector of each cluster. The validity of our method is tested by real robot's experimentation for a box-pushing-into-a-goal task.
PDF

Noise Robust Automatic Speech Recognition Scheme with Histogram of Oriented Gradient Features

Park, Taejin;Beack, SeungKwan;Lee, Taejin
- IEIE Transactions on Smart Processing and Computing
- /
- v.3 no.5
- /
- pp.259-266
- /
- 2014
In this paper, we propose a novel technique for noise robust automatic speech recognition (ASR). The development of ASR techniques has made it possible to recognize isolated words with a near perfect word recognition rate. However, in a highly noisy environment, a distinct mismatch between the trained speech and the test data results in a significantly degraded word recognition rate (WRA). Unlike conventional ASR systems employing Mel-frequency cepstral coefficients (MFCCs) and a hidden Markov model (HMM), this study employ histogram of oriented gradient (HOG) features and a Support Vector Machine (SVM) to ASR tasks to overcome this problem. Our proposed ASR system is less vulnerable to external interference noise, and achieves a higher WRA compared to a conventional ASR system equipped with MFCCs and an HMM. The performance of our proposed ASR system was evaluated using a phonetically balanced word (PBW) set mixed with artificially added noise.
https://doi.org/10.5573/IEIESPC.2014.3.5.259 인용 PDF KSCI

Robust Multithreaded Object Tracker through Occlusions for Spatial Augmented Reality

Lee, Ahyun;Jang, Insung
- ETRI Journal
- /
- v.40 no.2
- /
- pp.246-256
- /
- 2018
A spatial augmented reality (SAR) system enables a virtual image to be projected onto the surface of a real-world object and the user to intuitively control the image using a tangible interface. However, occlusions frequently occur, such as a sudden change in the lighting environment or the generation of obstacles. We propose a robust object tracker based on a multithreaded system, which can track an object robustly through occlusions. Our multithreaded tracker is divided into two threads: the detection thread detects distinctive features in a frame-to-frame manner, and the tracking thread tracks features periodically using an optical-flow-based tracking method. Consequently, although the speed of the detection thread is considerably slow, we achieve real-time performance owing to the multithreaded configuration. Moreover, the proposed outlier filtering automatically updates a random sample consensus distance threshold for eliminating outliers according to environmental changes. Experimental results show that our approach tracks an object robustly in real-time in an SAR environment where there are frequent occlusions occurring from augmented projection images.
https://doi.org/10.4218/etrij.2017-0047 인용 PDF KSCI

A Comparison of Front-Ends for Robust Speech Recognition

Kim, Doh-Suk;Jeong, Jae-Hoon;Lee, Soo-Young;Kil, Rhee M.
- The Journal of the Acoustical Society of Korea
- /
- v.17 no.3E
- /
- pp.3-11
- /
- 1998
Zero-crossings with Peak amplitudes (ZCPA) model motivated by human auditory periphery was proposed to extract reliable features form speech signals even in noisy environments for robust speech recognition. In this paper, the performance of the ZCPA model is further improved by incorporating conventional speech processing techniques into the model output. Spectral and cepstral representations of the ZCPA model output are compared, and the incorporation of dynamic features with several different lengths of time-derivative window are evaluated. Also, comparative evaluations with other front-ends in real-world noisy environments are performed, and result in the superiority of the ZCPA model.
PDF

Performance Improvement of Robust Speaker Verification According to Various Standard Deviations of a Reference Distribution in Histogram Transformation (히스토그램 변환에서 기준분포의 표준편차 변경에 따른 강인한 화자인증 성능 개선)

Kwon, Chul-Hong
- Phonetics and Speech Sciences
- /
- v.2 no.3
- /
- pp.127-134
- /
- 2010
Additive noise and channel mismatch strongly degrade the performance of speaker verification systems, as they distort the features of speech. In this paper a histogram transformation technique is presented to improve the robustness of text-independent speaker verification systems. The technique transforms the features extracted from speech such that their histogram is conformed to a reference distribution. The effect of different standard deviations for the reference distribution is investigated. Experimental results indicate that, in channel mismatched environments, the proposed technique offers significant improvements over existing techniques. We also verify performance improvement of the proposed method using statistics.
PDF

Search Result 718, Processing Time 0.027 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)