• Title/Summary/Keyword: Hidden markov model

Search Result 641, Processing Time 0.025 seconds

A Robust Speech Recognition Method Combining the Model Compensation Method with the Speech Enhancement Algorithm (음질향상 기법과 모델보상 방식을 결합한 강인한 음성인식 방식)

  • Kim, Hee-Keun;Chung, Yong-Joo;Bae, Keun-Seung
    • Speech Sciences
    • /
    • v.14 no.2
    • /
    • pp.115-126
    • /
    • 2007
  • There have been many research efforts to improve the performance of the speech recognizer in noisy conditions. Among them, the model compensation method and the speech enhancement approach have been used widely. In this paper, we propose to combine the two different approaches to further enhance the recognition rates in the noisy speech recognition. For the speech enhancement, the minimum mean square error-short time spectral amplitude (MMSE-STSA) has been adopted and the parallel model combination (PMC) and Jacobian adaptation (JA) have been used as the model compensation approaches. From the experimental results, we could find that the hybrid approach that applies the model compensation methods to the enhanced speech produce better results than just using only one of the two approaches.

  • PDF

A New Distance Measure for a Variable-Sized Acoustic Model Based on MDL Technique

  • Cho, Hoon-Young;Kim, Sang-Hun
    • ETRI Journal
    • /
    • v.32 no.5
    • /
    • pp.795-800
    • /
    • 2010
  • Embedding a large vocabulary speech recognition system in mobile devices requires a reduced acoustic model obtained by eliminating redundant model parameters. In conventional optimization methods based on the minimum description length (MDL) criterion, a binary Gaussian tree is built at each state of a hidden Markov model by iteratively finding and merging similar mixture components. An optimal subset of the tree nodes is then selected to generate a downsized acoustic model. To obtain a better binary Gaussian tree by improving the process of finding the most similar Gaussian components, this paper proposes a new distance measure that exploits the difference in likelihood values for cases before and after two components are combined. The mixture weight of Gaussian components is also introduced in the component merging step. Experimental results show that the proposed method outperforms MDL-based optimization using either a Kullback-Leibler (KL) divergence or weighted KL divergence measure. The proposed method could also reduce the acoustic model size by 50% with less than a 1.5% increase in error rate compared to a baseline system.

Trade-off between Model Complexity and Performance in Intra-frame Predictive Vector Quantization of Wideband Speech (광대역 음성에 대한 프레임내 잔차 벡터 양자화에 있어서 모델 복잡도와 성능 사이의 교환관계)

  • Song, Geun-Bae;Hahn, Hern-Soo
    • The Journal of Korea Robotics Society
    • /
    • v.5 no.1
    • /
    • pp.70-76
    • /
    • 2010
  • This paper addresses a design issue of "model complexity and performance trade-off" in the application of bandwidth extension (BWE) methods to the intra-frame predictivevector quantization problem of wideband speech. It discusses model-based linear and non-linear prediction methods and presents a comparative study of them in terms of prediction gain. Through experimentation, the general trend of saturation in performance (with the increase in model complexity) is observed. However, specifically, it is also observed that there is no significant difference between HMM and GMM-based BWE functions.

A Study on the Noisy Speech Recognition Based on Multi-Model Structure Using an Improved Jacobian Adaptation (향상된 JA 방식을 이용한 다 모델 기반의 잡음음성인식에 대한 연구)

  • Chung, Yong-Joo
    • Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.75-84
    • /
    • 2006
  • Various methods have been proposed to overcome the problem of speech recognition in the noisy conditions. Among them, the model compensation methods like the parallel model combination (PMC) and Jacobian adaptation (JA) have been found to perform efficiently. The JA is quite effective when we have hidden Markov models (HMMs) already trained in a similar condition as the target environment. In a previous work, we have proposed an improved method for the JA to make it more robust against the changing environments in recognition. In this paper, we further improved its performance by compensating the delta-mean vectors and covariance matrices of the HMM and investigated its feasibility in the multi-model structure for the noisy speech recognition. From the experimental results, we could find that the proposed improved the robustness of the JA and the multi-model approach could be a viable solution in the noisy speech recognition.

  • PDF

Echo Noise Robust HMM Learning Model using Average Estimator LMS Algorithm (평균 예측 LMS 알고리즘을 이용한 반향 잡음에 강인한 HMM 학습 모델)

  • Ahn, Chan-Shik;Oh, Sang-Yeob
    • Journal of Digital Convergence
    • /
    • v.10 no.10
    • /
    • pp.277-282
    • /
    • 2012
  • The speech recognition system can not quickly adapt to varied environmental noise factors that degrade the performance of recognition. In this paper, the echo noise robust HMM learning model using average estimator LMS algorithm is proposed. To be able to adapt to the changing echo noise HMM learning model consists of the recognition performance is evaluated. As a results, SNR of speech obtained by removing Changing environment noise is improved as average 3.1dB, recognition rate improved as 3.9%.

Performance Comparison between the PMC and VTS Method for the Isolated Speech Recognition in Car Noise Environments (자동차 잡음환경 고립단어 음성인식에서의 VTS와 PMC의 성능비교)

  • Chung, Yong-Joo;Lee, Seung-Wook
    • Speech Sciences
    • /
    • v.10 no.3
    • /
    • pp.251-261
    • /
    • 2003
  • There has been many research efforts to overcome the problems of speech recognition in noisy conditions. Among the noise-robust speech recognition methods, model-based adaptation approaches have been shown quite effective. Particularly, the PMC (parallel model combination) method is very popular and has been shown to give considerably improved recognition results compared with the conventional methods. In this paper, we experimented with the VTS (vector Taylor series) algorithm which is also based on the model parameter transformation but has not attracted much interests of the researchers in this area. To verify the effectiveness of it, we employed the algorithm in the continuous density HMM (Hidden Markov Model). We compared the performance of the VTS algorithm with the PMC method and could see that the it gave better results than the PMC method.

  • PDF

Sign Language Spotting Based on Semi-Markov Conditional Random Field (세미-마르코프 조건 랜덤 필드 기반의 수화 적출)

  • Cho, Seong-Sik;Lee, Seong-Whan
    • Journal of KIISE:Software and Applications
    • /
    • v.36 no.12
    • /
    • pp.1034-1037
    • /
    • 2009
  • Sign language spotting is the task of detecting the start and end points of signs from continuous data and recognizing the detected signs in the predefined vocabulary. The difficulty with sign language spotting is that instances of signs vary in both motion and shape. Moreover, signs have variable motion in terms of both trajectory and length. Especially, variable sign lengths result in problems with spotting signs in a video sequence, because short signs involve less information and fewer changes than long signs. In this paper, we propose a method for spotting variable lengths signs based on semi-CRF (semi-Markov Conditional Random Field). We performed experiments with ASL (American Sign Language) and KSL (Korean Sign Language) dataset of continuous sign sentences to demonstrate the efficiency of the proposed method. Experimental results show that the proposed method outperforms both HMM and CRF.

Risk-Incorporated Trajectory Prediction to Prevent Contact Collisions on Construction Sites

  • Rashid, Khandakar M.;Datta, Songjukta;Behzadan, Amir H.;Hasan, Raiful
    • Journal of Construction Engineering and Project Management
    • /
    • v.8 no.1
    • /
    • pp.10-21
    • /
    • 2018
  • Many construction projects involve a plethora of safety-related problems that can cause loss of productivity, diminished revenue, time overruns, and legal challenges. Incorporating data collection and analytics methods can help overcome the root causes of many such problems. However, in a dynamic construction workplace collecting data from a large number of resources is not a trivial task and can be costly, while many contractors lack the motivation to incorporate technology in their activities. In this research, an Android-based mobile application, Preemptive Construction Site Safety (PCS2) is developed and tested for real-time location tracking, trajectory prediction, and prevention of potential collisions between workers and site hazards. PCS2 uses ubiquitous mobile technology (smartphones) for positional data collection, and a robust trajectory prediction technique that couples hidden Markov model (HMM) with risk-taking behavior modeling. The effectiveness of PCS2 is evaluated in field experiments where impending collisions are predicted and safety alerts are generated with enough lead time for the user. With further improvement in interface design and underlying mathematical models, PCS2 will have practical benefits in large scale multi-agent construction worksites by significantly reducing the likelihood of proximity-related accidents between workers and equipment.

An Innovative Approach of Bangla Text Summarization by Introducing Pronoun Replacement and Improved Sentence Ranking

  • Haque, Md. Majharul;Pervin, Suraiya;Begum, Zerina
    • Journal of Information Processing Systems
    • /
    • v.13 no.4
    • /
    • pp.752-777
    • /
    • 2017
  • This paper proposes an automatic method to summarize Bangla news document. In the proposed approach, pronoun replacement is accomplished for the first time to minimize the dangling pronoun from summary. After replacing pronoun, sentences are ranked using term frequency, sentence frequency, numerical figures and title words. If two sentences have at least 60% cosine similarity, the frequency of the larger sentence is increased, and the smaller sentence is removed to eliminate redundancy. Moreover, the first sentence is included in summary always if it contains any title word. In Bangla text, numerical figures can be presented both in words and digits with a variety of forms. All these forms are identified to assess the importance of sentences. We have used the rule-based system in this approach with hidden Markov model and Markov chain model. To explore the rules, we have analyzed 3,000 Bangla news documents and studied some Bangla grammar books. A series of experiments are performed on 200 Bangla news documents and 600 summaries (3 summaries are for each document). The evaluation results demonstrate the effectiveness of the proposed technique over the four latest methods.

Fast Text Line Segmentation Model Based on DCT for Color Image (컬러 영상 위에서 DCT 기반의 빠른 문자 열 구간 분리 모델)

  • Shin, Hyun-Kyung
    • The KIPS Transactions:PartD
    • /
    • v.17D no.6
    • /
    • pp.463-470
    • /
    • 2010
  • We presented a very fast and robust method of text line segmentation based on the DCT blocks of color image without decompression and binary transformation processes. Using DC and another three primary AC coefficients from block DCT we created a gray-scale image having reduced size by 8x8. In order to detect and locate white strips between text lines we analyzed horizontal and vertical projection profiles of the image and we applied a direct markov model to recover the missing white strips by estimating hidden periodicity. We presented performance results. The results showed that our method was 40 - 100 times faster than traditional method.