Search | Korea Science

A Multi-Model Based Noisy Speech Recognition Using the Model Compensation Method (다 모델 방식과 모델보상을 통한 잡음환경 음성인식)

Chung, Young-Joo;Kwak, Seung-Woo
- MALSORI
- /
- no.62
- /
- pp.97-112
- /
- 2007
The speech recognizer in general operates in noisy acoustical environments. Many research works have been done to cope with the acoustical variations. Among them, the multiple-HMM model approach seems to be quite effective compared with the conventional methods. In this paper, we consider a multiple-model approach combined with the model compensation method and investigate the necessary number of the HMM model sets through noisy speech recognition experiments. By using the data-driven Jacobian adaptation for the model compensation, the multiple-model approach with only a few model sets for each noise type could achieve comparable results with the re-training method.
PDF

Rule-based Named Entity (NE) Recognition from Speech (음성 자료에 대한 규칙 기반 Named Entity 인식)

Kim Ji-Hwan
- MALSORI
- /
- no.58
- /
- pp.45-66
- /
- 2006
In this paper, a rule-based (transformation-based) NE recognition system is proposed. This system uses Brill's rule inference approach. The performance of the rule-based system and IdentiFinder, one of most successful stochastic systems, are compared. In the baseline case (no punctuation and no capitalisation), both systems show almost equal performance. They also have similar performance in the case of additional information such as punctuation, capitalisation and name lists. The performances of both systems degrade linearly with the number of speech recognition errors, and their rates of degradation are almost equal. These results show that automatic rule inference is a viable alternative to the HMM-based approach to NE recognition, but it retains the advantages of a rule-based approach.
PDF

A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database (훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구)

Jung Sung Yun;Kim Min Sung;Son Jong Mok;Bae Keun Sung;Kang Jeom Ja
- Proceedings of the KSPS conference
- /
- 2003.05a
- /
- pp.119-122
- /
- 2003
In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.
PDF

An Overview and Market Review of Speaker Recognition Technology (화자인식 기술 및 국내외시장 동향)

Yu, Ha-Jin
- Proceedings of the KSPS conference
- /
- 2004.05a
- /
- pp.91-97
- /
- 2004
We provide a brief overview of the area of speaker recognition, describing underlying techniques and current market review. We describe the techniques mainly based on GMM(gaussian mixture model) that is the most prevalent and effective approach. Following the technical overview, we will outline the market review of the area inside and outside of the country.
PDF

Speech Recognition in Noise Environment by Independent Component Analysis and Spectral Enhancement (독립 성분 분석과 스펙트럼 향상에 의한 잡음 환경에서의 음성인식)

Choi Seung-Ho
- MALSORI
- /
- no.48
- /
- pp.81-91
- /
- 2003
In this paper, we propose a speech recognition method based on independent component analysis (ICA) and spectral enhancement techniques. While ICA tris to separate speech signal from noisy speech using multiple channels, some noise remains by its algorithmic limitations. Spectral enhancement techniques can compensate for lack of ICA's signal separation ability. From the speech recognition experiments with instantaneous and convolved mixing environments, we show that the proposed approach gives much improved recognition accuracies than conventional methods.
PDF

Convolutive source separation in noisy environments (잡음 환경하에서의 음성 분리)

Jang Inseon;Choi Seungjin
- Proceedings of the KSPS conference
- /
- 2003.10a
- /
- pp.97-100
- /
- 2003
This paper addresses a method of convolutive source separation that based on SEONS (Second Order Nonstationary Source Separation) [1] that was originally developed for blind separation of instantaneous mixtures using nonstationarity. In order to tackle this problem, we transform the convolutive BSS problem into multiple short-term instantaneous problems in the frequency domain and separated the instantaneous mixtures in every frequency bin. Moreover, we also employ a H infinity filtering technique in order to reduce the sensor noise effect. Numerical experiments are provided to demonstrate the effectiveness of the proposed approach and compare its performances with existing methods.
PDF

Coda Neutralization in Korean: OT Approach

Hong, Soonhyun
- Proceedings of the KSPS conference
- /
- 1996.10a
- /
- pp.123-128
- /
- 1996
So far we have proposed the following constraint ranking for the (over-)application of the coda neutralization: (22) License family ≫ UE family ≫ IDENT-IO family ≫ Base-ID This analysis shows that only the surface level is enough to analyze the opaque behaviors of coda neutralization. Uniform Exponence constraint is worth further study since it can handle Consonant Cluster Simplification and underapplication of /t/-palatalization in Korean compounds in which morphemes before a stem are uniformly realized as one surface form: i.e., the output base form (S. Hong in preparation)(equation omitted)
PDF

A STUDY ON THE SATISFIED DEGREE OF ORAL FUNCTION IN GERIATIRIC PATIENTS WITH THE SHORTENED DENTAL ARCH (단치궁 노인의 구강 기능 만족도에 관한 연구)

Choi Jae-Sung;Kang Woo-Jin;Chung Moon-Kyu
- The Journal of Korean Academy of Prosthodontics
- /
- v.30 no.2
- /
- pp.191-202
- /
- 1992
The purpose of the present study is to inspect the satisfied degree of each oral function in geriatric patients with the shortened dental arch and when their prosthetic treatment is on schedule, provide some references to such treatment. For the approach to such study, 521 subjects were reviewed by grouping them according to the number of their remaining teeth, and masticatory function, phonetic function, facial change, and TMJ disorders were inspected and clarified through some questionnaires. Also through the questionnaires, the correlations between the geriatiric patients with the shortened dental arch and dentition and between the geriatiric patients with the Free-end RDP at the shortened dental arch and their oral function were found out with their satisfied degree of oral function. Results or findings from such study are as follows : 1. With regard to their satisfied degree of oral function, there was a significant difference of satisfaction between or among the group having only the anterior teeth and the group having the part of premolars and the group having even the part of molars, however no significant difference of satisfaction appeard between the group having 1st molars and the group having 2nd molars. 2. With regard to their satisfied degree of phonetic function, no significant difference appeared between or among the group having only the anterior part of teeth and the group having even the part of premolars and the group having even the part of molars, and with regard to their satisfied degree of facial change, no significant difference of satisfaction appeared between the group having the part of premolars and the group having even the part of molars. 3. With regard to their satisfied degree of masticatory function, phonetic function, TMJ disorders, and facial change, no significant difference appeared between the group having both the anterior part of teeth and the part of premolars and the group attached with the Free-end RPD on the same conditions of the afore-said group.
PDF

A multi-dimensional approach to English for Global Communication: Pragmatics of International Intelligibility

Nihalani, Paroo
- Proceedings of the KSPS conference
- /
- 2000.07a
- /
- pp.353-363
- /
- 2000
The consonant system of English is relatively uniform throughout the English-speaking countries. Accents of English are mainly known to differ in terms of their vowel systems as well as in the phonetic realisations of vowel phonemes. The results of an acoustic study of vowel phonology of Japanese English, Singapore English and Indian English are presented, and an attempt is then made to compare the vowel phonology of these non-native varieties with that of Scottish English and RP. Various native varieties of English are thus shown to differ from each other in major ways, as much, perhaps, as the non-native varieties differ from the native varieties. Nevertheless, native speakers of English appear to be mutually intelligible to a degree that does not extend to non-native varieties. Obviously there are features that various native accents have in common which facilitate their mutual intelligibility, and these features are not shared by non-native accents. It is proposed that the foreign learner adopt certain core features of English in his pronunciation if he is to use English effectively as an international language. The common core that is significant in the communication process will be discussed. In conclusion, some pragmatic implications for the English language education in the new millennium will be articulated.
PDF

A Query-by-Speech Scheme for Photo Albuming (음성 질의 기반 디지털 사진 검색 기법)

Kim Tae-Sung;Suh Young-Joo;Lee Yong-Ju;Kim Hoi-Rin
- MALSORI
- /
- no.57
- /
- pp.99-112
- /
- 2006
In this paper, we introduce two retrieval methods for photos with speech documents. We compare the pattern of speech query with those of speech documents recorded in digital cameras, and measure the similarities, and retrieve photos corresponding to the speech documents which have high similarity scores. As the first approach, a phoneme recognition scheme is used as the pre-processor for the pattern matching, and in the second one, the vector quantization (VQ) and the dynamic time warping (DTW) are applied to match the speech query with the documents in signal domain itself. Experimental results show that the performance of the first approach is highly dependent on that of phoneme recognition while the processing time is short. The second method provides a great improvement of performance. While the processing time is longer than that of the first method due to DTW, but we can reduce it by taking approximated methods.
PDF

Search Result 78, Processing Time 0.017 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)