Search | Korea Science

HMM-based missing feature reconstruction for robust speech recognition in additive noise environments (가산잡음환경에서 강인음성인식을 위한 은닉 마르코프 모델 기반 손실 특징 복원)

Cho, Ji-Won;Park, Hyung-Min
- Phonetics and Speech Sciences
- /
- v.6 no.4
- /
- pp.127-132
- /
- 2014
This paper describes a robust speech recognition technique by reconstructing spectral components mismatched with a training environment. Although the cluster-based reconstruction method can compensate the unreliable components from reliable components in the same spectral vector by assuming an independent, identically distributed Gaussian-mixture process of training spectral vectors, the presented method exploits the temporal dependency of speech to reconstruct the components by introducing a hidden-Markov-model prior which incorporates an internal state transition plausible for an observed spectral vector sequence. The experimental results indicate that the described method can provide temporally consistent reconstruction and further improve recognition performance on average compared to the conventional method.
https://doi.org/10.13064/KSSS.2014.6.4.127 인용 PDF KSCI

Sub-Nyquist Nonuniform Sampling and Perfect Reconstruction of Speech Signals (음성신호의 Sub-Nyquist 비균일 표준화 및 완전 복구에 관한 연구)

Lee, He-Young
- Speech Sciences
- /
- v.12 no.2
- /
- pp.153-170
- /
- 2005
The sub-Nyquist nonuniform sampling (SNNS) and the perfect reconstruction (PR) formula are proposed for the development of a systematic method to obtain minimal representation of a speech signal. In the proposed method, the instantaneous sampling frequency (ISF) varies, depending on the least upper boundary of spectral support of a speech signal in time-frequency domain (TFD). The definition of the instantaneous bandwidth (IB), which determines the ISF and is used for generating the set of samples that represent continuous-time signals perfectly, is given. Also, the spectral characteristics of the sampled data generated by the sub-Nyquist nonuniform sampling method is analyzed. The proposed method doesn't generate the redundant samples due to the time-varying property of the instantaneous bandwidth of a speech signal.
PDF

Feature Compensation Combining SNR-Dependent Feature Reconstruction and Class Histogram Equalization

Suh, Young-Joo;Kim, Hoi-Rin
- ETRI Journal
- /
- v.30 no.5
- /
- pp.753-755
- /
- 2008
In this letter, we propose a new histogram equalization technique for feature compensation in speech recognition under noisy environments. The proposed approach combines a signal-to-noise-ratio-dependent feature reconstruction method and the class histogram equalization technique to effectively reduce the acoustic mismatch present in noisy speech features. Experimental results from the Aurora 2 task confirm the superiority of the proposed approach for acoustic feature compensation.
PDF

Low-band Extension of CELP Speech Coder by Recovery of Harmonics (고조파 복원에 의한 CELP 음성 부호화기의 저대역 확장)

Park Jin Soo;Choi Mu Yeol;Kim Hyung Soon
- MALSORI
- /
- no.49
- /
- pp.63-75
- /
- 2004
Most existing telephone speech transmitted in current public networks is band-limited to 0.3-3.4 kHz. Compared with wideband speech(0-8 kHz), the narrowband speech lacks low-band (0-0.3 kHz) and high-band(3.4-8 kHz) components of sound. As a result, the speech is characterized by the reduced intelligibility and a muffled quality, and degraded speaker identification. Bandwidth extension is a technique to provide wideband speech quality, which means reconstruction of low-band and high-band components without any additional transmitted information. Our new approach considers to exploit harmonic synthesis method for reconstruction of low-band speech over the CELP coded speech. A spectral distortion measurement and listening test are introduced to assess the proposed method, and the improvement of synthesized speech quality was verified.
PDF

A Study on the Reconstruction of a Frame Based Speech Signal through Dictionary Learning and Adaptive Compressed Sensing (Adaptive Compressed Sensing과 Dictionary Learning을 이용한 프레임 기반 음성신호의 복원에 대한 연구)

Jeong, Seongmoon;Lim, Dongmin
- The Journal of Korean Institute of Communications and Information Sciences
- /
- v.37A no.12
- /
- pp.1122-1132
- /
- 2012
Compressed sensing has been applied to many fields such as images, speech signals, radars, etc. It has been mainly applied to stationary signals, and reconstruction error could grow as compression ratios are increased by decreasing measurements. To resolve the problem, speech signals are divided into frames and processed in parallel. The frames are made sparse by dictionary learning, and adaptive compressed sensing is applied which designs the compressed sensing reconstruction matrix adaptively by using the difference between the sparse coefficient vector and its reconstruction. Through the proposed method, we could see that fast and accurate reconstruction of non-stationary signals is possible with compressed sensing.
https://doi.org/10.7840/kics.2012.37A.12.1122 인용 PDF KSCI

Scalable High-quality Speech Reconstruction in Distributed Speech Recognition Environments (분산음성인식 환경에서 서버에서의 스케일러블 고품질 음성복원)

Yoon, Jae-Sam;Kim, Hong-Kook;Kang, Byung-Ok
- Proceedings of the IEEK Conference
- /
- 2007.07a
- /
- pp.423-424
- /
- 2007
In this paper, we propose a scalable high-quality speech reconstruction method for distributed speech recognition (DSR). It is difficult to reconstruct speech of high quality with MFCCs at the DSR server. Depending on the bit-rate available by the DSR system, we can send additional information associated with speech coding to the DSR sorrel, where the bit-rate is variable from 4.8 kbit/s to 11.4 kbit/s. The experimental results show that the speech quality reproduced by the proposed method when the bit-rate is 11.4 kbit/s is comparable with that of ITU-T G.729 under both ideal channel and frame error channel conditions while the performance of DSR is maintained to that of wireline speech recognition.
PDF

A Study of Speech Coding for the Transmission on Network by the Wavelet Packets (Wavelet Packet을 이용한 Network 상의 음성 코드에 관한 연구)

Baek, Han-Wook;Chung, Chin-Hyun
- Proceedings of the KIEE Conference
- /
- 2000.07d
- /
- pp.3028-3030
- /
- 2000
In general. a speech coding is dedicated to the compression performance or the speech quality. But. the speech coding in this paper is focused on the performance of flexible transmission to the, network speed. For this. the subbanding coding is needed. which is used the wavelet packet concept in the signal analysis. The extraction of each frequency-band is difficult to general signal analysis methods, after coding each band, the reconstruction of these is also a difficult problem. But. with the wavelet packet concept(perfect reconstruction) and its fast computation algorithm. the extraction of each band and the reconstruction are more natural. Also, this paper describes a direct solution of the voice transmission on network and implement this algorithm at the TCP/IP network environment of PC.
PDF

Wideband Speech Reconstruction Using Modular Neural Networks (모듈화한 신경 회로망을 이용한 광대역 음성 복원)

Woo Dong Hun;Ko Charm Han;Kang Hyun Min;Jeong Jin Hee;Kim Yoo Shin;Kim Hyung Soon
- MALSORI
- /
- no.48
- /
- pp.93-105
- /
- 2003
Since telephone channel has bandlimited frequency characteristics, speech signal over the telephone channel shows degraded speech quality. In this paper, we propose an algorithm using neural network to reconstruct wideband speech from its narrowband version. Although single neural network is a good tool for direct mapping, it has difficulty in training for vast and complicated data. To alleviate this problem, we modularize the neural networks based on appropriate clustering of the acoustic space. We also introduce fuzzy computing to compensate for probable misclassification at the cluster boundaries. According to our simulation, the proposed algorithm showed improved performance over the single neural network and conventional codebook mapping method in both objective and subjective evaluations.
PDF

Reconstruction of a Total Soft Palatal Defect Using a Folded Radial Forearm Free Flap and Palmaris Longus Tendon Sling

Lee, Myung-Chul;Lee, Dong-Won;Rah, Dong-Kyun;Lee, Won-Jai
- Archives of Plastic Surgery
- /
- v.39 no.1
- /
- pp.25-30
- /
- 2012
Background : The soft palate functions as a valve and helps generate the oral pressure required for normal speech resonance. Speech problems and nasal regurgitation can result from a soft palatal defect. Reduction of the size of the velopharyngeal orifice is required to compensate for the lack of mobility in a reconstructed soft palate. We suggest a large volume folded free flap for reduction of the caliber and a palmaris longus tendon sling for suspension of the reconstructed palate. Methods : Six patients had total soft palate resection for tonsillar cancer and reconstruction with a large volume folded radial forearm free flap combined with a palmaris longus sling. A single surgeon and speech therapist examined the patients with three standardized speech assessment tools: nasometer test, consonant articulation test, and speech acuity test performed for speech evaluation. Results : Mean nasalance score was 76.20% for sentences with nasal sounds and 43.60% for sentences with oral sounds. Hypernasality was seen for oral sound sentences. The mean score of the picture consonant articulation test was 84% (range, 63% to 100%). The mean score of the speech acuity test was 5.84 (range, 5 to 6). These mean ratings represent a satisfactory level of speech function. Conclusions : The large volume folded free flap with a palmaris longus tendon sling for total soft palate reconstruction resulted in satisfactory prognosis for speech despite moderate hypernasality.
https://doi.org/10.5999/aps.2012.39.1.25 인용 PDF KSCI

Palatoplasty with Reconstruction of Levator Sling (Preliminary Report) (근륜(Levator Sling)재건술식을 이용한 구개성형술 (일차보고))

Choi, See-Ho
- Journal of Yeungnam Medical Science
- /
- v.7 no.2
- /
- pp.49-54
- /
- 1990
Ten cleft palate patients were operated with reconstruction of levator sling without pushback for the purpose of not to make raw surface in the anterior portion of hard palate to prevent maxillary retrognathia. Speech was evaluated by using speech assessment list. Maxillary growth was not evaluated due to in-growing age in majority patient. The report will be followed in next chance. We could impose the significance in clinical application of levator sling palatoplasty without any complications but improving speech.
PDF

Search Result 88, Processing Time 0.026 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)