• Title/Summary/Keyword: Speech reconstruction

Search Result 88, Processing Time 0.03 seconds

가산잡음환경에서 강인음성인식을 위한 은닉 마르코프 모델 기반 손실 특징 복원 (HMM-based missing feature reconstruction for robust speech recognition in additive noise environments)

  • 조지원;박형민
    • 말소리와 음성과학
    • /
    • 제6권4호
    • /
    • pp.127-132
    • /
    • 2014
  • This paper describes a robust speech recognition technique by reconstructing spectral components mismatched with a training environment. Although the cluster-based reconstruction method can compensate the unreliable components from reliable components in the same spectral vector by assuming an independent, identically distributed Gaussian-mixture process of training spectral vectors, the presented method exploits the temporal dependency of speech to reconstruct the components by introducing a hidden-Markov-model prior which incorporates an internal state transition plausible for an observed spectral vector sequence. The experimental results indicate that the described method can provide temporally consistent reconstruction and further improve recognition performance on average compared to the conventional method.

음성신호의 Sub-Nyquist 비균일 표준화 및 완전 복구에 관한 연구 (Sub-Nyquist Nonuniform Sampling and Perfect Reconstruction of Speech Signals)

  • 이희영
    • 음성과학
    • /
    • 제12권2호
    • /
    • pp.153-170
    • /
    • 2005
  • The sub-Nyquist nonuniform sampling (SNNS) and the perfect reconstruction (PR) formula are proposed for the development of a systematic method to obtain minimal representation of a speech signal. In the proposed method, the instantaneous sampling frequency (ISF) varies, depending on the least upper boundary of spectral support of a speech signal in time-frequency domain (TFD). The definition of the instantaneous bandwidth (IB), which determines the ISF and is used for generating the set of samples that represent continuous-time signals perfectly, is given. Also, the spectral characteristics of the sampled data generated by the sub-Nyquist nonuniform sampling method is analyzed. The proposed method doesn't generate the redundant samples due to the time-varying property of the instantaneous bandwidth of a speech signal.

  • PDF

Feature Compensation Combining SNR-Dependent Feature Reconstruction and Class Histogram Equalization

  • Suh, Young-Joo;Kim, Hoi-Rin
    • ETRI Journal
    • /
    • 제30권5호
    • /
    • pp.753-755
    • /
    • 2008
  • In this letter, we propose a new histogram equalization technique for feature compensation in speech recognition under noisy environments. The proposed approach combines a signal-to-noise-ratio-dependent feature reconstruction method and the class histogram equalization technique to effectively reduce the acoustic mismatch present in noisy speech features. Experimental results from the Aurora 2 task confirm the superiority of the proposed approach for acoustic feature compensation.

  • PDF

고조파 복원에 의한 CELP 음성 부호화기의 저대역 확장 (Low-band Extension of CELP Speech Coder by Recovery of Harmonics)

  • 박진수;최무열;김형순
    • 대한음성학회지:말소리
    • /
    • 제49호
    • /
    • pp.63-75
    • /
    • 2004
  • Most existing telephone speech transmitted in current public networks is band-limited to 0.3-3.4 kHz. Compared with wideband speech(0-8 kHz), the narrowband speech lacks low-band (0-0.3 kHz) and high-band(3.4-8 kHz) components of sound. As a result, the speech is characterized by the reduced intelligibility and a muffled quality, and degraded speaker identification. Bandwidth extension is a technique to provide wideband speech quality, which means reconstruction of low-band and high-band components without any additional transmitted information. Our new approach considers to exploit harmonic synthesis method for reconstruction of low-band speech over the CELP coded speech. A spectral distortion measurement and listening test are introduced to assess the proposed method, and the improvement of synthesized speech quality was verified.

  • PDF

Adaptive Compressed Sensing과 Dictionary Learning을 이용한 프레임 기반 음성신호의 복원에 대한 연구 (A Study on the Reconstruction of a Frame Based Speech Signal through Dictionary Learning and Adaptive Compressed Sensing)

  • 정성문;임동민
    • 한국통신학회논문지
    • /
    • 제37A권12호
    • /
    • pp.1122-1132
    • /
    • 2012
  • 압축센싱은 이미지, 음성신호, 레이더 등 많은 분야에 적용되고 있다. 압축센싱은 주로 통계적 특성이 시불변인 신호에 적용되고 있으며, 측정 데이터를 줄여 압축률을 높일수록 복원에러가 증가한다. 이와 같은 문제점들을 해결하기 위해 음성신호를 프레임 단위로 나누어 병렬로 처리하였으며, dictionary learning을 이용하여 프레임들을 sparse하게 만들고, sparse 계수 벡터와 그 복원값의 차를 이용하여 압축센싱 복원행렬을 적응적으로 만든 적응압축센싱을 적용하였다. 이를 통해 통계적 특성이 시변인 신호도 압축센싱을 이용하여 빠르고 정확한 복원이 가능함을 확인할 수 있었다.

분산음성인식 환경에서 서버에서의 스케일러블 고품질 음성복원 (Scalable High-quality Speech Reconstruction in Distributed Speech Recognition Environments)

  • 윤재삼;김홍국;강병옥
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2007년도 하계종합학술대회 논문집
    • /
    • pp.423-424
    • /
    • 2007
  • In this paper, we propose a scalable high-quality speech reconstruction method for distributed speech recognition (DSR). It is difficult to reconstruct speech of high quality with MFCCs at the DSR server. Depending on the bit-rate available by the DSR system, we can send additional information associated with speech coding to the DSR sorrel, where the bit-rate is variable from 4.8 kbit/s to 11.4 kbit/s. The experimental results show that the speech quality reproduced by the proposed method when the bit-rate is 11.4 kbit/s is comparable with that of ITU-T G.729 under both ideal channel and frame error channel conditions while the performance of DSR is maintained to that of wireline speech recognition.

  • PDF

Wavelet Packet을 이용한 Network 상의 음성 코드에 관한 연구 (A Study of Speech Coding for the Transmission on Network by the Wavelet Packets)

  • 백한욱;정진현
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2000년도 하계학술대회 논문집 D
    • /
    • pp.3028-3030
    • /
    • 2000
  • In general. a speech coding is dedicated to the compression performance or the speech quality. But. the speech coding in this paper is focused on the performance of flexible transmission to the, network speed. For this. the subbanding coding is needed. which is used the wavelet packet concept in the signal analysis. The extraction of each frequency-band is difficult to general signal analysis methods, after coding each band, the reconstruction of these is also a difficult problem. But. with the wavelet packet concept(perfect reconstruction) and its fast computation algorithm. the extraction of each band and the reconstruction are more natural. Also, this paper describes a direct solution of the voice transmission on network and implement this algorithm at the TCP/IP network environment of PC.

  • PDF

모듈화한 신경 회로망을 이용한 광대역 음성 복원 (Wideband Speech Reconstruction Using Modular Neural Networks)

  • 우동헌;고참한;강현민;정진희;김유신;김형순
    • 대한음성학회지:말소리
    • /
    • 제48호
    • /
    • pp.93-105
    • /
    • 2003
  • Since telephone channel has bandlimited frequency characteristics, speech signal over the telephone channel shows degraded speech quality. In this paper, we propose an algorithm using neural network to reconstruct wideband speech from its narrowband version. Although single neural network is a good tool for direct mapping, it has difficulty in training for vast and complicated data. To alleviate this problem, we modularize the neural networks based on appropriate clustering of the acoustic space. We also introduce fuzzy computing to compensate for probable misclassification at the cluster boundaries. According to our simulation, the proposed algorithm showed improved performance over the single neural network and conventional codebook mapping method in both objective and subjective evaluations.

  • PDF

Reconstruction of a Total Soft Palatal Defect Using a Folded Radial Forearm Free Flap and Palmaris Longus Tendon Sling

  • Lee, Myung-Chul;Lee, Dong-Won;Rah, Dong-Kyun;Lee, Won-Jai
    • Archives of Plastic Surgery
    • /
    • 제39권1호
    • /
    • pp.25-30
    • /
    • 2012
  • Background : The soft palate functions as a valve and helps generate the oral pressure required for normal speech resonance. Speech problems and nasal regurgitation can result from a soft palatal defect. Reduction of the size of the velopharyngeal orifice is required to compensate for the lack of mobility in a reconstructed soft palate. We suggest a large volume folded free flap for reduction of the caliber and a palmaris longus tendon sling for suspension of the reconstructed palate. Methods : Six patients had total soft palate resection for tonsillar cancer and reconstruction with a large volume folded radial forearm free flap combined with a palmaris longus sling. A single surgeon and speech therapist examined the patients with three standardized speech assessment tools: nasometer test, consonant articulation test, and speech acuity test performed for speech evaluation. Results : Mean nasalance score was 76.20% for sentences with nasal sounds and 43.60% for sentences with oral sounds. Hypernasality was seen for oral sound sentences. The mean score of the picture consonant articulation test was 84% (range, 63% to 100%). The mean score of the speech acuity test was 5.84 (range, 5 to 6). These mean ratings represent a satisfactory level of speech function. Conclusions : The large volume folded free flap with a palmaris longus tendon sling for total soft palate reconstruction resulted in satisfactory prognosis for speech despite moderate hypernasality.

근륜(Levator Sling)재건술식을 이용한 구개성형술 (일차보고) (Palatoplasty with Reconstruction of Levator Sling (Preliminary Report))

  • 최시호
    • Journal of Yeungnam Medical Science
    • /
    • 제7권2호
    • /
    • pp.49-54
    • /
    • 1990
  • 구개열이 있는 10명의 환자를 대상으로 근륜 재건술식을 이용한 구개성형술을 실시 한 후, 발음평가표를 이용한 언어발성평가는 많이 호전(평균 점수 3.5 상승)됨을 나타냈었다. 생후 12개월에서 18개월사이에 수술 한 7명의 환자에서는 산악발육 명가는 4세, 언어발성 평가는 6세에 실시하기 위한 추적검사 중에 있다. 새로운 방법의 구개성형수술 술식의 임상적 응용 및 발음평가표 작성을 통한 정확한 언어 발성평가를 시도 하였음에 일차적인 의의가 있다고 하겠다.

  • PDF