• Title/Summary/Keyword: Non-speech

Search Result 470, Processing Time 0.024 seconds

Acoustic-Phonetic Phenotypes in Pediatric Speech Disorders;An Interdisciplinary Approach

  • Bunnell, H. Timothy
    • Proceedings of the KSPS conference
    • /
    • 2006.11a
    • /
    • pp.31-36
    • /
    • 2006
  • Research in the Center for Pediatric Auditory and Speech Sciences (CPASS) is attempting to characterize or phenotype children with speech delays based on acoustic-phonetic evidence and relate those phenotypes to chromosome loci believed to be related to language and speech. To achieve this goal we have adopted a highly interdisciplinary approach that merges fields as diverse as automatic speech recognition, human genetics, neuroscience, epidemiology, and speech-language pathology. In this presentation I will trace the background of this project, and the rationale for our approach. Analyses based on a large amount of speech recorded from 18 children with speech delays will be presented to illustrate the approach we will be taking to characterize the acoustic phonetic properties of disordered speech in young children. The ultimate goal of our work is to develop non-invasive and objective measures of speech development that can be used to better identify which children with apparent speech delays are most in need of, or would receive the most benefit from the delivery of therapeutic services.

  • PDF

Robust Speech Hash Function

  • Chen, Ning;Wan, Wanggen
    • ETRI Journal
    • /
    • v.32 no.2
    • /
    • pp.345-347
    • /
    • 2010
  • In this letter, we present a new speech hash function based on the non-negative matrix factorization (NMF) of linear prediction coefficients (LPCs). First, linear prediction analysis is applied to the speech to obtain its LPCs, which represent the frequency shaping attributes of the vocal tract. Then, the NMF is performed on the LPCs to capture the speech's local feature, which is then used for hash vector generation. Experimental results demonstrate the effectiveness of the proposed hash function in terms of discrimination and robustness against various types of content preserving signal processing manipulations.

Speech Rhythm Metrics for Automatic Scoring of English Speech by Korean EFL Learners

  • Jang, Tae-Yeoub
    • MALSORI
    • /
    • no.66
    • /
    • pp.41-59
    • /
    • 2008
  • Knowledge in linguistic rhythm of the target language plays a major role in foreign language proficiency. This study attempts to discover valid rhythm features that can be utilized in automatic assessment of non-native English pronunciation. Eight previously proposed and two novel rhythm metrics are investigated with 360 English read speech tokens obtained from 27 Korean learners and 9 native speakers. It is found that some of the speech-rate normalized interval measures and above-word level metrics are effective enough to be further applied for automatic scoring as they are significantly correlated with speakers' proficiency levels. It is also shown that metrics need to be dynamically selected depending upon the structure of target sentences. Results from a preliminary auto-scoring experiment through a Multi Regression analysis suggest that appropriate control of unexpected input utterances is also desirable for better performance.

  • PDF

Perceptual Speech Assessment after Maxillary Advancement Osteotomy in Patients with a Repaired Cleft Lip and Palate

  • Kim, Seok-Kwun;Kim, Ju-Chan;Moon, Ju-Bong;Lee, Keun-Cheol
    • Archives of Plastic Surgery
    • /
    • v.39 no.3
    • /
    • pp.198-202
    • /
    • 2012
  • Background : Maxillary hypoplasia refers to a deficiency in the growth of the maxilla commonly seen in patients with a repaired cleft palate. Those who develop maxillary hypoplasia can be offered a repositioning of the maxilla to a functional and esthetic position. Velopharyngeal dysfunction is one of the important problems affecting speech after maxillary advancement surgery. The aim of this study was to investigate the impact of maxillary advancement on repaired cleft palate patients without preoperative deterioration in speech compared with non-cleft palate patients. Methods : Eighteen patients underwent Le Fort I osteotomy between 2005 and 2011. One patient was excluded due to preoperative deterioration in speech. Eight repaired cleft palate patients belonged to group A, and 9 non-cleft palate patients belonged to group B. Speech assessments were performed preoperatively and postoperatively by using a speech screening protocol that consisted of a list of single words designed by Ok-Ran Jung. Wilcoxon signed rank test was used to determine if there were significant differences between the preoperative and postoperative outcomes in each group A and B. And Mann-Whitney U test was used to determine if there were significant differences in the change of score between groups A and B. Results : No patients had any noticeable change in speech production on perceptual assessment after maxillary advancement in our study. Furthermore, there were no significant differences between groups A and B. Conclusions : Repaired cleft palate patients without preoperative velopharyngeal dysfunction would not have greater risk of deterioration of velopharyngeal function after maxillary advancement compared to non-cleft palate patients.

Qualitative Classification of Voice Quality of Normal Speech and Derivation of its Correlation with Speech Features (정상 음성의 목소리 특성의 정성적 분류와 음성 특징과의 상관관계 도출)

  • Kim, Jungin;Kwon, Chulhong
    • Phonetics and Speech Sciences
    • /
    • v.6 no.1
    • /
    • pp.71-76
    • /
    • 2014
  • In this paper voice quality of normal speech is qualitatively classified by five components of breathy, creaky, rough, nasal, and thin/thick voice. To determine whether a correlation exists between a subjective measure of voice and an objective measure of voice, each voice is perceptually evaluated using the 1/2/3 scale by speech processing specialists and acoustically analyzed using speech analysis tools such as the Praat, MDVP, and VoiceSauce. The speech parameters include features related to speech source and vocal tract filter. Statistical analysis uses a two-independent-samples non-parametric test. Experimental results show that statistical analysis identified a significant correlation between the speech feature parameters and the components of voice quality.

A Variable Data Rate Speech Coding Technique Based on the Inflection Point Detection of Speech (음성의 변곡점 추출 및 전송에 기반한 가변 데이터율 음성 부호화 기법)

  • Iem, Byeong-Gwan
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.62 no.4
    • /
    • pp.562-565
    • /
    • 2013
  • A new variable rate speech coding technique is proposed. The method is based on the observation that the speech signal approximately looks linear for a very short period of time. The information transmitted is the location and data value of inflection points. If the distance between the inflection points is large, the mid point location and its data value are also delivered. Thus, the encoder transmits both the location and the data value for the inflection samples, but the location only for the non-inflection points. The location information is expressed using one bit for each sample, 0 for non-inflection and 1 for inflection point. At the receiver, using the interpolation, the decoder estimates the untransmitted sample values for non-inflection locations from the received sample values for the inflection samples. With 50 % of computational cost of the existing CVSD delta modulation, the proposed method is expected to achieve the data rate of 36 to 38 kbps and the SNR of 10 to 13 dB.

Speech Enhancement for Voice commander in Car environment (차량환경에서 음성명령어기 사용을 위한 음성개선방법)

  • 백승권;한민수;남승현;이봉호;함영권
    • Journal of Broadcast Engineering
    • /
    • v.9 no.1
    • /
    • pp.9-16
    • /
    • 2004
  • In this paper, we present a speech enhancement method as a pre-processor for voice commander under car environment. For the friendly and safe use of voice commander in a running car, non-stationary audio signals such as music and non-candidate speech should be reduced. Ow technique is a two microphone-based one. It consists of two parts Blind Source Separation (BSS) and Kalman filtering. Firstly, BSS is operated as a spatial filter to deal with non-stationary signals and then car noise is reduced by kalman filtering as a temporal filter. Algorithm Performance is tested for speech recognition. And the results show that our two microphone-based technique can be a good candidate to a voice commander.

Single-Channel Non-Causal Speech Enhancement to Suppress Reverberation and Background Noise

  • Song, Myung-Suk;Kang, Hong-Goo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.31 no.8
    • /
    • pp.487-506
    • /
    • 2012
  • This paper proposes a speech enhancement algorithm to improve the speech intelligibility by suppressing both reverberation and background noise. The algorithm adopts a non-causal single-channel minimum variance distortionless response (MVDR) filter to exploit an additional information that is included in the noisy-reverberant signals in subsequent frames. The noisy-reverberant signals are decomposed into the parts of the desired signal and the interference that is not correlated to the desired signal. Then, the filter equation is derived based on the MVDR criterion to minimize the residual interference without bringing speech distortion. The estimation of the correlation parameter, which plays an important role to determine the overall performance of the system, is mathematically derived based on the general statistical reverberation model. Furthermore, the practical implementation methods to estimate sub-parameters required to estimate the correlation parameter are developed. The efficiency of the proposed enhancement algorithm is verified by performance evaluation. From the results, the proposed algorithm achieves significant performance improvement in all studied conditions and shows the superiority especially for the severely noisy and strongly reverberant environment.

Parts-based Feature Extraction of Speech Spectrum Using Non-Negative Matrix Factorization (Non-Negative Matrix Factorization을 이용한 음성 스펙트럼의 부분 특징 추출)

  • 박정원;김창근;허강인
    • Proceedings of the IEEK Conference
    • /
    • 2003.11a
    • /
    • pp.49-52
    • /
    • 2003
  • In this paper, we propose new speech feature parameter using NMf(Non-Negative Matrix Factorization). NMF can represent multi-dimensional data based on effective dimensional reduction through matrix factorization under the non-negativity constraint, and reduced data present parts-based features of input data. In this paper, we verify about usefulness of NMF algorithm for speech feature extraction applying feature parameter that is got using NMF in Mel-scaled filter bank output. According to recognition experiment result, we could confirm that proposal feature parameter is superior in recognition performance than MFCC(mel frequency cepstral coefficient) that is used generally.

  • PDF

Intra-and Inter-frame Features for Automatic Speech Recognition

  • Lee, Sung Joo;Kang, Byung Ok;Chung, Hoon;Lee, Yunkeun
    • ETRI Journal
    • /
    • v.36 no.3
    • /
    • pp.514-517
    • /
    • 2014
  • In this paper, alternative dynamic features for speech recognition are proposed. The goal of this work is to improve speech recognition accuracy by deriving the representation of distinctive dynamic characteristics from a speech spectrum. This work was inspired by two temporal dynamics of a speech signal. One is the highly non-stationary nature of speech, and the other is the inter-frame change of a speech spectrum. We adopt the use of a sub-frame spectrum analyzer to capture very rapid spectral changes within a speech analysis frame. In addition, we attempt to measure spectral fluctuations of a more complex manner as opposed to traditional dynamic features such as delta or double-delta. To evaluate the proposed features, speech recognition tests over smartphone environments were conducted. The experimental results show that the feature streams simply combined with the proposed features are effective for an improvement in the recognition accuracy of a hidden Markov model-based speech recognizer.