• 제목/요약/키워드: speech analysis

검색결과 1,587건 처리시간 0.025초

상악 총의치 장착 환자 언어의 음향학적 특성 연구 (Acoustic Characteristics of Patients with Maxillary Complete Dentures)

  • 고석민;황병남
    • 음성과학
    • /
    • 제8권4호
    • /
    • pp.139-156
    • /
    • 2001
  • Speech intelligibility in patients with complete dentures is an important clinical problem depending on the material used. The objective of this study was to investigate the speech of two edentulous subjects fitted with a complete maxillary prosthesis made of two different palatal materials: chrome-cobalt alloy and acrylic resin. Three patients with complete dentures in the experiment group and ten people in the controls groups participated in the experiment. CSL, Visi-Pitch were used to measure speech characteristics. The test words consisted of a simple vowel /e/, meaningless three syllabic words containing fricative, affricated and stops sounds, and sustained fricative sounds /s/ and /$\int$/. The analysis speech parameters were vowel and lateral formants, VOT, sound durations, sound pressure level and fricative frequency. Data analysis was conducted by a series of paired T-test. The findings like the following: (1) Vowel formant one of patients with complete denture is higher than that of the control group (p<0.05), while lateral formant three of patients with complete denture is lower than that of the control group (p<0.0l). (2) Patients with complete denture produced lower speech intelligibility with low fricative frequency (/$\int$/) than control group (p<0.0). The speech intelligibility of patients with metal prosthesis was higher than that of those with resin prosthesis (p<0.05). (3) Fricative, lateral and stop sound durations of patients with complete denture were longer than those of the control group (p<0.01 and p<0.05), respectively. Total sound durations of patients with metal prosthesis were similar to that of the control group (p<0.05), while those with resin prosthesis had a shorter duration (p<0.01). This implied that those with metal prosthesis had higher speech intelligibility than those with resin prosthesis. (4) Patients with complete denture had higher sound pressure levels /t/ and /c/ than the control group (p<0.01). However, sound pressure levels for /c/ of patients with metal prosthesis or resin prosthesis was similar to the control group (p<0.05). (5) Patients with complete denture had higher fundamental frequency than the control group (p<0.01).

  • PDF

구개인두부전증 환자의 한국어 음성 코퍼스 구축 방안 연구 (Research on Construction of the Korean Speech Corpus in Patient with Velopharyngeal Insufficiency)

  • 이지은;김욱은;김광현;성명훈;권택균
    • Korean Journal of Otorhinolaryngology-Head and Neck Surgery
    • /
    • 제55권8호
    • /
    • pp.498-507
    • /
    • 2012
  • Background and Objectives We aimed to develop a Korean version of the velopharyngeal insufficiency (VPI) speech corpus system. Subjects and Method After developing a 3-channel simultaneous speech recording device capable of recording nasal/oral and normal compound speech separately, voice data were collected from VPI patients aged more than 10 years with/without the history of operation or prior speech therapy. This was compared to a control group for which VPI was simulated by using a french-3 nelaton tube inserted via both nostril through nasopharynx and pulling the soft palate anteriorly in varying degrees. The study consisted of three transcriptors: a speech therapist transcribed the voice file into text, a second transcriptor graded speech intelligibility and severity and the third tagged the types and onset times of misarticulation. The database were composed of three main tables regarding (1) speaker's demographics, (2) condition of the recording system and (3) transcripts. All of these were interfaced with the Praat voice analysis program, which enables the user to extract exact transcribed phrases for analysis. Results In the simulated VPI group, the higher the severity of VPI, the higher the nasalance score was obtained. In addition, we could verify the vocal energy that characterizes hypernasality and compensation in nasal/oral and compound sounds spoken by VPI patients as opposed to that characgerizes the normal control group. Conclusion With the Korean version of VPI speech corpus system, patients' common difficulties and speech tendencies in articulation can be objectively evaluated. Comparing these data with those of the normal voice, mispronunciation and dysarticulation of patients with VPI can be corrected.

경직형 뇌성마비 아동의 하위그룹별 말속도와 쉼의 특성 및 말명료도와의 관계 (Characteristics of speech rate and pause in children with spastic cerebral palsy and their relationships with speech intelligibility)

  • 정필연;심현섭
    • 말소리와 음성과학
    • /
    • 제12권3호
    • /
    • pp.95-103
    • /
    • 2020
  • 본 연구의 목적은 경직형 뇌성마비 아동의 하위그룹별로 말속도와 쉼에서 차이가 있는지 살펴보고, 말명료도와의 관련성에 대해서 알아보고자 하였다. 연구대상은 경직형 뇌성마비 아동 26명이 참여하였다. 말문제와 언어문제가 없는 NSMI-LCT 4명, 말문제는 없지만 언어문제가 있는 NSMI-LCI 그룹 6명, 말문제가 있지만 언어문제는 없는 SMI-LCT 6명, 말과 언어문제를 모두 동반하는 SMI-LCI 그룹 10명이 참여하였다. 연구과제는 문장 따라말하기였고, Praat을 통해 말속도, 조음속도, 쉼 시간의 비율, 평균 쉼 횟수, 평균 쉼 시간을 측정하였다. 연구결과, 첫째, 말속도와 조음속도는 언어문제의 유무와 관계없이 NSMI와 SMI 그룹 간에 유의한 차이가 나타났다. 둘째, NSMI에 비해 SMI 그룹에서에서 쉼 시간의 비율은 더 높고, 쉼 횟수는 더 빈번하였으며 쉼 시간은 더 길게 나타났다. 셋째, 말속도와 조음속도는 말명료도와 유의한 상관을 나타내었다. 본 연구의 결과는 느린 말속도가 SMI 그룹의 말산출 과정에서 나타나는 주요한 특성이고, 말명료도에 있어서 조음속도와 말속도가 중요한 역할을 함을 시사한다.

A Study on Image Recommendation System based on Speech Emotion Information

  • Kim, Tae Yeun;Bae, Sang Hyun
    • 통합자연과학논문집
    • /
    • 제11권3호
    • /
    • pp.131-138
    • /
    • 2018
  • In this paper, we have implemented speeches that utilized the emotion information of the user's speech and image matching and recommendation system. To classify the user's emotional information of speech, the emotional information of speech about the user's speech is extracted and classified using the PLP algorithm. After classification, an emotional DB of speech is constructed. Moreover, emotional color and emotional vocabulary through factor analysis are matched to one space in order to classify emotional information of image. And a standardized image recommendation system based on the matching of each keyword with the BM-GA algorithm for the data of the emotional information of speech and emotional information of image according to the more appropriate emotional information of speech of the user. As a result of the performance evaluation, recognition rate of standardized vocabulary in four stages according to speech was 80.48% on average and system user satisfaction was 82.4%. Therefore, it is expected that the classification of images according to the user's speech information will be helpful for the study of emotional exchange between the user and the computer.

강인한 음성인식을 위한 이중모드 센서의 결합방식에 관한 연구 (A Study on Combining Bimodal Sensors for Robust Speech Recognition)

  • 이철우;계영철;고인선
    • 한국음향학회지
    • /
    • 제20권6호
    • /
    • pp.51-56
    • /
    • 2001
  • 최근 잡음이 심한 환경에서 음성인식을 신뢰성있게 하기 위하여 입모양의 움직임과 음성을 같이 사용하는 방법이 활발히 연구되고 있다 본 논문에서도 이러한 목적으로 영상언어인식기와 음성인식기의 결과에 각각 가중치를 주어 결합하는 방법을 제안한다. 특히 가중치를 입력음성의 잡음의 정도에 따라 자동적으로 결정하는 방법을 제안한다. 가중치의 결정을 위하여 입력샘플간의 상관도와 LPC분석의 잔여 오차를 이용한다. 모의실험 결과, 이런 방식으로 결합된 인식기는 잡음이 심한 환경에서도 약 83%의 인식성능을 보이고 있다.

  • PDF

목소리 특성의 청취 평가에 기초한 사상체질과 음성 특징의 상관관계 분석 (Analysis of the Relationship Between Sasang Constitutional Groups and Speech Features Based on a Listening Evaluation of Voice Characteristics)

  • 권철홍;김종열;김근호;장준수
    • 말소리와 음성과학
    • /
    • 제4권4호
    • /
    • pp.71-77
    • /
    • 2012
  • Sasang constitution experts utilize voice characteristics as an auxiliary measure for deciding a person's constitutional group. This study aims at establishing a relationship between speech features and the constitutional groups by subjective listening evaluation of voice characteristics. A speech database of 841 speakers whose constitutional groups have been already diagnosed by Sasang constitution experts was constructed. Speech features related to speech source and vocal tract filter were extracted from five vowels and one sentence. Statistically significant speech features for classifying the groups were analyzed using SPSS. The features contributed to constitution classification were speaking rate, Energy, A1, A2, A3, H1, H2, H4, CPP for males in their 20s, F0_mean, CPP, SPI, HNR, Shimmer, Energy, A1, A2, A3, H1, H2, H4 for females in their 20s, Energy, A1, A2, A3, H1, H2, H4, CPP for male in the 60s, and Jitter, HNR, CPP, SPI for females in their 60s. Experimental results show that speech technology is useful in classifying constitutional groups.

배경잡음을 고려한 가변임계값 Dual Rate ADPCM 음성 CODEC 구현 (Implementation of Variable Threshold Dual Rate ADPCM Speech CODEC Considering the Background Noise)

  • 양재석;한경호
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2000년도 하계학술대회 논문집 D
    • /
    • pp.3166-3168
    • /
    • 2000
  • This paper proposed variable threshold dual rate ADPCM coding method which is modified from the standard ADPCM of ITU G.726 for speech quality improvement. The speech quality of variable threshold dual rate ADPCM is better than single rate ADPCM at noisy environment without increasing the complexity by using ZCR(Zero Crossing Rate). In this case, ZCR is used to divide input signal samples into two categories(noisy & speech). The samples with higher ZCR is categorized as the noisy region and the samples with lower ZCR is categorized as the speech region. Noisy region uses higher threshold value to be compressed by 16Kbps for reduced bit rates and the speech region uses lower threshold value to be compressed by 40Kbps for improved speech quality. Comparing with the conventional ADPCM, which adapts the fixed coding rate. the proposed variable threshold dual rate ADPCM coding method improves noise character without increasing the bit rate. For real time applications, ZCR calculation was considered as a simple method to obtain the background noise information for preprocess of speech analysis such as FFT and the experiment showed that the simple calculation of ZCR can be used without complexity increase. Dual rate ADPCM can decrease the amount of transferred data efficiently without increasing complexity nor reducing speech quality. Therefore result of this paper can be applied for real-time speech application such as the internet phone or VoIP.

  • PDF

정상 성인의 음도, 비성도, 음질 간의 상관 연구 (A Correlation Study among Pitch, Nasalance, and Voice Quality)

  • 박성종;유재연
    • 말소리와 음성과학
    • /
    • 제1권4호
    • /
    • pp.159-163
    • /
    • 2009
  • The purpose of this study is to conduct a correlational analysis among pitch, nasalance, and acoustic quality parameters estimated by two speech analysis softwares NasalView(version 1.31), Dr. Speech 4.5(Tiger Electronics). Thirty females and 25 males with normal voice participated in the study. The Pearson correlation coefficient was determined through a statistical analysis. The results came out as follows; Firstly, there was a correlation between $F_0$ and voice quality parameters, however there was no correlation between $F_0$ and nasalance. Secondly, nasalance showed a correlation with voice quality parameters.

  • PDF

DHMM과 어휘해석을 이용한 Voice dialing 시스템 (The Voice Dialing System Using Dynamic Hidden Markov Models and Lexical Analysis)

  • 최성호;이강성;김순협
    • 전자공학회논문지B
    • /
    • 제28B권7호
    • /
    • pp.548-556
    • /
    • 1991
  • In this paper, Korean spoken continuous digits are ercognized using DHMM(Dynamic Hidden Markov Model) and lexical analysis to provide the base of developing voice dialing system. After segmentation by phoneme unit, it is recognized. This system can be divided into the segmentation section, the design of standard speech section, the recognition section, and the lexical analysis section. In the segmentation section, it is segmented using the ZCR, O order LPC cepstrum, and Ai, parameter of voice speech dectaction, which is changed according to time. In the standard speech design section, 19 phonemes or syllables are trained by DHMM and designed as a standard speech. In the recognition section, phomeme stream are recognized by the Viterbi algorithm.In the lexical decoder section, finally recognized continuous digits are outputed. This experiment shiwed the recognition rate of 85.1% using data spoken 7 times of 21 classes of 7 continuous digits which are combinated all of the occurence, spoken by 10 man.

  • PDF

3세에서 8세 아동의 용언 발달 연구 (A Study on the Development of Inflected Words of Korean based on the analysis of 3 to 8 year-old Children's speech)

  • 최은아;신지영;김수진
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 10월 학술대회지
    • /
    • pp.89-93
    • /
    • 2003
  • The aim of this paper is to investigate the development of inflected words of Korean based on the analysis of 3 to 8 year-old children's spontaneous speech. For this purpose, the authors transcribe the spontaneous speech of 10 Korean children for each age and classified inflected word. The result of the analysis is as follows : $\circled1$ In the verbs simple words are occupied 62%, derivative words 18% and complex words 20%. In the adjectives simple words are 82%, derivative words 7% and complex words 11%. $\circled2$ The children's getting older, derivative and complex words are increased, in spite of simple words are reduced. $\circled3$ 4 year-old children get to start the ability of word formation and then since the children become 8 year-old, the children complete that ability almost all we think.

  • PDF