• 제목/요약/키워드: speech factors

검색결과 352건 처리시간 0.026초

The Interlanguage Speech Intelligibility Benefit for Listeners (ISIB-L): The Case of English Liquids

  • Lee, Joo-Kyeong;Xue, Xiaojiao
    • 말소리와 음성과학
    • /
    • 제3권1호
    • /
    • pp.51-65
    • /
    • 2011
  • This study attempts to investigate the interlanguage speech intelligibility benefit for listeners (ISIB-L), examining Chinese talkers' production of English liquids and its perception of native listeners and non-native Chinese and Korean listeners. An Accent Judgment Task was conducted to measure non-native talkers' and listeners' phonological proficiency, and two levels of proficiency groups (high and low) participated in the experiment. The English liquids /l/ and /r/ produced by Chinese talkers were considered in terms of positions (syllable initial and final), contexts (segment, word and sentence) and lexical density (minimal vs. nonminimal pair) to see if these factors play a role in ISIIB-L. Results showed that both matched and mismatched interlanguage speech intelligibility benefit for listeners occurred except for the initial /l/. Non-native Chinese and Korean listeners, though only with high proficiency, were more accurate at identifying initial /r/, final /l/ and final /r/, but initial /l/ was significantly more intelligible to native listeners than non-native listeners. There was evidence of contextual and lexical density effects on ISIB-L. No ISIB-L was demonstrated in sentence context, but both matched and mismatched ISIB-L was observed in word context; this finding held true for only high proficiency listeners. Listeners recognized the targets better in the non-minimal pair (sparse density) environment than the minimal pair (higher density) environment. These findings suggest that ISIB-L for English liquids is influenced by talkers' and listeners' proficiency, syllable position in association with L1 and L2 phonological structure, context, and word neighborhood density.

  • PDF

The Role of Cognitive Control in Tinnitus and Its Relation to Speech-in-Noise Performance

  • Tai, Yihsin;Husain, Fatima T.
    • Journal of Audiology & Otology
    • /
    • 제23권1호
    • /
    • pp.1-7
    • /
    • 2019
  • Self-reported difficulties in speech-in-noise (SiN) recognition are common among tinnitus patients. Whereas hearing impairment that usually co-occurs with tinnitus can explain such difficulties, recent studies suggest that tinnitus patients with normal hearing sensitivity still show decreased SiN understanding, indicating that SiN difficulties cannot be solely attributed to changes in hearing sensitivity. In fact, cognitive control, which refers to a variety of top-down processes that human beings use to complete their daily tasks, has been shown to be critical for SiN recognition, as well as the key to understand cognitive inefficiencies caused by tinnitus. In this article, we review studies investigating the association between tinnitus and cognitive control using behavioral and brain imaging assessments, as well as those examining the effect of tinnitus on SiN recognition. In addition, three factors that can affect cognitive control in tinnitus patients, including hearing sensitivity, age, and severity of tinnitus, are discussed to elucidate the association among tinnitus, cognitive control, and SiN recognition. Although a possible central or cognitive involvement has always been postulated in the observed SiN impairments in tinnitus patients, there is as yet no direct evidence to underpin this assumption, as few studies have addressed both SiN performance and cognitive control in one tinnitus cohort. Future studies should aim at incorporating SiN tests with various subjective and objective methods that evaluate cognitive performance to better understand the relationship between SiN difficulties and cognitive control in tinnitus patients.

영어와 한국어 자연발화 음성 코퍼스에서의 무성 파열음 연구 (A study on the voiceless plosives from the English and Korean spontaneous speech corpus)

  • 윤규철
    • 말소리와 음성과학
    • /
    • 제11권4호
    • /
    • pp.45-53
    • /
    • 2019
  • 본 논문의 목적은 자연발화 음성 코퍼스를 대상으로 영어 무성 파열음 [p, t, k]과 한국어 격음 파열음 [ph, th, kh]의 조음위치 결정에 영향을 미치는 요인들을 살펴보는 것이다. 프랏 스크립트를 이용하여 요인들은 자동 추출하였고, 판별분석을 통해 요인의 수를 점차 증가시켜가면서 무성 파열음의 예측 정확도를 계산하였다. 분석에 사용된 요인들은 개방파열, 파열 후 기식음과 모음 시작 부분의 운동량과 스펙트럼 기울기, 폐쇄구간과 VOT, 단어와 발화 내 위치, 마지막으로 직후 모음의 종류 등이었다. 분석 결과에 따르면, 요인의 수가 다섯 개까지 증가하는 경우 예측정확도가 최대로 증가하여 영어는 74.6%, 한국어는 66.4%를 나타내었다. 그러나 사실상의 최대값에 도달하는 데는 네 개의 요인으로도 충분하였고, 이들은 개방파열과 직후 모음의 운동량과 스펙트럼 기울기, 폐쇄구간과 VOT였다. 이는 무성파열음의 조음위치가 자신의 내부 요인들과 직후 모음의 영향을 동시에 받는다는 것을 의미한다고 볼 수 있다.

Prosodic Features at "Sentence Boundaries" in Oral Presentations

  • Umesaki, Atsuko-Furuta
    • 대한음성학회지:말소리
    • /
    • 제41호
    • /
    • pp.83-96
    • /
    • 2001
  • It is generally said that falling intonation is used at the end of a declarative sentence. However, this is not the case with all stretches of spontaneous speech which are marked in transcription as sentences. The present paper examines intonation patterns appearing at the end of declarative sentences in oral presentations, and discusses instances where falling intonation does not appear. The texts used for analysis are eight oral presentations collected at international conferences in the field of physics. Quantitative and qualitative analyses are carried out. Three major factors related to discourse structure have been found for non-occurrence of falling intonation at sentence boundaries.

  • PDF

Prosodic Features at "Sentence Boundaries" in Oral Presentations

  • Umesaki, Atsuko-Furuta
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2000년도 7월 학술대회지
    • /
    • pp.149-164
    • /
    • 2000
  • It is generally said that falling intonation is used at the end of a declarative sentence. However, this is not the case with all stretches of spontaneous speech which are marked in transcription as sentences. The present paper examines intonation patterns appearing at the end of declarative sentences in oral presentations, and discusses instances where falling intonation does not appear. The texts used for analysis are eight oral presentations collected at international conferences in the field of physics. Quantitative and qualitative analyses are carried out. Three major factors related to discourse structure have been found for nonoccurrence of falling intonation at sentence boundaries.

  • PDF

퍼지 로직을 이용한 감정인식 모델설계 (Design of Emotion Recognition Model Using fuzzy Logic)

  • 김이곤;배영철
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2000년도 춘계학술대회 학술발표 논문집
    • /
    • pp.268-282
    • /
    • 2000
  • Speech is one of the most efficient communication media and it includes several kinds of factors about speaker, context emotion and so on. Human emotion is expressed in the speech, the gesture, the physiological phenomena(the breath, the beating of the pulse, etc). In this paper, the method to have cognizance of emotion from anyone's voice signals is presented and simulated by using neuro-fuzzy model.

  • PDF

음성신호를 이용한 감정인식 모델설계 (Design of Emotion Recognition Using Speech Signals)

  • 김이곤;김서영;하종필
    • 한국정보통신학회:학술대회논문집
    • /
    • 한국해양정보통신학회 2001년도 추계종합학술대회
    • /
    • pp.265-270
    • /
    • 2001
  • Voice is one of the most efficient communication media and it includes several kinds of factors about speaker, context emotion and so on. Human emotion is expressed in the speech, the gesture, the physiological phenomena(the breath, the beating of the pulse, etc). In this paper, the method to have cognizance of emotion from anyone's voice signals is presented and simulated by using neuro-fuzzy model.

  • PDF

Long-Term Follow-Up Study of Young Adults Treated for Unilateral Complete Cleft Lip, Alveolus, and Palate by a Treatment Protocol Including Two-Stage Palatoplasty: Speech Outcomes

  • Kappen, Isabelle Francisca Petronella Maria;Bittermann, Dirk;Janssen, Laura;Bittermann, Gerhard Koendert Pieter;Boonacker, Chantal;Haverkamp, Sarah;de Wilde, Hester;Van Der Heul, Marise;Specken, Tom FJMC;Koole, Ron;Kon, Moshe;Breugem, Corstiaan Cornelis;van der Molen, Aebele Barber Mink
    • Archives of Plastic Surgery
    • /
    • 제44권3호
    • /
    • pp.202-209
    • /
    • 2017
  • Background No consensus exists on the optimal treatment protocol for orofacial clefts or the optimal timing of cleft palate closure. This study investigated factors influencing speech outcomes after two-stage palate repair in adults with a non-syndromal complete unilateral cleft lip and palate (UCLP). Methods This was a retrospective analysis of adult patients with a UCLP who underwent two-stage palate closure and were treated at our tertiary cleft centre. Patients ${\geq}17$ years of age were invited for a final speech assessment. Their medical history was obtained from their medical files, and speech outcomes were assessed by a speech pathologist during the follow-up consultation. Results Forty-eight patients were included in the analysis, with a mean age of 21 years (standard deviation, 3.4 years). Their mean age at the time of hard and soft palate closure was 3 years and 8.0 months, respectively. In 40% of the patients, a pharyngoplasty was performed. On a 5-point intelligibility scale, 84.4% received a score of 1 or 2; meaning that their speech was intelligible. We observed a significant correlation between intelligibility scores and the incidence of articulation errors (P<0.001). In total, 36% showed mild to moderate hypernasality during the speech assessment, and 11%-17% of the patients exhibited increased nasalance scores, assessed through nasometry. Conclusions The present study describes long-term speech outcomes after two-stage palatoplasty with hard palate closure at a mean age of 3 years old. We observed moderate long-term intelligibility scores, a relatively high incidence of persistent hypernasality, and a high pharyngoplasty incidence.

프레임 신뢰도 가중에 의한 강인한 음성인식 (Frame Reliability Weighting for Robust Speech Recognition)

  • 조훈영;김락용;오영환
    • 한국음향학회지
    • /
    • 제21권3호
    • /
    • pp.323-329
    • /
    • 2002
  • 본 논문에서는 임의의 시점에서 발생하여 음성 신호의 일부분을 심하게 손상시키는 시간선택 잡음 (time-selective noise)을 보상하기 위한 프레임 신뢰도 가중 방법을 제안한다. 음성 프레임들은 서로 다른 정도의 신뢰도를 갖으며, 신뢰도는 프레임의 신호대잡음비 (signal-to-noise ratio)에 비례한다. 잡음이 일정한 경우에는 무음구간에서 획득한 잡음 정보를 이용하여 프레임의 신호대잡음비 추정이 용이하나, 시간선택 잡음은 잡음추정이 어렵다. 따라서, 본 연구에서는 프레임 신뢰도를 추정하기 위해 깨끗한 음성의 통계적 모델을 사용하였다. 제안한 MFR (model-based frame reliability) 방법은 탐조 모델의 평균 벡터열과 입력 MFCC (mel-frequency cepstral coefficient) 특징 벡터 열의 역변환에 의해 얻은 필터뱅크 에너지를 이용하여 프레임 신호대잡음비를 근사한다. 다양한 버스트 (burst) 잡음에 대한 인식 실험 결과, 제안한 방법은 프레임의 신뢰도를 효과적으로 나타낼 수 있었으며, 이 신뢰도를 우도 계산에서 가중치로 적용하여 인식 성능을 향상시킬 수 있었다.

G.723.1 음성부호화기와 EVRC 음성부호화기의 상호 부호화 알고리듬 (An Efficient Transcoding Algorithm For G.723.1 and EVRC Speech Coders)

  • 김경태;정성교;윤성완;박영철;윤대희;최용수;강태익
    • 한국통신학회논문지
    • /
    • 제28권5C호
    • /
    • pp.548-554
    • /
    • 2003
  • 서로 다른 음성 부호화기를 사용하는 유/무선 통신망의 연동에서 각 음성 패킷간 효율적인 변환 과정이 필요하다. 이러한 패킷 변환 가정을 위해서 과거에는 이중 부/복호화 방식을 이용하였다. 그러나, 두 음성 부호화기가 이중 부/복호화 방식으로 연동될 경우, 음질 저하 및 계산량 증가, 부가적인 전달 지연 등의 문제가 발생한다. 이 논문에서는 유/무선 통신 시스템에서 널리 사용되는 ITU-T G.723.1[1]과 TIA IS-127 EVRC(Enhanced-Variable-Rate-Codec)[2]음성부호화기 간의 효과적인 연동을 위한 상호부호화 알고리듬을 제안하였다. 제안된 상호부호화 알고리듬은 크게 LSP(Line-Spectrum-Pairs) 변환, 개회로 피치 변환, 고속 적응코드북 검색, 고속 고정코드북 검색의 네 부분으로 나뉘어 진다. TMS320C62x DSP를 사용하여 구현해 본 결과, 제안된 상호부호화 알고리듬이 기존의 이중 부/복호화 과정에 비해 30%∼35% 정도 계산량을 개선하며, 적은 지연 시간으로 동등한 주/객관적 음질을 제공함을 확인하였다.