• Title/Summary/Keyword: Speech Tone

Search Result 200, Processing Time 0.024 seconds

Speech signal processing in the auditory system (청각 계통에서의 음성신호처리)

  • 이재혁;심재성;백승화;박상희
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 1987.10b
    • /
    • pp.680-683
    • /
    • 1987
  • The speech signal processing in the auditory system can be analysized based on two representations : Average discharge rate and Temporal discharge pattern. But the average discharge rate representation is restricted by the narrow dynamic range because of the rate saturation and the two tone suppression phenomena, and the temporal discharge pattern representation needs a sophisticate frequency analysis and synchrony measure. In this paper, a simple representation is proposed : using a model considering the interaction of Cochlear fluid-BM movement and a haircell model, the feature of speech signals (formant frequency and pitch of vowels) is easily estimated in the Average Synchronized Rate.

  • PDF

Prosodic Properties in the Speech of Adults with Cerebral Palsy (뇌성마비 성인 발화의 운율특성)

  • Lee, Sook-Hyang;Ko, Hyun-Ju;Kim, Soo-Jin
    • MALSORI
    • /
    • no.64
    • /
    • pp.39-51
    • /
    • 2007
  • The purpose of this study is to investigate prosodic characteristics in the speech of adults with cerebral palsy through a comparison with the speech of normal speakers. Ten speakers with cerebral palsy (6 males, 4 females) and 6 normal speakers (3 males, 3 females) served as subjects. The results revealed that, compared to normal speakers, speakers with cerebral palsy showed a slower speech rate, a larger number of intonational phrases(IPs) and pauses, a larger number of accentual phrases(APs) per IP, a longer duration of pauses, and more gradual slopes of [L +H] in APs. However, the two groups showed similar tone patterns in their APs. The results also showed mild to moderate correlations between speech intelligibility and the prosodic properties which showed significant differences between the two groups, suggesting that they could be important prosodic factors to predict speech intelligibility in the speech of adults with cerebral palsy.

  • PDF

An aerodynamic and acoustic characteristics of Clear Speech in patients with Parkinson's disease (파킨슨 환자의 클리어 스피치 전후 음향학적 공기역학적 특성)

  • Shin, Hee Baek;Ko, Do-Heung
    • Phonetics and Speech Sciences
    • /
    • v.9 no.3
    • /
    • pp.67-74
    • /
    • 2017
  • An increase in speech intelligibility has been found in Clear Speech compared to conversational speech. Clear Speech is defined by decreased articulation rates and increased frequency and length of pauses. The objective of the present study was to investigate improvement in immediate speech intelligibility in 10 patients with Parkinson's disease (age range: 46 to 75 years) using Clear Speech. This experiment has been performed using the Phonatory Aerodynamic System 6600 after the participants read the first sentence of a Sanchaek passage and the "List for Adults 1" in the Sentence Recognition Test (SRT) using casual speech and Clear Speech. Acoustic and aerodynamic parameters that affect speech intelligibility were measured, including mean F0, F0 range, intensity, speaking rate, mean airflow rate, and respiratory rate. In the Sanchaek passage, use of Clear Speech resulted in significant differences in mean F0, F0 range, speaking rate, and respiratory rate, compared with the use of casual speech. In the SRT list, significant differences were seen in mean F0, F0 range, and speaking rate. Based on these findings, it is claimed that speech intelligibility can be affected by adjusting breathing and tone in Clear Speech. Future studies should identify the benefits of Clear Speech through auditory-perceptual studies and evaluate programs that use Clear Speech to increase intelligibility.

Modality-Based Sentence-Final Intonation Prediction for Korean Conversational-Style Text-to-Speech Systems

  • Oh, Seung-Shin;Kim, Sang-Hun
    • ETRI Journal
    • /
    • v.28 no.6
    • /
    • pp.807-810
    • /
    • 2006
  • This letter presents a prediction model for sentence-final intonations for Korean conversational-style text-to-speech systems in which we introduce the linguistic feature of 'modality' as a new parameter. Based on their function and meaning, we classify tonal forms in speech data into tone types meaningful for speech synthesis and use the result of this classification to build our prediction model using a tree structured classification algorithm. In order to show that modality is more effective for the prediction model than features such as sentence type or speech act, an experiment is performed on a test set of 970 utterances with a training set of 3,883 utterances. The results show that modality makes a higher contribution to the determination of sentence-final intonation than sentence type or speech act, and that prediction accuracy improves up to 25% when the feature of modality is introduced.

  • PDF

In Search of Models in Speech Communication Research

  • Hiroya, Fujisaki
    • Phonetics and Speech Sciences
    • /
    • v.1 no.1
    • /
    • pp.9-22
    • /
    • 2009
  • This paper first presents the author's personal view on the importance of modeling in scientific research in general, and then describes two of his works toward modeling certain aspects of human speech communication. The first work is concerned with the physiological and physical mechanisms of controlling the voice fundamental frequency of speech, which is an important parameter for expressing information on tone, accent, and intonation. The second work is concerned with the cognitive processes involved in a discrimination test of speech stimuli, which gives rise to the phenomenon of so-called categorical perception. They are meant to illustrate the power of models based on deep understanding and precise formulation of the functions of the mechanisms/processes that underlie observed phenomena. Finally, it also presents the author's view on some models that are yet to be developed.

  • PDF

Boundary Tones of Intonational Phrase-Final Morphemes in Dialogues (대화체 억양구말 형태소의 경계성조 연구)

  • Han, Sun-Hee
    • Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.219-234
    • /
    • 2000
  • The study of boundary tones in connected speech or dialogues is one of the most underdeveloped areas of Korean prosody. This. paper concerns the boundary tones of intonational phrase-final morphemes which are shown in the speech corpus of dialogues. Results of phonetic analysis show that different kinds of boundary tones are realized, depending on the positions of the intonational phrase-final morphemes in the sentences.. This study has also shown that boundary tone patterning is somewhat related to the sentence structure, and for better speech recognition and speech synthesis, it presents a simple model of boundary tones based on the fundamental frequency contour. The results of this study will contribute to our understanding of the prosodic pattern of Korean connected speech or dialogues.

  • PDF

A Study on Intonation of the Topic in English Information Structure (영어 정보구조에서의 화제에 대한 억양 연구)

  • Lee, Yong-Jae;Kim, Hwa-Young
    • Speech Sciences
    • /
    • v.13 no.2
    • /
    • pp.87-105
    • /
    • 2006
  • Many researchers have studied the relationship between the information structure and intonation. Arguments about the relations between the information structure and intonation researched so far can be summarized as follows: the intonation of topic and focus in English information structure is represented as i) a pitch accent, ii) a tune (a pitch accent + an edge tone), or iii) a boundary tone. The purpose of this paper is to study various informational patterns of the topic in English information structure, using real TV discussion data. In this paper, the topic is classified as contrastive topics and non-contrastive topics, based on contrastiveness. The results show that the intonation of the topic in English information structure is implemented as a pitch accent, neither a tune nor a boundary tone. Of the non-contrastive topics, while anaphoric determinative NP topics (Lnc, Lncd) are mainly represented as a H* pitch accent, the pronoun topic(Lp) does not have a pitch accent. Of contrastive topics, while the semantically focused topic(Lci) is mainly represented as a H* pitch accent, the contrastively focused topic(Lcc) is represented as both H* and L+H* pitch accents. It shows that it is not always true that the topic or focus to have the meaning of contrast is represented as a L+H* pitch accent as argued in the previous researches.

  • PDF

Emotion Recognition Using Tone and Tempo Based on Voice for IoT (IoT를 위한 음성신호 기반의 톤, 템포 특징벡터를 이용한 감정인식)

  • Byun, Sung-Woo;Lee, Seok-Pil
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.65 no.1
    • /
    • pp.116-121
    • /
    • 2016
  • In Internet of things (IoT) area, researches on recognizing human emotion are increasing recently. Generally, multi-modal features like facial images, bio-signals and voice signals are used for the emotion recognition. Among the multi-modal features, voice signals are the most convenient for acquisition. This paper proposes an emotion recognition method using tone and tempo based on voice. For this, we make voice databases from broadcasting media contents. Emotion recognition tests are carried out by extracted tone and tempo features from the voice databases. The result shows noticeable improvement of accuracy in comparison to conventional methods using only pitch.

Neutralization of Short Tones in Taiwanese

  • Jane Tsay
    • Proceedings of the KSPS conference
    • /
    • 1996.10a
    • /
    • pp.136-141
    • /
    • 1996
  • This paper is an acoustic study of neutralization of short tones in Taiwanese. The results show that the two short tones were completely neutralized in juncture position. Since long tones in Taiwanese show complete neutralization in context position, the bidirectionality of tone alternation in Taiwanese Tone Sandhi poses a problem far rule-based approaches, while it is consistent with the hypothesis that both juncture and context tones are listed in the lexicon, instead of one being derived from the other. Moreover, in order to account for the difference between Taiwanese Tone Sandhi and Mandarin Tone Sandhi (which has been proven acoustically to be incomplete neutralization), the Naturalness Hypothesis is proposed, which claims that if the neutralization is phonetically unnatural, then the neutralization is more likely to be lexicalized and show complete neutralization.

  • PDF

Development of Ambulatory Audiometric System by Sound Pressure Level Calibration (음압 보정을 통한 이동형 청력 검사 시스템 구현)

  • Shin, Seung-Won;Kim, Kyeong-Seop;Yoon, Tae-Ho;Lee, Sang-Min;Song, Chul-Gyu
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.56 no.6
    • /
    • pp.1157-1164
    • /
    • 2007
  • In this paper, we implement a PDA(Personal Digital Assistant)-based audiometric system in order to estimate hearing threshold by adopting both pure-tone sound and speech audiometric test system. To estimate a subject's hearing threshold in an ambulatory audiometric test environment, an efficient sound calibration scheme between a PDA and a headphone device is proposed by appling polynomial fitting algorithms in 8-banded frequency ranges.