• Title/Summary/Keyword: 음성적 변이

Search Result 248, Processing Time 0.031 seconds

Measurement of the vocal tract area of vowels By MRI and their synthesis by area variation (MRI에 의한 모음의 성도 단면적 측정 및 면적 변이에 따른 합성 연구)

  • Yang, Byung-Gon
    • Speech Sciences
    • /
    • v.4 no.1
    • /
    • pp.19-34
    • /
    • 1998
  • The author collected and compared midsagittal, coronal, coronal oblique, and transversal images of Korean monophthongs /a, i, e, o, u, i, v/ produced by a healthy male speaker using 1.5 T MR, VISION. Area was measured by computer software after tracing the cross-section at different points along the tract. Results showed that the width of the oral and pharyngeal cavities varied compensatorily from each other on the midsagittal dimension. Formant frequency values estimated from the area functions of the seven vowels showed a strong correlation (r=0.978) with those analyzed from the spoken vowels. Moreover, almost all of 35 students who listened to the synthesized vowels from area data perceived the synthesized vowels as equivalent to the spoken ones. Movement of constriction points of vowel /u/ with wider lip opening sounded /i/ and led to slight changes in vowel quality. Jaw and tongue movement led to major volume variation with an anatomical limitation. Each comer vowel varied systematically from a somewhat constant volume of the average area. Thus, the author proposed that any simulation studies related to vocal tract area variation should reflect its constant volume. The results may be helpful to verify exact measurement of the vocal tract area through vowel synthesis and a simulation study before having any operation of the vocal tract.

  • PDF

Speech perception difficulties and their associated cognitive functions in older adults (노년층의 말소리 지각 능력 및 관련 인지적 변인)

  • Lee, Soo Jung;Kim, HyangHee
    • Phonetics and Speech Sciences
    • /
    • v.8 no.1
    • /
    • pp.63-69
    • /
    • 2016
  • The aims of the present study are two-fold: 1) to explore differences on speech perception between younger and older adults according to noise conditions; and 2) to investigate which cognitive domains are correlated with speech perception. Data were acquired from 15 younger adults and 15 older adults. Sentence recognition test was conducted in four noise conditions(i.e., in-quiet, +5 dB SNR, 0 dB SNR, -5 dB SNR). All participants completed auditory and cognitive assessment. Upon controlling for hearing thresholds, the older group revealed significantly poorer performance compared to the younger adults only under the high noise condition at -5 dB SNR. For older group, performance on Seoul Verbal Learning Test(immediate recall) was significantly correlated with speech perception performance, upon controlling for hearing thresholds. In older adults, working memory and verbal short-term memory are the best predictors of speech-in-noise perception. The current study suggests that consideration of cognitive function for older adults in speech perception assessment is necessary due to its adverse effect on speech perception under background noise.

Effects of Helium Gas on the Articulators (헬륨(He)가스가 조음기관에 미치는 영향)

  • Lim, Soon-Yong;Lim, Sung-Su;Youn, Yong-Heum;Min, Ji-Sun;Song, Han-Sol;Kim, Bong-Hyun;Ka, Min-Kyoung;Cho, Dong-Uk
    • Annual Conference of KIPS
    • /
    • 2011.04a
    • /
    • pp.1082-1085
    • /
    • 2011
  • 기존에 잠수부가 사용하던 질소가스가 인체에 치명적인 공기 색전증을 유발하게 되면서 헬륨 산소혼합가스는 이를 극복하기 위한 대체 호흡용 가스로 사용되고 있다. 특히, 헬륨가스는 명료도가 낮은 squeaky voice를 유발하기 때문에 잠수부들의 비정상적인 음성에 대한 해석에 어려움이 많다. 또한, 헬륨가스는 일상생활에서 자주 접하는 것이며 다양한 TV프로에서도 헬륨가스를 마신 후 변하는 목소리를 통해 웃음을 전달하고 있다. 따라서 본 논문에서는 헬륨가스가 조음기관에 미치는 영향을 음성분석학적 요소의 적용을 통해 측정하는 연구를 수행하였다.

A Study on Speech Recognition Using the HM-Net Topology Design Algorithm Based on Decision Tree State-clustering (결정트리 상태 클러스터링에 의한 HM-Net 구조결정 알고리즘을 이용한 음성인식에 관한 연구)

  • 정현열;정호열;오세진;황철준;김범국
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.2
    • /
    • pp.199-210
    • /
    • 2002
  • In this paper, we carried out the study on speech recognition using the KM-Net topology design algorithm based on decision tree state-clustering to improve the performance of acoustic models in speech recognition. The Korean has many allophonic and grammatical rules compared to other languages, so we investigate the allophonic variations, which defined the Korean phonetics, and construct the phoneme question set for phonetic decision tree. The basic idea of the HM-Net topology design algorithm is that it has the basic structure of SSS (Successive State Splitting) algorithm and split again the states of the context-dependent acoustic models pre-constructed. That is, it have generated. the phonetic decision tree using the phoneme question sets each the state of models, and have iteratively trained the state sequence of the context-dependent acoustic models using the PDT-SSS (Phonetic Decision Tree-based SSS) algorithm. To verify the effectiveness of the above algorithm we carried out the speech recognition experiments for 452 words of center for Korean language Engineering (KLE452) and 200 sentences of air flight reservation task (YNU200). Experimental results show that the recognition accuracy has progressively improved according to the number of states variations after perform the splitting of states in the phoneme, word and continuous speech recognition experiments respectively. Through the experiments, we have got the average 71.5%, 99.2% of the phoneme, word recognition accuracy when the state number is 2,000, respectively and the average 91.6% of the continuous speech recognition accuracy when the state number is 800. Also we haute carried out the word recognition experiments using the HTK (HMM Too1kit) which is performed the state tying, compared to share the parameters of the HM-Net topology design algorithm. In word recognition experiments, the HM-Net topology design algorithm has an average of 4.0% higher recognition accuracy than the context-dependent acoustic models generated by the HTK implying the effectiveness of it.

A Design of Artificial Emotion Model (인공 감정 모델의 설계)

  • Lee, In-Geun;Seo, Seok-Tae;Jeong, Hye-Cheon;Gwon, Sun-Hak
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2007.04a
    • /
    • pp.58-62
    • /
    • 2007
  • 인간이 생성한 음성, 표정 영상, 문장 등으로부터 인간의 감정 상태를 인식하는 연구와 함께, 인간의 감정을 모방하여 다양한 외부 자극으로 감정을 생성하는 인공 감정(Artificial Emotion)에 관한 연구가 이루어지고 있다. 그러나 기존의 인공 감정 연구는 외부 감정 자극에 대한 감정 변화 상태를 선형적, 지수적으로 변화시킴으로써 감정 상태가 급격하게 변하는 형태를 보인다. 본 논문에서는 외부 감정 자극의 강도와 빈도뿐만 아니라 자극의 반복 주기를 감정 상태에 반영하고, 시간에 따른 감정의 변화를 Sigmoid 곡선 형태로 표현하는 감정 생성 모델을 제안한다. 그리고 기존의 감정 자극에 대한 회상(recollection)을 통해 외부 감정 자극이 없는 상황에서도 감정을 생성할 수 있는 인공 감정 시스템을 제안한다.

  • PDF

Automatic Generatio of Korean Pronunciation Variants (TTS 시스템을 위한 한국어 발음열 자동 생성)

  • 차선화
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.08a
    • /
    • pp.413-418
    • /
    • 1998
  • 음성 합성 시스템의 한 모듈로서 한국어 문자열을 음소열로 자동 변환하는 시스템을 구현하였다. 문자열을 음소열로 변환할 때에는 한국어 음운현상에 대한 체계적인 분석 과정이 필요하다. 한국어의 음운 변화 현상은 단일 형태소 내부와 여러 형태소가 결합하여 한 어절을 이루는 경우 그 형태소 경계, 그리고 어절 경계에서 서로 다른 음운규칙이 적용된다. 따라서 언절이나 문장 등의 입력을 음소열로 변환하기 위해서는 형태소 분석, 태깅작업이 반드시 수행되어야 올바른 발음열을 유도할 수 있다. 본 논문에서 제안한 시스템은 한국어의 형태음운현상을 반영하기 위해 형태소 분석을 선행한 후, 한국어에서 빈번하게 발생하는 음운 변화 현상의 분석을 통해 정의된 음소 변동 규칙과 변이음 규칙을 선택적으로 적용하여 형태소, 어절, 언절 또는 문장 등의 다양한 형태의 입력에 대해 발음열을 생성한다. 기존의 연구에서 분리되어 있던 형태소 태거와 변환시스템을 통합하여 사용자 편의성을 높였으며 텍스트 기반의 형태소 분석기를 사용하기 때문에 원형이 복원되는 형태소들에 대한 처리 루틴을 두어 오류를 감소 시켰다.

  • PDF

A Study on Analysis of Variant Factors of Recognition Performance for Lip-reading at Dynamic Environment (동적 환경에서의 립리딩 인식성능저하 요인분석에 대한 연구)

  • 신도성;김진영;이주헌
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.5
    • /
    • pp.471-477
    • /
    • 2002
  • Recently, lip-reading has been studied actively as an auxiliary method of automatic speech recognition(ASR) in noisy environments. However, almost of research results were obtained based on the database constructed in indoor condition. So, we dont know how developed lip-reading algorithms are robust to dynamic variation of image. Currently we have developed a lip-reading system based on image-transform based algorithm. This system recognize 22 words and this word recognizer achieves word recognition of up to 53.54%. In this paper we present how stable the lip-reading system is in environmental variance and what the main variant factors are about dropping off in word-recognition performance. For studying lip-reading robustness we consider spatial valiance (translation, rotation, scaling) and illumination variance. Two kinds of test data are used. One Is the simulated lip image database and the other is real dynamic database captured in car environment. As a result of our experiment, we show that the spatial variance is one of degradations factors of lip reading performance. But the most important factor of degradation is not the spatial variance. The illumination variances make severe reduction of recognition rates as much as 70%. In conclusion, robust lip reading algorithms against illumination variances should be developed for using lip reading as a complementary method of ASR.

An Objective Estimation for Simulating of Asymmetrical Auditory Filter of the Hearing Impaired According to Hearing Loss Degree (난청인의 난청 정도에 따른 비대칭 청각 필터 구현의 객관적 평가)

  • Joo, S.I.;Jeon, Y.Y.;Song, Y.R.;Lee, S.M.
    • Journal of rehabilitation welfare engineering & assistive technology
    • /
    • v.3 no.1
    • /
    • pp.27-34
    • /
    • 2009
  • Hearing impaired person's hearing loss has personally various shape, so existing symmetrical auditory filter of frequency band method wasn't properly simulated the hearing impaired person's various hearing loss shape. The shapes of auditory filter are asymmetrical different with each center frequency and each input level. Hearing impaired person which has hearing loss was differently changed with that of normal hearing people and it has different value for speech of quality through auditory filter. In this study, the asymmetrical auditory filter was simulated and then some tests to estimate the filter's performance objectively were performed. The experiment as simulated auditory filter's performance evaluation method used perceptual evaluation of speech quality (PESQ) and log likelihood ratio (LLR) for speech through auditory filter. In the test, processed speech was evaluated objective speech quality and distortion using PESQ and LLR value. When hearing loss processed, PESQ and LLR value have big difference between symmetrical and asymmetrical auditory filter. It means that the difference of the shape auditory filter may affect to speech quality. Especially, when hearing loss existed, auditory filter changing according to asymmetrical shape for each center frequency affected to perceive speech quality of the hearing impaired.

  • PDF

Modified AWSSDR method for frequency-dependent reverberation time estimation (주파수 대역별 잔향시간 추정을 위한 변형된 AWSSDR 방식)

  • Min Sik Kim;Hyung Soon Kim
    • Phonetics and Speech Sciences
    • /
    • v.15 no.4
    • /
    • pp.91-100
    • /
    • 2023
  • Reverberation time (T60) is a typical acoustic parameter that provides information about reverberation. Since the impacts of reverberation vary depending on the frequency bands even in the same space, frequency-dependent (FD) T60, which offers detailed insights into the acoustic environments, can be useful. However, most conventional blind T60 estimation methods, which estimate the T60 from speech signals, focus on fullband T60 estimation, and a few blind FDT60 estimation methods commonly show poor performance in the low-frequency bands. This paper introduces a modified approach based on Attentive pooling based Weighted Sum of Spectral Decay Rates (AWSSDR), previously proposed for blind T60 estimation, by extending its target from fullband T60 to FDT60. The experimental results show that the proposed method outperforms conventional blind FDT60 estimation methods on the acoustic characterization of environments (ACE) challenge evaluation dataset. Notably, it consistently exhibits excellent estimation performance in all frequency bands. This demonstrates that the mechanism of the AWSSDR method is valuable for blind FDT60 estimation because it reflects the FD variations in the impact of reverberation, aggregating information about FDT60 from the speech signal by processing the spectral decay rates associated with the physical properties of reverberation in each frequency band.

Various Diagnostic Methods for Helicobacter pylori Infection (헬리코박터 파일로리 감염의 다양한 진단법)

  • Han Jo Jeon;Hyuk Soon Choi
    • The Korean Journal of Medicine
    • /
    • v.99 no.2
    • /
    • pp.104-110
    • /
    • 2024
  • Helicobacter pylori (H. pylori) is a bacterium that colonizes the human stomach, leading to various gastrointestinal diseases including gastritis, peptic ulcers, and gastric cancer. There is no gold standard test that relies entirely on one method in H. pylori diagnosis. We must be aware of the pros and cons of various testing methods to perform an appropriate test according to the situation. Accurate diagnosis and eradication therapy are essential for disease management. Diagnostic methods include invasive techniques like tissue biopsy and rapid urease test, as well as non-invasive tests such as urea breath test, serology test, and stool antigen test. Each method has its advantages and limitations, requiring careful consideration in clinical practice. Understanding these diagnostic tools is crucial for effective H. pylori management and prevention of associated complications.