• Title/Summary/Keyword: VOCAL SIGNAL

Search Result 85, Processing Time 0.018 seconds

A Study on the Diagnosis of Laryngeal Diseases by Acoustic Signal Analysis (음향신호의 분석에 의한 후두질환의 진단에 관한 연구)

  • Jo, Cheol-Woo;Yang, Byong-Gon;Wang, Soo-Geon
    • Speech Sciences
    • /
    • v.5 no.1
    • /
    • pp.151-165
    • /
    • 1999
  • This paper describes a series of researches to diagnose vocal diseases using the statistical method and the acoustic signal analysis method. Speech materials are collected at the hospital. Using the pathological database, the basic parameters for the diagnosis are obtained. Based on the statistical characteristics of the parameters, valid parameters are chosen and those are used to diagnose the pathological speech signal. Cepstrum is used to extract parameters which represents characteristics of pathological speech. 3 layered neural network is used to train and classify pathological speech into normal, benign and malignant case.

  • PDF

Vocal separation method using weighted β-order minimum mean square error estimation based on kernel back-fitting (커널 백피팅 알고리즘 기반의 가중 β-지수승 최소평균제곱오차 추정방식을 적용한 보컬음 분리 기법)

  • Cho, Hye-Seung;Kim, Hyoung-Gook
    • The Journal of the Acoustical Society of Korea
    • /
    • v.35 no.1
    • /
    • pp.49-54
    • /
    • 2016
  • In this paper, we propose a vocal separation method using weighted ${\beta}$-order minimum mean wquare error estimation (WbE) based on kernel back-fitting algorithm. In spoken speech enhancement, it is well-known that the WbE outperforms the existing Bayesian estimators such as the minimum mean square error (MMSE) of the short-time spectral amplitude (STSA) and the MMSE of the logarithm of the STSA (LSA), in terms of both objective and subjective measures. In the proposed method, WbE is applied to a basic iterative kernel back-fitting algorithm for improving the vocal separation performance from monaural music signal. The experimental results show that the proposed method achieves better separation performance than other existing methods.

Usefullness of the Vibration Pick-Up in Detection of Pitch for Synchronization of Laryngeal Stroboscopy (후두 스트로보스코프 검사의 신호 동기화를 위한 진동 검출기의 유용성)

  • Lee, Jin-Choon;Lee, Byung-Joo;Wang, Soo-Geun;Roh, Jung-Hoon;Kwon, Sun-Bok;Jo, Cheol-Woo
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.18 no.1
    • /
    • pp.26-32
    • /
    • 2007
  • Objective and Background: Laryngeal stroboscope is an useful equipment in evaluation of vocal cord vibration and in early detection of mucosal lesion including invasive cancer of the vocal cord. Recently Lee et al. (2006) developed portable stroboscope using voice as synchronization signal. It has been frequently impaired ability to synchronize the flashes even in normal female. Authors tried to investigate various methods including vibration pick-up, microphone, laryngeal microphone, and contact microphone for development of simple and accurate method like electroglottograph signal. The purpose of this study was to estimate wheher the vibration pick-up is available and is consistent with the signal of EGG. Subjects and Methods: Authors compared the signals between EGG and noncontact method such as voice, contact methods including vibration pick-up, laryngeal microphone, and contact microphone in normal twenty adults (male 10 and female 10). The number of peak in one cycle was compared with the number of the peak in EGG, and the percent of phase difference in the peak was compared with EGG Also, authors tried to investigate which site of vibration pick-up was most effective for synchronization of stobo flashes. Three site including anterior neck below the cricoid cartilage, thyroid ala, and suprahyoid region were analysed. Results: Among various methods for synchronization of strobo flashes, vibration pick-up was most effective method in peak detection. And anterior neck below cricoid cartilage was the most available site of the vibration pick-up. Conclusion: Authors suggest that vibration pick-up is most available and effective method for synchronization of strobo flashes.

  • PDF

The Effect of An Increase of Closed Quotient on Improvement of Voice Quality after Type I Thyroplasty in Patients with Unilateral Vocal Cord Paralysis (일측 성대마비 환자에서 성대내전술 후 성대접촉율의 증가가 음질 개선에 미치는 영향)

  • Kim, Han-Su;Choi, Seung-Hee;Lim, Jae-Yol;Choi, Hong-Shik
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.15 no.1
    • /
    • pp.16-20
    • /
    • 2004
  • Purpose : To assess perceptual, acoustic and aerodynamic measure of voice quality in patients with unilateral vocal cord paralysis before and after type I thyroplasty. Methods : The clinical records of patients operated type I thyroplasty in the Departement of otorhinoalryngolgy, Yongdong Severance hospital from November 2001 to November 2003 were reviewed. All patients uderwent a vocal function evaluation including perceptual, acoustic and aerodynamic measures of voice preoperative and on $60^{th}$ postoperative day. The perceptual and acoustic measures were obtained from recording of patients' reading a 'Sanchak' passage. The perceptual evaluation was performed by 2 speech pathologist using a 4-point rating scale. Acoustic parameters(voice range profile low(RAL), voice range profile high(RAH), average fundamental frequency(AFX), closed quotient, harmonic to noise ratio, jitter and shimmer) were investigated by Lx speech studio. Mean flow rate(MFR), subglottic pressure(Psub) and intensity were measured using the Phonatory function analyzer. The maximum phonation time was also measured. The data were statistically analyzed. A paired t-test (p<0.1) was used to compare preoperative and postoperative results. And multiple regression test was used to find which parameter was most correlated to improvement of postoperative voice quality. Results : Among aerodynamic parameters, Psub $(88.11mmH_2O{\rightarrow}58.7mmH_2O)$, MPT(7.87sec${\rightarrow}$12.53sec), MFR (359.8ml/sec${\rightarrow}$161.06ml/sec) were statistically improved. AFx(205.5Hz${\rightarrow}$163.27Hz), AQx(23.9%${\rightarrow}$48.3%), RAL, RAH. Jotter and shimmer were improved. In multiple regression test, AFx and AQx was noted as the two meost correlated parameters to improvement of postoperative breathiness. But general grade of voice quality was more correlated to Psub and shimmer. Conclusion : Vocal fold medialization procedures effectively reduce glottic gap. Increasing of contact area of both vocal folds induced improvement in aerodynamic parameters and leaded stabilizing of vocal fold vibration. That effect results in improvement in acoustic parameters (shimmer, jitter, signal-to-noise ratio, voice range profile) and voice quality.

  • PDF

Change of Extracellular Matrix of Human Vocal Fold Fibroblasts by Vibratory Stimulation (진동이 성대세포주의 세포외기질 변화에 대한 연구)

  • Kim, Ji Min;Shin, Sung-Chan;Kwon, Hyun-Keun;Cheon, Yong-Il;Ro, Jung Hoon;Lee, Byung-Joo
    • Journal of the Korean Society of Laryngology, Phoniatrics and Logopedics
    • /
    • v.32 no.1
    • /
    • pp.15-23
    • /
    • 2021
  • Background and Objectives During speech, the vocal folds oscillate at frequencies ranging from 100-200 Hz with amplitudes of a few millimeters. Mechanical stimulation is an essential factor which affects metabolism of human vocal folds. The effect of mechanical vibration on the cellular response in the human vocal fold fibroblasts cells (hVFFs) was evaluated. Materials and Method We created a culture systemic device capable of generating vibratory stimulations at human phonation frequencies. To establish optimal cell culture condition, cellular proliferation and viability assay was examined. Quantitative real time polymerase chain reaction was used to assess extracellular matrix (ECM) related and growth factors expression on response to changes in vibratory frequency and amplitude. Western blot was used to investigate ECM and inflammation-related transcription factor activation and its related cellular signaling transduction pathway. Results The cell viability was stable with vibratory stimulation within 24 h. A statistically significant increase of ECM genes (collagen type I alpha 1 and collagen type I alpha 2) and growth factor [transforming growth factor β1 (TGF-β1) and fibroblast growth factor 1 (FGF-1)] observe under the experimental conditions. Vibratory stimulation induced transcriptional activation of NF-κB by phosphorylation of p65 subunit through cellular Mitogen-activated protein kinases activation by extracellular signal regulated kinase and p38 mitogen-activated protein kinases (MAPKs) phosphorylation on hVFFs. Conclusion This study confirmed enhancing synthesis of collagen, TGF-β1 and FGF was testified by vibratory stimulation on hVFFs. This mechanism is thought to be due to the activation of NF-κB and MAPKs. Taken together, these results demonstrate that vibratory bioreactor may be a suitable alternative to hVFFs for studying vocal folds cellular response to vibratory vocalization.

Postnatal Development of Echolocation Vocalizations in the Serotine Bat, Eptesicus serotinus (Chiroptera: Vespertilionidae) (문둥이박쥐(Eptesicus serotinus)의 생후 반향정위 발성 발달에 관한 연구)

  • Chung, Chul-Un;Han, Sang-Hoon;Kim, Sung-Chul;Lim, Chun-Woo;Cha, Jin-Yeol
    • Korean Journal of Environment and Ecology
    • /
    • v.29 no.6
    • /
    • pp.858-864
    • /
    • 2015
  • Developmental changes in the vocal signals of serotine bats (Eptesicus serotinus) during their infancy were examined in this study. The analysis was conducted on 4 infant serotine bats from 1 to 40 days after their birth. Pulse duration (PD), pulse interval (PI), peak frequency (PF), maximum frequency ($F_{MAX}$), minimum frequency ($F_{MIN}$), and bandwidth (BW) were measured. As the bats grew, their vocalizations became increasingly consistent and similar to those of adults. For infant bats, PD and PI decreased as they grew older, whereas PF, $F_{MAX}$, $F_{MIN}$, and BW increased. The greatest change in vocalizations was observed between the $10^{th}$ and $20^{th}$ days after birth. Also, PF, $F_{MAX}$, $F_{MIN}$ and BW, which describe sound frequency, increased dramatically during the period between the $10^{th}$ and the $20^{th}$ days. In contrast, the greatest change in PD occurred between the $30^{th}$ and $40^{th}$ days after birth. The results collected in this study suggest that frequency increased as the contraction ability of the muscles developed by around 20 days of age. Muscle relaxation ability, which is related to PD, was found to develop significantly at 30 to 40 days of age. According to the results of this study, although 40 day-old infant bats are not yet able to fly, their vocal signals were similar to those of adults. This indicates that vocal development and flying activity develop separately in young bats.

A Study on Correcting Korean Pronunciation Error of Foreign Learners by Using Supporting Vector Machine Algorithm

  • Jang, Kyungnam;You, Kwang-Bock;Park, Hyungwoo
    • International Journal of Advanced Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.316-324
    • /
    • 2020
  • It has experienced how difficult People with foreign language learning, it is to pronounce a new language different from the native language. The goal of various foreigners who want to learn Korean is to speak Korean as well as their native language to communicate smoothly. However, each native language's vocal habits also appear in Korean pronunciation, which prevents accurate information transmission. In this paper, the pronunciation of Chinese learners was compared with that of Korean. For comparison, the fundamental frequency and its variation of the speech signal were examined and the spectrogram was analyzed. The Formant frequencies known as the resonant frequency of the vocal tract were calculated. Based on these characteristics parameters, the classifier of the Supporting Vector Machine was found to classify the pronunciation of Koreans and the pronunciation of Chinese learners. In particular, the linguistic proposition was scientifically proved by examining the Korean pronunciation of /ㄹ/ that the Chinese people were not good at pronouncing.

Voice Conversion Using Linear Multivariate Regression Model and LP-PSOLA Synthesis Method (선형다변회귀모델과 LP-PSOLA 합성방식을 이용한 음성변환)

  • 권홍석;배건성
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.3
    • /
    • pp.15-23
    • /
    • 2001
  • This paper presents a voice conversion technique that modifies the utterance of a source speaker as if it were spoken by a target speaker. Feature parameter conversion methods to perform the transformation of vocal tract and prosodic characteristics between the source and target speakers are described. The transformation of vocal tract characteristics is achieved by modifying the LPC cepstral coefficients using Linear Multivariate Regression (LMR). Prosodic transformation is done by changing the average pitch period between speakers, and it is applied to the residual signal using the LP-PSOLA scheme. Experimental results show that transformed speech by LMR and LP-PSOLA synthesis method contains much characteristics of the target speaker.

  • PDF

Emotion Recognition Based on Frequency Analysis of Speech Signal

  • Sim, Kwee-Bo;Park, Chang-Hyun;Lee, Dong-Wook;Joo, Young-Hoon
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.2 no.2
    • /
    • pp.122-126
    • /
    • 2002
  • In this study, we find features of 3 emotions (Happiness, Angry, Surprise) as the fundamental research of emotion recognition. Speech signal with emotion has several elements. That is, voice quality, pitch, formant, speech speed, etc. Until now, most researchers have used the change of pitch or Short-time average power envelope or Mel based speech power coefficients. Of course, pitch is very efficient and informative feature. Thus we used it in this study. As pitch is very sensitive to a delicate emotion, it changes easily whenever a man is at different emotional state. Therefore, we can find the pitch is changed steeply or changed with gentle slope or not changed. And, this paper extracts formant features from speech signal with emotion. Each vowels show that each formant has similar position without big difference. Based on this fact, in the pleasure case, we extract features of laughter. And, with that, we separate laughing for easy work. Also, we find those far the angry and surprise.

Pitch Detection by the Analysis of Speech and EGG Signals (2-채널 (음성 및 EGG) 신호 분석에 의한 피치검출)

  • Shin, Mu-Yong;Kim, Jeong-Cheol;Bae, Keun-Sung
    • The Journal of the Acoustical Society of Korea
    • /
    • v.15 no.5
    • /
    • pp.5-12
    • /
    • 1996
  • We propose a two-channel(Speech & EGG) pitch detection algorithm. The EGG signal monitors the vibratory motion of vocal folds very well. Therefore, using the EGG signal as well as speech signal, we obtain a reliable and robust pitch detection algorithm that minimizers problems occuring in the pitch detection with speech only. The proposed algorithm gives precise pitch markers that are synchronized to the speech in the time domain. Experimental results demonstrate the superiority of the two-channel pitch detection algorithm over the conventional method, and it can be used in obtaining reference pitch for evaluation of other pitch detection algorithms.

  • PDF