• 제목/요약/키워드: speech analysis

검색결과 1,592건 처리시간 0.025초

파킨슨병 환자의 교대운동속도 과제에서 관찰된 '말 뭉침'의 음향학적 특성 (Acoustic Characteristics of 'Short Rushes of Speech' using Alternate Motion Rates in Patients with Parkinson's Disease)

  • 김선우;윤지혜;이승진
    • 말소리와 음성과학
    • /
    • 제7권2호
    • /
    • pp.55-62
    • /
    • 2015
  • It is widely accepted that Parkinson's disease(PD) is the most common cause of hypokinetic dysarthria, and its characteristics of 'short rushes of speech' have become more evident along with the severity of motor disorders. Speech alternate motion rates (AMRs) are particularly useful for observing not only rate abnormalities but also deviant speech. However, relatively little is known about the characteristics of 'short rushes of speech' in terms of AMRs of PD except for the perceptual characteristics. The purpose of this study was to examine which acoustic features of 'short rushes of speech' in terms of AMRs are a robust indicator of Parkinsonian speech. Numbers of syllabic repetitions (/pə/, /tə/, /kə/) in AMR tasks were analyzed through acoustic methods observing a spectrogram of the Computerized Speech Lab in 9 patients with PD. Acoustically, we found three characteristics of 'short rushes of speech': 1) Vocalized consonants without closure duration(VC) 76.3%; 2) No consonant segmentation(NC) 18.6%; 3) No vowel formant frequency(NV) 5.1%. Based on these results, 'short rushes of speech' may affect the failure to reach and maintain the phonatory targets. In order to best achieve the therapeutic goals, and to make the treatment most efficacious, it is important to incorporate training methods which are based on both phonation and articulation.

다이폰을 이용한 한국어 문자-음성 변환 시스템의 설계 및 구현 (Design and Implementation of Korean Tet-to-Speech System)

  • 정준구
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 제11회 음성통신 및 신호처리 워크샵 논문집 (SCAS 11권 1호)
    • /
    • pp.91-94
    • /
    • 1994
  • This paper is a study on the design and implementation of the Korean Tet-to-Speech system. In this paper, parameter symthesis method is chosen for speech symthesis method and PARCOR coeffient, one of the LPC analysis, is used as acoustic parameter, We use a diphone as synthesis unit, it include a basic naturalness of human speech. Diphone DB is consisted of 1228 PCM files. LPC synthesis method has defect that decline clearness of synthesis speech, during synthesizing unvoiced sound In this paper, we improve clearness of synthesized speech, using residual signal as ecitation signal of unvoiced sound. Besides, to improve a naturalness, we control the prosody of synthesized speech through controlling the energy and pitch pattern. Synthesis system is implemented at PC/486 and use a 70Hz-4.5KHz band pass filter for speech imput/output, amplifier and TMS320c30 DSP board.

  • PDF

회의실 유리창 진동음의 명료도 분석 (Speech Intelligibility Analysis on the Vibration Sound of the Window Glass of a Conference Room)

  • 김윤호;김희동;김석현
    • 한국소음진동공학회:학술대회논문집
    • /
    • 한국소음진동공학회 2006년도 추계학술대회논문집
    • /
    • pp.150-155
    • /
    • 2006
  • Speech intelligibility is investigated on a conference room-window glass coupled system. Using MLS(Maximum Length Sequency) signal as a sound source, acceleration and velocity responses of the window glass are measured by accelerometer and laser doppler vibrometer. MTF(Modulation Transfer Function) is used to identify the speech transmission characteristics of the room and window system. STI(Speech Transmission Index) is calculated by using MTF and speech intelligibility of the room and the window glass is estimated. Speech intelligibilities by the acceleration signal and the velocity signal are compared and the possibility of the wiretapping is investigated. Finally, intelligibility of the conversation sound is examined by the subjective test.

  • PDF

회의실 유리창 진동음의 음성 명료도 분석 (Speech Intelligibility Analysis on the Vibration Sound of the Glass Window of a Conference Room)

  • 김희동;김윤호;김석현
    • 한국소음진동공학회논문집
    • /
    • 제17권4호
    • /
    • pp.363-369
    • /
    • 2007
  • The purpose of the study is to obtain acoustical information to prevent eavesdropping of the glass window. Speech intelligibility was investigated on the vibration sound detected from the glass window of a conference room. Objective test using speech transmission index(STI) was performed to estimate quantitatively the speech intelligibility. STI was determined based on tile modulation transfer function(MTF) of the room-glass window system. Using Maximum Length Sequency(MLS) signal as a sound source, impulse responses of the glass window and MTF were determined by signals from accelerometers and laser doppler vibrometer. Finally, speech intelligibility of the interior sound and window vibration were compared under different sound pressure levels and amplifier gains to confirm the effect of measurement condition on the speech intelligibility.

청각 장애자용 발음 훈련 기기의 개발 (Speech training aids for deafs)

  • 김동준;윤태성;박상희
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1991년도 한국자동제어학술회의논문집(국내학술편); KOEX, Seoul; 22-24 Oct. 1991
    • /
    • pp.746-751
    • /
    • 1991
  • Deafs train articulation by observing mouth of a tutor. sensing tactually the notions of the vocal organs, or using speech training aids. Present speech training aids for deafs can measure only single speech ter, or display only frequency spectra in histogrm or pseudo-color. In this study, a speech training aids that can display subject's articulation in the form of a cross section of the vocal organs and other speech parameters together in a single system Is aimed to develop and this system makes a subject to know where to correct. For our objective, first, speech production mechanism is assumed to be AR model in order to estimate articulatory notions of the vocal tract from speech signal. Next, a vocal tract profile mode using LPC analysis is made up. And using this model, articulatory notions for Korean vowels are estimated and displayed in the vocal tract profile graphics.

  • PDF

CDMA이동통신환경에서의 음성인식을 위한 왜곡음성신호 거부방법 (Distorted Speech Rejection For Automatic Speech Recognition under CDMA Wireless Communication)

  • 김남수;장준혁
    • 한국음향학회지
    • /
    • 제23권8호
    • /
    • pp.597-601
    • /
    • 2004
  • 본 논문에서는 CDMA이동통신 환경에서의 음성인식을 위한 왜곡음성신호의 전처리-지부방법을 소개한다. 먼저, CDMA이동통신 채널에서의 왜곡된 음성신호를 분석하고 분석된 매커니즘을 바탕으로 채널에 의해 왜곡된 음성신호를 음성의 준주기성을 바탕으로 하여 거부하는 알고리즘을 제안한다. 실험을 통해 제안된 전처리-거부방법이 적은 계산량을 가지고 음성인식에 적용되어 효과적으로 CDMA에 환경에서 채널왜곡된 음성신호를 거부-할 수 있음을 알 수 있었다.

성직자 음성의 음향학적인 비교 연구 (A Comparative Study on the Voices of Clergymen: Ministers vs. Priests)

  • 이은선;박상희;조성미;정옥란;석동일
    • 음성과학
    • /
    • 제10권3호
    • /
    • pp.79-86
    • /
    • 2003
  • This study compared the voices of ministers and priests. There. has been a common notion that ministers is more passionate than priests in delivering their speech. Therefore, it can be assumed that ministers abuses or misuses his/her voice compared to priests. This study attempted acoustic analysis of the voices of 6 ministers and .5 priests before and after their speech. We measured F0, jitter, shimmer, NNE and HNR using Dr. Speech (Version 4.0, Tiger DRS). A t-test was performed to determine any objective differences of their voices. The results showed that there were no significant differences in the voices of ministers and priests before and after their speech. However, there seemed to be an interesting reversed tendency between ministers and priests, although it did not reach a statistical significance. That is, P0 tended to increase after the speech in ministers, whereas it tended to decrease in priests. In addition, HNR tended to decrease after the speech in priests, while it tended to increase in ministers.

  • PDF

웨이브렛 변환을 이용한 음성신호의 성문폐쇄시점 검출 (Detection of Glottal Closure Instant for Voiced Speech Using Wavelet Transform)

  • 배건성
    • 음성과학
    • /
    • 제7권3호
    • /
    • pp.153-165
    • /
    • 2000
  • During the phonation of voiced sounds, instants exist where the glottis is opened or closed, due to the periodic vibration of the vocal cord. When closed, this is called the glottal closure instant(GCI) or epoch.. The correct detection of the GCI is one of the important problems in speech processing for pitch detection, pitch synchronous analysis, and so on. Recently, it has been shown that the local maxima points of the wavelet transformed speech signal correspond to the GCIs of speech signal. In this paper, we investigate the accuracy of Gels estimated from this wavelet transformed speech signal. For this purpose we compare them with the negative peak points of the differentiated EGG signal that represents the actual GCIs of speech signal.

  • PDF

잡음환경에서의 음성인식 성능 향상을 위한 이중채널 음성의 CASA 기반 전처리 방법 (CASA-based Front-end Using Two-channel Speech for the Performance Improvement of Speech Recognition in Noisy Environments)

  • 박지훈;윤재삼;김홍국
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2007년도 하계종합학술대회 논문집
    • /
    • pp.289-290
    • /
    • 2007
  • In order to improve the performance of a speech recognition system in the presence of noise, we propose a noise robust front-end using two-channel speech signals by separating speech from noise based on the computational auditory scene analysis (CASA). The main cues for the separation are interaural time difference (ITD) and interaural level difference (ILD) between two-channel signal. As a result, we can extract 39 cepstral coefficients are extracted from separated speech components. It is shown from speech recognition experiments that proposed front-end has outperforms the ETSI front-end with single-channel speech.

  • PDF

훈련음성 데이터에 적응시킨 필터뱅크 기반의 MFCC 특징파라미터를 이용한 전화음성 연속숫자음의 인식성능 향상에 관한 연구 (A study on the recognition performance of connected digit telephone speech for MFCC feature parameters obtained from the filter bank adapted to training speech database)

  • 정성윤;김민성;손종목;배건성;강점자
    • 대한음성학회:학술대회논문집
    • /
    • 대한음성학회 2003년도 5월 학술대회지
    • /
    • pp.119-122
    • /
    • 2003
  • In general, triangular shape filters are used in the filter bank when we get the MFCCs from the spectrum of speech signal. In [1], a new feature extraction approach is proposed, which uses specific filter shapes in the filter bank that are obtained from the spectrum of training speech data. In this approach, principal component analysis technique is applied to the spectrum of the training data to get the filter coefficients. In this paper, we carry out speech recognition experiments, using the new approach given in [1], for a large amount of telephone speech data, that is, the telephone speech database of Korean connected digit released by SITEC. Experimental results are discussed with our findings.

  • PDF