통합 검색 | Korea Science

VOICE SOURCE ESTIMATION USING SEQUENTIAL SVD AND EXTRACTION OF COMPOSITE SOURCE PARAMETERS USING EM ALGORITHM

Hong, Sung-Hoon;Choi, Hong-Sub;Ann, Sou-Guil
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
- /
- pp.893-898
- /
- 1994
In this paper, the influence of voice source estimation and modeling on speech synthesis and coding is examined and then their new estimation and modeling techniques are proposed and verified by computer simulation. It is known that the existing speech synthesizer produced the speech which is dull and inanimated. These problems are arised from the fact that existing estimation and modeling techniques can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can represent a variety of source characteristics. First, we divide speech samples in one pitch region into four parts having different characteristics. Second, the vocal-tract parameters and voice source waveforms are estimated in each regions differently using sequential SVD. Third, we propose composite source model as a new voice source model which is represented by weighted sum of pre-defined basis functions. And finally, the weights and time-shift parameters of the proposed composite source model are estimeted uning EM(estimate maximize) algorithm. Experimental results indicate that the proposed estimation and modeling methods can estimate more accurate voice source waveforms and represent various source characteristics.
PDF

Artsim'을 이용한 모음의 조음점 추정에 관한 연구 (Estimation of Articulatory Characteristics of Vowels Using 'ArtSim')

김대현;조철우
- 대한음성학회지:말소리
- /
- 제35_36호
- /
- pp.121-129
- /
- 1998
In this paper, articulatory simulator 'Artsim' is used as a tool for the experiments to examine the articulatory characteristics of 6 different vowels. Each vowels are defined by some articulatory points from their vocal tract area functions and shapes of tongues. Each points are varied systematically to synthesize vowels and the synthesized sound is evaluated by human listners. Finally distributions of each vowels within vowel space is obtained. From the experimental results it is verified that our articulatory simulator can be used effectively to investigate the articulatory characteristics of speech.
PDF

음성천이구간에서의 성도 파라메타 시변추정에 관한 연구 (Time-varying Estimation of Vocal Track Parameters During the Speech Transition Regions)

최홍섭
- 한국음향학회지
- /
- 제16권2호
- /
- pp.101-106
- /
- 1997
음성의 천이구간에서의 특징 파라메타를 찾아내기 위하여 본 논문에서는 AR모델을 사용하여 적응적으로 성문폐쇄구간을 찾은 후, 이를 제외한 구간에서 성도 파라메타를 추정함으로써 음원의 피치바이어스 영향을 제거하는 SSRLS(Sample Selective RLS)방법을 제안한다. 성능을 비교하기 위하여 합성음과 실제음에 대하여 포만트 추정실험을 했으며, 실험결과 제안된 방법이 WRLS 보다 우수함을 알 수 있었다.
PDF

성악심리치료활동을 통한 자기의식 변화에 관한 연구 (A Study on Effects of the vocal psychotherapy upon Self-Consciousness)

이현주
- 인간행동과 음악연구
- /
- 제4권2호
- /
- pp.66-83
- /
- 2007
본 연구의 목적은 성악심리치료활동이 자기의식 수준 변화에 미치는 영향을 알아보고 자기의식 수준의 차이에 따라 자기보고서에 보고하는 내용이 어떻게 다른지를 알아보는 데에 있다. 본 연구는 서울시에 위치하고 있는 E 대학교에 재학 중인 여대생 3명을 대상으로 총 10회로 약 60분간 진행되었다. 연구에 참여한 이들은 음악치료 프로그램 지원자 모집에 자발적으로 지원한 자들이며, 연구에 사용된 성악심리치료 프로그램은 4단계로 구성되었다. 성악심리치료활동을 통한 대상자의 자기의식 수준 변화를 알아보기 위해 사전사후에 자기의식검사를 실시하였으며, 매 회기 성악심리치료활동을 마친 후 자기보고서를 작성하여 보고 내용의 변화를 분석하였다. 본 연구에서는 첫째로 성악심리치료활동이 자기의식검사 점수 변화에 미치는 영향을 알아보고자 하였다. 성악심리치료 프로그램의 실시 전후에 자기의식검사를 실시한 결과, 자기의식검사 전체 점수와 하위개념 점수가 개인별로 다른 변화를 보였다. 자기의식검사의 전체 점수와 하위개념 점수의 사전사후 점수에는 나타나지 않은 변화를 살펴보기 위해 문항별 변화를 알아본 결과, 총 28문항 중에서 대상자 A는 13개, 대상자 B는 16개, 대상자 C는 19개의 문항이 1-2점씩 변화하였다. 이는 점수의 변화율에는 나타나지 않은 결과로 위의 내용들을 종합하면, 성악심리치료활동이 자기의식검사의 하위개념 점수의 증감에 영향을 미친다는 결론을 유추할 수 있다. 둘째로는 매 회기를 마친 후 자기보고서에 보고된 내용의 분석을 통해 성악심리치료활동이 진행되는 동안 자기의식 수준의 차이에 따라 자기보고서에 보고하는 내용이 어떻게 다른지를 알아보고자 하였다. 질적 분석 결과, 자기의식의 개인차에 따라 자신을 목소리로 표현하는 것에 대한 인식, 자신의 목소리에 대한 인식, 타인이 의식되는 정도와 내용이 회기와 개인별로 다르게 나타났다. 분석된 자기보고서의 내용을 구체적으로 살펴보면, 사적자기의식 점수와 공적자기의식 점수가 상대적으로 높은 대상자 A는 자신의 내적 사고와 감정에 대한 측면을 많이 의식하고, 사회적 객체로서의 자신에게 주의를 집중하는 경향이 자기보고서에 많이 나타났다. 또한 감정적 풍부함, 예술적 호기심과 같은 긍정적 정서 경험과 함께 특성, 상태 불안, 신경증적 경향성 등의 부정적 정서도 많이 나타났다. 반면, 대상자 A와 함께 사적자기의식 점수와 공적자기의식 점수가 상대적으로 높은 B의 경우 사회적 객체로서의 자신에게 주의를 집중하는 경향은 A와 비슷하나, 다른 사람 앞에서 불안해하고 당황하는 사회불안 관련 경향이 많이 보고되었다. 마지막으로, 다른 참여자들에 비해 상대적으로 낮은 자기의식점수를 보이는 대상자 C는 대상자 A와 B에 비해 자신의 타인을 의식하는 경향이 상대적으로 적게 보고되었으며, 특성, 상태 불안과 같은 부정적 정서도 적게 보고되었다. 이 연구는 자기의식으로 인해 활성화되는 자기평가과정으로 인해 어려움을 겪는 이들을 위한 연구라는 점과 자기(Self)와 밀접하게 연관된 목소리를 주로 사용한다는 점, 그리고 기술적 어려움의 해소를 위해 성악기술훈련을 실시하였다는 점이 의미가 있다. 또한 성악심리치료활동을 통해 정신건강과 성격발달에 긍정적인 변화를 가져올 수 있다는 사실을 시사한다.
PDF

은닉 마르코프 모델을 이용한 음성에서의 감정인식 (Emotion recognition in speech using hidden Markov model)

김성일;정현열
- 융합신호처리학회논문지
- /
- 제3권3호
- /
- pp.21-26
- /
- 2002
본 논문은 분노, 행복, 평정, 슬픔, 놀람 등과 같은 인간의 감정상태를 인식하는 새로운 접근에 대해 설명한다. 이러한 시도는 이산길이를 포함하는 연속 은닉 마르코프 모델(HMM)을 사용함으로써 이루어진다. 이를 위해, 우선 입력음성신호로부터 감정의 특징 파라메타를 정의한다. 본 연구에서는 피치 신호, 에너지, 그리고 각각의 미분계수 등의 운율 파라메타를 사용하고, HMM으로 훈련과정을 거친다. 또한, 화자적응을 위해서 최대 사후확률(MAP) 추정에 기초한 감정 모델이 이용된다. 실험 결과로서, 음성에서의 감정 인식률은 적응 샘플수의 증가에 따라 점차적으로 증가함을 보여준다.
PDF

견실 순차 특이치분해를 이용한 음원추정 (Voice Source Estimation Using Robust Sequential SVD)

홍성훈
- 한국음향학회:학술대회논문집
- /
- 한국음향학회 1993년도 학술논문발표회 논문집 제12권 1호
- /
- pp.75-79
- /
- 1993
본 논문에서는 변화가 심한 음원파형을 추정하는 새로운 순차처리 알고리듬을 제안한다. 먼저, 1) 기존의 순차처리 분석법중 대표적인 분석법인 RLS(recursive least square)의 문제점들을 검토하고, 2) 이를 개선하기 위해서 관측행렬(observation matrix)을 최적차수의 SVD(reduced-rank singular value decomposition)로 재구성하고, 3) 이에 견실개념(robustness concept)을 적용해서 최적의 성도변수(vocal tract parameter)를 찾아내고 역필터를 적용해서 음원(voice source)을 효과적으로 구분해낸다. 본 논문에서 제안된 방법으로 음원을 추정할 경우, 변화가 심한 음원파형을 잘 추정할 수 있으며, 음원의 특성을 구분해낸 성도 파라미터도 효과적으로 추정할 수 있다. 본 연구내용은 음성합성에서 자연성 개선 및 개인성 구현을 위해서 필수적이며, 다양한 형태의 음성을 표현하기 위해 사용되어질 수 있다. 또한, 음성코딩, 화자인식, 음성인식에서도 사용되어질 수 있다.
PDF

Recognition of Emotion and Emotional Speech Based on Prosodic Processing

Kim, Sung-Ill
- The Journal of the Acoustical Society of Korea
- /
- 제23권3E호
- /
- pp.85-90
- /
- 2004
This paper presents two kinds of new approaches, one of which is concerned with recognition of emotional speech such as anger, happiness, normal, sadness, or surprise. The other is concerned with emotion recognition in speech. For the proposed speech recognition system handling human speech with emotional states, total nine kinds of prosodic features were first extracted and then given to prosodic identifier. In evaluation, the recognition results on emotional speech showed that the rates using proposed method increased more greatly than the existing speech recognizer. For recognition of emotion, on the other hands, four kinds of prosodic parameters such as pitch, energy, and their derivatives were proposed, that were then trained by discrete duration continuous hidden Markov models(DDCHMM) for recognition. In this approach, the emotional models were adapted by specific speaker's speech, using maximum a posteriori(MAP) estimation. In evaluation, the recognition results on emotional states showed that the rates on the vocal emotions gradually increased with an increase of adaptation sample number.
PDF KSCI

청각장애자를 위한 원격조음훈련시스템의 개발 (Remote Articulation Training System for the Deafs)

이재혁;유선국;박상희
- 대한후두음성언어의학회지
- /
- 제7권1호
- /
- pp.43-49
- /
- 1996
In this study, remote articulation training system which connects the hearing disabled trainee and the speech therapist via B-ISDN is introduced. The hearing disabled does not have the hearing feedback of his own pronuciation, and the chance of watching his speech organs movement trajectory will offer him the self-training of articulation. So the system has two purposes of self articulation training and trainer's on-line checking in remote place. We estimate the vocal tract articultory movements from the speech signal using inverse modelling and display the movement trajectoy on the sideview of human face graphically. The trajectories of trainees articulation is displayed along with the reference trajectories, so the trainee can control his articulating to make the two trajectories overlapped. For on-line communication and ckecking training record the system has the function of video conferencing and tranferring articulatory data.
PDF

운율 변조 양상에 따른 청자의 연령 지각 (Listener's Age Estimation by Prosody Manipulation)

김지연;성철재
- 말소리와 음성과학
- /
- 제6권2호
- /
- pp.81-88
- /
- 2014
The normal aging process on speech production and these changes are perceived by listeners. This study examined whether age perception changed under various conditions of prosodic manipulations in normal listeners, comparing the prosodic changes according to age and sex in adulthood. The older and younger voices were resynthesized by manipulation of the speaking rate and pitch to shift the perceived age of the groups toward each other. Two-way repeated ANOVA were conducted to determine if the prosodic type of resynthesized cue resulted in a significant shift in perceived age of young and old voices. The manipulation of the speaking rate resulted in a significant shift in perceived age for the older and younger groups. A significant shift in age estimates was not observed for the younger male group when pitch was manipulated. There were significant gender-by-age group interactions for prosodic manipulation type. Age-related changes in the prosodic properties of speech may ultimately influence speech perception.
https://doi.org/10.13064/KSSS.2014.6.2.081 인용 PDF KSCI

Efficient Tracking of Speech Formant Using Closed Phase WRLS-VFF-VT Algorithm

Lee, Kyo-Sik;Park, Kyu-Sik
- The Journal of the Acoustical Society of Korea
- /
- 제19권2E호
- /
- pp.8-13
- /
- 2000
In this paper, we present an adaptive formant tracking algorithm for speech using closed phase WRLS-VFF-VT method. The pitch synchronous closed phase methods is known to give more accurate estimates of the vocal tract parameters than the pitch asynchronous method. However the use of a pitch-synchronous closed phase analysis method has been limited due to difficulties associated with the task of accurately isolating the closed phase region in successive periods of speech. Therefore we have implemented the pitch synchronous closed phase WRLS-VFF-VT algorithm for speech analysis, especially for formant tracking. The proposed algorithm with the variable threshold(VT) can provide a superior performance in the boundary of phone and voiced/unvoiced sound. The proposed method is experimentally compared with the other method such as two channel CPC method by using synthetic waveform and real speech data. From the experimental results, we found that the block data processing techniques, such as the two-channel CPC, gave reasonable estimates of the formant/antiformant. However, the data windows used by these methods included the effects of the periodic excitation pulses, which affected the accuracy of the estimated formants. On the other hand the proposed WRLS-VFF-VT method, which eliminated the influence of the pulse excitation by using an input estimation as part of the algorithm, gave very accurate formant/bandwidth estimates and good spectral matching.
PDF

검색결과 28건 처리시간 0.027초

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

자세히 찾기

이미지 검색 (β)