• 제목/요약/키워드: synthetic voice

검색결과 29건 처리시간 0.027초

감성 평가를 이용한 듣기 좋은 음성 합성음에 대한 연구 (Evaluation of Synthetic Voice which is Agreeable to the Ear Using Sensibility Ergonomics Method)

  • 박용국;김재국;전용웅;조암
    • 대한인간공학회지
    • /
    • 제21권1호
    • /
    • pp.51-65
    • /
    • 2002
  • As the method of providing information is getting multimedia, the synthetic voice is used in not only CTI(Computer Telephony Integration), information service for the blind, but also applications on internet. But properties of synthetic voice, such as speech rate, pitch, timbre and so on, are not adjusted to customers' preference but providers' preference. In order to consider customers' preference, this study proposed four subjective factors of voice through the evaluation of voice using the method of sensibility ergonomics. And the relation synthetic voice to be agreeable to the ear with emotional images was formulated as a fuzzy model. Consequently, this study proposed the speech rate and pitch of synthetic voice which is agreeable to the ear.

HMM 기반의 한국어 음성합성에서 음색변환에 관한 연구 (A Study on the Voice Conversion with HMM-based Korean Speech Synthesis)

  • 김일환;배건성
    • 대한음성학회지:말소리
    • /
    • 제68권
    • /
    • pp.65-74
    • /
    • 2008
  • A statistical parametric speech synthesis system based on the hidden Markov models (HMMs) has grown in popularity over the last few years, because it needs less memory and low computation complexity and is suitable for the embedded system in comparison with a corpus-based unit concatenation text-to-speech (TTS) system. It also has the advantage that voice characteristics of the synthetic speech can be modified easily by transforming HMM parameters appropriately. In this paper, we present experimental results of voice characteristics conversion using the HMM-based Korean speech synthesis system. The results have shown that conversion of voice characteristics could be achieved using a few sentences uttered by a target speaker. Synthetic speech generated from adapted models with only ten sentences was very close to that from the speaker dependent models trained using 646 sentences.

  • PDF

표본화율 변환을 이용한 합성음의 운율제어 (Prosody Control of the Synthetic Speech using Sampling Rate Conversion)

  • 이현구;홍광석
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1999년도 추계종합학술대회 논문집
    • /
    • pp.676-679
    • /
    • 1999
  • In this paper, we presents a method to control prosody of the synthetic speech using sampling rate conversion technique. In prosody control, the conventional methods perform overlap and add. So the synthetic speech has a distortion and the voice quality is not satisfied. Using sampling rate conversion technique, we can get high Qualify of the synthetic speech. Also we can control various talking speeds according to speaker's patterns.

  • PDF

음성합성시스템을 위한 음색제어규칙 연구 (A Study on Voice Color Control Rules for Speech Synthesis System)

  • 김진영;엄기완
    • 음성과학
    • /
    • 제2권
    • /
    • pp.25-44
    • /
    • 1997
  • When listening the various speech synthesis systems developed and being used in our country, we find that though the quality of these systems has improved, they lack naturalness. Moreover, since the voice color of these systems are limited to only one recorded speech DB, it is necessary to record another speech DB to create different voice colors. 'Voice Color' is an abstract concept that characterizes voice personality. So speech synthesis systems need a voice color control function to create various voices. The aim of this study is to examine several factors of voice color control rules for the text-to-speech system which makes natural and various voice types for the sounding of synthetic speech. In order to find such rules from natural speech, glottal source parameters and frequency characteristics of the vocal tract for several voice colors have been studied. In this paper voice colors were catalogued as: deep, sonorous, thick, soft, harsh, high tone, shrill, and weak. For the voice source model, the LF-model was used and for the frequency characteristics of vocal tract, the formant frequencies, bandwidths, and amplitudes were used. These acoustic parameters were tested through multiple regression analysis to achieve the general relation between these parameters and voice colors.

  • PDF

Jitter 합성에 의한 음질변환에 관한 연구 (Voice quality transform using jitter synthesis)

  • 조철우
    • 말소리와 음성과학
    • /
    • 제10권4호
    • /
    • pp.121-125
    • /
    • 2018
  • This paper describes procedures of changing and measuring voice quality in terms of jitter. Jitter synthesis method was applied to the TD-PSOLA analysis system of the Praat software. The jitter component is synthesized based on a Gaussian random noise model. The TD-PSOLA re-synthesize process is used to synthesize the modified voice with artificial jitter. Various vocal jitter parameters are used to measure the change in quality caused by artificial systematic jitter change. Synthetic vowels, natural vowels and short sentences are used to check the change in voice quality through the synthesizer model. The results shows that the suggested method is useful for voice quality control in a limited way and can be used to alter the jitter component of voice.

음원 모델에 기초한 합성음의 피치 조절 (Pitch Modification based on a Voice Source Model)

  • 최용진;여수진;김진영;성굉모
    • 음성과학
    • /
    • 제3권
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

머리 전달 함수를 이용한 합성 스테레오 음향 반향 제거기 (Acoustic Echo Canceller for Synthetic Stereo Using HRTF)

  • 박장식;백주순;손경식
    • 한국멀티미디어학회:학술대회논문집
    • /
    • 한국멀티미디어학회 2002년도 춘계학술발표논문집(상)
    • /
    • pp.149-153
    • /
    • 2002
  • In this brief, Acoustic echo cancellation scheme is proposed to enhance the presence of multiple participants of hands-free voice and video conference. Synthetic stereo using head related transfer function and the stereo echo cancellation scheme are proposed. It is shown that the proposed synthetic stereo echo cancellation scheme is well performed by computer simulation.

  • PDF

APPLICATION OF KOREAN TEXT-TO-SPEECH FOR X.400 MHS SYSTEM

  • Kim, Hee-Dong;Koo, Jun-Mo;Choi, Ho-Joon;Kim, Sang-Taek
    • 한국음향학회:학술대회논문집
    • /
    • 한국음향학회 1994년도 FIFTH WESTERN PACIFIC REGIONAL ACOUSTICS CONFERENCE SEOUL KOREA
    • /
    • pp.885-892
    • /
    • 1994
  • This paper presents the Korean text-to-speech (TTS) algorithm with speed and intonation control capability, and describes the development of the Voice message delivery system employing this TTS algorithm. This system allows the Interpersonal Messaging (IPM) Service users of Message Handling System (MHS) to send his/her text messages to user via telephone line using synthetic voice. In the X.400 MHS recommendation, the protocols and service elements are not specified for the voice message delivery system. Thus, we defined access protocol and service elements for Voice Access Unit based on the application program interface for message transfers between X.400 Message Transfer Agent and Voice Access Unit. The system architecture and operations will be provided.

  • PDF

기본주파수와 성도길이의 상관관계를 이용한 HTS 음성합성기에서의 목소리 변환 (Voice transformation for HTS using correlation between fundamental frequency and vocal tract length)

  • 유효근;김영관;서영주;김회린
    • 말소리와 음성과학
    • /
    • 제9권1호
    • /
    • pp.41-47
    • /
    • 2017
  • The main advantage of the statistical parametric speech synthesis is its flexibility in changing voice characteristics. A personalized text-to-speech(TTS) system can be implemented by combining a speech synthesis system and a voice transformation system, and it is widely used in many application areas. It is known that the fundamental frequency and the spectral envelope of speech signal can be independently modified to convert the voice characteristics. Also it is important to maintain naturalness of the transformed speech. In this paper, a speech synthesis system based on Hidden Markov Model(HMM-based speech synthesis, HTS) using the STRAIGHT vocoder is constructed and voice transformation is conducted by modifying the fundamental frequency and spectral envelope. The fundamental frequency is transformed in a scaling method, and the spectral envelope is transformed through frequency warping method to control the speaker's vocal tract length. In particular, this study proposes a voice transformation method using the correlation between fundamental frequency and vocal tract length. Subjective evaluations were conducted to assess preference and mean opinion scores(MOS) for naturalness of synthetic speech. Experimental results showed that the proposed voice transformation method achieved higher preference than baseline systems while maintaining the naturalness of the speech quality.

Face-to-face Communication in Cyberspace using Analysis and Synthesis of Facial Expression

  • Shigeo Morishima
    • 한국방송∙미디어공학회:학술대회논문집
    • /
    • 한국방송공학회 1999년도 KOBA 방송기술 워크샵 KOBA Broadcasting Technology Workshop
    • /
    • pp.111-118
    • /
    • 1999
  • Recently computer can make cyberspace to walk through by an interactive virtual reality technique. An a avatar in cyberspace can bring us a virtual face-to-face communication environment. In this paper, an avatar is realized which has a real face in cyberspace and a multiuser communication system is constructed by voice transmitted through network. Voice from microphone is transmitted and analyzed, then mouth shape and facial expression of avatar are synchronously estimated and synthesized on real time. And also an entertainment application of a real-time voice driven synthetic face is introduced and this is an example of interactive movie. Finally, face motion capture system using physics based face model is introduced.