• Title/Summary/Keyword: synthetic voice

Search Result 29, Processing Time 0.023 seconds

Evaluation of Synthetic Voice which is Agreeable to the Ear Using Sensibility Ergonomics Method (감성 평가를 이용한 듣기 좋은 음성 합성음에 대한 연구)

  • Park, Yong-Kuk;Kim, Jae-Kuk;Jeon, Yong-Woong;Cho, Am
    • Journal of the Ergonomics Society of Korea
    • /
    • v.21 no.1
    • /
    • pp.51-65
    • /
    • 2002
  • As the method of providing information is getting multimedia, the synthetic voice is used in not only CTI(Computer Telephony Integration), information service for the blind, but also applications on internet. But properties of synthetic voice, such as speech rate, pitch, timbre and so on, are not adjusted to customers' preference but providers' preference. In order to consider customers' preference, this study proposed four subjective factors of voice through the evaluation of voice using the method of sensibility ergonomics. And the relation synthetic voice to be agreeable to the ear with emotional images was formulated as a fuzzy model. Consequently, this study proposed the speech rate and pitch of synthetic voice which is agreeable to the ear.

A Study on the Voice Conversion with HMM-based Korean Speech Synthesis (HMM 기반의 한국어 음성합성에서 음색변환에 관한 연구)

  • Kim, Il-Hwan;Bae, Keun-Sung
    • MALSORI
    • /
    • v.68
    • /
    • pp.65-74
    • /
    • 2008
  • A statistical parametric speech synthesis system based on the hidden Markov models (HMMs) has grown in popularity over the last few years, because it needs less memory and low computation complexity and is suitable for the embedded system in comparison with a corpus-based unit concatenation text-to-speech (TTS) system. It also has the advantage that voice characteristics of the synthetic speech can be modified easily by transforming HMM parameters appropriately. In this paper, we present experimental results of voice characteristics conversion using the HMM-based Korean speech synthesis system. The results have shown that conversion of voice characteristics could be achieved using a few sentences uttered by a target speaker. Synthetic speech generated from adapted models with only ten sentences was very close to that from the speaker dependent models trained using 646 sentences.

  • PDF

Prosody Control of the Synthetic Speech using Sampling Rate Conversion (표본화율 변환을 이용한 합성음의 운율제어)

  • 이현구;홍광석
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.676-679
    • /
    • 1999
  • In this paper, we presents a method to control prosody of the synthetic speech using sampling rate conversion technique. In prosody control, the conventional methods perform overlap and add. So the synthetic speech has a distortion and the voice quality is not satisfied. Using sampling rate conversion technique, we can get high Qualify of the synthetic speech. Also we can control various talking speeds according to speaker's patterns.

  • PDF

A Study on Voice Color Control Rules for Speech Synthesis System (음성합성시스템을 위한 음색제어규칙 연구)

  • Kim, Jin-Young;Eom, Ki-Wan
    • Speech Sciences
    • /
    • v.2
    • /
    • pp.25-44
    • /
    • 1997
  • When listening the various speech synthesis systems developed and being used in our country, we find that though the quality of these systems has improved, they lack naturalness. Moreover, since the voice color of these systems are limited to only one recorded speech DB, it is necessary to record another speech DB to create different voice colors. 'Voice Color' is an abstract concept that characterizes voice personality. So speech synthesis systems need a voice color control function to create various voices. The aim of this study is to examine several factors of voice color control rules for the text-to-speech system which makes natural and various voice types for the sounding of synthetic speech. In order to find such rules from natural speech, glottal source parameters and frequency characteristics of the vocal tract for several voice colors have been studied. In this paper voice colors were catalogued as: deep, sonorous, thick, soft, harsh, high tone, shrill, and weak. For the voice source model, the LF-model was used and for the frequency characteristics of vocal tract, the formant frequencies, bandwidths, and amplitudes were used. These acoustic parameters were tested through multiple regression analysis to achieve the general relation between these parameters and voice colors.

  • PDF

Voice quality transform using jitter synthesis (Jitter 합성에 의한 음질변환에 관한 연구)

  • Jo, Cheolwoo
    • Phonetics and Speech Sciences
    • /
    • v.10 no.4
    • /
    • pp.121-125
    • /
    • 2018
  • This paper describes procedures of changing and measuring voice quality in terms of jitter. Jitter synthesis method was applied to the TD-PSOLA analysis system of the Praat software. The jitter component is synthesized based on a Gaussian random noise model. The TD-PSOLA re-synthesize process is used to synthesize the modified voice with artificial jitter. Various vocal jitter parameters are used to measure the change in quality caused by artificial systematic jitter change. Synthetic vowels, natural vowels and short sentences are used to check the change in voice quality through the synthesizer model. The results shows that the suggested method is useful for voice quality control in a limited way and can be used to alter the jitter component of voice.

Pitch Modification based on a Voice Source Model (음원 모델에 기초한 합성음의 피치 조절)

  • Choi, Yong-Jin;Yeo, Su-Jin;Kim, Jin-Young;Sung, Koeng-Mo
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

Acoustic Echo Canceller for Synthetic Stereo Using HRTF (머리 전달 함수를 이용한 합성 스테레오 음향 반향 제거기)

  • 박장식;백주순;손경식
    • Proceedings of the Korea Multimedia Society Conference
    • /
    • 2002.05c
    • /
    • pp.149-153
    • /
    • 2002
  • In this brief, Acoustic echo cancellation scheme is proposed to enhance the presence of multiple participants of hands-free voice and video conference. Synthetic stereo using head related transfer function and the stereo echo cancellation scheme are proposed. It is shown that the proposed synthetic stereo echo cancellation scheme is well performed by computer simulation.

  • PDF

APPLICATION OF KOREAN TEXT-TO-SPEECH FOR X.400 MHS SYSTEM

  • Kim, Hee-Dong;Koo, Jun-Mo;Choi, Ho-Joon;Kim, Sang-Taek
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.885-892
    • /
    • 1994
  • This paper presents the Korean text-to-speech (TTS) algorithm with speed and intonation control capability, and describes the development of the Voice message delivery system employing this TTS algorithm. This system allows the Interpersonal Messaging (IPM) Service users of Message Handling System (MHS) to send his/her text messages to user via telephone line using synthetic voice. In the X.400 MHS recommendation, the protocols and service elements are not specified for the voice message delivery system. Thus, we defined access protocol and service elements for Voice Access Unit based on the application program interface for message transfers between X.400 Message Transfer Agent and Voice Access Unit. The system architecture and operations will be provided.

  • PDF

Voice transformation for HTS using correlation between fundamental frequency and vocal tract length (기본주파수와 성도길이의 상관관계를 이용한 HTS 음성합성기에서의 목소리 변환)

  • Yoo, Hyogeun;Kim, Younggwan;Suh, Youngjoo;Kim, Hoirin
    • Phonetics and Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.41-47
    • /
    • 2017
  • The main advantage of the statistical parametric speech synthesis is its flexibility in changing voice characteristics. A personalized text-to-speech(TTS) system can be implemented by combining a speech synthesis system and a voice transformation system, and it is widely used in many application areas. It is known that the fundamental frequency and the spectral envelope of speech signal can be independently modified to convert the voice characteristics. Also it is important to maintain naturalness of the transformed speech. In this paper, a speech synthesis system based on Hidden Markov Model(HMM-based speech synthesis, HTS) using the STRAIGHT vocoder is constructed and voice transformation is conducted by modifying the fundamental frequency and spectral envelope. The fundamental frequency is transformed in a scaling method, and the spectral envelope is transformed through frequency warping method to control the speaker's vocal tract length. In particular, this study proposes a voice transformation method using the correlation between fundamental frequency and vocal tract length. Subjective evaluations were conducted to assess preference and mean opinion scores(MOS) for naturalness of synthetic speech. Experimental results showed that the proposed voice transformation method achieved higher preference than baseline systems while maintaining the naturalness of the speech quality.

Face-to-face Communication in Cyberspace using Analysis and Synthesis of Facial Expression

  • Shigeo Morishima
    • Proceedings of the Korean Society of Broadcast Engineers Conference
    • /
    • 1999.06a
    • /
    • pp.111-118
    • /
    • 1999
  • Recently computer can make cyberspace to walk through by an interactive virtual reality technique. An a avatar in cyberspace can bring us a virtual face-to-face communication environment. In this paper, an avatar is realized which has a real face in cyberspace and a multiuser communication system is constructed by voice transmitted through network. Voice from microphone is transmitted and analyzed, then mouth shape and facial expression of avatar are synchronously estimated and synthesized on real time. And also an entertainment application of a real-time voice driven synthetic face is introduced and this is an example of interactive movie. Finally, face motion capture system using physics based face model is introduced.