• Title/Summary/Keyword: Voice Synthesis

Search Result 103, Processing Time 0.028 seconds

Quantifying the Urgency Perception of Voice Alarm Generated by Concatenative Synthesizer (연결형 합성음성을 이용한 경보음의 주관적 위급도 정량화)

  • Jang, Pil-Sik;Lee, Gyeong-Tae
    • Journal of the Ergonomics Society of Korea
    • /
    • v.25 no.2
    • /
    • pp.63-70
    • /
    • 2006
  • This paper presents an experimental study of the factors modulating the urgency perception of voice alarm generated by concatenative synthesizers. Four experiments were conducted using psycho-physical approach in which 105 participants made magnitude estimation for urgency perception of various voice alarm stimuli. Experiment 1 identified 6 acoustic and non-acoustic factors modulating the perceived urgency of synthesized voice alarm. Experiment 2, 3 and 4 quantified the relations between the objective changes in each of the quantifiable parameters and the subjective changes in urgency perception. This research has implications for the design and implementation of synthesized voice alarm systems where urgency mapping is required.

Universal Personal Telecommunications using Specialized Resource Functions in the Intelligent Peripheral (Intelligent Peripheral의 특수 음성 자원을 이용한 Universal Personal Telecommunications 서비스)

  • Kim, Gi-Ryeong;Kim, Tae-Il;Choe, Go-Bong
    • The Transactions of the Korea Information Processing Society
    • /
    • v.3 no.6
    • /
    • pp.1506-1514
    • /
    • 1996
  • This paper proposes enhanced features for the Universal Telecommunications (UPT), voice authentication and voice synthesis, using the specialized resources functions in the Intelligent peripheral(IP). The proposed voice authentication is able ti provide simple and user-friendly security mechanism and to prevent unauthorized users from fraudulently using the UPT number. Also, traditional UPT service deliveries only fixed message to the UPT user, but the proposed UPT service can support flexible message transfer by use of the voice synthesis.

  • PDF

VOICE SOURCE ESTIMATION USING SEQUENTIAL SVD AND EXTRACTION OF COMPOSITE SOURCE PARAMETERS USING EM ALGORITHM

  • Hong, Sung-Hoon;Choi, Hong-Sub;Ann, Sou-Guil
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.893-898
    • /
    • 1994
  • In this paper, the influence of voice source estimation and modeling on speech synthesis and coding is examined and then their new estimation and modeling techniques are proposed and verified by computer simulation. It is known that the existing speech synthesizer produced the speech which is dull and inanimated. These problems are arised from the fact that existing estimation and modeling techniques can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can not give more accurate voice parameters. Therefore, in this paper we propose a new voice source estimation algorithm and modeling techniques which can represent a variety of source characteristics. First, we divide speech samples in one pitch region into four parts having different characteristics. Second, the vocal-tract parameters and voice source waveforms are estimated in each regions differently using sequential SVD. Third, we propose composite source model as a new voice source model which is represented by weighted sum of pre-defined basis functions. And finally, the weights and time-shift parameters of the proposed composite source model are estimeted uning EM(estimate maximize) algorithm. Experimental results indicate that the proposed estimation and modeling methods can estimate more accurate voice source waveforms and represent various source characteristics.

  • PDF

GMM based Nonlinear Transformation Methods for Voice Conversion

  • Vu, Hoang-Gia;Bae, Jae-Hyun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.67-70
    • /
    • 2005
  • Voice conversion (VC) is a technique for modifying the speech signal of a source speaker so that it sounds as if it is spoken by a target speaker. Most previous VC approaches used a linear transformation function based on GMM to convert the source spectral envelope to the target spectral envelope. In this paper, we propose several nonlinear GMM-based transformation functions in an attempt to deal with the over-smoothing effect of linear transformation. In order to obtain high-quality modifications of speech signals our VC system is implemented using the Harmonic plus Noise Model (HNM)analysis/synthesis framework. Experimental results are reported on the English corpus, MOCHA-TlMlT.

  • PDF

A Development of Automatic Safety Navigation Support Service Providing System for Medium and Small Ships based on Speech Synthesis (중소형 선박을 위한 음성합성 기반 자동 안전항해 지원 서비스 제공 시스템 개발)

  • Hwang, Hun-Gyu;Kim, Bae-Sung;Woo, Yum-Tae
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.25 no.4
    • /
    • pp.595-602
    • /
    • 2021
  • Marine accidents are mostly caused by medium and small ships, and are continuously increasing. In this paper, we propose an architecture of the speech synthesis based automatic safety navigation support service providing system for small ships that equiped onboard systems compared with vessels. The main purpose of the system is to prevent marine accidents by providing synthesized voice safety messages to nearby ships. The safety navigation support service is operated by connecting GPS and AIS to synthesize voice safety messages, automatically broadcast through VHF. Therefore, we developed a data processing module, a staged risk analysis module, a voice synthesis safety message generation module, and a VHF broadcasting equipment control module, which are components of the system. In addition, we conducted laboratory-level and sea-trial demonstration tests using the developed the system, which verified usefulness of the proposed service.

A Study on Approximation-Synthesis of Transition Segment in Speech Signal (음성신호에서 천이구간의 근사합성에 관한 연구)

  • Lee See-Woo
    • The Journal of the Korea Contents Association
    • /
    • v.5 no.3
    • /
    • pp.167-173
    • /
    • 2005
  • In a speech coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including Unvoiced Consonant) extraction method by using pitch pulses and Zero Crossing Rate in order to unexistent with a voiced and unvoiced consonants in a frame. And this paper present a TSIUVC approximate-synthesis method by using frequency band division. As a result, this method obtains a high quality approximation-synthesis waveform within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. And the TSIUVC extraction rate was $91\%$ for female voice and $96.2\%$ for male voice respectively This method has the capability of being applied to a new speech coding of Voiced/Silence/TSIUVC, speech analysis, and speech synthesis.

  • PDF

A Study on Multi-Pulse Speech Coding Method by using Selected Information in a Frequency Domain (주파수 영역의 선택정보를 이용한 멀티펄스 음성부호화 방식에 관한 연구)

  • Lee See-Woo
    • Journal of Internet Computing and Services
    • /
    • v.7 no.4
    • /
    • pp.57-66
    • /
    • 2006
  • In this paper, I propose a new method of Multi-Pulse Speech Coding(FBD-MPC: Frequency Band Division MPC) by using TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction and approximation-synthesis method in a frequency domain. As, a result. the extraction rates of TSIUVC are 84.8%(plosive), 94.9%(fricative) and 92.3%(affricative) in female voice, 88%(plosive), 94.9%(fricative) and 92.3%(affricative) in male voice respectively. Also, I obtain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. I evaluate MPC by using switching information of voiced/unvoiced and FBD-MPC by using switching information of voiced/Silence/TSIUVC. As, a result, I knew that synthesis speech of FBD-MPC was better in speech quality than synthesis speech of the MPC.

  • PDF

Development of a Voice User Interface for Web Browser using VoiceXML (VoiceXML을 이용한 VUI 지원 웹브라우저 개발)

  • Yea SangHoo;Jang MinSeok
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.2
    • /
    • pp.101-111
    • /
    • 2005
  • The present web informations are mainly described in terms of HTML, which users obtain through input devices such as mouse, keyboard, etc. Thus the existing GUI environment have not supported human's most natural information acquisition means, that is, voice. To solve the problem, several vendors are developing voice user interface. However these products are deficient in man -machine interactivity and their accommodation of existing web environment. This paper presents a VUI(Voice User Interface) supporting web browser by utilizing more and more maturing speech recognition technology and VoiceXML, a markup language derived from XML. It provides users with both interfaces, VUI as well as GUI. In addition, XML Island technology is applied to the bowser in a way that VoiceXML fragments are nested in HTML documents to accommodate the existing web environment. Also for better interactivity, dialogue scenarios for menu, bulletin, and search engine are suggested.

Design and Implementation of Voice-based Interactive Service KIOSK (음성기반 대화형 서비스 키오스크 설계 및 구현)

  • Kim, Sang-woo;Choi, Dae-june;Song, Yun-Mi;Moon, Il-Young
    • Journal of Practical Engineering Education
    • /
    • v.14 no.1
    • /
    • pp.99-108
    • /
    • 2022
  • As the demand for kiosks increases, more users complain of discomfort. Accordingly, a kiosk that enables easy menu selection and order by producing a voice-based interactive service is produced and provided in the form of a web. It implements voice functions based on the Annyang API and SpeechSynthesis API, and understands the user's intention through Dialogflow. And discuss how to implement this process based on Rest API. In addition, the recommendation system is applied based on collaborative filtering to improve the low consumer accessibility of existing kiosks, and to prevent infection caused by droplets during the use of voice recognition services, it provides the ability to check the wearing of masks before using the service.

Voice Source Modeling Using Weighted Sum-of-Basis-Functions Model (기저함수의 가중합을 이용한 음원의 모델링)

  • 강상기
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1998.06c
    • /
    • pp.171-174
    • /
    • 1998
  • 본 논문에서는 음성합성(speech synthesis) 및 부호화(coding) 시스템에 있어서 음원(voice source) 모델링에 관한 문제를 살펴보고자 한다. 기존의 음원 모델링 시스템이 가지고 있는 여러 문제들을 극복하고자 기저함수(basis function) 의 가중 합(weighted-sum)으로 음원을 모델링 하는 새로운 기법을 제안하고자 한다. 제안한 방법에서는 음원 파형(voice source waveform)을 적절히 표현하기 위해서 필터뱅크(filter bank)에 기초한 기저함수의 가중 합으로 나타낸다. 다양한 음원 특성을 효과적으로 나타내는 음원 파라미터를 구하기 위하여 EM(estimate maximize)에 기초한 구조에 관해 조사한다. 제안한 방법을 이용하여 다양한 유성음에 대해 실험을 수행하였다. 실험결과 제안한 추정(estimation) 방법 및 모델링 방법을 이용하면 기존의 방법에 비해 더 정확한 음원 파형을 추정할 수 있고, 다양한 음원 특성을 나타낼 수 있다. 또한 음성합성 및 부호화에서도 음성품질(voice quality)를 개선시킬 수 있으리라 기대된다.

  • PDF