• Title/Summary/Keyword: Speech speed

Search Result 238, Processing Time 0.022 seconds

Creation and Assessment of Korean Speech and Noise DB in Car Environments (자동차 환경에서의 노이즈 DB 및 한국어 음성 DB 구축)

  • Lee Kwang-Hyun;Kim Bong-Wan;Lee Yong-Ju
    • MALSORI
    • /
    • no.48
    • /
    • pp.141-153
    • /
    • 2003
  • Researches into robust recognition in noise environments, especially in car environments, are being carried out actively in speech community. In this paper we will report on three types of corpora that SiTEC (Speech Information TEchnology & industry promotion Center) has created for research into speech recognition in car noise environments. The first is the recordings of 900 Korean native speakers, distributed according to gender, age, and region, who uttered application words in car environments. The second is the collections of mixed noise in 3 car types by model while setting up various noise patterns which can be obtained with the car engine on or off, at different driving speed, and in different road conditions with windows open or closed. The third is the recordings of simulated speech by HATS (Head and Torso Simulator) in car environments with the internal and external noise factors added. These three types of recordings were all made through synchronized 8 channel microphones that are fixed in a car. The creation and applications of these corpora will be reported on in detail.

  • PDF

A Study on Real-time Implementing of Time-Scale Modification (음성 신호 시간축 변환의 실시간 구현에 관한 연구)

  • Han, Dong-Chul;Lee, Ki-Seung;Cha, Il-Hawan;Youn, Dae-Hee
    • The Journal of the Acoustical Society of Korea
    • /
    • v.14 no.2
    • /
    • pp.50-61
    • /
    • 1995
  • A time scale modification method yielding rate-modified speech while conserving the characteristic of speech was implemented in real-time using a goneral purpose digital signal processor. Time scale modification changed pronunciation speed only, producing a time difference between the input signal and the modified signal, making it impossible to implement it in real-time. In this thesis, a system was implemented to remove the time difference between the input and modified signals. Speech signals slowed down or speeded up by a physical time scale modification method, such as adjusting the motor speed of the cassett tape recorder, was used as the input signal. Physical modification that controled only the inter speed of the cassette tape player distorted the pitch period of the original speech. In this study, a real-time system was implemented so that the pitch-distorted speech was reconstructed back to the original by fractional sampling pitch shifting using an FIR filter, and this signal was time scale modified to match the cassette tape recorder motor speed using SOLA time-scale medification. In experiments using speech signals medifiedby the proposed method, results obtained using a 16-bit resolution ADSP2101 processor and using computer simulations employing floating point operations showed about the same average frame signal-to-noise ratio of about 20 dB.

  • PDF

Improvement of convergence speed in FDICA algorithm with weighted inner product constraint of unmixing matrix (분리행렬의 가중 내적 제한조건을 이용한 FDICA 알고리즘의 수렴속도 향상)

  • Quan, Xingri;Bae, Keunsung
    • Phonetics and Speech Sciences
    • /
    • v.7 no.4
    • /
    • pp.17-25
    • /
    • 2015
  • For blind source separation of convolutive mixtures, FDICA(Frequency Domain Independent Component Analysis) algorithms are generally used. Since FDICA algorithm such as Sawada FDICA, IVA(Independent Vector Analysis) works on the frequency bin basis with a natural gradient descent method, it takes much time to converge. In this paper, we propose a new method to improve convergence speed in FDICA algorithm. The proposed method reduces the number of iteration drastically in the process of natural gradient descent method by applying a weighted inner product constraint of unmixing matrix. Experimental results have shown that the proposed method achieved large improvement of convergence speed without degrading the separation performance of the baseline algorithms.

A study on the interactive speech recognition mobile robot (대화형 음성인식 이동로봇에 관한 연구)

  • 이재영;윤석현;홍광석
    • Journal of the Korean Institute of Telematics and Electronics B
    • /
    • v.33B no.11
    • /
    • pp.97-105
    • /
    • 1996
  • This paper is a study on the implementation of speech recognition mobile robot to which the interactive speech recognition techniques is applied. The speech command uttered the sentential connected word and is asserted through the wireless mic system. This speech signal transferred LPC-cepstrum and shorttime energy which are computed from the received signal on the DSP board to notebook PC. In notebook PC, DP matching technique is used for recognizer and the recognition results are transferred to the motor control unit which output pulse signals corresponding to the recognized command and drive the stepping motor. Grammar network applied to reduce the recognition speed of the recogniger, so that real time recognition is realized. The misrecognized command is revised by interface revision through the conversation with mobile robot. Therefore, user can move the mobile robot to the direction which user wants.

  • PDF

A Study on a Searching, Extraction and Approximation-Synthesis of Transition Segment in Continuous Speech (연속음성에서 천이구간의 탐색, 추출, 근사합성에 관한 연구)

  • Lee, Si-U
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.4
    • /
    • pp.1299-1304
    • /
    • 2000
  • In a speed coding system using excitation source of voiced and unvoiced, it would be involved a distortion of speech quality in case coexist with a voiced and an unvoiced consonants in a frame. So, I propose TSIUVC(Transition Segment Including UnVoiced Consonant) searching, extraction ad approximation-synthesis method in order to uncoexistent with a voiced and unvoiced consonants in a frame. This method based on a zerocrossing rate and pitch detector using FIR-STREAK Digital Filter. As a result, the extraction rates of TSIUVC are 84.8% (plosive), 94.9%(fricative), 92.3%(affricative) in female voice, and 88%(plosive), 94.9%(fricative), 92.3%(affricative) in male voice respectively, Also, I obain a high quality approximation-synthesis waveforms within TSIUVC by using frequency information of 0.547kHz below and 2.813kHz above. This method has the capability of being applied to speech coding of low bit rate, speech analysis and speech synthesis.

  • PDF

Performance Improvement of the Network Echo Canceller (네트웍 반향제거기의 성능 향상)

  • Yoo, Jae-Ha
    • Speech Sciences
    • /
    • v.11 no.4
    • /
    • pp.89-97
    • /
    • 2004
  • In this paper, an improved network echo canceller is proposed. The proposed echo canceller is based on the LTJ(lattice transversal joint) adaptive filter which uses informations from the speech decoder. In the proposed implementation method of the network echo canceller, the filer coefficients of the transversal filter part in the LTJ adaptive filter is updated every other sample instead of every sample. So its complexity can be lower than that of the transversal filter. And the echo cancellation rate can be improved by residual echo cancellation using the lattice predictor whose order is less than 10. Computational complexity of the proposed echo canceller is lower than that of the transversal filter but the convergence speed is faster than that of the transversal filter. The performance improvement of the proposed echo canceller was verified by the experiments using the real speech signal and speech coder.

  • PDF

Acoustic, Intraoral Air Pressure and EMG Studies of Vowel Devoicing in Korean

  • Kim, Hyun-Gi;Niimi, Sei-Ji
    • Speech Sciences
    • /
    • v.10 no.1
    • /
    • pp.3-13
    • /
    • 2003
  • The devoicing vowel is a phonological process whose contrast in sonority is lost or reduces in a particular phonetic environment. Phonetically, the vocal fold vibration originates from the abduction/adduction of the glottis in relation to supraglottal articulatory movements. The purpose of this study is to investigate Korean vowel devoicing by means of experimental instruments. The interrelated laryngeal adjustments and aerodynamic effects for this voicing can clarify the redundant articulatory gestures relevant to the distinctive feature of sonority. Five test words were selected, being composed of the high vowel /i/, between the fricative and strong aspirated or lenis affricated consonants. The subjects uttered the test words successively at a normal or at a faster speed. The EMG, the sensing tube Gaeltec S7b and the High-Speech Analysis system and MSL II were used in these studies. Acoustically, three different types of speech waveforms and spectrograms were classified, based on the voicing variation. The intraoral air pressure curves showed differences, depending on the voicing variations. The activity patterns of the PCA and the CT for devoicing vowels appeared differently from those showing the partially devoicing vowels and the voicing vowels.

  • PDF

Computerization and Application of the Korean Standard Pronunciation Rules (한국어 표준발음법의 전산화 및 응용)

  • 이계영;임재걸
    • Language and Information
    • /
    • v.7 no.2
    • /
    • pp.81-101
    • /
    • 2003
  • This paper introduces a computerized version of the Korean Standard Pronunciation Rules that can be used in speech engineering systems such as Korean speech synthesis and recognition systems. For this purpose, we build Petri net models for each item of the Standard Pronunciation Rules, and then integrate them into the sound conversion table. The reversion of the Korean Standard Pronunciation Rules regulates the way of matching sounds into grammatically correct written characters. This paper presents not only the sound conversion table but also the character conversion table obtained by reversely converting the sound conversion table. Malting use of these tables, we have implemented a Korean character into a sound system and a Korean sound into the character conversion system, and tested them with various data sets reflecting all the items of the Standard Pronunciation Rules to verify the soundness and completeness of our tables. The test results show that the tables improve the process speed in addition to the soundness and completeness.

  • PDF