• Title/Summary/Keyword: voice signal

Search Result 438, Processing Time 0.023 seconds

A Study on Realization of Speech Recognition System based on VoiceXML for Railroad Reservation Service (철도예약서비스를 위한 VoiceXML 기반의 음성인식 구현에 관한 연구)

  • Kim, Beom-Seung;Kim, Soon-Hyob
    • Journal of the Korean Society for Railway
    • /
    • v.14 no.2
    • /
    • pp.130-136
    • /
    • 2011
  • This paper suggests realization method for real-time speech recognition using VoiceXML in telephony environment based on SIP for Railroad Reservation Service. In this method, voice signal incoming through PSTN or Internet is treated as dialog using VoiceXML and the transferred voice signal is processed by Speech Recognition System, and the output is returned to dialog of VoiceXML which is transferred to users. VASR system is constituted of dialog server which processes dialog, APP server for processing voice signal, and Speech Recognition System to process speech recognition. This realizes transfer method to Speech Recognition System in which voice signal is recorded using Record Tag function of VoiceXML to process voice signal in telephony environment and it is played in real time.

A Study on the design of voice cryptograph system (음성암호시스템 설계에 관한 연구)

  • Choi, Tae-Sup;Ahn, In-Soo
    • Journal of the Institute of Electronics Engineers of Korea TE
    • /
    • v.39 no.2
    • /
    • pp.51-59
    • /
    • 2002
  • In this paper, we studied the voice cryptograph system designed by the SEED algorithm for the safe transmission and receipt on the voice communication. Voice band signal converts to digital signal by the CODEC and DSP that applied the improved SEED algorithm encrypt the digital signal. The CODEC convert Encryption signal into analog voice signal. This voice signal is transmitted safely because of encryption signal even if someone wiretap. Receiver can hear the source voice, because the encryption signal decrypted using the SEED algorithm. In this paper, We designed the 32 round key instead of 16 round key in the SEED algorithm so that we improve the truncated differential probability from $2^{-143.1}$ to $2^{-286.6}$

The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing (음성 신호 분류에 따른 장애 음성의 변동률 분석, 비선형 동적 분석, 캡스트럼 분석의 유용성)

  • Choi, Seong Hee;Choi, Chul-Hee
    • Phonetics and Speech Sciences
    • /
    • v.6 no.3
    • /
    • pp.63-72
    • /
    • 2014
  • The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.

An Analysis of Correlation between Voice vowels and Human body (음성모음과 신체의 상관관계 분석)

  • Choi, In-Ho;Jeon, Jong-Weon
    • Journal of Advanced Navigation Technology
    • /
    • v.14 no.3
    • /
    • pp.375-383
    • /
    • 2010
  • In this paper, the correlation between voice vowels and human body is analysed for the voice therapy and diagnosis. Using vowels('a', 'e', 'i', 'o', 'u'), the vibration signals in head, chest and belly is measured with the voice signal. As the result, it is shown that body characteristics can be checked from some vowels, and the correlation coefficient of body vibration signal and BMI(body mass index) is computed. From the result, using voice signal and body vibrations, the body diagnosis model is proposed.

Implementation of the automatic switching device for the voice communications between heterogeneous devices (이종 기기 간 음성통신을 위한 자동전환장치의 구현)

  • Lew, Chang-Guk;Lee, Bae-Ho
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.10 no.12
    • /
    • pp.1321-1328
    • /
    • 2015
  • A radio is a half-duplex voice communication method using the PTT(: Push To Talk), occupy a single line calls during transmission. As an interface between the telephone and the radio, UHF and VHF, for voice communication between the different heterogeneous devices, A device automatically switches between the two devices is required. Therefore, in accordance with the performance of the voice switching apparatus for detecting a voice to be transmitted from an input signal, loss of the audio signal to be transmitted is subjected to Significant influence. Conventional method has the problem responding to noise by setting the level through simple means of amplitude of input signal, in other words, the energy level of the input signal. This paper, by using the audio signal processing techniques, this discriminated what the voice is among the input signal and substantiated a device for the automatic voice transmission between heterogeneous devices. With this proposal, I was confirmed of improvement of performance in the automatic voice switching device, could perform loss-less transmission of voice between heterogeneous devices.

Voice Source Modeling Using Harmonic Compensated LF Model (LF 모델에 고조파 성분을 보상한 음원 모델링)

  • 이건웅;김태우홍재근
    • Proceedings of the IEEK Conference
    • /
    • 1998.10a
    • /
    • pp.1247-1250
    • /
    • 1998
  • In speech synthesis, LF model is widely used for excitation signal for voice source coding system. But LF model does not represent the harmonic frequencies of excitation signal. We propose an effective method which use sinusoidal functions for representing the harmonics of voice source signal. The proposed method could achieve more exact voice source waveform and better synthesized speech quality than LF model.

  • PDF

An analysis of a statistical difference of acoustic Parameters' distribution between normal voice and pathological voice (병적 음성과 정상 음성의 음향학적 파라미터 분포에 대한 통계적 분석)

  • 김용주;권순복;김기련;신민철;조철우;왕수건
    • Proceedings of the IEEK Conference
    • /
    • 2001.06d
    • /
    • pp.249-252
    • /
    • 2001
  • The most basic means of communication among humans is a voice. Without speaking of voice technologies, we found it is important and convenient to use a voice in everyday life. But. in consideration to speech recognition systems, we can't always desire a normal voice input as input signal to the system. Generally speaking. a pathological voice as against a normal which is a voice with a problem in the larynx. could be also special case of input voice. Of course, but the distortion of a speech signal by environmental effects i.e., noise or transmission channel was a raised problem. we will take up a pathological voices with laryngeal disease which is essential distortion factor in voice. Also, we are to find out the difference of acoustic parameters distribution between normal and pathological voice by a statistical method in our research.

  • PDF

Pitch Modification based on a Voice Source Model (음원 모델에 기초한 합성음의 피치 조절)

  • Choi, Yong-Jin;Yeo, Su-Jin;Kim, Jin-Young;Sung, Koeng-Mo
    • Speech Sciences
    • /
    • v.3
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

Characteristics of Cow´s Voices in Time and Frequency domains for Recognition

  • Ikeda, Yoshio;Ishii, Y.
    • Agricultural and Biosystems Engineering
    • /
    • v.2 no.1
    • /
    • pp.15-23
    • /
    • 2001
  • On the assumption that the voices of the cows are produced by the linear prediction filter, we characterized the cows’voices. The order of this filter was determined by examining the voice characteristics both in time and frequency domains. The proposed order of the linear prediction filter is 15 for modeling voice production of the cow. The characteristics of the amplitude envelope of the voice signal was investigated by analyzing the sequence of the short time variance both in time and frequency domains, and the new parameters were defined. One of the coefficients o the linear prediction filter generating the voice signal, the fundamental frequency, the slope of the straight line regressed from the log-log spectra of the short time variance and the coefficients of the linear prediction filter generating the sequence of the short time variance of the voice signal can differentiate the two cows.

  • PDF

A Noise Reduction Method with Linear Prediction Using Periodicity of Voiced Speech

  • Sasaoka, Naoto;Kawamura, Arata;Fujii, Kensaku;Itoh, Yoshio;Fukui, Yutaka
    • Proceedings of the IEEK Conference
    • /
    • 2002.07a
    • /
    • pp.102-105
    • /
    • 2002
  • A noise reduction technique to reduce background noise in corrupted voice is proposed. The proposed method is based on linear prediction and takes advantages of periodicity of voiced speech. A voiced sound is regarded as a periodic stationary signal in short time interval. Therefore, the current voice signal is correlated with the voice signal delayed by a pitch period. A linear predictor can estimate only the current signal correlated with the delayed signal. Therefore, the enhanced voice can be obtained as output of the linear predictor. Simulation results show that the proposed method is able to reduce the background noise.

  • PDF