• 제목/요약/키워드: Voice signal

검색결과 431건 처리시간 0.023초

철도예약서비스를 위한 VoiceXML 기반의 음성인식 구현에 관한 연구 (A Study on Realization of Speech Recognition System based on VoiceXML for Railroad Reservation Service)

  • 김범승;김순협
    • 한국철도학회논문집
    • /
    • 제14권2호
    • /
    • pp.130-136
    • /
    • 2011
  • 본 논문에서는 철도예약서비스를 위한 SIP를 기반으로 하는 텔레포니 환경에서의 VoiceXML을 이용한 실시간 음성인식을 구현하는 방안을 제안하였다. 제안된 방법은 PSTN 또는 인터넷을 통하여 들어온 음성신호를 VoiceXML을 이용한 Dialog 처리를 하고 전송된 음성신호를 음성인식 시스템에서 처리하여 출력된 결과값을 VoiceXML의 Dialog에 반환하여 사용자에게 전달하는 방식이다. VASR 시스템은 Dialog를 처리하는 Dialog 서버, 음성신호를 처리하기 위한 APP서버, 그리고 음성인식을 처리하는 음성인식 시스템으로 구성된다. 본 논문에서는 텔레포니 환경에서의 음성신호 처리를 위하여 VoiceXML의 Record Tag 기능을 이용하여 음성신호를 녹음하고 이를 실시간 재생하여 음성인식 시스템으로 전송하는 방식을 구현하였다.

음성암호시스템 설계에 관한 연구 (A Study on the design of voice cryptograph system)

  • 최태섭;안인수
    • 대한전자공학회논문지TE
    • /
    • 제39권2호
    • /
    • pp.51-59
    • /
    • 2002
  • 본 논문에서는 음성 통화에서의 안전한 전송과 수신을 위하여 SEED 알고리즘을 이용한 음성 암호 시스템 설계를 하였다. 음성영역의 신호는 CODEC에 의해 디지털 신호로 변환된다. 그리고 개선된 SEED 알고리즘을 적용한 DSP는 이 신호를 암호화한다. CODEC은 암호화된 신호를 아날로그 음성신호로 변환한다. 이 음성 신호는 중간에 도청이나 감청을 한다고 하더라도 암호화되어있기 때문에 안전하게 전송할 수 있다. 수신자는 수신된 음성신호를 복호화 SEED 알고리즘을 이용하여 송신자의 원음성을 들을 수 있다. 본 논문에서는 16라운드인 SEED 알고리즘의 라운드 수를 32라운드로 설계하여 truncated differential 확률을 $2^{-143.1}$에서 $2^{-286.6}$이상으로 개선하였다.

음성 신호 분류에 따른 장애 음성의 변동률 분석, 비선형 동적 분석, 캡스트럼 분석의 유용성 (The Utility of Perturbation, Non-linear dynamic, and Cepstrum measures of dysphonia according to Signal Typing)

  • 최성희;최철희
    • 말소리와 음성과학
    • /
    • 제6권3호
    • /
    • pp.63-72
    • /
    • 2014
  • The current study assessed the utility of acoustic analyses the most commonly used in routine clinical voice assessment including perturbation, nonlinear dynamic analysis, and Spectral/Cepstrum analysis based on signal typing of dysphonic voices and investigated their applicability of clinical acoustic analysis methods. A total of 70 dysphonic voice samples were classified with signal typing using narrowband spectrogram. Traditional parameters of %jitter, %shimmer, and signal-to-noise ratio were calculated for the signals using TF32 and correlation dimension(D2) of nonlinear dynamic parameter and spectral/cepstral measures including mean CPP, CPP_sd, CPPf0, CPPf0_sd, L/H ratio, and L/H ratio_sd were also calculated with ADSV(Analysis of Dysphonia in Speech and VoiceTM). Auditory perceptual analysis was performed by two blinded speech-language pathologists with GRBAS. The results showed that nearly periodic Type 1 signals were all functional dysphonia and Type 4 signals were comprised of neurogenic and organic voice disorders. Only Type 1 voice signals were reliable for perturbation analysis in this study. Significant signal typing-related differences were found in all acoustic and auditory-perceptual measures. SNR, CPP, L/H ratio values for Type 4 were significantly lower than those of other voice signals and significant higher %jitter, %shimmer were observed in Type 4 voice signals(p<.001). Additionally, with increase of signal type, D2 values significantly increased and more complex and nonlinear patterns were represented. Nevertheless, voice signals with highly noise component associated with breathiness were not able to obtain D2. In particular, CPP, was highly sensitive with voice quality 'G', 'R', 'B' than any other acoustic measures. Thus, Spectral and cepstral analyses may be applied for more severe dysphonic voices such as Type 4 signals and CPP can be more accurate and predictive acoustic marker in measuring voice quality and severity in dysphonia.

음성모음과 신체의 상관관계 분석 (An Analysis of Correlation between Voice vowels and Human body)

  • 최인호;전종원
    • 한국항행학회논문지
    • /
    • 제14권3호
    • /
    • pp.375-383
    • /
    • 2010
  • 본 논문은 음성진단이나 음성치료를 위한 연구로서 음성과 신체의 상관관계를 분석한 것이다. 음성신호와 함께 신체의 머리와 가슴 그리고 복부에서 음성에 의한 진동파형을 측정하였으며, 이 때 사용한 음성은 모음 '아', '에', '이', '오', '우' 이다. 그 결과 모음에 따라 신체의 특징을 잘 나타내는 성분을 확인할 수 있었으며, 신체질량지수(BMI)와의 상관계수를 측정하여 음성에 의한 신체조건 진단의 활용방안을 제시하였다.

이종 기기 간 음성통신을 위한 자동전환장치의 구현 (Implementation of the automatic switching device for the voice communications between heterogeneous devices)

  • 류창국;이배호
    • 한국전자통신학회논문지
    • /
    • 제10권12호
    • /
    • pp.1321-1328
    • /
    • 2015
  • 무전기의 음성통신은 PTT(: Push To Talk)를 이용한 반이중(half-duplex) 방식으로, 송신 시 단일 통화선로를 점유한다. 전화와 무전기간, UHF와 VHF 간의 인터페이스와 같이 서로 다른 이종 장치 간 음성통신을 위해서는 두 장치간의 자동전환장치가 요구되고, 이 장치는 입력 신호로부터 전송해야 할 음성을 검출하는 음성전환장치의 성능에 따라 전송되는 음성신호의 손실여부에 많은 영향을 받는다. 기존방식은 단순 입력신호의 크기 즉, 에너지 레벨을 통해 기준을 정함으로써 잡음에도 반응하는 문제점을 지니고 있다. 본 논문에서는 음성신호처리기법을 이용하여 입력된 신호가 음성임을 판별함으로써, 이종 기기 사이의 음성을 자동으로 전달하는 장치를 구현하였다. 이를 통해 음성 자동전환장치의 성능향상을 확인하였고, 이종 기기 간 음성 손실 없는 전송을 수행할 수 있었다.

LF 모델에 고조파 성분을 보상한 음원 모델링 (Voice Source Modeling Using Harmonic Compensated LF Model)

  • 이건웅;김태우홍재근
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 1998년도 추계종합학술대회 논문집
    • /
    • pp.1247-1250
    • /
    • 1998
  • In speech synthesis, LF model is widely used for excitation signal for voice source coding system. But LF model does not represent the harmonic frequencies of excitation signal. We propose an effective method which use sinusoidal functions for representing the harmonics of voice source signal. The proposed method could achieve more exact voice source waveform and better synthesized speech quality than LF model.

  • PDF

병적 음성과 정상 음성의 음향학적 파라미터 분포에 대한 통계적 분석 (An analysis of a statistical difference of acoustic Parameters' distribution between normal voice and pathological voice)

  • 김용주;권순복;김기련;신민철;조철우;왕수건
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2001년도 하계종합학술대회 논문집(4)
    • /
    • pp.249-252
    • /
    • 2001
  • The most basic means of communication among humans is a voice. Without speaking of voice technologies, we found it is important and convenient to use a voice in everyday life. But. in consideration to speech recognition systems, we can't always desire a normal voice input as input signal to the system. Generally speaking. a pathological voice as against a normal which is a voice with a problem in the larynx. could be also special case of input voice. Of course, but the distortion of a speech signal by environmental effects i.e., noise or transmission channel was a raised problem. we will take up a pathological voices with laryngeal disease which is essential distortion factor in voice. Also, we are to find out the difference of acoustic parameters distribution between normal and pathological voice by a statistical method in our research.

  • PDF

음원 모델에 기초한 합성음의 피치 조절 (Pitch Modification based on a Voice Source Model)

  • 최용진;여수진;김진영;성굉모
    • 음성과학
    • /
    • 제3권
    • /
    • pp.132-147
    • /
    • 1998
  • Previously developed methods for pitch modification have not been based on the voice source model. Therefore, the synthesized speech often sounds unnatural although it may be highly intelligible. The purpose of this paper is to analyze the alteration of a voice source signal with pitch period and to establish the pitch-modification rule based on the result of this analysis. We examine the alteration of the interval of closing phase, closed phase and open phase using the excitation waveform as the pitch increases. In comparison to the previous methods which performed directly on the speech signal, the pitch modification method based on a voice source model shows high intelligibility and naturalness. This study might benefit the application to the speaker identification and the voice color conversion. Therefore the proposed method will provide high quality synthetic speech.

  • PDF

Characteristics of Cow´s Voices in Time and Frequency domains for Recognition

  • Ikeda, Yoshio;Ishii, Y.
    • Agricultural and Biosystems Engineering
    • /
    • 제2권1호
    • /
    • pp.15-23
    • /
    • 2001
  • On the assumption that the voices of the cows are produced by the linear prediction filter, we characterized the cows’voices. The order of this filter was determined by examining the voice characteristics both in time and frequency domains. The proposed order of the linear prediction filter is 15 for modeling voice production of the cow. The characteristics of the amplitude envelope of the voice signal was investigated by analyzing the sequence of the short time variance both in time and frequency domains, and the new parameters were defined. One of the coefficients o the linear prediction filter generating the voice signal, the fundamental frequency, the slope of the straight line regressed from the log-log spectra of the short time variance and the coefficients of the linear prediction filter generating the sequence of the short time variance of the voice signal can differentiate the two cows.

  • PDF

A Noise Reduction Method with Linear Prediction Using Periodicity of Voiced Speech

  • Sasaoka, Naoto;Kawamura, Arata;Fujii, Kensaku;Itoh, Yoshio;Fukui, Yutaka
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 ITC-CSCC -1
    • /
    • pp.102-105
    • /
    • 2002
  • A noise reduction technique to reduce background noise in corrupted voice is proposed. The proposed method is based on linear prediction and takes advantages of periodicity of voiced speech. A voiced sound is regarded as a periodic stationary signal in short time interval. Therefore, the current voice signal is correlated with the voice signal delayed by a pitch period. A linear predictor can estimate only the current signal correlated with the delayed signal. Therefore, the enhanced voice can be obtained as output of the linear predictor. Simulation results show that the proposed method is able to reduce the background noise.

  • PDF