• 제목/요약/키워드: VOCAL SIGNAL

검색결과 85건 처리시간 0.019초

보컬 피치 검출의 성능 향상을 위한 보컬 강화 기술 (Vocal Enhancement for Improving the Performance of Vocal Pitch Detection)

  • 이세원;송재종;이석필;박호종
    • 한국음향학회지
    • /
    • 제30권6호
    • /
    • pp.353-359
    • /
    • 2011
  • 본 논문에서는 다성 음악 신호의 보컬 피치 검출 성능을 향상시키기 위해 음악 신호의 보컬 신호를 강화시키는 전처리 기술을 제안한다. 제안한 보컬 강화 기술은 입력된 다성 음악 신호로부터 반주 신호를 예측하고, 예측된 반주 신호를 입력된 보컬 신호의 크기에 맞춰 가공하여 반주 복사본 신호를 생성한다. 마지막으로 주파수 영역에서 반주 복사본 신호를 원래 다성 음악 신호에서 제거하여 보컬이 강화된 출력 신호를 생성한다. 원 음악 신호와 제안한 방법으로 보컬이 강화된 신호에 동일한 보컬 피치 검출 방법을 각각 적용하여 피치 검출의 정확도를 측정하였고, 제안한 기술에 의하여 피치 검출 정확도가 평균 7.1 % 포인트 향상된 것을 확인하였다.

성대 신호를 이용한 인식 시스템 (RECOGNITION SYSTEM USING VOCAL-CORD SIGNAL)

  • 조관현;한문성;박준석;정영규
    • 대한전기학회:학술대회논문집
    • /
    • 대한전기학회 2005년도 학술대회 논문집 정보 및 제어부문
    • /
    • pp.216-218
    • /
    • 2005
  • This paper present a new approach to a noise robust recognizer for WPS interface. In noisy environments, performance of speech recognition is decreased rapidly. To solve this problem, We propose the recognition system using vocal-cord signal instead of speech. Vocal-cord signal has low quality but it is more robust to environment noise than speech signal. As a result, we obtained 75.21% accuracy using MFCC with CMS and 83.72% accuracy using ZCPA with RASTA.

  • PDF

정상인과 후두폴립환자에서의 음성학적 측정 (Acoustic Measures from Normal and Vocal Polyp Patients)

  • 최홍식;장미숙;이정준
    • 대한후두음성언어의학회지
    • /
    • 제5권1호
    • /
    • pp.38-43
    • /
    • 1994
  • Though normal vocal cords show regular vibration, pathologic vocal cords show irregularity between peaks. Jitter means fluctuation in the time interval between peaks, and Shimmer means cycle to cycle variation in the amplitude of the peaks. We investigated the vocal vibration of Korean normal persons objectively. The fundamental frequency, Jitter, Shimmer and SNR(signal to noise ratio) of normal persons were compared with that of vocal Polyp Patients with CSpeech Program for the possibility of distinguishing the pathologic vocal vibration from normal. The results were as follows ; Comparing the fundamental frequency of vocal Polyp Patients with normal persons, great change was noted only in female cases. But the Jitter and Shimmer of vocal polyp patients were greater than normal significantly in both male and female cases. SNR was lower than normal in vocal polyp patients. In the conclusion, fundamental frequency, Jitter, Shimmer and SNR might be meaningful parameters distinguisuing pathologic vibration from normal.

  • PDF

교차 예측 기반의 보컬 추정 방법을 이용한 SAOC Karaoke 모드에서의 음질 향상 기법에 대한 연구 (Quality Improvement of Karaoke Mode in SAOC using Cross Prediction based Vocal Estimation Method)

  • 이동금;박영철;윤대희
    • 한국음향학회지
    • /
    • 제32권3호
    • /
    • pp.227-236
    • /
    • 2013
  • 본 논문에서는 SAOC의 Karaoke 모드의 출력 신호 내에 존재하는 잔여 보컬 성분을 추정하여 억제시킴으로써 음질을 향상시킬 수 있는 알고리듬을 제안하였다. 잔여 보컬 성분은 Karaoke 모드 환경으로 합성된 신호와 Solo 모드로 새로 합성된 신호를 서로 교차 예측하여 추정될 수 있다. 그러나, 두 신호는 모두 같은 다운 믹스 신호로부터 합성되는 신호이므로, 두 신호간의 높은 상관성으로 인하여 가라오케 신호내의 잔여 보컬 성분뿐만 아니라 음악 성분도 함께 제거된다. 이러한 열화를 해결하기 위해, 본 논문에서는 교차 예측 과정에서 심리 음향적 특성을 고려한 예측 방해 신호를 적용하였으며, 이 신호의 크기는 심리음향모델의 마스킹 특성에 따라 음악적 음질의 열화가 최소화되도록 적응적으로 설정되었다. 실험은 보컬 객체가 포함된 음악 신호에 대해서 객관적 및 주관적 음질평가를 수행하였으며, 전체적으로 성능 향상이 있음을 확인하였다.

음성 하모닉스 스펙트럼의 피크-피팅을 이용한 피치검출에 관한 연구 (A Study on the Pitch Detection of Speech Harmonics by the Peak-Fitting)

  • 김종국;조왕래;배명진
    • 음성과학
    • /
    • 제10권2호
    • /
    • pp.85-95
    • /
    • 2003
  • In speech signal processing, it is very important to detect the pitch exactly in speech recognition, synthesis and analysis. If we exactly pitch detect in speech signal, in the analysis, we can use the pitch to obtain properly the vocal tract parameter. It can be used to easily change or to maintain the naturalness and intelligibility of quality in speech synthesis and to eliminate the personality for speaker-independence in speech recognition. In this paper, we proposed a new pitch detection algorithm. First, positive center clipping is process by using the incline of speech in order to emphasize pitch period with a glottal component of removed vocal tract characteristic in time domain. And rough formant envelope is computed through peak-fitting spectrum of original speech signal infrequence domain. Using the roughed formant envelope, obtain the smoothed formant envelope through calculate the linear interpolation. As well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. Inverse fast fourier transform (IFFT) compute this flattened harmonics. After all, we obtain Residual signal which is removed vocal tract element. The performance was compared with LPC and Cepstrum, ACF. Owing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

조음도를 이용한 발음훈련기기의 개발 (Development of Speech Training Aids Using Vocal Tract Profile)

  • 박상희;김동준;이재혁;윤태성
    • 대한전기학회논문지
    • /
    • 제41권2호
    • /
    • pp.209-216
    • /
    • 1992
  • Deafs train articulation by observing mouth of a tutor, sensing tactually the motions of the vocal organs, or using speech training aids. Present speech training aids for deafs can measure only single speech parameter, or display only frequency spectra in histogram of pseudo-color. In this study, a speech training aids that can display subject's articulation in the form of a cross section of the vocal organs and other speech parameters together in a single system is to be developed and this system makes a subject know where to correct. For our objective, first, speech production mechanism is assumed to be AR model in order to estimate articulatory motions of the vocal organs from speech signal. Next, a vocal tract profile model using LP analysis is made up. And using this model, articulatory motions for Korean vowels are estimated and displayed in the vocal tract profile graphics.

  • PDF

웨이브렛 변환을 이용한 음성신호의 성문폐쇄시점 검출 (Detection of Glottal Closure Instant for Voiced Speech Using Wavelet Transform)

  • 배건성
    • 음성과학
    • /
    • 제7권3호
    • /
    • pp.153-165
    • /
    • 2000
  • During the phonation of voiced sounds, instants exist where the glottis is opened or closed, due to the periodic vibration of the vocal cord. When closed, this is called the glottal closure instant(GCI) or epoch.. The correct detection of the GCI is one of the important problems in speech processing for pitch detection, pitch synchronous analysis, and so on. Recently, it has been shown that the local maxima points of the wavelet transformed speech signal correspond to the GCIs of speech signal. In this paper, we investigate the accuracy of Gels estimated from this wavelet transformed speech signal. For this purpose we compare them with the negative peak points of the differentiated EGG signal that represents the actual GCIs of speech signal.

  • PDF

피치 검출을 위한 스펙트럼 평탄화 기법 (Flattening Techniques for Pitch Detection)

  • 김종국;조왕래;배명진
    • 대한전자공학회:학술대회논문집
    • /
    • 대한전자공학회 2002년도 하계종합학술대회 논문집(4)
    • /
    • pp.381-384
    • /
    • 2002
  • In speech signal processing, it Is very important to detect the pitch exactly in speech recognition, synthesis and analysis. but, it is very difficult to pitch detection from speech signal because of formant and transition amplitude affect. therefore, in this paper, we proposed a pitch detection using the spectrum flattening techniques. Spectrum flattening is to eliminate the formant and transition amplitude affect. In time domain, positive center clipping is process in order to emphasize pitch period with a glottal component of removed vocal tract characteristic. And rough formant envelope is computed through peak-fitting spectrum of original speech signal in frequency domain. As a results, well get the flattened harmonics waveform with the algebra difference between spectrum of original speech signal and smoothed formant envelope. After all, we obtain residual signal which is removed vocal tract element The performance was compared with LPC and Cepstrum, ACF 0wing to this algorithm, we have obtained the pitch information improved the accuracy of pitch detection and gross error rate is reduced in voice speech region and in transition region of changing the phoneme.

  • PDF

성도 면적 함수를 이용한 음성 인식에 관한 연구 (A Study on Speech Recognition using Vocal Tract Area Function)

  • 송제혁;김동준
    • 대한의용생체공학회:의공학회지
    • /
    • 제16권3호
    • /
    • pp.345-352
    • /
    • 1995
  • The LPC cepstrum coefficients, which are an acoustic features of speech signal, have been widely used as the feature parameter for various speech recognition systems and showed good performance. The vocal tract area function is a kind of articulatory feature, which is related with the physiological mechanism of speech production. This paper proposes the vocal tract area function as an alternative feature parameter for speech recognition. The linear predictive analysis using Burg algorithm and the vector quantization are performed. Then, recognition experiments for 5 Korean vowels and 10 digits are executed using the conventional LPC cepstrum coefficients and the vocal tract area function. The recognitions using the area function showed the slightly better results than those using the conventional LPC cepstrum coefficients.

  • PDF

청각 장애자용 발음 훈련 기기의 개발 (Speech training aids for deafs)

  • 김동준;윤태성;박상희
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 제어로봇시스템학회 1991년도 한국자동제어학술회의논문집(국내학술편); KOEX, Seoul; 22-24 Oct. 1991
    • /
    • pp.746-751
    • /
    • 1991
  • Deafs train articulation by observing mouth of a tutor. sensing tactually the notions of the vocal organs, or using speech training aids. Present speech training aids for deafs can measure only single speech ter, or display only frequency spectra in histogrm or pseudo-color. In this study, a speech training aids that can display subject's articulation in the form of a cross section of the vocal organs and other speech parameters together in a single system Is aimed to develop and this system makes a subject to know where to correct. For our objective, first, speech production mechanism is assumed to be AR model in order to estimate articulatory notions of the vocal tract from speech signal. Next, a vocal tract profile mode using LPC analysis is made up. And using this model, articulatory notions for Korean vowels are estimated and displayed in the vocal tract profile graphics.

  • PDF