• Title, Summary, Keyword: 성별 음성인식

Search Result 25, Processing Time 0.043 seconds

Comparison of Male/Female Speech Features and Improvement of Recognition Performance by Gender-Specific Speech Recognition (남성과 여성의 음성 특징 비교 및 성별 음성인식에 의한 인식 성능의 향상)

  • Lee, Chang-Young
    • The Journal of the Korea institute of electronic communication sciences
    • /
    • v.5 no.6
    • /
    • pp.568-574
    • /
    • 2010
  • In an effort to improve the speech recognition rate, we investigated performance comparison between speaker-independent and gender-specific speech recognitions. For this purpose, 20 male and 20 female speakers each pronounced 300 isolated Korean words and the speeches were divided into 4 groups: female, male, and two mixed genders. To examine the validity for the gender-specific speech recognition, Fourier spectrum and MFCC feature vectors averaged over male and female speakers separately were examined. The result showed distinction between the two genders, which supports the motivation for the gender-specific speech recognition. In experiments of speech recognition rate, the error rate for the gender-specific case was shown to be less than50% compared to that of the speaker-independent case. From the obtained results, it might be suggested that hierarchical recognition of gender and speech recognition might yield better performance over the current method of speech recognition.

Speech Identification of Male and Female Speakers in Noisy Speech for Improving Performance of Speech Recognition System (음성인식 시스템의 성능 향상을 위한 잡음음성의 남성 및 여성화자의 음성식별)

  • Choi, Jae-seung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.619-620
    • /
    • 2017
  • 본 논문에서는 음성인식 알고리즘에 매우 중요한 정보를 제공하는 화자의 성별인식을 위하여 신경회로망을 사용하여 잡음 환경 하에서 남성음성 및 여성음성의 화자를 식별하는 성별인식 알고리즘을 제안한다. 본 논문에서 제안하는 신경회로망은 MFCC의 계수를 사용하여 음성의 각 구간에서 남성음성 및 여성음성의 화자를 인식할 수 있는 알고리즘이다. 실험결과로부터 백색잡음이 중첩된 잡음환경 하에서 음성신호의 MFCC의 특징벡터를 사용함으로써 남성음성 및 여성음성의 화자에 대해서 양호한 성별인식 결과가 구해졌다.

  • PDF

배경잡음 하에서의 신경회로망에 의한 남성화자 및 여성화자의 성별인식 알고리즘

  • Choe, Jae-Seung
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.515-517
    • /
    • 2013
  • 본 논문에서는 잡음 환경 하에서 남녀 성별인식이 가능한 신경회로망에 의한 화자종속 음성인식 알고리즘을 제안한다. 본 논문에서 제안한 음성인식 알고리즘은 남성화자 및 여성화자를 인식하기 위하여 LPC 켑스트럼 계수를 사용하여 신경회로망에 의하여 학습된다. 본 실험에서는 백색잡음 및 자동차잡음에 대하여 신경회로망의 네크워크에 대한 인식결과를 나타낸다. 인식실험의 결과로부터 백색잡음에 대해서는 최대 96% 이상의 인식률, 자동차잡음에 대해서는 최대 88% 이상의 인식률을 구하였다.

  • PDF

Improved Gender Recognition System for Male and Female Speakers Using MFCC (MFCC를 사용한 개선된 남성 및 여성화자의 성별인식시스템)

  • Choi, Jae-Seung
    • The Journal of Korean Institute of Information Technology
    • /
    • v.15 no.9
    • /
    • pp.23-28
    • /
    • 2017
  • In the field of speech recognition, the speaker's gender recognition provides important information to the speech recognition system. In particular, gender recognition based on the voice of a speaker is a method of distinguishing speech signals uttered by male and female. This paper proposes a gender recognition system which improves speech recognition rates under background noise environments by using multi-layer perceptron neural network (MLPNN) for speaker's gender recognition. The proposed MLPNN is an algorithm that recognizes the gender of the male and female in each speech section using Mel frequency cepstral coefficient (MFCC) feature parameters. From the experiment results, the average recognition results obtained by the proposed algorithm is 94.92% for male speaker and 99.39% for female speaker, by using the MFCC feature vector in the case of the speech contaminated with white noise. Therefore, it is experimentally confirmed that the proposed algorithm is relatively effective for white noise.

A Study on The Improvement of Emotion Recognition by Gender Discrimination (성별 구분을 통한 음성 감성인식 성능 향상에 대한 연구)

  • Cho, Youn-Ho;Park, Kyu-Sik
    • Journal of the Institute of Electronics Engineers of Korea SP
    • /
    • v.45 no.4
    • /
    • pp.107-114
    • /
    • 2008
  • In this paper, we constructed a speech emotion recognition system that classifies four emotions - neutral, happy, sad, and anger from speech based on male/female gender discrimination. At first, the proposed system distinguish between male and female from a queried speech, then the system performance can be improved by using separate optimized feature vectors for each gender for the emotion classification. As a emotion feature vector, this paper adopts ZCPA(Zero Crossings with Peak Amplitudes) which is well known for its noise-robustic characteristic from the speech recognition area and the features are optimized using SFS method. For a pattern classification of emotion, k-NN and SVM classifiers are compared experimentally. From the computer simulation results, the proposed system was proven to be highly efficient for speech emotion classification about 85.3% regarding four emotion states. This might promise the use the proposed system in various applications such as call-center, humanoid robots, ubiquitous, and etc.

Speaker-dependent Speech Recognition Algorithm for Male and Female Classification (남녀성별 분류를 위한 화자종속 음성인식 알고리즘)

  • Choi, Jae-Seung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.4
    • /
    • pp.775-780
    • /
    • 2013
  • This paper proposes a speaker-dependent speech recognition algorithm which can classify the gender for male and female speakers in white noise and car noise, using a neural network. The proposed speech recognition algorithm is trained by the neural network to recognize the gender for male and female speakers, using LPC (Linear Predictive Coding) cepstrum coefficients. In the experiment results, the maximal improvement of total speech recognition rate is 96% for white noise and 88% for car noise, respectively, after trained a total of six neural networks. Finally, the proposed speech recognition algorithm is compared with the results of a conventional speech recognition algorithm in the background noisy environment.

LPC 켑스트럼 및 FFT 스펙트럼에 의한 성별 인식 알고리즘

  • Choe, Jae-Seung;Jeong, Byeong-Gu
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • /
    • pp.63-65
    • /
    • 2012
  • 본 논문에서는 입력된 음성이 남성화자인지 여성화자인지를 구분하는 FFT 스펙트럼 및 LPC 켑스트럼 입력에 의한 성별인식 알고리즘을 제안한다. 본 논문에서는 특히 남성화자와 여성화자의 특징벡터를 비교 분석하여, 이러한 남녀의 음향학적인 특징벡터의 차이점을 이용하여 신경회로망에 의한 성별 인식에 대한 실험을 수행한다. 특히 12차의 LPC 켑스트럼 및 8차의 저역 FFT 스펙트럼의 특징벡터를 사용한 경우에, 남성화자 및 여성화자에 대해서 양호한 남녀 성별인식률이 구해졌다.

  • PDF

Voice-Based Gender Identification Employing Support Vector Machines (음성신호 기반의 성별인식을 위한 Support Vector Machines의 적용)

  • Lee, Kye-Hwan;Kang, Sang-Ick;Kim, Deok-Hwan;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.2
    • /
    • pp.75-79
    • /
    • 2007
  • We propose an effective voice-based gender identification method using a support vector machine(SVM). The SVM is a binary classification algorithm that classifies two groups by finding the voluntary nonlinear boundary in a feature space and is known to yield high classification performance. In the present work, we compare the identification performance of the SVM with that of a Gaussian mixture model(GMM) using the mel frequency cepstral coefficients(MFCC). A novel means of incorporating a features fusion scheme based on a combination of the MFCC and pitch is proposed with the aim of improving the performance of gender identification using the SVM. Experiment results indicate that the gender identification performance using the SVM is significantly better than that of the GMM. Moreover, the performance is substantially improved when the proposed features fusion technique is applied.

GMM-Based Gender Identification Employing Group Delay (Group Delay를 이용한 GMM기반의 성별 인식 알고리즘)

  • Lee, Kye-Hwan;Lim, Woo-Hyung;Kim, Nam-Soo;Chang, Joon-Hyuk
    • The Journal of the Acoustical Society of Korea
    • /
    • v.26 no.6
    • /
    • pp.243-249
    • /
    • 2007
  • We propose an effective voice-based gender identification using group delay(GD) Generally, features for speech recognition are composed of magnitude information rather than phase information. In our approach, we address a difference between male and female for GD which is a derivative of the Fourier transform phase. Also, we propose a novel way to incorporate the features fusion scheme based on a combination of GD and magnitude information such as mel-frequency cepstral coefficients(MFCC), linear predictive coding (LPC) coefficients, reflection coefficients and formant. The experimental results indicate that GD is effective in discriminating gender and the performance is significantly improved when the proposed feature fusion technique is applied.

A Design and Implementation of Speech Recognition Preprocessing System using Formant Frequency (포만트 주파수를 이용한 음성인식 전처리 시스템의 설계 및 구현)

  • 김태욱;한승진;김민성;이정현
    • Proceedings of the Korean Information Science Society Conference
    • /
    • /
    • pp.198-200
    • /
    • 1999
  • 인간이 발성하는 음성에는 의미에 대한 정보 뿐만 아니라 화자의 성별에 따라 고유한 특성을 가지고 있다. 즉 음성은 고음이 강한 여성음성과 남성음성으로 분류할 수 있다. 그러나, 기존의 HMM을 이용한 음성인식시스템에서는 남성과 여성음성의 이러한 특성이 있음에도 불구하고 이를 고려하지 않고, 하나의 HMM으로 구성하고 있다. 본 논문에서 제시하는 알고리즘으로 실험한 결과 남성과 여성의 포만트 주파수가 100~30Hzck이가 나는 것을 알 수 있었고, 이러한 특성을 고려하여 남성과 여성의 음성을 구별할 수 있는 방법을 제안한다. 또한 남성과 여성음성을 각각 구분하여 GMM을 훈련시킨 후 인식과정에서 입력된 음성의 포만트 특성에 따라 남성음성이면 남성 HMM으로 여성음성이면 여성 HMM으로 인식을 수행함으로써 기존의 인식방법보다 남성음성은 5.2% 여성음성은 4.4% 향상된 결과를 얻었다.

  • PDF