• Title/Summary/Keyword: Text-independent speaker verification

Search Result 19, Processing Time 0.019 seconds

Text Independent Speaker Verficiation Using Dominant State Information of HMM-UBM (HMM-UBM의 주 상태 정보를 이용한 음성 기반 문맥 독립 화자 검증)

  • Shon, Suwon;Rho, Jinsang;Kim, Sung Soo;Lee, Jae-Won;Ko, Hanseok
    • The Journal of the Acoustical Society of Korea
    • /
    • v.34 no.2
    • /
    • pp.171-176
    • /
    • 2015
  • We present a speaker verification method by extracting i-vectors based on dominant state information of Hidden Markov Model (HMM) - Universal Background Model (UBM). Ergodic HMM is used for estimating UBM so that various characteristic of individual speaker can be effectively classified. Unlike Gaussian Mixture Model(GMM)-UBM based speaker verification system, the proposed system obtains i-vectors corresponding to each HMM state. Among them, the i-vector for feature is selected by extracting it from the specific state containing dominant state information. Relevant experiments are conducted for validating the proposed system performance using the National Institute of Standards and Technology (NIST) 2008 Speaker Recognition Evaluation (SRE) database. As a result, 12 % improvement is attained in terms of equal error rate.

Noise Rabust Speaker Verification Using Sub-Band Weighting (서브밴드 가중치를 이용한 잡음에 강인한 화자검증)

  • Kim, Sung-Tak;Ji, Mi-Kyong;Kim, Hoi-Rin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.28 no.3
    • /
    • pp.279-284
    • /
    • 2009
  • Speaker verification determines whether the claimed speaker is accepted based on the score of the test utterance. In recent years, methods based on Gaussian mixture models and universal background model have been the dominant approaches for text-independent speaker verification. These speaker verification systems based on these methods provide very good performance under laboratory conditions. However, in real situations, the performance of speaker verification system is degraded dramatically. For overcoming this performance degradation, the feature recombination method was proposed, but this method had a drawback that whole sub-band feature vectors are used to compute the likelihood scores. To deal with this drawback, a modified feature recombination method which can use each sub-band likelihood score independently was proposed in our previous research. In this paper, we propose a sub-band weighting method based on sub-band signal-to-noise ratio which is combined with previously proposed modified feature recombination. This proposed method reduces errors by 28% compared with the conventional feature recombination method.

Speaker verification system combining attention-long short term memory based speaker embedding and I-vector in far-field and noisy environments (Attention-long short term memory 기반의 화자 임베딩과 I-vector를 결합한 원거리 및 잡음 환경에서의 화자 검증 알고리즘)

  • Bae, Ara;Kim, Wooil
    • The Journal of the Acoustical Society of Korea
    • /
    • v.39 no.2
    • /
    • pp.137-142
    • /
    • 2020
  • Many studies based on I-vector have been conducted in a variety of environments, from text-dependent short-utterance to text-independent long-utterance. In this paper, we propose a speaker verification system employing a combination of I-vector with Probabilistic Linear Discriminant Analysis (PLDA) and speaker embedding of Long Short Term Memory (LSTM) with attention mechanism in far-field and noisy environments. The LSTM model's Equal Error Rate (EER) is 15.52 % and the Attention-LSTM model is 8.46 %, improving by 7.06 %. We show that the proposed method solves the problem of the existing extraction process which defines embedding as a heuristic. The EER of the I-vector/PLDA without combining is 6.18 % that shows the best performance. And combined with attention-LSTM based embedding is 2.57 % that is 3.61 % less than the baseline system, and which improves performance by 58.41 %.

Frame Selection, Hybrid, Modified Weighting Model Rank Method for Robust Text-independent Speaker Identification (강건한 문맥독립 화자식별을 위한 프레임 선택방법, 복합방법, 수정된 가중모델순위 방법)

  • 김민정;오세진;정호열;정현열
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.8
    • /
    • pp.735-743
    • /
    • 2002
  • In this paper, we propose three new text-independent speaker identification methods. At first, to exclude the frames not having enough features of speaker's vocal from calculation of the maximum likelihood, we propose the FS(Frame Selection) method. This approach selects the important frames by evaluating the difference between the biggest likelihood and the second in each frame, and uses only the frames in calculating the score of likelihood. Our secondly proposed, called the Hybrid, is a combined version of the FS and WMR(Weighting Model Rank). This method determines the claimed speaker using exponential function weights, instead of likelihood itself, only on the selected frames obtained from the FS method. The last proposed, called MWMR (Modified WMR), considers both original likelihood itself and its relative position, when the claimed speaker is determined. It is different from the WMR that take into account only the relative position of likelihood. Through the experiments of the speaker identification, we show that the all the proposed have higher identification rates than the ML. In addition, the Hybrid and MWMR have higher identification rate about 2% and about 3% than WMR, respectively.

A Study on the Fast Enrollment of Text-Independent Speaker Verification for Vehicle Security (차량 보안을 위한 어구독립 화자증명의 등록시간 단축에 관한 연구)

  • Lee, Tae-Seung;Choi, Ho-Jin
    • Journal of Advanced Navigation Technology
    • /
    • v.5 no.1
    • /
    • pp.1-10
    • /
    • 2001
  • Speech has a good characteristics of which car drivers busy to concern with miscellaneous operation can make use in convenient handling and manipulating of devices. By utilizing this, this works proposes a speaker verification method for protecting cars from being stolen and identifying a person trying to access critical on-line services. In this, continuant phonemes recognition which uses language information of speech and MLP(mult-layer perceptron) which has some advantages against previous stochastic methods are adopted. The recognition method, though, involves huge computation amount for learning, so it is somewhat difficult to adopt this in speaker verification application in which speakers should enroll themselves at real time. To relieve this problem, this works presents a solution that introduces speaker cohort models from speaker verification score normalization technique established before, dividing background speakers into small cohorts in advance. As a result, this enables computation burden to be reduced through classifying the enrolling speaker into one of those cohorts and going through enrollment for only that cohort.

  • PDF

Text-Independent Speaker Verification Based on MLP Cohort Model (MLP 군집 모델에 기반한 어구독립 화자증명)

  • 이태승;최호진
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2000.10b
    • /
    • pp.434-436
    • /
    • 2000
  • 본 논문에서는 기존의 확률적 화자군집 모델을 MLP(multi-layer perceptron)로 구현하는 방법과 원형 화자군집 모델이 갖는 문제를 해결할 수정 모델을 제시한다. 화자군집 모델은 화자등록 시간에 민감한 실용 환경에서 중요한 의미를 지닌다. 본 연구에서 사용한 인식단위는 여러 음소계열에서 지속적인 부분을 추출한 지속음이므로 화자등록과 증명 단계에서 특정한 어구에 한정되지 않는 어구독립 방식을 채택한다.

  • PDF

Realization a Text Independent Speaker Identification System with Frame Level Likelihood Normalization (프레임레벨유사도정규화를 적용한 문맥독립화자식별시스템의 구현)

  • 김민정;석수영;김광수;정현열
    • Journal of the Institute of Convergence Signal Processing
    • /
    • v.3 no.1
    • /
    • pp.8-14
    • /
    • 2002
  • In this paper, we realized a real-time text-independent speaker recognition system using gaussian mixture model, and applied frame level likelihood normalization method which shows its effects in verification system. The system has three parts as front-end, training, recognition. In front-end part, cepstral mean normalization and silence removal method were applied to consider speaker's speaking variations. In training, gaussian mixture model was used for speaker's acoustic feature modeling, and maximum likelihood estimation was used for GMM parameter optimization. In recognition, likelihood score was calculated with speaker models and test data at frame level. As test sentences, we used text-independent sentences. ETRI 445 and KLE 452 database were used for training and test, and cepstrum coefficient and regressive coefficient were used as feature parameters. The experiment results show that the frame-level likelihood method's recognition result is higher than conventional method's, independently the number of registered speakers.

  • PDF

An Implementation of Security System Using Speaker Recognition Algorithm (화자인식 알고리즘을 이용한 보안 시스템 구축)

  • Shin, You-Shik;Park, Kee-Young;Kim, Chong-Kyo
    • Journal of the Korean Institute of Telematics and Electronics T
    • /
    • v.36T no.4
    • /
    • pp.17-23
    • /
    • 1999
  • This paper described a security system using text-independent speaker recognition algorithm. Security system is based on PIC16F84 and sound card. Speaker recognition algorithm applied a k-means based model and weighted cepstrum for speech features. As the experimental results, recognition rate of the training data is 100%, non-training data is 99%. Also false rejection rate is 1%, false acceptance rate is 0% and verification mean error rate is 0.5% for registered 5 persons.

  • PDF

Performance Improvement in GMM-based Text-Independent Speaker Verification System (GMM 기반의 문맥독립 화자 검증 시스템의 성능 향상)

  • Hahm Seong-Jun;Shen Guang-Hu;Kim Min-Jung;Kim Joo-Gon;Jung Ho-Youl;Chung Hyun-Yeol
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • autumn
    • /
    • pp.131-134
    • /
    • 2004
  • 본 논문에서는 GMM(Gaussian Mixture Model)을 이용한 문맥독립 화자 검증 시스템을 구현한 후, arctan 함수를 이용한 정규화 방법을 사용하여 화자검증실험을 수행하였다. 특징파라미터로서는 선형예측방법을 이용한 켑스트럼 계수와 회귀계수를 사용하고 화자의 발성 변이를 고려하여 CMN(Cepstral Mean Normalization)을 적용하였다. 화자모델 생성을 위한 학습단에서는 화자발성의 음향학적 특징을 잘 표현할 수 있는 GMM(Gaussian Mixture Model)을 이용하였고 화자 검증단에서는 ML(Maximum Likelihood)을 이용하여 유사도를 계산하고 기존의 정규화 방법과 arctan 함수를 이용한 방법에 의해 정규화된 점수(score)와 미리 정해진 문턱값과 비교하여 검증하였다. 화자 검증 실험결과, arctan 함수를 부가한 방법이 기존의 방법보다 항상 향상된 EER을 나타냄을 확인할 수 있었다.

  • PDF