• Title/Summary/Keyword: speaker

Search Result 1,676, Processing Time 0.028 seconds

A Study on Background Speaker Selection Method in Speaker Verification System (화자인증 시스템에서 선정 방법에 관한 연구)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.9 no.2
    • /
    • pp.135-146
    • /
    • 2002
  • Generally a speaker verification system improves its system recognition ratio by regularizing log likelihood ratio, using a speaker model and its background speaker model that are required to be verified. The speaker-based cohort method is one of the methods that are widely used for selecting background speaker model. Recently, Gaussian-based cohort model has been suggested as a virtually synthesized cohort model, and unlike a speaker-based model, this is the method that chooses only the probability distributions close to basic speaker's probability distribution among the several neighboring speakers' probability distributions and thereby synthesizes a new virtual speaker model. It shows more excellent results than the existing speaker-based method. This study compared the existing speaker-based background speaker models and virtual speaker models and then constructed new virtual background speaker model groups which combined them in a certain ratio. For this, this study constructed a speaker verification system that uses GMM (Gaussin Mixture Model), and found that the suggested method of selecting virtual background speaker model shows more improved performance.

  • PDF

On Speaker Adaptations with Sparse Training Data for Improved Speaker Verification

  • Ahn, Sung-Joo;Kang, Sun-Mee;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.7 no.1
    • /
    • pp.31-37
    • /
    • 2000
  • This paper concerns effective speaker adaptation methods to solve the over-training problem in speaker verification, which frequently occurs when modeling a speaker with sparse training data. While various speaker adaptations have already been applied to speech recognition, these methods have not yet been formally considered in speaker verification. This paper proposes speaker adaptation methods using a combination of MAP and MLLR adaptations, which are successfully used in speech recognition, and applies to speaker verification. Experimental results show that the speaker verification system using a weighted MAP and MLLR adaptation outperforms that of the conventional speaker models without adaptation by a factor of up to 5 times. From these results, we show that the speaker adaptation method achieves significantly better performance even when only small training data is available for speaker verification.

  • PDF

A Study on Background Speaker Model Design for Portable Speaker Verification Systems (휴대용 화자확인시스템을 위한 배경화자모델 설계에 관한 연구)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.10 no.2
    • /
    • pp.35-43
    • /
    • 2003
  • General speaker verification systems improve their recognition performances by normalizing log likelihood ratio, using a speaker model and its background speaker model that are required to be verified. So these systems rely heavily on the availability of much speaker independent databases for background speaker model design. This constraint, however, may be a burden in practical and portable devices such as palm-top computers or wireless handsets which place a premium on computations and memory. In this paper, new approach for the GMM-based background model design used in portable speaker verification system is presented when the enrollment data is available. This approach is to modify three parameters of GMM speaker model such as mixture weights, means and covariances along with reduced mixture order. According to the experiment on a 20 speaker population from YOHO database, we found that this method had a promise of effective use in a portable speaker verification system.

  • PDF

Speaker Verification Using Hidden LMS Adaptive Filtering Algorithm and Competitive Learning Neural Network (Hidden LMS 적응 필터링 알고리즘을 이용한 경쟁학습 화자검증)

  • Cho, Seong-Won;Kim, Jae-Min
    • The Transactions of the Korean Institute of Electrical Engineers D
    • /
    • v.51 no.2
    • /
    • pp.69-77
    • /
    • 2002
  • Speaker verification can be classified in two categories, text-dependent speaker verification and text-independent speaker verification. In this paper, we discuss text-dependent speaker verification. Text-dependent speaker verification system determines whether the sound characteristics of the speaker are equal to those of the specific person or not. In this paper we obtain the speaker data using a sound card in various noisy conditions, apply a new Hidden LMS (Least Mean Square) adaptive algorithm to it, and extract LPC (Linear Predictive Coding)-cepstrum coefficients as feature vectors. Finally, we use a competitive learning neural network for speaker verification. The proposed hidden LMS adaptive filter using a neural network reduces noise and enhances features in various noisy conditions. We construct a separate neural network for each speaker, which makes it unnecessary to train the whole network for a new added speaker and makes the system expansion easy. We experimentally prove that the proposed method improves the speaker verification performance.

Wireless Multi-Channel Speaker Using Wireless Lan (무선랜을 이용한 다채널 Speaker 구현방법에 관한 연구)

  • Hong, Sug-Hoon
    • Proceedings of the KIEE Conference
    • /
    • 2008.10b
    • /
    • pp.258-259
    • /
    • 2008
  • 기존의 유선 연결을 이용한 Hometheater System은 점차 다채널로 발전하는Sound Format의 지원에 따라 복잡한 Speaker 연결 구조를 가지게 되어 오히려 사용자의 불편함을 만들고 이에 따라 일부 Speaker만을 사용하거나, 전면에 모든 Speaker를 배치하여 사용함에 따라 다채널 Speaker의 이점을 활용하지 못하고 있다. 이에 따라 점차 Wireless 기반의 다채널 Speaker에 대한 요구가 증가하고 있는데 현재 Wireless 기반의 각 Speaker Unit에 Data를 전송하는 과정에서 전송 delay가 발생하고 이 문제로 인해 Wireless Speaker의 보급에 빠르게 이루어지지 못하고 있는 상태에다. 이에 따라 본 논문에서는 무선랜을 이용한 무선 홈씨어터 시스템 구현에서 문제가 되는 전송Delay에 대해 보정 알고리즘을 통한 개선 방법을 제안한다.

  • PDF

Emotional Speaker Recognition using Emotional Adaptation (감정 적응을 이용한 감정 화자 인식)

  • Kim, Weon-Goo
    • The Transactions of The Korean Institute of Electrical Engineers
    • /
    • v.66 no.7
    • /
    • pp.1105-1110
    • /
    • 2017
  • Speech with various emotions degrades the performance of the speaker recognition system. In this paper, a speaker recognition method using emotional adaptation has been proposed to improve the performance of speaker recognition system using affective speech. For emotional adaptation, emotional speaker model was generated from speaker model without emotion using a small number of training affective speech and speaker adaptation method. Since it is not easy to obtain a sufficient affective speech for training from a speaker, it is very practical to use a small number of affective speeches in a real situation. The proposed method was evaluated using a Korean database containing four emotions. Experimental results show that the proposed method has better performance than conventional methods in speaker verification and speaker recognition.

A Korean Multi-speaker Text-to-Speech System Using d-vector (d-vector를 이용한 한국어 다화자 TTS 시스템)

  • Kim, Kwang Hyeon;Kwon, Chul Hong
    • The Journal of the Convergence on Culture Technology
    • /
    • v.8 no.3
    • /
    • pp.469-475
    • /
    • 2022
  • To train the model of the deep learning-based single-speaker TTS system, a speech DB of tens of hours and a lot of training time are required. This is an inefficient method in terms of time and cost to train multi-speaker or personalized TTS models. The voice cloning method uses a speaker encoder model to make the TTS model of a new speaker. Through the trained speaker encoder model, a speaker embedding vector representing the timbre of the new speaker is created from the small speech data of the new speaker that is not used for training. In this paper, we propose a multi-speaker TTS system to which voice cloning is applied. The proposed TTS system consists of a speaker encoder, synthesizer and vocoder. The speaker encoder applies the d-vector technique used in the speaker recognition field. The timbre of the new speaker is expressed by adding the d-vector derived from the trained speaker encoder as an input to the synthesizer. It can be seen that the performance of the proposed TTS system is excellent from the experimental results derived by the MOS and timbre similarity listening tests.

Development of Voice Activated Universal Remote Control System using the Speaker Adaptation (화자적응을 이용한 음성인식 제어시스템 개발)

  • Kim Yong-Pyo;Yoon Dong-Han;Choi Un-Ha
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.10 no.4
    • /
    • pp.739-743
    • /
    • 2006
  • In this paper, development of voice activated Universal Remote Control using the Neural Networks. A speaker dependent system is developed to operate for a single speaker. These systems are usually easier to develop, cheaper to buy and more accurate, but not as flexible as speaker adaptive or speaker independent systems. A speaker independent system is developed to operate for any speaker of a particular type (e.g. American English). These systems are the most difficult to develop, most expensive and accuracy is lower than speaker dependent systems. However, they are more flexible. A speaker adaptive system is developed to adapt its operation to the characteristics of new speakers. It's difficulty lies somewhere between speaker independent and speaker dependent systems. This paper is developed Speaker Adaptation using the Neural Networks.

On Effective Speaker Verification Based on Subword Model

  • Ahn, Sung-Joo;Kang, Sun-Mee;Ko, Han-Seok
    • Speech Sciences
    • /
    • v.9 no.1
    • /
    • pp.49-59
    • /
    • 2002
  • This paper concerns an effective text-dependent speaker verification method to increase the performance of speaker verification. While various speaker verification methods have already been developed, their effectiveness has not yet been formally proven in terms of achieving acceptable performance levels. This paper proposes a weighted likelihood procedure along with a confidence measure based on subword-based text-dependent speaker verification. Our aim is to remedy the low performance problem in speaker verification by exploring a means to strengthen the verification likelihood via subword-based hypothesis criteria and weighted likelihood method. Experimental results show that the proposed speaker verification method outperforms that of the speaker verification scheme without using the proposed decision by a factor of up to 1.6 times. From these results, the proposed speaker verification method is shown to be very effective and to achieve a reliable performance.

  • PDF

A Speaker Pruning Method for Real-Time Speaker Identification System

  • Kim, Min-Joung;Suk, Soo-Young;Jeong, Jong-Hyeog
    • IEMEK Journal of Embedded Systems and Applications
    • /
    • v.10 no.2
    • /
    • pp.65-71
    • /
    • 2015
  • It has been known that GMM (Gaussian Mixture Model) based speaker identification systems using ML (Maximum Likelihood) and WMR (Weighting Model Rank) demonstrate very high performances. However, such systems are not so effective under practical environments, in terms of real time processing, because of their high calculation costs. In this paper, we propose a new speaker-pruning algorithm that effectively reduces the calculation cost. In this algorithm, we select 20% of speaker models having higher likelihood with a part of input speech and apply MWMR (Modified Weighted Model Rank) to these selected speaker models to find out identified speaker. To verify the effectiveness of the proposed algorithm, we performed speaker identification experiments using TIMIT database. The proposed method shows more than 60% improvement of reduced processing time than the conventional GMM based system with no pruning, while maintaining the recognition accuracy.