• Title/Summary/Keyword: speaker

Search Result 1,676, Processing Time 0.027 seconds

An Implementation of Real-Time Speaker Verification System on Telephone Voices Using DSP Board (DSP보드를 이용한 전화음성용 실시간 화자인증 시스템의 구현에 관한 연구)

  • Lee Hyeon Seung;Choi Hong Sub
    • MALSORI
    • /
    • no.49
    • /
    • pp.145-158
    • /
    • 2004
  • This paper is aiming at implementation of real-time speaker verification system using DSP board. Dialog/4, which is based on microprocessor and DSP processor, is selected to easily control telephone signals and to process audio/voice signals. Speaker verification system performs signal processing and feature extraction after receiving voice and its ID. Then through computing the likelihood ratio of claimed speaker model to the background model, it makes real-time decision on acceptance or rejection. For the verification experiments, total 15 speaker models and 6 background models are adopted. The experimental results show that verification accuracy rates are 99.5% for using telephone speech-based speaker models.

  • PDF

Context-Independent Speaker Recognition in URC Environment (지능형 서비스 로봇을 위한 문맥독립 화자인식 시스템)

  • Ji, Mi-Kyong;Kim, Sung-Tak;Kim, Hoi-Rin
    • The Journal of Korea Robotics Society
    • /
    • v.1 no.2
    • /
    • pp.158-162
    • /
    • 2006
  • This paper presents a speaker recognition system intended for use in human-robot interaction. The proposed speaker recognition system can achieve significantly high performance in the Ubiquitous Robot Companion (URC) environment. The URC concept is a scenario in which a robot is connected to a server through a broadband connection allowing functions to be performed on the server side, thereby minimizing the stand-alone function significantly and reducing the robot client cost. Instead of giving a robot (client) on-board cognitive capabilities, the sensing and processing work are outsourced to a central computer (server) connected to the high-speed Internet, with only the moving capability provided by the robot. Our aim is to enhance human-robot interaction by increasing the performance of speaker recognition with multiple microphones on the robot side in adverse distant-talking environments. Our speaker recognizer provides the URC project with a basic interface for human-robot interaction.

  • PDF

Speaker Identification in Small Training Data Environment using MLLR Adaptation Method (MLLR 화자적응 기법을 이용한 적은 학습자료 환경의 화자식별)

  • Kim, Se-hyun;Oh, Yung-Hwan
    • Proceedings of the KSPS conference
    • /
    • 2005.11a
    • /
    • pp.159-162
    • /
    • 2005
  • Identification is the process automatically identify who is speaking on the basis of information obtained from speech waves. In training phase, each speaker models are trained using each speaker's speech data. GMMs (Gaussian Mixture Models), which have been successfully applied to speaker modeling in text-independent speaker identification, are not efficient in insufficient training data environment. This paper proposes speaker modeling method using MLLR (Maximum Likelihood Linear Regression) method which is used for speaker adaptation in speech recognition. We make SD-like model using MLLR adaptation method instead of speaker dependent model (SD). Proposed system outperforms the GMMs in small training data environment.

  • PDF

A Study on the Perlormance Variations of the Mobile Phone Speaker Verification System According to the Various Background Speaker Properties (휴대폰음성을 이용한 화자인증시스템에서 배경화자에 따른 성능변화에 관한 연구)

  • Choi, Hong-Sub
    • Speech Sciences
    • /
    • v.12 no.3
    • /
    • pp.105-114
    • /
    • 2005
  • It was verified that a speaker verification system improved its performances of EER by regularizing log likelihood ratio, using background speaker models. Recently the wireless mobile phones are becoming more dominant communication terminals than wired phones. So the need for building a speaker verification system on mobile phone is increasing abruptly. Therefore in this paper, we had some experiments to examine the performance of speaker verification based on mobile phone's voices. Especially we are focused on the performance variations in EER(Equal Error Rate) according to several background speaker's characteristics, such as selecting methods(MSC, MIX), number of background speakers, aging factor of speech database. For this, we constructed a speaker verification system that uses GMM(Gaussin Mixture Model) and found that the MIX method is generally superior to another method by about 1.0% EER. In aspect of number of background speakers, EER is decreasing in proportion to the background speakers populations. As the number is increasing as 6, 10 and 16, the EERs are recorded as 13.0%, 12.2%, and 11.6%. An unexpected results are happened in aging effects of the speech database on the performance. EERs are measured as 4%, 12% and 19% for each seasonally recorded databases from session 1 to session 3, respectively, where duration gap between sessions is set by 3 months. Although seasons speech database has 10 speakers and 10 sentences per each, which gives less statistical confidence to results, we confirmed that enrolled speaker models in speaker verification system should be regularly updated using the ongoing claimant's utterances.

  • PDF

Speaker Identification Using Dynamic Time Warping Algorithm (동적 시간 신축 알고리즘을 이용한 화자 식별)

  • Jeong, Seung-Do
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.12 no.5
    • /
    • pp.2402-2409
    • /
    • 2011
  • The voice has distinguishable acoustic properties of speaker as well as transmitting information. The speaker recognition is the method to figures out who speaks the words through acoustic differences between speakers. The speaker recognition is roughly divided two kinds of categories: speaker verification and identification. The speaker verification is the method which verifies speaker himself based on only one's voice. Otherwise, the speaker identification is the method to find speaker by searching most similar model in the database previously consisted of multiple subordinate sentences. This paper composes feature vector from extracting MFCC coefficients and uses the dynamic time warping algorithm to compare the similarity between features. In order to describe common characteristic based on phonological features of spoken words, two subordinate sentences for each speaker are used as the training data. Thus, it is possible to identify the speaker who didn't say the same word which is previously stored in the database.

Group-based speaker embeddings for text-independent speaker verification (문장 독립 화자 검증을 위한 그룹기반 화자 임베딩)

  • Jung, Youngmoon;Eom, Youngsik;Lee, Yeonghyeon;Kim, Hoirin
    • The Journal of the Acoustical Society of Korea
    • /
    • v.40 no.5
    • /
    • pp.496-502
    • /
    • 2021
  • Recently, deep speaker embedding approach has been widely used in text-independent speaker verification, which shows better performance than the traditional i-vector approach. In this work, to improve the deep speaker embedding approach, we propose a novel method called group-based speaker embedding which incorporates group information. We cluster all speakers of the training data into a predefined number of groups in an unsupervised manner, so that a fixed-length group embedding represents the corresponding group. A Group Decision Network (GDN) produces a group weight, and an aggregated group embedding is generated from the weighted sum of the group embeddings and the group weights. Finally, we generate a group-based embedding by adding the aggregated group embedding to the deep speaker embedding. In this way, a speaker embedding can reduce the search space of the speaker identity by incorporating group information, and thereby can flexibly represent a significant number of speakers. We conducted experiments using the VoxCeleb1 database to show that our proposed approach can improve the previous approaches.

The Research On the improvements of Speaker's Frequency Characteristic using DSP Audio Processor (DSP 오디오 프로세서를 이용한 스피커 주파수 특성 개선에 관한 연구)

  • Lee, Soon-Reyo;Choi, Hong-Sub
    • Journal of Digital Contents Society
    • /
    • v.8 no.3
    • /
    • pp.341-346
    • /
    • 2007
  • The purpose of this paper is to propose the design of VADSM(Value-Added Digital Speaker Module) which tunes up the speaker unit by measuring the speaker's frequency responses and controlling EQ band. This module can reduce audible distortions at particular frequency band and improve some flatness in the speaker's frequency response. VADSM is composed of DSP AMP and speaker unit. When a speaker transforms electrical signal to sound, the magnitude response at some frequencies are more or less than normal level. So, DSP AMP can be used to adjust those magnitudes up or down by controlling its EQ bands.

  • PDF

Speaker Verification System Based on HMM Robust to Noise Environments (잡음환경에 강인한 HMM기반 화자 확인 시스템에 관한 연구)

  • 위진우;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.20 no.7
    • /
    • pp.69-75
    • /
    • 2001
  • Intra-speaker variation, noise environments, and mismatch between training and test conditions are the major reasons for the speaker verification system unable to use it practically. In this study, we propose robust end-point detection algorithm, noise cancelling with the microphone property compensation technique, and inter-speaker discriminate technique by weighting cepstrum for robust speaker verification system. Simulation results show that the average speaker verification rate is improved in the rate of 17.65% with proposed end-point detection algorithm using LPC residue and is improved in the rate of 36.93% with proposed noise cancelling and microphone property compensation algorithm. The proposed weighting function for discriminating inter-speaker variations also improves the average speaker verification rate in the rate of 6.515%.

  • PDF

Speaker Identification Based on Incremental Learning Neural Network

  • Heo, Kwang-Seung;Sim, Kwee-Bo
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.5 no.1
    • /
    • pp.76-82
    • /
    • 2005
  • Speech signal has various features of speakers. This feature is extracted from speech signal processing. The speaker is identified by the speaker identification system. In this paper, we propose the speaker identification system that uses the incremental learning based on neural network. Recorded speech signal through the microphone is blocked to the frame of 1024 speech samples. Energy is divided speech signal to voiced signal and unvoiced signal. The extracted 12 orders LPC cpestrum coefficients are used with input data for neural network. The speakers are identified with the speaker identification system using the neural network. The neural network has the structure of MLP which consists of 12 input nodes, 8 hidden nodes, and 4 output nodes. The number of output node means the identified speakers. The first output node is excited to the first speaker. Incremental learning begins when the new speaker is identified. Incremental learning is the learning algorithm that already learned weights are remembered and only the new weights that are created as adding new speaker are trained. It is learning algorithm that overcomes the fault of neural network. The neural network repeats the learning when the new speaker is entered to it. The architecture of neural network is extended with the number of speakers. Therefore, this system can learn without the restricted number of speakers.

A Study on the Design of Web-based Speaker Verification System (웹 기반의 화자확인시스템 설계에 관한 연구)

  • 이재희;강철호
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.4
    • /
    • pp.23-30
    • /
    • 2000
  • In this paper, the web-based speaker verification system is designed. To decide the speaker recognition algorithm applied to the web-based speaker verification system, the recognition performance and special features of the text-dependent speaker recognition algorithms(DTW, DHMM, SCHMM) are compared through the computer simulation. Using the results of computer simulation, select DHMM as speaker recognition algorithm at web-based speaker verification system because DHMM has the proper recognition performance and initial training utterance number. And by the three-tier method using the ActiveX, DCOM techniques web-based speaker verification system is designed to be operated in the distributed processing environment.

  • PDF