• Title/Summary/Keyword: speaker

Search Result 1,676, Processing Time 0.026 seconds

Performance Improvement of Speaker Recognition System Using Genetic Algorithm (유전자 알고리즘을 이용한 화자인식 시스템 성능 향상)

  • 문인섭;김종교
    • The Journal of the Acoustical Society of Korea
    • /
    • v.19 no.8
    • /
    • pp.63-67
    • /
    • 2000
  • This paper deals with text-prompt speaker recognition based on dynamic time warping (DTW). The Genetic Algorithm was applied to the creation of reference patterns for suitable reflection of the speaker characteristics, one of the most important determinants in the fields of speaker recognition. In order to overcome the weakness of text-dependent and text-independent speaker recognition, the text-prompt type was suggested. Performed speaker identification and verification in close and open set respectively, hence the Genetic algorithm-based reference patterns had been proven to have better performance in both recognition rate and speed than that of conventional reference patterns.

  • PDF

A study on the speaker adaptation in CDHMM usling variable number of mixtures in each state (CDHMM의 상태당 가지 수를 가변시키는 화자적응에 관한 연구)

  • 김광태;서정일;홍재근
    • Journal of the Korean Institute of Telematics and Electronics S
    • /
    • v.35S no.3
    • /
    • pp.166-175
    • /
    • 1998
  • When we make a speaker adapted model using MAPE (maximum a posteriori estimation), the adapted model has one mixture in each state. This is because we cannot estimate a number of a priori distribution from a speaker-independent model in each state. If the model is represented by one mixture in each state, it is not well adadpted to specific speaker because it is difficult to represent various speech informationof the speaker with one mixture. In this paper, we suggest the method using several mixtures to well represent various speech information of the speaker in each state. But, because speaker-specific training dat is not sufficient, this method can't be used in every state. So, we make the number of mixtures in each state variable in proportion to the number of frames and to the determinant ofthe variance matrix in the state. Using the proposed method, we reduced the error rate than methods using one branch in each state.

  • PDF

Statistical Extraction of Speech Features Using Independent Component Analysis and Its Application to Speaker Identification

  • 장길진;오영환
    • The Journal of the Acoustical Society of Korea
    • /
    • v.21 no.4
    • /
    • pp.156-156
    • /
    • 2002
  • We apply independent component analysis (ICA) for extracting an optimal basis to the problem of finding efficient features for representing speech signals of a given speaker The speech segments are assumed to be generated by a linear combination of the basis functions, thus the distribution of speech segments of a speaker is modeled by adapting the basis functions so that each source component is statistically independent. The learned basis functions are oriented and localized in both space and frequency, bearing a resemblance to Gabor wavelets. These features are speaker dependent characteristics and to assess their efficiency we performed speaker identification experiments and compared our results with the conventional Fourier-basis. Our results show that the proposed method is more efficient than the conventional Fourier-based features in that they can obtain a higher speaker identification rate.

Unsupervised Speaker Adaptation Based on Sufficient HMM Statistics (SUFFICIENT HMM 통계치에 기반한 UNSUPERVISED 화자 적응)

  • Ko Bong-Ok;Kim Chong-Kyo
    • Proceedings of the KSPS conference
    • /
    • 2003.05a
    • /
    • pp.127-130
    • /
    • 2003
  • This paper describes an efficient method for unsupervised speaker adaptation. This method is based on selecting a subset of speakers who are acoustically close to a test speaker, and calculating adapted model parameters according to the previously stored sufficient HMM statistics of the selected speakers' data. In this method, only a few unsupervised test speaker's data are required for the adaptation. Also, by using the sufficient HMM statistics of the selected speakers' data, a quick adaptation can be done. Compared with a pre-clustering method, the proposed method can obtain a more optimal speaker cluster because the clustering result is determined according to test speaker's data on-line. Experiment results show that the proposed method attains better improvement than MLLR from the speaker independent model. Moreover the proposed method utilizes only one unsupervised sentence utterance, while MLLR usually utilizes more than ten supervised sentence utterances.

  • PDF

A Study On Text Independent Speaker Recognition Using Eigenspace (고유영역을 이용한 문자독립형 화자인식에 관한 연구)

  • 함철배;이동규;이두수
    • Proceedings of the IEEK Conference
    • /
    • 1999.06a
    • /
    • pp.671-674
    • /
    • 1999
  • We report the new method for speaker recognition. Until now, many researchers have used HMM (Hidden Markov Model) with cepstral coefficient or neural network for speaker recognition. Here, we introduce the method of speaker recognition using eigenspace. This method can reduce the training and recognition time of speaker recognition system. In proposed method, we use the low rank model of the speech eigenspace. In experiment, we obtain good recognition result.

  • PDF

Text-dependent Speaker Verification System Over Telephone Lines (전화망을 위한 어구 종속 화자 확인 시스템)

  • 김유진;정재호
    • Proceedings of the IEEK Conference
    • /
    • 1999.11a
    • /
    • pp.663-667
    • /
    • 1999
  • In this paper, we review the conventional speaker verification algorithm and present the text-dependent speaker verification system for application over telephone lines and its result of experiments. We apply blind-segmentation algorithm which segments speech into sub-word unit without linguistic information to the speaker verification system for training speaker model effectively with limited enrollment data. And the World-mode] that is created from PBW DB for score normalization is used. The experiments are presented in implemented system using database, which were constructed to simulate field test, and are shown 3.3% EER.

  • PDF

Speaker Change Detection Based on a Graph-Partitioning Criterion

  • Seo, Jin-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.30 no.2
    • /
    • pp.80-85
    • /
    • 2011
  • Speaker change detection involves the identification of time indices of an audio stream, where the identity of the speaker changes. In this paper, we propose novel measures for the speaker change detection based on a graph-partitioning criterion over the pairwise distance matrix of feature-vector stream. Experiments on both synthetic and real-world data were performed and showed that the proposed approach yield promising results compared with the conventional statistical measures.

Text-Independent Speaker Identification System Based On Vowel And Incremental Learning Neural Networks

  • Heo, Kwang-Seung;Lee, Dong-Wook;Sim, Kwee-Bo
    • 제어로봇시스템학회:학술대회논문집
    • /
    • 2003.10a
    • /
    • pp.1042-1045
    • /
    • 2003
  • In this paper, we propose the speaker identification system that uses vowel that has speaker's characteristic. System is divided to speech feature extraction part and speaker identification part. Speech feature extraction part extracts speaker's feature. Voiced speech has the characteristic that divides speakers. For vowel extraction, formants are used in voiced speech through frequency analysis. Vowel-a that different formants is extracted in text. Pitch, formant, intensity, log area ratio, LP coefficients, cepstral coefficients are used by method to draw characteristic. The cpestral coefficients that show the best performance in speaker identification among several methods are used. Speaker identification part distinguishes speaker using Neural Network. 12 order cepstral coefficients are used learning input data. Neural Network's structure is MLP and learning algorithm is BP (Backpropagation). Hidden nodes and output nodes are incremented. The nodes in the incremental learning neural network are interconnected via weighted links and each node in a layer is generally connected to each node in the succeeding layer leaving the output node to provide output for the network. Though the vowel extract and incremental learning, the proposed system uses low learning data and reduces learning time and improves identification rate.

  • PDF

Sound Quality Enhancement by using the Single Core Exciter in OLED Panel

  • Lee, Sungtae;Park, Kwanho;Park, Hyungwoo
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.14 no.2
    • /
    • pp.871-888
    • /
    • 2020
  • With the development of display engineering and technology, the screen and sound quality of information devices such as TVs are improving. The screen used LEDs via LCD and PDP and a large flat panel in the early CRT to create super-high resolution. The sound is improved by directly vibrating a thin and simple panel, such as an OLED. In our previous study, the exciter speaker was attached to the rear of the OLED panel to be used as the diaphragm of the speaker, and the sound quality was as good as that of the TV using the existing dynamic speaker. This method supplied the viewer with the direct sound coming from the panel, delivering clear sound, and the sound and image came from the same location, thus giving the viewer high immersion and maximizing the effect of information transfer. OLED exciter speakers, however, have a special directivity, which tends to slightly attenuate the tone at the very center of the screen. This study improves the sound quality by improving the structure of the exciter speaker and the radiated sound of the flat panel display. A 2-in-1 Exciter is made into a single core to improve the speaker's radiation pattern.

Speaker Recognition Using Optimal Path and Weighted Orthogonal Parameters (최적경로와 가중직교인자를 이용한 화자인식)

  • Park, Seung-Kyu;Bai, Chul-Soo
    • The Journal of the Acoustical Society of Korea
    • /
    • v.11 no.2
    • /
    • pp.68-72
    • /
    • 1992
  • Recently, many researchers have studied the speaker recognition through the statistical processing method using Karhunen-Loeve Transform. However, the content of speaker's identity and the vocalization speed cause speaker recognition rate to be lowered. This parer studies the speaker recognition method using weighted orthogonal parameters which are weighted with eigen-values of speech so as to emphasize the speaker's identity, and optimal path which is made by DWP so as to normalize dynamic time feature of speech. To confirm this method, we compare the speaker recognition rate from this proposed method with that from the conventional statistical processing method. As a result, it is shown that this method is more excellent in speaker recognition rate than conventional method.

  • PDF