• Title/Summary/Keyword: 화자군집

Search Result 15, Processing Time 0.017 seconds

Adaptation and Clustering Method for Speaker Identification with Small Training Data (화자적응과 군집화를 이용한 화자식별 시스템의 성능 및 속도 향상)

  • Kim Se-Hyun;Oh Yung-Hwan
    • MALSORI
    • /
    • no.58
    • /
    • pp.83-99
    • /
    • 2006
  • One key factor that hinders the widespread deployment of speaker identification technologies is the requirement of long enrollment utterances to guarantee low error rate during identification. To gain user acceptance of speaker identification technologies, adaptation algorithms that can enroll speakers with short utterances are highly essential. To this end, this paper applies MLLR speaker adaptation for speaker enrollment and compares its performance against other speaker modeling techniques: GMMs and HMM. Also, to speed up the computational procedure of identification, we apply speaker clustering method which uses principal component analysis (PCA) and weighted Euclidean distance as distance measurement. Experimental results show that MLLR adapted modeling method is most effective for short enrollment utterances and that the GMMs performs better when long utterances are available.

  • PDF

Non-Keyword Model for the Improvement of Vocabulary Independent Keyword Spotting System (가변어휘 핵심어 검출 성능 향상을 위한 비핵심어 모델)

  • Kim, Min-Je;Lee, Jung-Chul
    • The Journal of the Acoustical Society of Korea
    • /
    • v.25 no.7
    • /
    • pp.319-324
    • /
    • 2006
  • We Propose two new methods for non-keyword modeling to improve the performance of speaker- and vocabulary-independent keyword spotting system. The first method is decision tree clustering of monophone at the state level instead of monophone clustering method based on K-means algorithm. The second method is multi-state multiple mixture modeling at the syllable level rather than single state multiple mixture model for the non-keyword. To evaluate our method, we used the ETRI speech DB for training and keyword spotting test (closed test) . We also conduct an open test to spot 100 keywords with 400 sentences uttered by 4 speakers in an of fce environment. The experimental results showed that the decision tree-based state clustering method improve 28%/29% (closed/open test) than the monophone clustering method based K-means algorithm in keyword spotting. And multi-state non-keyword modeling at the syllable level improve 22%/2% (closed/open test) than single state model for the non-keyword. These results show that two proposed methods achieve the improvement of keyword spotting performance.

Speech Synthesis using Diphone Clustering and Improved Spectral Smoothing (다이폰 군집화와 개선된 스펙트럼 완만화에 의한 음성합성)

  • Jang, Hyo-Jong;Kim, Kwan-Jung;Kim, Gye-Young;Choi, Hyung-Il
    • The KIPS Transactions:PartB
    • /
    • v.10B no.6
    • /
    • pp.665-672
    • /
    • 2003
  • This paper describes a speech synthesis technique by concatenating unit phoneme. At that time, a major problem is that discontinuity is happened from connection part between unit phonemes, especially from connection part between unit phonemes recorded by different persons. To solve the problem, this paper uses clustered diphone, and proposes a spectral smoothing technique, not only using formant trajectory and distribution characteristic of spectrum but also reflecting human's acoustic characteristic. That is, the proposed technique performs unit phoneme clustering using distribution characteristic of spectrum at connection part between unit phonemes and decides a quantity and a scope for the smoothing by considering human's acoustic characteristic at the connection part of unit phonemes, and then performs the spectral smoothing using weights calculated along a time axes at the border of two diphones. The proposed technique removes the discontinuity and minimizes the distortion which can be occurred by spectrum smoothing. For the purpose of the performance evaluation, we test on five hundred diphones which are extracted from twenty sentences recorded by five persons, and show the experimental results.

Fast Speaker Adaptation in Noisy Environment using Environment Clustering (잡음 환경하에서 환경 군집화를 이용한 고속화자 적응)

  • Kim, Young-Kuk;Song, Hwa-Jeon;Kim, Hyung-Soon
    • Proceedings of the KSPS conference
    • /
    • 2007.05a
    • /
    • pp.33-36
    • /
    • 2007
  • In this paper, we investigate a fast speaker adaptation method based on eigenvoice in several noisy environments. In order to overcome its weakness against noise, we propose a noisy environment clustering method which divides the noisy adaptation utterances into utterance groups with similar environments by the vector quantization based clustering using a cepstral mean as a feature vector. Then each utterance group is used for adaptation to make an environment dependent model. According to our experiment, we obtained 19-37 % relative improvement in error rate compared with the simultaneous speaker adaptation and environmental compensation method

  • PDF

Comparison of Plant Community Structures in Cut and Uncut Areas at Burned Area of Mt. Gumo-san (금오산(金烏山)의 산화지(山火地)에서 벌목지(伐木地)와 비벌목지(非伐木地)의 식물(植物) 군집구조(群集構造) 비교(比較))

  • Che, Sang-Hoon;Kim, Woen
    • Journal of Korean Society of Forest Science
    • /
    • v.86 no.4
    • /
    • pp.509-520
    • /
    • 1997
  • This is a report on the early vegetation, plant community structure, and secondary succession of cut and uncut sites of burned areas in Mt. Gumo-sun. The forest fire occurred on April, 1994 and the pine forest and its floor vegetation were burned down. The investigation was carried out from April, 1995 to October, 1996. The results are summarized as follows : The floristic composition of cut and uncut sites of burned area and unburned area were composed of 32, 36, and 34 kinds of vascular plants respectively. The biological spectra showed the $H(G)-D_1-R_5-e$ type, $H(M)-D_1-R_5-e$ and $M(N)-D_1-R_5-e$ in cut, uncut, and unburned site respectively. The dominant species based on $SDR_3$ of the cut site were Miscanthus sinensis var. purpurascens(100.00). Caret humilis(52.27), Quercus serrata(51.19) and Lysimachia clethroides(39.40), however, in the uncut sites the dominant species were Quercus acutissima(56.91), Pinus densiflora(26.83) in the tree layer, Quercus serrata(50.43), Lindera glauca(40.51), Lespedeza bicolor(37.85) in the shrub layer, and Miscanthus sinensis var. purpurascens(72.27), Pteridium aquilium var. latiusculum(60.92), Carex humilis(63.63) in the herb layer. Pinus densiflora(99.88), Miscanthus sinensis var. purpurascens(82.74), Quercus serrata(77.47) and Carex humilis(74.02) were dominant in the unburned site. The species diversity(H) and evenness index(e) were 1.05, 0.70 and 1.32, 0.85 in the cut and uncut site, respectively and 0.22, 0.63 in the unburned site. Dominance index(C) was 0.15, 0.06 and 0.96 in the cut, uncut site and unburned site, respectively. Degree of succession(DS) was 345.19, 747.47 and 674.34 in cut, uncut and unburned site, respectively. The index of similarity(CCs) was 0.66 between cut and uncut sites, 0.50 between unburned and cut sites and 0.61 between unburned and uncut sites. The amount of exchangeable sodium, calcium, magnesium and soil pH were increased, but the amount of organic matter, available phosphous, total nitrogen, total carbon and exchangeable potassium were decreased in cut site after fire.

  • PDF