Search | Korea Science

Heo, Jungwoo;Shim, Hye-jin;Kim, Ju-ho;Yu, Ha-Jin
- The Journal of the Acoustical Society of Korea
- /
- v.40 no.2
- /
- pp.148-154
- /
- 2021
Text-Independent speaker verification needs to extract text-independent speaker embedding to improve generalization performance. However, deep neural networks that depend on training data have the potential to overfit text information instead of learning the speaker information when repeatedly learning from the identical time series. In this paper, to prevent the overfitting, we propose a segment unit shuffling layer that divides and rearranges the input layer or a hidden layer along the time axis, thus mixes the time series information. Since the segment unit shuffling layer can be applied not only to the input layer but also to the hidden layers, it can be used as generalization technique in the hidden layer, which is known to be effective compared to the generalization technique in the input layer, and can be applied simultaneously with data augmentation. In addition, the degree of distortion can be adjusted by adjusting the unit size of the segment. We observe that the performance of text-independent speaker verification is improved compared to the baseline when the proposed segment unit shuffling layer is applied.
https://doi.org/10.7776/ASK.2021.40.2.148 인용 PDF KSCI

Kim, Hyoung-Gook;Shin, Dong
- The Journal of the Acoustical Society of Korea
- /
- v.29 no.3
- /
- pp.216-221
- /
- 2010
In this paper, we propose speaker verification method using Support Vector Machine (SVM) kernel with Gaussian Mixture Model (GMM)-supervector based on the Mahalanobis distance. The proposed GMM-supervector SVM kernel method is combined GMM with SVM. The GMM-supervectors are generated by GMM parameters of speaker and other speaker utterances. A speaker verification threshold of GMM-supervectors is decided by SVM kernel based on Mahalanobis distance to improve speaker verification accuracy. The experimental results for text-independent speaker verification using 20 speakers demonstrates the performance of the proposed method compared to GMM, SVM, GMM-supervector SVM kernel based on Kullback-Leibler (KL) divergence, and GMM-supervector SVM kernel based on Bhattacharyya distance.
https://doi.org/10.7776/ASK.2010.29.3.216 인용 PDF KSCI

Kim, Hyoung-Gook;Shin, Dong
- The Journal of The Korea Institute of Intelligent Transport Systems
- /
- v.10 no.1
- /
- pp.84-90
- /
- 2011
In this paper, a VoIP-based voice secure telecommunication technology using the text-independent speaker authentication in the telematics environments is proposed. For the secure telecommunication, the sender's voice packets are encrypted by the public-key generated from the speaker's voice information and submitted to the receiver. It is constructed to resist against the man-in-the middle attack. At the receiver side, voice features extracted from the received voice packets are compared with the reference voice-key received from the sender side for the speaker authentication. To improve the accuracy of text-independent speaker authentication, Gaussian Mixture Model(GMM)-supervectors are applied to Support Vector Machine (SVM) kernel using Bayesian information criterion (BIC) and Mahalanobis distance (MD).
PDF KSCI