Search | Korea Science

Kim, Kyung-Wha;So, Byung-Min;Yu, Ha-Jin
- Phonetics and Speech Sciences
- /
- v.4 no.3
- /
- pp.95-101
- /
- 2012
In this paper, we introduce the automatic speaker identification system 'SPO(Supreme Prosecutors Office) Verifier'. SPO Verifier is a GMM(Gaussian mixture model)-UBM(universal background model) based automatic speaker recognition system and has been developed using Korean speakers' utterances. This system uses a channel compensation algorithm to compensate recording device characteristics. The system can give the users the ability to manage reference models with utterances from various environments to get more accurate recognition results. To evaluate the performance of SPO Verifier on Korean speakers, we compared this system with one of the most widely used commercial systems in the forensic field. The results showed that SPO Verifier shows lower EER(equal error rate) than that of the commercial system.
https://doi.org/10.13064/KSSS.2012.4.3.095 인용 PDF

Song Min-Chang;Shin Jiyoung;Kang SunMee
- MALSORI
- /
- no.46
- /
- pp.25-35
- /
- 2003
This study deals with the disguised voice (or voice disguise) in the field of forensic phonetics. We especially studied the effects of the methods of disguised voice on the aural decision. Within the nonelectronic-deliberate voice disguise area, the methods of disguised voice include use of lowered pitch, pinched nostrils, falsetto, and whisper. Ten (male:5, female:5) Seoul speakers made a recording of 16 sentences. In the aural test, 30 subjects listened normal and disguised voice. And they were asked to make a decision whether speakers identified or not. The result is as follows: The speaker verification of the falsetto and whisper was more difficult than the lowered pitch and pinched nostrils.
PDF

Kim, Min-Seok;Kim, Kyung-Wha;Yang, IL-Ho;Yu, Ha-Jin
- Phonetics and Speech Sciences
- /
- v.2 no.3
- /
- pp.135-139
- /
- 2010
Forensic speaker identification needs high accuracy and reliability. However, the current level of speaker identification does not reach its demand. Therefore, the confidence evaluation of results is one of the issues in forensic speaker identification. In this paper, we propose a new confidence measure of forensic speaker identification system. This is based on pitch differences between the registered utterances of the identified speaker and the test utterance. In the experiments, we evaluate this confidence measure by speech identification tasks on various environments. As the results, the proposed measure can be a good measure indicating if the result is reliable or not.
PDF

Park Hansang
- Proceedings of the KSPS conference
- /
- 2002.11a
- /
- pp.77-80
- /
- 2002
This study proposes phonation type index k as a descriptor of the overall spectral tilt, which is free from the effects of fundamental frequency and vowel quality. The newly proposed phonation type index k presents a simple and single measure of the overall spectral tilt. Phonation type index k can be applied to speech technology. It can also be used in diagnosing patients voice qualities in speech pathology. The distribution of phonation type index k, which is speaker-dependent, may be useful in forensic phonetics and voice recognition as an indicator of speaker identity.
PDF

Yang, Il-Ho;Kim, Kyung-Wha;Kim, Myung-Jae;Baek, Rock-Seon;Heo, Hee-Soo;Yu, Ha-Jin
- Phonetics and Speech Sciences
- /
- v.6 no.2
- /
- pp.21-28
- /
- 2014
We propose a novel scheme for digital audio authentication of given audio files which are edited by inserting small audio segments from different environmental sources. The purpose of this research is to detect inserted sections from given audio files. We expect that the proposed method will assist human investigators by notifying suspected audio section which considered to be recorded or transmitted on different environments. GMM-UBM and GSV-SVM are applied for modeling the dominant environment of a given audio file. Four kinds of likelihood ratio based scores and SVM score are used to measure the likelihood for a dominant environment model. We also use an ensemble score which is a combination of the aforementioned five kinds of scores. In the experimental results, the proposed method shows the lowest average equal error rate when we use the ensemble score. Even when dominant environments were unknown, the proposed method gives a similar accuracy.
https://doi.org/10.13064/KSSS.2014.6.2.021 인용 PDF KSCI

Ahn, Seo-Yeong;Ryu, Se-Hui;Kim, Kyung-Wha;Hong, Ki-Hyung
- Phonetics and Speech Sciences
- /
- v.14 no.3
- /
- pp.103-112
- /
- 2022
Due to the popularization of smartphones, most of the recorded speech files submitted as evidence of recent crimes are produced by smartphones, and the integrity (forgery) of the submitted speech files based on smartphones is emerging as a major issue in the investigation and trial process. Samsung smartphones with the highest domestic market share are distributed with built-in speech recording applications that can record calls and voice, and can edit recorded speech. Unlike editing through third-party speech (audio) applications, editing by their own builtin speech applications has a high similarity to the original file in metadata structures and attributes, so more precise analysis techniques need to prove integrity. In this study, we constructed a speech file metadata database for speech files (original files) recorded by 34 Samsung smartphones and edited speech files edited by their built-in speech recording applications. We analyzed by comparing the metadata structures and attributes of the original files to their edited ones. As a result, we found significant metadata differences between the original speech files and the edited ones.
https://doi.org/10.13064/KSSS.2022.14.3.103 인용 PDF KSCI