Developing a Korean Standard Speech DB

Shin, Jiyoung;Jang, Hyejin;Kang, Younmin;Kim, Kyung-Wha;

doi:10.13064/KSSS.2015.7.1.139

Phonetics and Speech Sciences (말소리와 음성과학)

Volume 7 Issue 1
/
Pages.139-150
/
2015
/
2005-8063(pISSN)
/
2586-5854(eISSN)

Korean Society of Speech Sciences (한국음성학회)

DOI QR Code

Developing a Korean Standard Speech DB

한국인 표준 음성 DB 구축

신지영 (고려대학교) ;
장혜진 (고려대학교) ;
강연민 (고려대학교) ;
김경화 (대검찰청)

Received : 2015.02.07
Accepted : 2015.03.11
Published : 2015.03.31

https://doi.org/10.13064/KSSS.2015.7.1.139 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

The data accumulated in this database will be used to develop a speaker identification system. This may also be applied towards, but not limited to, fields of phonetic studies, sociolinguistics, and language pathology. We plan to supplement the large-scale speech corpus next year, in terms of research methodology and content, to better answer the needs of diverse fields. The purpose of this study is to develop a speech corpus for standard Korean speech. For the samples to viably represent the state of spoken Korean, demographic factors were considered to modulate a balanced spread of age, gender, and dialects. Nine separate regional dialects were categorized, and five age groups were established from individuals in their 20s to 60s. A speech-sample collection protocol was developed for the purpose of this study where each speaker performs five tasks: two reading tasks, two semi-spontaneous speech tasks, and one spontaneous speech task. This particular configuration of sample data collection accommodates gathering of rich and well-balanced speech-samples across various speech types, and is expected to improve the utility of the speech corpus developed in this study. Samples from 639 individuals were collected using the protocol. Speech samples were collected also from other sources, for a combined total of samples from 1,012 individuals.

Keywords

References

서상규.김형정 (2005). 구어 말뭉치 설계의 몇 가지 조건, 언어사실과 관점, 14, 5-29.
국립국어원 (2007). 21세기 세종계획 국어 특수자료 구축, 서울: 국립국어원.
윤원희 외 (2013). 한국어 자연발화 음성코퍼스 구축을 위한 기초 연구, 실험음성학연구회 강독회.
Pitt, M. A., Johnson, K., Hume, E., Kiesling, S., & Raymond, W. (2005). The Buckeye corpus of conversational speech: Labeling conventions and a test of transcriber reliability, Speech Communication, 45(1), 89-95. https://doi.org/10.1016/j.specom.2004.09.001
Nolan, F., de Jong, G. and McDougall, K. (2006). Introducing the DyViS project: Dynamic variability in speech: a forensic phonetic study of British English, In Abstract Proc. Annual Conf. of the International Association for Forensic Phonetics and Acoustics(IAFPA).
Zue, V., Seneff, S., & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond, Speech Communication, 9(4), 351-356. https://doi.org/10.1016/0167-6393(90)90010-7
Parsa, V., & Jamieson, D. G. (2001). Acoustic Discrimination of Pathological Voice: Sustained Vowels Versus Continuous Speech, Journal of Speech, Language, and Hearing Research, 44(2), 327-339. https://doi.org/10.1044/1092-4388(2001/027)

Phonetics and Speech Sciences (말소리와 음성과학)

Developing a Korean Standard Speech DB

한국인 표준 음성 DB 구축

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)