DOI QR코드

DOI QR Code

Developing a Korean Standard Speech DB

한국인 표준 음성 DB 구축

  • Received : 2015.02.07
  • Accepted : 2015.03.11
  • Published : 2015.03.31

Abstract

The data accumulated in this database will be used to develop a speaker identification system. This may also be applied towards, but not limited to, fields of phonetic studies, sociolinguistics, and language pathology. We plan to supplement the large-scale speech corpus next year, in terms of research methodology and content, to better answer the needs of diverse fields. The purpose of this study is to develop a speech corpus for standard Korean speech. For the samples to viably represent the state of spoken Korean, demographic factors were considered to modulate a balanced spread of age, gender, and dialects. Nine separate regional dialects were categorized, and five age groups were established from individuals in their 20s to 60s. A speech-sample collection protocol was developed for the purpose of this study where each speaker performs five tasks: two reading tasks, two semi-spontaneous speech tasks, and one spontaneous speech task. This particular configuration of sample data collection accommodates gathering of rich and well-balanced speech-samples across various speech types, and is expected to improve the utility of the speech corpus developed in this study. Samples from 639 individuals were collected using the protocol. Speech samples were collected also from other sources, for a combined total of samples from 1,012 individuals.

Keywords

References

  1. 서상규.김형정 (2005). 구어 말뭉치 설계의 몇 가지 조건, 언어사실과 관점, 14, 5-29.
  2. 국립국어원 (2007). 21세기 세종계획 국어 특수자료 구축, 서울: 국립국어원.
  3. 윤원희 외 (2013). 한국어 자연발화 음성코퍼스 구축을 위한 기초 연구, 실험음성학연구회 강독회.
  4. Pitt, M. A., Johnson, K., Hume, E., Kiesling, S., & Raymond, W. (2005). The Buckeye corpus of conversational speech: Labeling conventions and a test of transcriber reliability, Speech Communication, 45(1), 89-95. https://doi.org/10.1016/j.specom.2004.09.001
  5. Nolan, F., de Jong, G. and McDougall, K. (2006). Introducing the DyViS project: Dynamic variability in speech: a forensic phonetic study of British English, In Abstract Proc. Annual Conf. of the International Association for Forensic Phonetics and Acoustics(IAFPA).
  6. Zue, V., Seneff, S., & Glass, J. (1990). Speech database development at MIT: TIMIT and beyond, Speech Communication, 9(4), 351-356. https://doi.org/10.1016/0167-6393(90)90010-7
  7. Parsa, V., & Jamieson, D. G. (2001). Acoustic Discrimination of Pathological Voice: Sustained Vowels Versus Continuous Speech, Journal of Speech, Language, and Hearing Research, 44(2), 327-339. https://doi.org/10.1044/1092-4388(2001/027)