DOI QR코드

DOI QR Code

The f0 distribution of Korean speakers in a spontaneous speech corpus

  • Yang, Byunggon (Department of English Education, Pusan National University)
  • Received : 2021.07.27
  • Accepted : 2021.09.07
  • Published : 2021.09.30

Abstract

The fundamental frequency, or f0, is an important acoustic measure in the prosody of human speech. The current study examined the f0 distribution of a corpus of spontaneous speech in order to provide normative data for Korean speakers. The corpus consists of 40 speakers talking freely about their daily activities and their personal views. Praat scripts were created to collect f0 values, and a majority of obvious errors were corrected manually by watching and listening to the f0 contour on a narrow-band spectrogram. Statistical analyses of the f0 distribution were conducted using R. The results showed that the f0 values of all the Korean speakers were right-skewed, with a pointy distribution. The speakers produced spontaneous speech within a frequency range of 274 Hz (from 65 Hz to 339 Hz), excluding statistical outliers. The mode of the total f0 data was 102 Hz. The female f0 range, with a bimodal distribution, appeared wider than that of the male group. Regression analyses based on age and f0 values yielded negligible R-squared values. As the mode of an individual speaker could be predicted from the median, either the median or mode could serve as a good reference for the individual f0 range. Finally, an analysis of the continuous f0 points of intonational phrases revealed that the initial and final segments of the phrases yielded several f0 measurement errors. From these results, we conclude that an examination of a spontaneous speech corpus can provide linguists with useful measures to generalize acoustic properties of f0 variability in a language by an individual or groups. Further studies would be desirable of the use of statistical measures to secure reliable f0 values of individual speakers.

Keywords

Acknowledgement

This work was supported by a 2-Year Research Grant of Pusan National University.

References

  1. Boersma, P., & Weenink, D. (2019). Praat: Doing phonetics by computer (version 6.0.46) [Computer program]. Retrieved from http://www.fon.hum.uva.nl/praat/
  2. Boothroyd, A. (1986). Speech acoustics and perception. Austin, TX: Pro-Ed.
  3. Catford, J. C. (1977). Fundamental problems in phonetics. Edinburgh, UK: Edinburgh University Press.
  4. Couper-Kuhlen, E. (1996). The prosody of repetition: On quoting and mimicry. In E. Couper-Kuhlen & M. Selting (Eds.), Prosody in conversation (pp. 366-405). Cambridge, UK: Cambridge University Press.
  5. Efron, B. (2003). Second thoughts on the bootstrap. Statistical, 18(2), 135-140.
  6. Fant, G. (1973). Speech sounds and features. Cambridge, MA: MIT Press.
  7. Field, A. (2013). Discovering statistics using IBM SPSS statistics. London, UK: Sage.
  8. Kunter, G. (2011). Compound stress in English. The phonetics and phonology of prosodic prominence. Berlin, Germany: De Gruyter.
  9. Ladd, D. (1996). Intonational phonology. (Cambridge Studies in Linguistics 79). Cambridge, UK: Cambridge University Press.
  10. Lennes, M., Stevanovic, M., Aalto, D., & Palo, P. (2016). Comparing pitch distributions using Praat and R. Phonetician, 111(2), 35-53.
  11. Lieberman, P. (1967). Intonation perception and language.Cambridge, MA: MIT Press.
  12. Medeiros, B. R., Cabral, J. P., Meireles, A. R., & Baceti, A. A. (2021). A comparative study of fundamental frequency stability between speech and singing. Speech Communication, 128, 15-23. https://doi.org/10.1016/j.specom.2021.02.003
  13. Morrill, T. (2012). Acoustic correlates of stress in English adjective-noun compounds. Language and Speech, 55(2), 167-201. https://doi.org/10.1177/0023830911417251
  14. Murray, K. (2001). A study of automatic pitch tracker doubling/ halving "Errors". Proceedings of the Second SIGdial Workshop on Discourse and Dialogue. Philadelphia, PA.
  15. Nolan, F. J. (1983). The phonetic bases of speaker recognition. Cambridge, UK: Cambridge University Press.
  16. R Core Team. (2021). R: A language and environment for statistical computing (version 4.1.0) [Computer software]. Vienna, Austria: R Foundation for Statistical Computing. Retrieved from https://www.R-project.org/
  17. Yang, B. (1990). Development of vowel normalization procedures: English and Korean (Doctoral dissertation). The University of Texas, Arlington, TX.
  18. Yang, B. (1998). A study of pitch analysis by Signalize. Dongeui Nonjip, 28, 68-79.
  19. Yang, B. (2018). Pitch trajectories of English vowels produced by American men, women, and children. Phonetics and Speech Sciences, 10(4), 31-37. https://doi.org/10.13064/KSSS.2018.10.4.031
  20. Yang, B. (2021). Measuring vowels. In R. A. Knight, & J. Setter (Eds.), The Cambridge handbook of phonetics (pp. 261-284). Cambridge, UK: Cambridge University Press.
  21. Yun, W., Yoon, K., Park, S., Lee, J., Cho, S., Kang, D., Byun, K., Hahn, H., & Kim, J. (2015). The Korean corpus of spontaneous speech. Phonetics and Speech Sciences, 7(2), 103-109. https://doi.org/10.13064/KSSS.2015.7.2.103
  22. Zheng, Y., & Brette, R. (2017). On the relation between pitch and level. Hearing Research, 348, 63-69. https://doi.org/10.1016/j.heares.2017.02.014