DOI QR코드

DOI QR Code

A Fixed Rate Speech Coder Based on the Filter Bank Method and the Inflection Point Detection

  • Iem, Byeong-Gwan (Department of Electronic Engineering, Gangneung-Wonju National University)
  • Received : 2016.12.08
  • Accepted : 2016.12.13
  • Published : 2016.12.12

Abstract

A fixed rate speech coder based on the filter bank and the non-uniform sampling technique is proposed. The non-uniform sampling is achieved by the detection of inflection points (IPs). A speech block is band passed by the filter bank, and the subband signals are processed by the IP detector, and the detected IP patterns are compared with entries of the IP database. For each subband signal, the address of the closest member of the database and the energy of the IP pattern are transmitted through channel. In the receiver, the decoder recovers the subband signals using the received addresses and the energy information, and reconstructs the speech via the filter bank summation. As results, the coder shows fixed data rate contrary to the existing speech coders based on the non-uniform sampling. Through computer simulation, the usefulness of the proposed technique is confirmed. The signal-to-noise ratio (SNR) performance of the proposed method is comparable to that of the uniform sampled pulse code modulation (PCM) below 20 kbps data rate.

Keywords

References

  1. L. R. Rabiner and R. W. Schafer, Digital Processing of Speech Signals. Englewood Cliffs, NJ: Prentice-Hall, 1978.
  2. T. F. Quatieri, Discrete-Time Speech Signal Processing: Principles and Practice. Upper Saddle River, NJ: Prentice-Hall, 2002.
  3. G. Lee and W. G. Kim, "Emotion recognition using pitch parameters of speech," Journal of Korean Institute of Intelligent Systems, vol. 25, no. 3, pp. 272-278, 2015. http://dx.doi.org/10.5391/jkiis.2015.25.3.272
  4. W. G. Kim, "Robust speech recognition parameters for emotional variation," Journal of Korean Institute of Intelligent Systems, vol. 15, no. 6, pp. 655-660, 2005. http://dx.doi.org/10.5391/jkiis.2005.15.6.655
  5. M. J. Bae, W. C. Lee, and D. S. Kim, "On a new vocoder technique by the nonuniform sampling," in Proceedings of Military Communications Conference (MILCOM'96), Mclean, VA, 1996, pp. 649-652. http://dx.doi.org/10.1109/milcom.1996.569428
  6. M. Budaes and L. Goras, "On speech signals reconstruction from local extreme values," in Proceedings of International Symposium on Signals, Circuits and Systems, Iasi, Romania, 2005, pp. 315-318. http://dx.doi.org/10.1109/ISSCS.2005.1509917
  7. L. Davisson, "Data compression using straight line interpolation," IEEE Transactions on Information Theory, vol. 14, no. 3, pp. 390-394, 1968. http://dx.doi.org/10.1109/TIT.1968.1054160
  8. J. Mark and T. Todd, "A nonuniform sampling approach to data compression," IEEE Transactions on Communications, vol. 29, no.1, pp. 24-32, 1981. http://dx.doi.org/10.1109/TCOM.1981.1094872
  9. B. G. Iem, "A nonuniform sampling technique based on inflection point detection and its application to speech coding," Journal of the Acoustical Society of America, vol. 136, no. 2, pp. 903-909, 2014. http://dx.doi.org/10.1121/1.4884882
  10. B. G. Iem, "A nonuniform sampling technique and its application to speech coding," Journal of Korean Institute of Intelligent Systems, vol. 24, no. 1, pp. 28-32, 2014. http://dx.doi.org/10.5391/jkiis.2014.24.1.028
  11. B. G. Iem, "A low bit rate speech coder based on the inflection point detection," International Journal of Fuzzy Logic and Intelligent Systems, vol. 15, no. 4, pp. 300-304, 2015. http://dx.doi.org/10.5391/ijfis.2015.15.4.300