Acknowledgement
This research was supported by Global Infrastructure Program through the National Research Foundation of Korea(NRF) funded by the Ministry of Science and ICT(NRF-2018K1A3A1A20026485)
References
- S. Y. Oh. (2020). Speech Recognition Performance Improvement using a convergence of GMM Phoneme Unit parameter and Vocabulary Clustering. Journal of Convergence for Information Technology, 10(8), 35-39. DOI : 10.22156/CS4SMB.2020.10.08.035
- C. S. Ahn & S. Y. Oh. (2012). Gaussian Model Optimization using Configuration Thread Control In CHMM Vocabulary Recognition. The Journal of Digital Policy and Management. 10(7), 167-172. DOI : 10.14400/JDPM.2012.10.7.167
- J. Homer & I. Mareels. (2004). LS detection guided NLMS estimation of sparse system. Proceedings of the IEEE 2004 International Conference on Acoustic. Speech, and Signal Processing(ICASSP). Montreal, Quebec, Canada. DOI : 10.1109/ICASSP.2004.1326394
- B. Sisman, J. Yamagishi, S. King & H. Li. (2020). An overview of voice conversion and its challenges: From statistical modeling to deep learning. IEEE/ACM Transactions on Audio, Speech, and Language Processing.
- B. F. Wu & K. C. Wang. (2005). Robust endpoint detection algorithm based on the adaptive band-partitioning spectral entropy in adverse environments. IEEE Transactions on Speech and Audio Processing, 13(5), 762-775. DOI : 10.1109/TSA.2005.851909
- Q. Li, J. Zheng, A.Tsai & Q. Zhou. (2002). Robust endpoint detection and energy normalization for real-time speech and speaker recognition. IEEE Transactions on Speech and Audio Processing, 10(3), 146-157. DOI : 10.1109/TSA.2002.1001979
- A. Arango, J. P'erez & B. Poblete. (2019). Hate Speech Detection is Not as Easy as You May Think, A Closer Look at Model Validation. In Proceedings of the 42nd International ACM SIGIR Conference on Research and Development in Information Retrieval, 45-54. Paris, France: Association for Computing Machinery. DOI : 10.1145/3331184.3331262
- S. S. Aluru, B. Mathew, P. Saha & A. Mukherjee. (2020). Deep Learning Models for Multilingual Hate Speech Detection, arXiv preprint arXiv:2004.06465
- E. T. S. I. Standard. (2003). Speech Processing, Transmission and Quality aspects(STQ); Distributed speech recognition; Advanced front-end feature extraction algorithm; Compression algorithms. ETSI ES 202 050 v.1.1.3.
- P. Scart & J. Filho, (2002). Speech enhancement based on a priori signal to noise estimation. In 1996 IEEE International Conference on Acoustics, Speech, and Signal Processing Conference Proceedings (Vol. 2, pp. 629-632). IEEE.
- K. Chung & S. Y. Oh. (2015). Improvement of speech signal extraction method using detection filter of energy spectrum entropy. Cluster Computing, 18(2), 629-635. DOI : 10.1007/s10586-015-0429-9
- S. Kamarth & P.Loizou. (2002). A multi-band spectral subtraction method for enhancing speech corrupted by colored noise. In ICASSP (Vol. 4, pp. 44164-44164).
- Yi Hu & P. C. Loizou. (2008). Evaluation of objective quality measures for speech enhancement. IEEE Transactions on audio, speech, and language processing, 16(1), 229-238. https://doi.org/10.1109/TASL.2007.911054
- S. Y. Oh & K. Chung. (2018). Performance evaluation of silence-feature normalization model using cepstrum features of noise signals. Wireless Personal Communications, 98(4), 3287-3297. DOI : 10.1109/TASL.2007.911054
- K. C. Wang & Y. H. Tsai. (2008). Voice activity detection algorithm with low signal-to-noise ratios based on spectrum entropy. In 2008 Second International Symposium on Universal Communication (pp. 423-428). DOI : 10.1109/ISUC.2008.55