References
- B. V. Dasarathy, "Sensor fusion potential exploitation: innovative architectures and illustrative applications," Proceedings of the IEEE, vol. 85, no. 1, pp. 24-38, Jan. 1997. http://dx.doi.org/10.1109/5.554206
- D. W. Massaro, "Speech Perception by Ear and Eye: A Paradigm for Psychological Inquiry," Hillsdale, NJ: Lawrence Erlbaum, 1987.
- J. S. Lee and C. H. Park, "Training hidden Markov models by hybrid simulated annealing for visual speech recognition," in Proceedings of 2006 IEEE International Conference on Systems, Man and Cybernetics, Taipei, 2006, pp. 198-202, Oct. 2006.
- K. Kumatani, S. Nakamura, and K. Shikano, "An adaptive integration based on product HMM for audio-visual speech recognition," in Proceedings of 2001 IEEE International Conference on Multimedia and Expo, Tokyo, 2001, pp. 813-816. http://dx.doi.org/10.1109/ICME.2001.1237846
- J. S. Lee and C. H. Park. "Robust audio-visual speech recognition based on late integration," IEEE Transactions on Multimedia, vol. 10, no. 5, pp. 767-779, Aug. 2008. http://dx.doi.org/10.1109/TMM.2008.922789
- S. Dupont and J. Luettin, "Audio-visual speech modeling for continuous speech recognition," IEEE Transactions on Multimedia, vol. 2, no. 3, pp. 141-151, Sep. 2000. http://dx.doi.org/10.1109/6046.865479
- S. Davis and P. Mermelstein, "Comparison of parametric representations for monosyllabic word recognition in continuously spoken sentences," IEEE Transactions on Acoustics, Speech and Signal Processing, vol. 28, no. 4, pp. 357-366, Aug. 1980. http://dx.doi.org/10.1109/TASSP. 1980.1163420
- D. A. Reynolds, T. F. Quatieri, and R. B. Dunn, "Speaker verification using adapted Gaussian mixture models," Digital Signal Processing, vol. 10, no. 1, pp. 19-41, Jan. 2000. https://doi.org/10.1006/dspr.1999.0361
- I. Peer , B. Rafaely, and Y. Zigel, "Reverberation matching for speaker recognition," in Proceedings of 2008 IEEE International Conference on Acoustics, Speech and Signal Processing, Las Vegas, NV, 2008, pp. 4829-4832. http: //dx.doi.org/10.1109/ICASSP.2008.4518738
- F. Bimbot, J. F. Bonastre, C. Fredouille, G. Gravier, I. Magrin-Chagnolleau, S. Meignier, T. Merlin, J. Ortega- Garcia, D. Petrovska-Delacretaz, and D. A. Reynolds, "A tutorial on text-independent speaker verification," EURASIP Journal on Advances in Signal Processing, vol. 2004, no. 4, pp. 430-451, Apr. 2004. http://dx.doi.org/10. 1155/S1110865704310024 https://doi.org/10.1155/S1110865704310024
- C. Neti, G. Potamianos, J. Luettin, I. Matthews, H. Glotin, D. Vergyri, J. Sison, A. Mashari, and J. Zhou, "Audio visual speech recognition," in Final Workshop 2000 Report, Center for Language and Speech Processing, Baltimore, 2000.
- L. R. Rabiner and B. H. Juang, Fundamentals of Speech Recognition, Englewood Cliffs, NJ: PTR Prentice Hall, 1993.
- J. Luettin, G. Potamianos, and C. Neti, "Asynchronous stream modeling for large vocabulary audio-visual speech recognition," in Proceedings of 2001 IEEE International Conference on Acoustics, Speech, and Signal Processing, Salt Lake City, UT, 2001, pp. 169-172. http://dx.doi.org/ 10.1109/ICASSP.2001.940794
- H. Zhao, C. Tang, and T. Yu, "Fast thresholding segmentation for image with high noise," in Proceedings of 2008 International Conference on Information and Automation, Changsha, 2008, pp. 290-295. http://dx.doi.org/10.1109/ ICINFA.2008.4608013
- Lei Xie and D. Jiang, "Audio-visual synthesis and synchronous asynchronous experimental research for bimodal speech recognition," Journal of Northwestern Polytechnical University, vol. 22, no. 2, pp.171-175, 2004.
- H. Zhao, Y. Gu, and C. Tang, "Research of relationship between weight coefficient of product HMM and instantaneous SNR in bimodal speech recognition", Journal of Computer Application, vol. 29, pp. 279-285, 2009.
- A. Adjoudani and C. Benot, "On the integration of auditory and visual parameters in an HMM-based ASR," in Proceedings NATO ASI Conference on Speechreading by Man and Machine: Models, Systems and Applications, 1995, pp. 461-471.
- L. Rabiner, "A tutorial on hidden Markov models and selected applications in speech recognition," Proceedings of the IEEE, vol. 77, no. 2, pp. 257-286, Feb. 1989. http: //dx.doi.org/10.1109/5.18626
- C. Bregler and S. M. Omohundro, "Nonlinear manifold learning for visual speech recognition," in Proceedings of 1995 5th International Conference on Computer Vision, Cambridge, MA, 1995, pp. 494-499. http://dx.doi.org/10. 1109/ICCV.1995.466899