References
- L. A. Ross, D. Saint-Amour, V. M. Leavitt, D. C. Javitt, and J. J. Foxe, 'Do you see what I am saying? Exploring visual enhancement of speech comprehension in noisy environments,' Cerebral Cortex, vol. 17, no. 5, pp. 1147-1153, 2007 https://doi.org/10.1093/cercor/bhl024
- C. C. Chibelushi, F. Deravi, and J. S. D. Mason, 'A review of speech-based bimodal recognition,' IEEE Trans. Multimedia, vol. 4, no. 1, pp. 23-37, Mar. 2002 https://doi.org/10.1109/6046.985551
- X.-D. Huang, A. Acero, and H.- W. Hon, Spoken Language Processing: A Guide to Theory, Algorithm, and System Development, Prentice Hall, 2001
- P. Scanlon and R. Reilly, 'Feature analysis for automatic speechreading,' in Proc. Int. Conf. Multimedia and Expo, pp. 625-630, 2001
- C. Benoit, 'The intrinsic bimodality of speech communication and the synthesis of talking faces,' The Structure of Multimodal Dialogue II, M. M. Taylor, F. Nel, and D. Bouwhuis, Eds. Amsterdam, The Netherlands: John Benjamins, pp. 485-202, 2000
- J.-S. Lee and C. H. Park, 'Training hidden Markov models by hybrid simulated annealing for visual speech recognition,' in Proc. Int. Conf. Systems, Man, Cybernetics, pp. 198-202, Oct. 2006
- 이종석, 심선희, 김소영. 박철훈, '제어되지 않은 조명 조건하에서 입술움직임의 강인한 특징추출을 이용한 바이모달 음성인식,' Telecommunications Review, 제 14 권, 제 1호, pp. 123-134, 2. 2004
- R. C. Gonzalez and R. E. Woods, Digital Image Processing, Addison-Wesley Publishing Company, 2001
- T. J. Hazen, 'Visual model structures and synchrony constraints for audio-visual speech recognition,' IEEE Trans. Audio, Speech, Language Processing, vol. 14, no. 3, pp. 1082-1089, May 2006 https://doi.org/10.1109/TSA.2005.857572
- A. Verma, T. Faruquie, C. Neti, and S. Basu, 'Late integration in audio-visual continuous speech recognition,' in Proc. Workshop on Automatic Speech Recognition and Understanding, pp. 71-74, Dec. 1999
- G. F. Meyer, J. B. Mulligan, and S. M. Wuerger, 'Continuous audio-visual digit recognition using N-best decision fusion,' Information Fusion, vol. 5, no. 2, pp. 91-101, June 2004 https://doi.org/10.1016/j.inffus.2003.07.001
- S. Tamura, K. Iwano, and S. Furui, 'A stream-weight optimization method for multi-stream HMMs based on likelihood value normalization,' in Proc. Int. Conf. Acoustics, Speech and Signal Processing, vol. 1, pp. 469-472, Mar. 2005
- T. W. Lewis and D. M. W. Powers, 'Sensor fusion weighting measures in audio-visual speech recognition,' in Proc. Conf. Australasian Computer Science, pp. 305-314, 2004
- C. Bishop, Neural Networks for Pattern Recognition, Oxford University Press, UK, 1995
- A. Varga and H. J. M. Steeneken, 'Assessment for automatic speech recognition: II. NOISEX-92: A database and an experiment to study the effect of additive noise on speech recognition systems,' Speech Communication, vol. 12, no. 3, pp. 247-251, 1993 https://doi.org/10.1016/0167-6393(93)90095-3