DOI QR코드

DOI QR Code

Video Summarization Using Eye Tracking and Electroencephalogram (EEG) Data

시선추적-뇌파 기반의 비디오 요약 생성 방안 연구

  • Received : 2022.01.17
  • Accepted : 2022.02.22
  • Published : 2022.02.28

Abstract

This study developed and evaluated audio-visual (AV) semantics-based video summarization methods using eye tracking and electroencephalography (EEG) data. For this study, twenty-seven university students participated in eye tracking and EEG experiments. The evaluation results showed that the average recall rate (0.73) of using both EEG and pupil diameter data for the construction of a video summary was higher than that (0.50) of using EEG data or that (0.68) of using pupil diameter data. In addition, this study reported that the reasons why the average recall (0.57) of the AV semantics-based personalized video summaries was lower than that (0.69) of the AV semantics-based generic video summaries. The differences and characteristics between the AV semantics-based video summarization methods and the text semantics-based video summarization methods were compared and analyzed.

본 연구는 시선 및 뇌파 정보를 이용하여 오디오-비주얼(audio-visual, AV) 시맨틱스 기반의 동영상 요약 방법들을 개발하고 평가해 보았다. 이를 위해서 27명의 대학생들을 대상으로 시선추적과 뇌파 실험을 수행하였다. 평가 결과, 뇌파와 동공크기 데이터를 함께 사용한 방법의 평균 재현율(0.73)이 뇌파 또는 동공크기 데이터만을 사용한 방법의 평균 재현율(뇌파: 0.50, 동공크기: 0.68)보다 높게 나타났다. 또한 AV 시맨틱스 기반의 개인화된 동영상 요약의 평균 재현율(0.57)이 AV 시맨틱스 기반의 일반적인 동영상 요약의 평균 재현율(0.69)보다 낮게 나타난 원인들을 분석하였다. 끝으로, AV 시맨틱스 기반 동영상 요약 방법과 텍스트 시맨틱스 기반 동영상 요약 방법 간의 차이 및 특성도 비교분석해 보았다.

Keywords

Acknowledgement

이 논문은 2020년 대한민국 교육부와 한국연구재단의 인문사회분야 중견연구자지원사업의 지원을 받아 수행된 연구임(NRF-2020S1A5A2A01040945).

References

  1. Kim, H. & Kim, Y. (2016). Understanding topical relevance of multimedia based on EEG techniques. Journal of Korean Library and Information Science Society, 50(3), 361-381. https://doi.org/10.4275/KSLIS.2016.50.3.361
  2. Kim, H. & Kim, Y. (2019). Towards the generation of language-based sound summaries using electroencephalogram measurements. Journal of the Korean Society for Information Management, 36(3), 131-148. https://doi.org/10.3743/KOSIM.2019.36.3.131
  3. Ahn, H. M. (2013). The Effect of lEarning Media Size and Type to Learning Concentration in Text based Learning. Master's thesis, Korea National University of Education.
  4. Yoon, J. & Syn, S. Y. (2021). How do formats of health related Facebook posts effect on eye movements and cognitive outcomes?. Journal of the Korean Society for Library and Information Science, 55(3), 219-237. https://doi.org/10.4275/KSLIS.2021.55.3.219
  5. Chung, Y. M. (2012). Research in Information Retrieval. Seoul: Yonsei University Press.
  6. Ajanki, A., Hardoon, D. R., Kaski, S., Puolamaki, K., & Shawe-Taylor, J. (2009). Can eyes reveal interest? Implicit queries from gaze patterns. User Modeling and User-Adapted Interaction, 19(4), 307-339. https://doi.org/10.1007/s11257-009-9066-4
  7. Bezugam, S. S., Majumdar, S., Ralekar, C., & Gandhi, T. K. (2021). Efficient video summarization framework using EEG and eye-tracking signals. arXiv preprint arXiv: 2101.11249.
  8. Bhattacharya, N. & Gwizdka, J. (2018). Relating eye-tracking measures with changes in knowledge on search tasks. arXiv preprint arXiv: 1805.02399.
  9. Bhattacharya, N., Rakshit, S., Gwizdka, J., & Kogut, P. (2020). Relevance prediction from eye-movements using semi-interpretable convolutional neural networks. In Proceedings of the 2020 Conference on Human Information Interaction and Retrieval (pp. 223-233).
  10. Evans, W. J., Cui, L., & Starr, A. (1995). Olfactory event-related potentials in normal human subjects: Effects of age and gender. Electroencephalography and Clinical Neurophysiology, 95(4), 293-301. https://doi.org/10.1016/0013-4694(95)00055-4
  11. Foley, J. J. & Kwan, P. (2015). Feature extraction in content-based image retrieval. In Encyclopedia of Information Science and Technology, Third Edition (pp. 5897-5905). IGI Global.
  12. Golenia, J. E., Wenzel, M. A., Bogojeski, M., & Blankertz, B. (2018). Implicit relevance feedback from electroencephalography and eye tracking in image search. Journal of Neural Engineering, 15(2), 026002. https://doi.org/10.1088/1741-2552/15/2/026002
  13. Gwizdka, J., Hosseini, R., Cole, M., & Wang, S. (2017). Temporal dynamics of eye-tracking and EEG during reading and relevance decisions. Journal of the Association for Information Science and Technology, 68(10), 2299-2312. https://doi.org/10.1002/asi.23904
  14. Gwizdka, J. & Zhang, Y. (2015). Differences in eye-tracking measures between visits and revisits to relevant and irrelevant web pages. In Proceedings of the 38th International ACM SIGIR Conference on Research and Development in Information Retrieval (pp. 811-814). ACM.
  15. Hansen, D., Shneiderman, B., & Smith, M. A. (2010). Analyzing social media networks with NodeXL: Insights from a connected world. MA: Morgan Kaufmann.
  16. Hughes, J. R. & John, E. R. (1999). Conventional and quantitative electroencephalography in psychiatry. The Journal of neuropsychiatry and clinical neurosciences, 11(2), 190-208. https://doi.org/10.1176/jnp.11.2.190
  17. Katti, H., Yadati, K., Kankanhalli, M., & Tat-Seng, C. (2011). Affective video summarization and story board generation using pupillary dilation and eye gaze. In 2011 IEEE International Symposium on Multimedia (pp. 319-326). IEEE.
  18. Kaur, T. & Neeru, N. (2015). Text extraction from natural scene using PCA. International Journal of Computer Science Engineering & Technology, 5(7), 272-277.
  19. Kim, H. H. & Kim, Y. H. (2019a). Video summarization using event related potential responses to shot boundaries in real time video watching. Journal of the Association for Information Science and Technology, 70(2), 164-175. https://doi.org/10.1002/asi.24103
  20. Kim, H. H. & Kim, Y. H. (2019b). ERP/MMR algorithm for classifying topic relevant and topic irrelevant visual shots of documentary videos. Journal of the Association for Information Science and Technology, 70(9), 931-941. https://doi.org/10.1002/asi.24179
  21. Moon, J., Kwon, Y., Park, J., & Yoon, W. C. (2019). Detecting user attention to video segments using interval EEG features. Expert Systems with Applications, 115, 578-592. https://doi.org/10.1016/j.eswa.2018.08.016
  22. Oliveira, F. T., Aula, A., & Russell, D. M. (2009). Discriminating the relevance of web search results with measures of pupil size. In Proceedings of the SIGCHI Conference on Human Factors in Computing Systems (pp. 2209-2212). ACM.
  23. Schumacher, P. B. & Hung, Y.-C. (2012). Positional influences on information packaging: Insights from topological fields in German. Journal of Memory and Language, 67(2), 295-310. https://doi.org/10.1016/j.jml.2012.05.006
  24. Shi, Z. F., Zhou, C., Zheng, W. L., & Lu, B. L. (2017). Attention evaluation with eye tracking glasses for EEG-based emotion recognition. In 2017 8th International IEEE/EMBS Conference on Neural Engineering (NER) (pp. 86-89). IEEE.
  25. Syn, S. Y. & Yoon, J. (2021). Investigation on reading behaviors and cognitive outcomes of Facebook health information. Online Information Review, 45(6), 1097-1115. https://doi.org/10.1108/OIR-05-2020-0177
  26. van Berkum, J. J., Hagoort, P., & Brown, C. M. (1999). Semantic integration in sentences and discourse: Evidence from the N400. Journal of Cognitive Neuroscience, 11(6), 657-671. https://doi.org/10.1162/089892999563724
  27. Wang, L. & Schumacher, P. B. (2013). New is not always costly: evidence from online processing of topic and contrast in Japanese. Frontiers in Psychology, 4, 363. https://doi.org/10.3389/fpsyg.2013.00363
  28. Zhao, L. M., Li, X. W., Zheng, W. L., & Lu, B. L. (2018). Active feedback framework with scan-path clustering for deep affective models. In International Conference on Neural Information Processing (pp. 330-340). Springer, Cham.