DOI QR코드

DOI QR Code

Visual Search Model based on Saliency and Scene-Context in Real-World Images

실제 이미지에서 현저성과 맥락 정보의 영향을 고려한 시각 탐색 모델

  • Choi, Yoonhyung (Department of Industrial Management Engineering, Korea University) ;
  • Oh, Hyungseok (Department of Industrial Management Engineering, Korea University) ;
  • Myung, Rohae (Department of Industrial Management Engineering, Korea University)
  • 최윤형 (고려대학교 산업경영공학과) ;
  • 오형석 (고려대학교 산업경영공학과) ;
  • 명노해 (고려대학교 산업경영공학과)
  • Received : 2015.02.09
  • Accepted : 2015.05.11
  • Published : 2015.08.15

Abstract

According to much research on cognitive science, the impact of the scene-context on human visual search in real-world images could be as important as the saliency. Therefore, this study proposed a method of Adaptive Control of Thought-Rational (ACT-R) modeling of visual search in real-world images, based on saliency and scene-context. The modeling method was developed by using the utility system of ACT-R to describe influences of saliency and scene-context in real-world images. Then, the validation of the model was performed, by comparing the data of the model and eye-tracking data from experiments in simple task in which subjects search some targets in indoor bedroom images. Results show that model data was quite well fit with eye-tracking data. In conclusion, the method of modeling human visual search proposed in this study should be used, in order to provide an accurate model of human performance in visual search tasks in real-world images.

Keywords

References

  1. Anderson, J. R. (2007), How can the human mind occur in the physical universe?, Oxford University Press.
  2. Anderson, J. R., Bothell, D., Byrne, M. D., Douglass, S., Lebiere, C., and Qin, Y. (2004), An integrated theory of mind, Psychological Review, 111, 1036-1060.
  3. Anderson, J. R., Matessa, M., and Lebiere, C. (1997), ACT-R : A theory of higher level cognition and its relation to visual attention, Human-Computer Interaction, 12(4), 439-462. https://doi.org/10.1207/s15327051hci1204_5
  4. Byrne, M. D. (2001), ACT-R/PM and menu selection : Applying a cognitive architecture to HCI, International Journal of Human-Computer Studies, 55(1), 41-84. https://doi.org/10.1006/ijhc.2001.0469
  5. De Graef, P., Christiaens, D., and d'Ydewalle, G. (1990), Perceptual effects of scene context on object identification, Psychological research, 52(4), 317-329. https://doi.org/10.1007/BF00868064
  6. Halverson, T. and Hornof, A. J. (2011), A computational model of "active vision" for visual search in human-computer interaction, Human-Computer Interaction, 26(4), 285-314. https://doi.org/10.1080/07370024.2011.625237
  7. Henderson, J. M. (2003), Human gaze control during real-world scene perception, Trends in cognitive sciences, 7(11), 498-504. https://doi.org/10.1016/j.tics.2003.09.006
  8. Hornof, A. J. (2004), Cognitive strategies for the visual search of hierarchical computer displays, Human-Computer Interaction, 19(3), 183-223. https://doi.org/10.1207/s15327051hci1903_1
  9. John, B. E. and Kieras, D. E. (1996), Using GOMS for user interface design and evaluation: Which technique?, ACM Transactions on Computer-Human Interaction(TOCHI), 3(4), 287-319. https://doi.org/10.1145/235833.236050
  10. Jung, K. (2015), Legible and preferred korean sizes for various colors and fonts, Journal of the Korean Institute of Industrial Engineers, 41(1), 59-63. https://doi.org/10.7232/JKIIE.2015.41.1.059
  11. Koch, C. and Ullman, S. (1985), Shifts in selective visual attention : towards the underlying neural circuitry, Human Neurobiolgy, 4(4), 115-141.
  12. Kujala, T. and Saariluoma, P. (2011), Effects of menu structure and touch screen scrolling style on the variability of glance durations during in-vehicle visual search tasks, Ergonomics, 54(8), 716-732. https://doi.org/10.1080/00140139.2011.592601
  13. Lewis, R. L. and Vasishth, S. (2005), An activation-based model of sentence processing as skilled memory retrieval, Cognitive science, 29(3), 375-419. https://doi.org/10.1207/s15516709cog0000_25
  14. Ling, J. and Van Schaik, P. (2004), The effects of link format and screen location on visual search of web pages, Ergonomics, 47(8), 907-921. https://doi.org/10.1080/00140130410001670417
  15. Lohse, G. L. (1993), A cognitive model for understanding graphical perception, Human-Computer Interaction, 8(4), 353-388. https://doi.org/10.1207/s15327051hci0804_3
  16. Niebur, E. (2007), Saliency map, Scholarpedia, 2(8), 2675. https://doi.org/10.4249/scholarpedia.2675
  17. Nyamsuren, E. and Taatgen, N. A. (2013), Pre-attentive and attentive vision module, Cognitive Systems Research, 24, 62-71. https://doi.org/10.1016/j.cogsys.2012.12.010
  18. Oh, H., Jo, S., and Myung, R. (2011), Head Movement Module in ACT-R for Multi-display Environment, In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 55(1), 1836-1839.
  19. Oh, H., Jo, S., and Myung, R. (2014), Computational modeling of human performance in multiple monitor environments with ACT-R cognitive architecture, International Journal of Industrial Ergonomics, http://dx.doi.org/10.1016/j.ergon.2014.09.004
  20. Oh, H. and Myung, R. (2012), Modeling Human Visual Processing Within and Beyond the Oculomotor Range Using ACT-R Cognitive Architecture, In Proceedings of the Human Factors and Ergonomics Society Annual Meeting, 56(1), 995-999.
  21. Plumlee, M. D. and Ware, C. (2006), Zooming versus multiple window interfaces : Cognitive costs of visual comparisons, ACM Transactions on Computer-Human Interaction (TOCHI), 13(2), 179-209. https://doi.org/10.1145/1165734.1165736
  22. Ritter, F. E., Van Rooy, D., Amant, R. S., and Simpson, K. (2006), Providing user models direct access to interfaces : An exploratory study of a simple interface with implications for HRI and HCI, Systems, Man and Cybernetics, Part A: Systems and Humans, IEEE Transactions on, 36(3), 592-601. https://doi.org/10.1109/TSMCA.2005.853482
  23. Salvucci, D. D. (2001), An integrated model of eye movements and visual encoding, Cognitive Systems Research, 1(4), 201-220. https://doi.org/10.1016/S1389-0417(00)00015-2
  24. Schneider, W. X. (1995), VAM : A neuro-cognitive model for visual attention control of segmentation, object recognition, and space-based motor action, Visual Cognition, 2(2-3), 331-376. https://doi.org/10.1080/13506289508401737
  25. St Amant, R. and Riedl, M. O. (2001), A perception/action substrate for cognitive modeling in HCI, International Journal of Human-Computer Studies, 55(1), 15-39. https://doi.org/10.1006/ijhc.2001.0470
  26. Torralba, A., Oliva, A., Castelhano, M. S., and Henderson, J. M. (2006), Contextual guidance of eye movements and attention in real-world scenes : the role of global features in object search, Psychological review, 113(4), 766. https://doi.org/10.1037/0033-295X.113.4.766
  27. Treisman, A. (2006), How the deployment of attention determines what we see, Visual cognition, 14(4-8), 411-443. https://doi.org/10.1080/13506280500195250
  28. Tuch, A. N., Bargas-Avila, J. A., Opwis, K., and Wilhelm, F. H. (2009), Visual complexity of websites : Effects on users' experience, physiology, performance, and memory, International Journal of Human-Computer Studies, 67(9), 703-715. https://doi.org/10.1016/j.ijhcs.2009.04.002
  29. Walther, D. (2006), Interactions of visual attention and object recognition : computational modeling, algorithms, and psychophysics, Phd thesis, California Institute of Technology.