DOI QR코드

DOI QR Code

A MVC Framework for Visualizing Text Data

텍스트 데이터 시각화를 위한 MVC 프레임워크

  • Received : 2014.03.27
  • Accepted : 2014.05.12
  • Published : 2014.06.30

Abstract

As the importance of big data and related technologies continues to grow in the industry, it has become highlighted to visualize results of processing and analyzing big data. Visualization of data delivers people effectiveness and clarity for understanding the result of analyzing. By the way, visualization has a role as the GUI (Graphical User Interface) that supports communications between people and analysis systems. Usually to make development and maintenance easier, these GUI parts should be loosely coupled from the parts of processing and analyzing data. And also to implement a loosely coupled architecture, it is necessary to adopt design patterns such as MVC (Model-View-Controller) which is designed for minimizing coupling between UI part and data processing part. On the other hand, big data can be classified as structured data and unstructured data. The visualization of structured data is relatively easy to unstructured data. For all that, as it has been spread out that the people utilize and analyze unstructured data, they usually develop the visualization system only for each project to overcome the limitation traditional visualization system for structured data. Furthermore, for text data which covers a huge part of unstructured data, visualization of data is more difficult. It results from the complexity of technology for analyzing text data as like linguistic analysis, text mining, social network analysis, and so on. And also those technologies are not standardized. This situation makes it more difficult to reuse the visualization system of a project to other projects. We assume that the reason is lack of commonality design of visualization system considering to expanse it to other system. In our research, we suggest a common information model for visualizing text data and propose a comprehensive and reusable framework, TexVizu, for visualizing text data. At first, we survey representative researches in text visualization era. And also we identify common elements for text visualization and common patterns among various cases of its. And then we review and analyze elements and patterns with three different viewpoints as structural viewpoint, interactive viewpoint, and semantic viewpoint. And then we design an integrated model of text data which represent elements for visualization. The structural viewpoint is for identifying structural element from various text documents as like title, author, body, and so on. The interactive viewpoint is for identifying the types of relations and interactions between text documents as like post, comment, reply and so on. The semantic viewpoint is for identifying semantic elements which extracted from analyzing text data linguistically and are represented as tags for classifying types of entity as like people, place or location, time, event and so on. After then we extract and choose common requirements for visualizing text data. The requirements are categorized as four types which are structure information, content information, relation information, trend information. Each type of requirements comprised with required visualization techniques, data and goal (what to know). These requirements are common and key requirement for design a framework which keep that a visualization system are loosely coupled from data processing or analyzing system. Finally we designed a common text visualization framework, TexVizu which is reusable and expansible for various visualization projects by collaborating with various Text Data Loader and Analytical Text Data Visualizer via common interfaces as like ITextDataLoader and IATDProvider. And also TexVisu is comprised with Analytical Text Data Model, Analytical Text Data Storage and Analytical Text Data Controller. In this framework, external components are the specifications of required interfaces for collaborating with this framework. As an experiment, we also adopt this framework into two text visualization systems as like a social opinion mining system and an online news analysis system.

빅데이터의 중요성에 대한 인식이 확산되고, 관련한 기술이 발전됨에 따라, 최근에는 빅데이터의 처리와 분석의 결과를 어떻게 시각화할 것인지가 매우 관심 받는 주제로 부각되고 있다. 이는 분석된 결과를 보다 명확하고 효과적으로 전달하는 데에 있어서 데이터의 시각화가 매우 효과적인 방법이기 때문이다. 시각화는 분석 시스템과 사용자가 소통하기 위한 하나의 그래픽 사용자 인터페이스(GUI)를 담당하는 역할을 한다. 통상적으로 이러한 GUI 부분은 데이터의 처리나 분석의 결과와 독립될 수록 시스템의 개발과 유지보수가 용이하며, MVC(Model-View-Controller)와 같은 디자인 패턴의 적용을 통해 GUI와 데이터 처리 및 관리 부분 간의 결합도를 최소화하는 것이 중요하다. 한편 빅데이터는 크게 정형 데이터와 비정형 데이터로 구분할 수 있는데 정형 데이터는 시각화가 상대적으로 용이한 반면, 비정형 데이터는 시각화를 구현하기가 복잡하고 다양하다. 그럼에도 불구하고 비정형 데이터에 대한 분석과 활용이 점점 더 확산됨에 따라, 기존의 전통적인 정형 데이터를 위한 시각화 도구들의 한계를 벗어나기 위해 각각의 시스템들의 목적에 따라 고유의 방식으로 시각화 시스템이 구축되는 현실에 직면해 있다. 더욱이나 현재 비정형 데이터 분석의 대상 중 대부분을 차지하고 있는 텍스트 데이터의 경우 언어 분석, 텍스트 마이닝, 소셜 네트워크 분석 등 적용 기술이 매우 다양하여 하나의 시스템에 적용된 시각화 기술을 다른 시스템에 적용하는 것이 용이하지 않다. 이는 현재의 텍스트 분석 결과에 대한 정보 모델이 서로 다른 시스템에 적용될 수 있도록 설계되지 못하는 경우가 많기 때문이다. 본 연구에서는 이러한 문제를 해결하기 위하여 다양한 텍스트 데이터 분석 사례와 시각화 사례들의 공통적 구성 요소들을 식별하여 표준화된 정보 모델인 텍스트 데이터 시각화 모델을 제시하고, 이를 통해 시각화의 GUI 부분과 연결할 수 있는 시스템 모델로서의 시각화 프레임워크인 TexVizu를 제안하고자 한다.

Keywords

References

  1. Heijs, A., Big Data: Rethinking Text Visualization, Treparel, 2013. Available at http://treparel.com/wp-content/uploads/2012/07/WP-Big-Data-Rethinking-Text-Visualization.pdf(Downloaded 5 February 2014).
  2. Baker, C., Email Map, Christopher Baker 2004-2014, 2007. Available at http://christopherbaker.net/projects/mymap(Accessed 5 February 2014).
  3. Gamma, E., R. Helm, R. Johnson, and J. Vlissides, Design patterns: elements of reusable objectoriented software, Addison-Wesley, Massachusetts, 1995.
  4. Gansner, E., Y. Hu, and S. North, "Interactive Visualization of Streaming Text Data with Dynamic Maps," Journal of Graph Algorithms and Applications, Vol.17, No.4(2013), 515-540. https://doi.org/10.7155/jgaa.00302
  5. Hall S., Chord Diagrams in D3, 2013. Available at http://www.delimited.io/blog/2013/12/8/chorddiagrams-in-d3(Accessed 15 April 2014).
  6. Halvey M., and M. T. Keane, "An Assessment of Tag Presentation Techniques," Proceedings of the 16th international conference on World Wide Web, (2007), 1313-1314.
  7. Irwin, C., MVC is dead, it's time to MOVE on, 2012. Available at http://cirw.in/blog/time-to-move-on (Accessed 15 April 2014).
  8. Jahiruddin, M. Abulaisa, L. Dey, "A concept-driven biomedical knowledge extraction and visualization framework for conceptualization of text corpora," Journal of Biomedical Informatics, Vol.43, No.6 (2010), 1020-1035. https://doi.org/10.1016/j.jbi.2010.09.008
  9. Jones, S., S. Payne, B. Hicks, and L. Watts, "Visualization of Heterogeneous Text Data in Collaborative Engineering Projects," The 3rd IEEE Workshop on Interactive Visual Text Analytics, (2013).
  10. Kim, H. Y. and J. W. Park, "A Review on Expressive Materials and Approaches to Text Visualization," Journal of Korea Contents Association, Vol.13, No.1(2013), 64-72. https://doi.org/10.5392/JKCA.2013.13.01.064
  11. Kim, H. Y. and J. W. Park, "Case Analysis of Bible Visualization based on Text Data Traits - Focused on Content, Structure, Quotation of Text," Journal of Korea Contents Association, Vol.13, No.8 (2013), 83-92. https://doi.org/10.5392/JKCA.2013.13.08.083
  12. Lars, Essential Tools for Keyword Trend Analysis, 2011. Available at http://www.tripwiremagazine.com/2011/07/keyword-trend-analysis.html(Accessed 5 February 2014).
  13. Lee, J. Y, "A Usability Evaluation on the Visualization of Information Extraction Output," Journal of the Korean Society for Library and Information Science, Vol.39, No.2(2005), 287-304. https://doi.org/10.4275/KSLIS.2005.39.2.287
  14. Meyer, M., T. Girba, M. Lungu, "Mondrian: an agile information visualization framework," SoftVis '06 Proceedings of the 2006 ACM symposium on Software visualization, (2006), 135-144.
  15. Paley, W. B., TextArc: Alice's Adventures in Wonderland, 2009. Available at http://www.textarc.org/(Accessed 5 February 2014).
  16. Polley T., Studying Four Major NetSci Researchers (ISI Data), 2013. Available at http://wiki.cns.iu.edu/pages/viewpage.action?pageId=2200066(Accessed 15 April 2014).
  17. Posavec, S., G. McInerny, The Evolution of the Origin of Species, 2009. Available at http://www.itsbeenreal.co.uk/index.php?/on-going/about/(Accessed 5 February 2014).
  18. Saltlux, TopicRank, 2014. Available at http://www.saltlux.com/topicrank/(Accessed 15 April 2014).
  19. Yau, N., How to Make Bubble Charts, 2010. Available at http://flowingdata.com/2010/11/23/how-to-make-bubble-charts/(Accessed 15 April 2014).
  20. Zhao, Q., P. Mitra, "Event Detection and Visualization for Social Text Streams," International Conference on Weblogs and Social Media, (2007).

Cited by

  1. Visualizing the Results of Opinion Mining from Social Media Contents: Case Study of a Noodle Company vol.20, pp.4, 2014, https://doi.org/10.13088/jiis.2014.20.4.89