DOI QR코드

DOI QR Code

A Study on the Link Server Development Using B-Tree Structure in the Big Data Environment

빅데이터 환경에서의 B-tree 구조 기반 링크정보 관리서버의 개발

  • Park, Sungbum (Big Data Strategy Center, National Information Society Agency) ;
  • Hwang, Jong Sung (Big Data Strategy Center, National Information Society Agency) ;
  • Lee, Sangwon (Division of Information and Electronic Commerce, Wonkwang University)
  • Received : 2014.11.17
  • Accepted : 2015.02.24
  • Published : 2015.02.28

Abstract

Major corporations and portals have implemented a link server that connects Content Management Systems (CMS) to the physical address of content in a database (DB) to support efficient content use in web-based environments. In particular, a link server automatically connects the physical address of content in a DB to the content URL shown through a web browser screen, and re-connects the URL and the physical address when either is modified. In recent years, the number of users of digital content over the web has increased significantly because of the advent of the Big Data environment, which has also increased the number of link validity checks that should be performed in a CMS and a link server. If the link validity check is performed through an existing URL-based sequential method instead of petabyte or even etabyte environments, the identification rate of dead links decreases because of the degradation of validity check performance; moreover, frequent link checks add a large amount of workload to the DB. Hence, this study is aimed at providing a link server that can recognize URL link deletion or addition through analysis on the B-tree-based Information Identifier count per interval based on a large amount of URLs in order to resolve the existing problems. Through this study, the dead link check that is faster and adds lower loads than the existing method can be performed.

주요 기업들과 포털들은 사용자들에게 웹 기반 환경에서 보다 효율적인 콘텐츠 이용을 지원하기 위해 이른바 콘텐츠관리시스템(CMS, Contents Management Systems)과 콘텐츠의 데이터베이스 내 물리적 주소를 연결하여 관리하는 링크 서버를 적극적으로 도입하고 있다. 이를 통해 웹브라우저 화면에서 보여지는 콘텐츠의 URL과 실제 데이터베이스 안의 콘텐츠의 물리적 주소를 자동으로 연결해 주고, URL이나 데이터베이스의 물리적 주소의 변경시 두 주소를 재 연결하는 역할을 수행한다. 최근 빅데이터 환경의 도래에 따라 디지털 콘텐츠와 사용자 접속수가 폭발적으로 증가하고 있는 상황에서 CMS와 링크 서버에서 수행해야 하는 유효 링크 검사 횟수도 따라서 증가하고 있다. Peta-Byte 또는 Eta-Byte 환경 하에서 수행되는 유효 링크 검사를 기존 URL 기반의 순차적 방식으로 수행할 경우 속도저하에 따른 데이터 링크 식별률(identification rate)의 저하와 빈번한 링크 검사에 따른 데이터베이스에 부하를 주는 요인으로 작용될 수 있다. 따라서, 본 연구는 상기와 같은 종래의 문제점을 해결하기 위해 대량의 URL에 대해 B-Tree 기반의 정보식별자의 구간별 개수 분석을 기반으로 URL 삭제 링크 및 추가 링크를 인식하고 효과적으로 관리하는 것이 가능하도록 해주는 링크 서버를 제공하는 데 있다. 본 연구를 통해 기존 방식보다 빠르고 낮은 부하를 주는 데드 링크 체크 처리가 가능해 질 것이다.

Keywords

References

  1. R. Ramakrishnan and J. Gehrke, "Database Management Systems," McGraw-Hill, 2014. doi:10.1109/2.869369
  2. T. Connolly and C. Begg, "Database Systems: A Practical Approach to Design, Implementation, and Management," Addison-Wesley, 2014. doi:10.1287/isre.6.2.118
  3. C. J. Date, "An Introduction to Database Systems," Addison-Wesley, 2013.
  4. C. S. Jensen, D. Lin and B. C. Ooi, "Query and update efficient B+-tree based indexing of moving objects," VLDB '04 Proceedings of the Thirtieth International Conference, Vol. 30, pp. 768-779, 2004. doi:10.1145/342009.335427
  5. H. Berliner, "The B* Tree Search Algorithm: A Best-First Proof Procedure," Artificial Intelligence, Vol. 12, Iss. 1, pp. 23-40, 1979. doi:10.1016/0004-3702(79)90003-1
  6. R. Bayer, "The universal B-tree for multidimensional indexing: General concepts," Worldwide Computing and Its Applications (Lecture Notes in Computer Science), Vol. 1274, pp. 198-209, 1997. doi: 10.1007/3-540-63343-X_48
  7. S Wu, D Jiang, B. C. Ooi and K. L. Wu, "Efficient b-tree based indexing for cloud data processing," Vol. 3, Iss. 1-2, pp. 1207-1218, 2010. Doi:10.14778/1920841.1920991
  8. K Kousha and M Thelwall,"Google Scholar Citations and Google Web/URL Citations: A Multi-Discipline Exploratory Analysis," Journal of the American Society for Information Science and Technology, Vol. 58, Iss. 7, pp 1055?1065, 2007. doi:10.1002/asi.v58:7
  9. V. Mayer-Schonberger K. Cukie, "Big Data: A Revolution that Will Transform how We Live, Work, and Think," Houghton Mifflin Harcourt, 2014. doi: 10.2501/IJA-33-1-181-183
  10. H. Chen, R. H. L. Chiang and V. C. Storey, Business Intelligence and Analytics: From Big Data to Big Impact," MIS Quarterly, Vol. 36, No.4, pp. 1165-1188, 2012. http://dl.acm.org/citation.cfm?id=2481683
  11. J. Bughin, M. Chui and J. Manyika, "Clouds, Big Data, and Smart Assets: Ten Tech-Enabled Business Trends to Watch," McKinsey Quarterly, Vol. August, 2010. http://www.mckinsey.com/insights/high_tech_telecoms_i nternet/clouds_big_data_and_smart_assets_ten_tech-enab led_business_trends_to_watch
  12. E Turban, L. Volonino and G. R. Wood, "Information Technology for Management: Advancing Sustainable, Profitable Business Growth," Wiley, 2014. http://as.wiley.com/WileyCDA/WileyTitle/productCd-EHEP002524.html
  13. E. Turban, J. E. Aronson and T. P. Liang, "Decision Support Systems and Intelligent Systems," Prentice Hall, 2014. doi:10.2307/249703
  14. L. C. Zhong and J. M. Rabaey, "An Integrated Data-Link Energy Model for Wireless Sensor Networks," 2004 IEEE International Conference on Communications, Vol. 7, pp.3777-3783, 2004. Available at your request. http://ieeexplore.ieee.org/xpl/login.jsp?tp=&arnumber=1313260&url=http%3A%2F%2Fieeexplore.ieee.org%2Fxpls%2Fabs_all.jsp%3Farnumber%3D1313260

Cited by

  1. Algorithm Design to Judge Fake News based on Bigdata and Artificial Intelligence vol.11, pp.2, 2015, https://doi.org/10.7236/ijibc.2019.11.2.50
  2. Strategy Design to Protect Personal Information on Fake News based on Bigdata and Artificial Intelligence vol.11, pp.2, 2015, https://doi.org/10.7236/ijibc.2019.11.2.59
  3. Design of Evaluation Index System for Information Experience based on B2C e-Commerce Bigdata and Artificial Intelligence vol.8, pp.4, 2019, https://doi.org/10.7236/ijasc.2019.8.4.1
  4. Design of Cloud Service Platform for eGovernment vol.13, pp.1, 2021, https://doi.org/10.7236/ijibc.2021.13.1.201