A Hybrid Method of Storing XML Data Using RDBMS

RDBMS를 이용한 XML 데이터의 혼합형 저장 기법

  • 전찬훈 (중앙대학교 대학원 컴퓨터공학과) ;
  • 강현철 (중앙대학교 컴퓨터공학부)
  • Published : 2009.02.28

Abstract

As the Web-based e-Business prevails, the volume of XML data on the Web is getting larger than ever. Although much research has been done on decomposing and storing XML data in RDB, which is now the most popular storage for XML, and on processing XML queries through their SQL counterparts, little attention was paid to how to alleviate the burden of storing massive volume of XML data. In this paper, we propose a hybrid method of storing XML data in RDB, whereby the unit of storage could be an XML subtree as well as an XML node. The part of XML data whose nodes were separately stored could be reformed into an XML subtree for storing when it gets rarely queried or less valuable for reference as time goes by. With this method, we designed and implemented a hybrid XML storage and query processing system, comparing it with the conventional system where an XML node is the only unit of storage. Through experiments, we compared storage efficiency and query processing performance, validating the effectiveness of our proposed system.

웹 기반의 e-비지니스가 활성화되면서 웹 상의 데이터 교환 표준인 XML 데이터의 양이 폭발적으로 증가하고 있다. 현재 XML의 저장소로 가장 널리 사용되고 있는 RDB에 XML 데이터를 분해하여 저장하고 SQL을 통해 XML 질의를 처리하는 기법이 많이 연구되었지만, 대용량의 XML 데이터 저장에 따른 공간 부담을 어떻게 완화할 것인지에 대한 연구는 없었다. 본 논문에서는 XML 데이터를 분해하여 기존의 노드 단위로 저장하는 것과 더불어 자주 질의되지 않거나 시간의 경과 등으로 유효성이 떨어진 데이터를 서브트리 단위의 저장으로 전환할 수 있는 혼합형 저장 기법을 제시한다. 이를 바탕으로 XML 혼합형 저장 및 질의 처리 시스템을 설계 및 구현하고 기존의 노드 단위 저장 및 질의 처리 시스템과 공간 효율 및 질의 처리 성능을 실험을 통해 비교 평가함으로써 제시하는 기법의 효율성을 검증하였다.

Keywords

References

  1. D. Florescu and D. Kossmann, “Storing and Querying XML Data Using an RDBMS,” IEEE Data Engg. Bulletin, Sep. 1999, pp. 27-34.
  2. J. Shanmugasundaram et al., “Relational Databases for Querying XML Docu ments: Limitations and Opportunities,” Proc. Int’l Conf. on VLDB, 1999.
  3. M. Yoshikawa, T. Amagasa, T. Shimura, S. Uemura, “XRel : A Path-Based Approach to Storage and Retrieval of XML Documents Using Relational Databases,” ACM Trans. on Internet Technology, Vol. 1, No. 1, Aug. 2001, pp. 110-141. https://doi.org/10.1145/383034.383038
  4. H. Jiang, H. Lu, W. Wang, and J. Yu, “Path Materialization Revisited : An Efficient Storage Model for XML Data,” Proc. Australasian Database Conf., 2002, pp. 85-94.
  5. I. Tatarinov et al., “Storing and Querying Ordered XML Using a Relational Database System,” Proc. ACM SIGMOD Int’l Conf. on Management of Data, 2002, pp. 204-215.
  6. J. Shanmugasundaram, E. J. Shekita, R. Barr, M. J. Carey, B. G. Lindsay, H. Pirahesh, and B. Reinwald. “Efficiently Publishing Relational Data as XML Documents,” The VLDB Journal, Sep. 2000, pp. 65-76.
  7. M. Klettke and H. Meyer, “XML and Object-Relational Database Systems-Enhancing Structural Mappings based on Statistics,” Proc. Workshop on Web and Databases, May. 2000, pp. 63-68.
  8. World Wide Web Consortium, XML Path Language(XPath) version 1.0, http://www.w3.org/TR/xpath.
  9. SAX parser, http://xerces.apache.org/xerces2-j/javadocs/xerces2/org/apache/xerces/parsers/SAXParser.html.
  10. J. Melton and S. Buxton, Chapter 15. SQL/XML in “Querying XML : XQuery, XPath, and SQL/XML in Context,” Morgan Kaufmann, 2006.
  11. ISO/IEC FCD 9075-14, Information Technology-Database languages-SQLpart 14 : XML-Related Specifications (SQL/XML), http://www.ansi.org.
  12. Patric O’Neil, Elizabeth O’Neil, Shankar Pal, Istvan Cseri, Gideon Schaller, Nigel Westbury, “ORDPATHs : Insert- Friendly XML Node Labels,” SIGMOD 2004.
  13. Changqing Li, Tok Wang Ling, “QED: A Novel Quaternary Encoding to Completely Avoid Re-labeling in XML Updates,” CIKM 2005.
  14. C. Zhang, J. Naughton, D. Dewitt, Q. Luo, and G. Lohman. “On supporting containment queries in relational database management systems,” Proc. of SIGMOD, 2001.
  15. Q. Li and B. Moon, “Indexing and Querying XML Data for Regular Path Expressions,” Proc. of Int’l Conf. On VLDB, 2001.
  16. S. Al-Khalifa, H. Jagadish, N. Koudas, J. Patel, D. Srivastava, Y. Wu, “Structural Joins : A Primitive for Efficient XML Query Pattern Matching,” Proc. of Int’l Conf. on Data Engineering, 2002, pp. 141-152.
  17. World Wide Web Consortium, Document Object Model(DOM), http://www.w3.org/TR/REC-DOM-Level-1/.
  18. A. Schmidt, F. Waas, M. Kersten, M. J. Carey, I. Manolescu, R. Busse, “XMark : A Benchmark for XML Data Management,” Proc. of Intl. Conf. On VLDB, Hong Kong, China, August. 2002.