• 제목/요약/키워드: XML Documents

검색결과 705건 처리시간 0.026초

(Design and Implementation of DTD Authoring Tools for XML Documents) (XML 문서를 위한 DTD 저작 도구의 설계 및 구현)

  • 김현주
    • Journal of the Korea Computer Industry Society
    • /
    • 제3권8호
    • /
    • pp.1093-1104
    • /
    • 2002
  • XML is a markup language which has been accepted in various fields such as digital libraries, electronic commerce, and web applications. Research for creation, storage, management, and retrieval of XML documents is essential to develope XML application systems. This paper presents design and implementation details of powerful and convenient DTD authoring tools for XML documents. The design principles are authoring convenience, semi-automatic creation of valid and reliable document DTD by systematic guidance to reduce the possibility of syntax errors, and visualization of document structures.

  • PDF

Retrieval Performance of XML Documents Using Object-Relational Databases (객체-관계형 데이터베이스에 의한 XML문헌의 검색성능 평가)

  • Kim, Hee-Sop
    • Journal of the Korean Society for information Management
    • /
    • 제21권2호
    • /
    • pp.189-210
    • /
    • 2004
  • The purpose of this study is to evaluate the performance of XML retrieval based on ORDBMSs(Object-Relational Database Management Systems) approach. This paper describes indexing and retrieval methods for XML documents and the methodologies of experiments at INEX(Initiative for the Evaluation of XML retrieval). Like any other traditional information retrieval experiment, the test collection was consists of documents, topics/queries, task, relevance assessments and evaluation. EXIMA$^{TM}$ Supply, a kind of native XML DB based on ORDBMS technologies, is used for this experiment. Although this approach has many benefits, for example, no delay in storing and searching XML documents. but it showed relatively disappointed retrieval performance at INEX 2002. This result may caused since the given topics had to be decomposed and modified to be processed by the XPath processor, and during this modification the original meaning of topics can be changed inevitably and some important information nay pass over.r.

A Unification Algorithm for DTDs of XML Documents having a Similar Structure (유사 구조를 가지는 XML 문서들의 DTD 통합 알고리즘)

  • 유춘식;우선미;김용성
    • Journal of KIISE:Software and Applications
    • /
    • 제31권10호
    • /
    • pp.1400-1411
    • /
    • 2004
  • There are many cases that many XML documents have different DTDs in spite of having a similar structure and being logically same kind of document. For this reason, It occurs a problem that these XML documents have different database schema and are stored in different databases. So, in this paper, we propose an algorithm that unifies DTDs of these XML documents using the finite automata and the tree structure. The finite automata is suitable for representing repetition operators and connectors of DTD, and is a simple representation method for DTD. By using the finite automata, we are able to reduce the complexity of algorithm. And we apply a proposed algorithm to unify DTDs of science journals.

A Labeling Methods for Keyword Search over Large XML Documents (대용량 XML 문서의 키워드 검색을 위한 레이블링 기법)

  • Sun, Dong-Han;Hwang, Soo-Chan
    • Journal of KIISE
    • /
    • 제41권9호
    • /
    • pp.699-706
    • /
    • 2014
  • As XML documents are getting bigger and more complex, a keyword-based search method that does not require structural information is needed to search these large XML documents. In order to use this method, not only all keywords expressed as nodes in the XML document must be labeled for indexing but also structural information should be well represented. However, the existing labeling methods either have very simple information of XML documents for index or represent the structural information which is difficult to deal with the increase of XML documents' size. As the size of XML documents is getting larger, it causes either the poor performance of keyword search or the exponential increase of space usage. In this paper, we present the Repetitive Prime Labeling Scheme (RPLS) in order to improve the problem of the existing labeling methods for keyword-based search of large XML documents. This method is based on the existing prime number labeling method and allows a parent's prime number to be used at a lower level repeatedly so that the number of prime numbers being generated can be reduced. Then, we show an experimental result of the comparison between our methods and the existing methods.

Safe XML Documents Protection Policy Method from Attacker (침입자로부터 안전한 XML문서 보호정책 방안)

  • Koh, Chul-Ho;Lee, Ouk-Seh
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 한국컴퓨터정보학회 2013년도 제47차 동계학술대회논문집 21권1호
    • /
    • pp.241-242
    • /
    • 2013
  • 최근 XML 문서를 활용하여 다양한 분야에서 정보를 생성하여 사용하고 있다. 이에 따라 XML문서에 대한 보안이슈가 활발히 연구되고 있다. 본 논문에서는 익명의 침입자로부터 XML 문서에 대한 보호하는 정책을 제안한다. 이 기법은 XML 문서의 중요도에 따라 Count를 두어 설정한 Count 초과시 백업서버로 파일을 복제하여 전송한 후 파일을 삭제하기 때문에 익명의 사용자로부터 중요한 XML문서를 보호할 수 있다.

  • PDF

XML Document Clustering Technique by K-means algorithm through PCA (주성분 분석의 K 평균 알고리즘을 통한 XML 문서 군집화 기법)

  • Kim, Woo-Saeng
    • The KIPS Transactions:PartD
    • /
    • 제18D권5호
    • /
    • pp.339-342
    • /
    • 2011
  • Recently, researches are studied in developing efficient techniques for accessing, querying, and storing XML documents which are frequently used in the Internet. In this paper, we propose a new method to cluster XML documents efficiently. We use a K-means algorithm with a Principal Component Analysis(PCA) to cluster XML documents after they are represented by vectors in the feature vector space by transferring them as names and levels of the elements of the corresponding trees. The experiment shows that our proposed method has a good result.

Dynamic Generation of SMIL based Multimedia Documents on the Web (웹에서 SMIL 기반 멀티미디어 문서의 동적 생성)

  • 김경덕
    • Journal of Korea Multimedia Society
    • /
    • 제4권5호
    • /
    • pp.439-445
    • /
    • 2001
  • In this paper, we suggest a method for dynamic generation of SMIL documents by user profiles on the web. Generated multimedia documents are based on the SMIL (Synchronized Multimedia Integration Language) that are recommended by the W3C. The method generates automatically XSLT documents according to user profiles. SMIL documents are produced on real-time by integration of the XSLT documents and the XML documents that are made already. Most of conventional web-based documents are based on the HTML that is difficult to support reusability of documents are relation among multimedia abject. However, the suggested method is based on the XML, and so it supports reusability of documents and produces efficiently various SMIL-based multimedia documents. Application for the suggested method are as follows; Electronic commerce, tele-lecture, a web-based document editing, etc.

  • PDF

Common XML Structure Extracting Algorithm for Applying Data Mining Techniques (데이터마이닝 기법 적용을 위한 공용 XML 구조 추출 알고리즘)

  • Jang, Min-Seok;Bang, Hyun-Jin
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 한국해양정보통신학회 2005년도 춘계종합학술대회
    • /
    • pp.1072-1076
    • /
    • 2005
  • Importance of XML as a target of Data Mining is growing because XML is used generally as a standard markup language for describing structured data. Especially researches have been done about extracting wanted informations by applying association rules to XML documents. But there are few development about solving the problems of method for efficiently obtaining informations from similar kinds of XML documents. To solve the problem this paper tries to suggest the method by which common XML structure is extracted form the same kinds of XML documents having a various XML schemas. The resulted schema structure is supposed to be important one as a preliminary job because it helps us to acquire the useful informations from various kinds of documents by unifying their structures.

  • PDF

An Efficient Dynamic Indexing Model for Various Structure Retrievals of XML Documents (XML 문서의 다양한 구조 검색을 위한 효율적인 동적 색인 모델)

  • 신승호;손충범;강형일;유재수
    • Journal of KIISE:Databases
    • /
    • 제31권1호
    • /
    • pp.48-60
    • /
    • 2004
  • XML documents consist of elements that are basic units of information. When the structure of XML documents is changed dynamically, we need to update structure information efficiently without changing the information of the index structure for fast retrieval. In this paper, we propose a dynamic indexing model scheme that updates the index structure in real time as the structure of XML documents is changed by insertion and deletion of elements. Our dynamic indexing model consists of a structure information representation method and a dynamic index structure. The structure information representation method supports various types of structure retrievals. Our dynamic index structure processes various structural queries efficiently. We show through various experiments that our method outperforms existing ones in processing various types of queries such as content based queries, structural queries and hybrid queries.

Similarity Measure based on XML Document's Structure and Contents (XML 문서의 구조와 내용을 고려한 유사도 측정)

  • Kim, Woo-Saeng
    • Journal of Korea Multimedia Society
    • /
    • 제11권8호
    • /
    • pp.1043-1050
    • /
    • 2008
  • XML has become a standard for data representation and exchange on the Internet. With a large number of XML documents on the Web, there is an increasing need to automatically process those structurally rich documents for information retrieval, document management, and data mining applications. In this paper, we propose a new method to measure the similarity between XML documents by considering their structures and contents. The similarity of document's structure is found by a simple string matching technique and that of document's contents is found by weights taking into account of the names and positions of elements. The overall algorithm runs in time that is linear in the combined size of the two documents involved in comparison evaluation.

  • PDF