• Title/Summary/Keyword: document storing

Search Result 72, Processing Time 0.021 seconds

A Streaming XML Parser Supporting Adaptive Parallel Search (적응적 병렬 검색을 지원하는 스트리밍 XML 파서)

  • Lee, Kyu-Hee;Han, Sang-Soo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.17 no.8
    • /
    • pp.1851-1856
    • /
    • 2013
  • An XML is widely used for web services, such as SOAP(Simple Object Access Protocol) and REST (Representational State Transfer), and also de facto standard for representing data. Since the XML parser using DOM(Document Object Model) requires a preprocessing task creating a DOM-tree, and then storing it into memory, embedded systems with limited resources typically employ a streaming XML parser without preprocessing. In this paper, we propose a new architecture for the streaming XML parser using an APSearch(Adaptive Parallel Search) on FPGA(Field Programmable Gate Array). Compared to other approaches, the proposed APSearch parser dramatically reduces overhead on the software side and achieves about 2.55 and 2.96 times improvement in the time needed for an XML parsing. Therefore, our APSearch parser is suitable for systems to speed up XML parsing.

A Circle Labeling Scheme without Re-labeling for Dynamically Updatable XML Data (동적으로 갱신가능한 XML 데이터에서 레이블 재작성하지 않는 원형 레이블링 방법)

  • Kim, Jin-Young;Park, Seog
    • Journal of KIISE:Databases
    • /
    • v.36 no.2
    • /
    • pp.150-167
    • /
    • 2009
  • XML has become the new standard for storing, exchanging, and publishing of data over both the internet and the ubiquitous data stream environment. As demand for efficiency in handling XML document grows, labeling scheme has become an important topic in data storage. Recently proposed labeling schemes reflect the dynamic XML environment, which itself provides motivation for the discovery of an efficient labeling scheme. However, previous proposed labeling schemes have several problems: 1) An insertion of a new node into the XML document triggers re-labeling of pre-existing nodes. 2) They need larger memory space to store total label. etc. In this paper, we introduce a new labeling scheme called a Circle Labeling Scheme. In CLS, XML documents are represented in a circular form, and efficient storage of labels is supported by the use of concepts Rotation Number and Parent Circle/Child Circle. The concept of Radius is applied to support inclusion of new nodes at arbitrary positions in the tree. This eliminates the need for re-labeling existing nodes and the need to increase label length, and mitigates conflict with existing labels. A detailed experimental study demonstrates efficiency of CLS.

Design and Performance Evaluation of Data Models for the XML Document Management (XML 문서관리를 위한 데이터 모델 설계 및 성능평가)

  • 유재수;손충범;조혜영
    • Journal of Internet Computing and Services
    • /
    • v.2 no.5
    • /
    • pp.59-70
    • /
    • 2001
  • Recently, in various fields XML has been become a standard for information exchange in internet. Therefore, the researches on data modeling for storing and fetching the XML documents have been made actively. However, existing researches have weak points that they can support neither versioning nor fast fetching of documents while changing documents in dynamic environments. In this paper, we propose four kinds of hybrid data modeling schemes that combine fragmentation model and nonfragmentation model. Our data modeling schemes are suitable to dynamic environments. We also present guidelines that our hybrid data modeling schemes can be used for various applications. We shaw through various experiments that our hybrid schemes partially outperforms the existing modeling schemes in terms of insertion time, storage space and retrieval time.

  • PDF

An Efficient BitmapInvert Index based on Relative Position Coordinate for Retrieval of XML documents (효율적인 XML검색을 위한 상대 위치 좌표 기반의 BitmapInvert Index 기법)

  • Kim, Tack-Gon;Kim, Woo-Saeng
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.43 no.1 s.307
    • /
    • pp.35-44
    • /
    • 2006
  • Recently, a lot of index techniques for storing and querying XML document have been studied so far and many researches of them used coordinate-based methods. But update operation and query processing to express structural relations among elements, attributes and texts make a large burden. In this paper, we propose an efficient BitmapInvert index technique based on Relative Position Coordinate (RPC). RPC has good preformance even if there are frequent update operations because it represents relationship among parent node and left, right sibling nodes. BitmapInvert index supports tort query with bitwise operations and does not casue serious performance degradations on update operations using PostUpdate algerian. Overall, the performance could be improved by reduction of the number of times for traversing nodes.

An Index Method for Storing and Extracting XML Documents (XML 문서의 저장과 추출을 위한 색인 기법)

  • Kim Woosaeng;Song Jungsuk
    • Journal of Korea Multimedia Society
    • /
    • v.8 no.2
    • /
    • pp.154-163
    • /
    • 2005
  • Because most researches that were studied so far on XML documents used an absolute coordinate system in most of the index techniques, the update operation makes a large burden. To express the structural relations between elements, attributes and text, we need to reconstruct the structure of the coordinates. As the reconstruction process proceeds through out the entire XML document in a cascade manner, which is not limited to the current changing node, a serious performance problem may be caused by the frequent update operations. In this paper, we propose an index technique based on extensible index that does not cause serious performance degradations. It can limit the number of node to participate in reconstruction process and improve lots of performance capacities on the whole. And extensible index performs the containment relationship query by the simple expression using SQL statement.

  • PDF

An XML Proxy Cache System for XML Documents with Update Locality in Shipbuilding Information Management System (조선정보관리시스템에서의 갱신의 지역편중성을 갖는 XML문서를 위한 XML 프록시 캐쉬 시스템)

  • Kim Nak Hyun;Lee Dong-Ho;Choi Il-Hwan;Kim Hyoung-Joo
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.11 no.5
    • /
    • pp.393-400
    • /
    • 2005
  • XML makes it possible to query the information created and managed different applications, which is impossible if expressed in other structure or language. In using shipbuilding information management system, there is inefficiency in storing and querying such a large XML document in XDBox. XML proxy cache system is suggested to improve that. In this paper, we suggests XML proxy cache system with thought of update locality found in using shipbuilding information management system.

Design of XML Document Management System based on Schema (스키마 기반의 XML문서 관리 시스템 설계)

  • 조윤기;김영란
    • Journal of the Korea Society of Computer and Information
    • /
    • v.6 no.4
    • /
    • pp.85-93
    • /
    • 2001
  • As progressing rapidly to the information society and increasing greatly the amount of information, many researchers have been made utilizing XML to store and retrieval the information effectively. But, many other existing method could not support various structured retrieval method for specific parent, children and sibling nodes. In this paper, we propose (1)an effective method of representation for structured information and of indexing mechanism using OETID(Ordered Element Type ID) for effective management and structured retrieval of the XML documents. Also it contains another proposal that is (2) a documents integration mechanism for retrieval result and storing technique to store structural information of the XML documents. With our methods, we could effectively represent structural information of XML documents, and could directly access the specific elements and process various queries by simple operations.

  • PDF

A Study on the Intelligent Document Processing Platform for Document Data Informatization (문서 데이터 정보화를 위한 지능형 문서처리 플랫폼에 관한 연구)

  • Hee-Do Heo;Dong-Koo Kang;Young-Soo Kim;Sam-Hyun Chun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.24 no.1
    • /
    • pp.89-95
    • /
    • 2024
  • Nowadays, the competitiveness of a company depends on the ability of all organizational members to share and utilize the organizational knowledge accumulated by the organization. As if to prove this, the world is now focusing on ChetGPT service using generative AI technology based on LLM (Large Language Model). However, it is still difficult to apply the ChetGPT service to work because there are many hallucinogenic problems. To solve this problem, sLLM (Lightweight Large Language Model) technology is being proposed as an alternative. In order to construct sLLM, corporate data is essential. Corporate data is the organization's ERP data and the company's office document knowledge data preserved by the organization. ERP Data can be used by directly connecting to sLLM, but office documents are stored in file format and must be converted to data format to be used by connecting to sLLM. In addition, there are too many technical limitations to utilize office documents stored in file format as organizational knowledge information. This study proposes a method of storing office documents in DB format rather than file format, allowing companies to utilize already accumulated office documents as an organizational knowledge system, and providing office documents in data form to the company's SLLM. We aim to contribute to improving corporate competitiveness by combining AI technology.

Extracting Maximal Similar Paths between Two XML Documents using Sequential Pattern Mining (순차 패턴 마이닝을 사용한 두 XML 문서간 최대 유사 경로 추출)

  • 이정원;박승수
    • Journal of KIISE:Databases
    • /
    • v.31 no.5
    • /
    • pp.553-566
    • /
    • 2004
  • Some of the current main research areas involving techniques related to XML consist of storing XML documents, optimizing the query, and indexing. As such we may focus on the set of documents that are composed of various structures, but that are not shared with common structure such as the same DTD or XML Schema. In the case, it is essential to analyze structural similarities and differences among many documents. For example, when the documents from the Web or EDMS (Electronic Document Management System) are required to be merged or classified, it is very important to find the common structure for the process of handling documents. In this paper, we transformed sequential pattern mining algorithms(1) to extract maximal similar paths between two XML documents. Experiments with XML documents show that our transformed sequential pattern mining algorithms can exactly find common structures and maximal similar paths between them. For analyzing experimental results, similarity metrics based on maximal similar paths can exactly classify the types of XML documents.

An Apache-based WebDAV Server Supporting Reliable Reliable Resource Management (아파치 기반의 신뢰성 있는 자원관리를 지원하는 웹데브 서버)

  • Jung, Hye-Young;Ahn, Geon-Tae;Park, Yang-Soo;Lee, Myung-Joon
    • The KIPS Transactions:PartC
    • /
    • v.11C no.4
    • /
    • pp.545-554
    • /
    • 2004
  • WebDAV is a protocol to support collaboration among the workers in geographically distant locations through the Internet. WebDAV extends the web communication protocol HTTP/1.1 to provide a standard infrastructure for supporting asynchronous collaboration for various contents across the Internet. To provide the WebDAV functionality in legacy applications such as web-based collaborative systems or document management systems, those systems need to be implemented additionally to handle the WebDAV methods and headers information. In this paper, we developed an Apache-based WebDAV server, named DAVinci(WebDAV Is New Collaborative web-authoring Innovation)which supports the WebDAV specification. DAVinci was implemented as a form of service provider on a mod_dav Apache module. Mod_day, which is an Apache module, is an open source module to provide WebDAV capabilities in an Apache web server. We used a file system for storing resources and the PostgreSQL database for their properties. In addition, the system provides a consistency manager to guarantee that both resources and properties are maintained without inconsistency between resources and their properties.