• Title/Summary/Keyword: Document Order

Search Result 777, Processing Time 0.029 seconds

Study on the Environment Information Providing Method based on Spatial Information Document

  • Choi, Byoung Gil;Na, Young Woo;Kim, Sung Pyo
    • Journal of the Korean Society of Surveying, Geodesy, Photogrammetry and Cartography
    • /
    • v.34 no.2
    • /
    • pp.185-194
    • /
    • 2016
  • The purpose of this study is to present a method to provide environment information based on spatial information document. At present, a lot of spatial information, including environment information, is being produced, but separate software or system is required for the user to acquire the information. In particular, in the case of environment information, various types of information are being produced, such as ecology, vegetation and measurement network data. Therefore, it is necessary to present the form and the making method of spatial information document that allows using environment information as spatial information without separate software or system. To provide spatial information document-based environment information, types and forms of environment information, data format and offering methods produced by the government, in particular, the Ministry of Environment and the local governments, are analyzed. 12 fields are classified and the form of produced data is GIS DB, measurement network data, text data and so on. With decrease of paper maps, spatial information document that offers display by layer, coordinate data, attribute data, distance and area measurement, location search by coordinates, GPS location linkage and location display on the map is presented to increase utilization of geo-environment information maps. Finally, the standard document specification based on spatial information document is presented in consideration of usability and readability in order to provide a variety of environment information without separate software or system.

Generic Document Summarization using Coherence of Sentence Cluster and Semantic Feature (문장군집의 응집도와 의미특징을 이용한 포괄적 문서요약)

  • Park, Sun;Lee, Yeonwoo;Shim, Chun Sik;Lee, Seong Ro
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.16 no.12
    • /
    • pp.2607-2613
    • /
    • 2012
  • The results of inherent knowledge based generic summarization are influenced by the composition of sentence in document set. In order to resolve the problem, this papser propses a new generic document summarization which uses clustering of semantic feature of document and coherence of document cluster. The proposed method clusters sentences using semantic feature deriving from NMF(non-negative matrix factorization), which it can classify document topic group because inherent structure of document are well represented by the sentence cluster. In addition, the method can improve the quality of summarization because the importance sentences are extracted by using coherence of sentence cluster and the cluster refinement by re-cluster. The experimental results demonstrate appling the proposed method to generic summarization achieves better performance than generic document summarization methods.

A Clustering Technique using Common Structures of XML Documents (XML 문서의 공통 구조를 이용한 클러스터링 기법)

  • Hwang, Jeong-Hee;Ryu, Keun-Ho
    • Journal of KIISE:Databases
    • /
    • v.32 no.6
    • /
    • pp.650-661
    • /
    • 2005
  • As the Internet is growing, the use of XML which is a standard of semi-structured document is increasing. Therefore, there are on going works about integration and retrieval of XML documents. However, the basis of efficient integration and retrieval of documents is to cluster XML documents with similar structure. The conventional XML clustering approaches use the hierarchical clustering algorithm that produces the demanded number of clusters through repeated merge, but it have some problems that it is difficult to compute the similarity between XML documents and it costs much time to compare similarity repeatedly. In order to address this problem, we use clustering algorithm for transactional data that is scale for large size of data. In this paper we use common structures from XML documents that don't have DTD or schema. In order to use common structures of XML document, we extract representative structures by decomposing the structure from a tree model expressing the XML document, and we perform clustering with the extracted structure. Besides, we show efficiency of proposed method by comparing and analyzing with the previous method.

A Study on Edit Order of Text Cells on the MS Excel Files (MS 엑셀 파일의 텍스트 셀 입력 순서에 관한 연구)

  • Lee, Yoonmi;Chung, Hyunji;Lee, Sangjin
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.24 no.2
    • /
    • pp.319-325
    • /
    • 2014
  • Since smart phones or tablet PCs have been widely used recently, the users can create and edit documents anywhere in real time. If the input and edit flows of documents can be traced, it can be used as evidence in digital forensic investigation. The typical document application is the MS(Microsoft) Office. As the MS Office applications consist of two file formats that Compound Document File Format which had been used from version 97 to 2003 and OOXML(Office Open XML) File Format which has been used from version 2007 to now. The studies on MS Office files were for making a decision whether the file has been tampered or not through detection of concealed items or analysis of documents properties so far. This paper analyzed the input order of text cells on MS Excel files and shows how to figure out what cell is the last edited in digital forensic perspective.

Extension of legacy gear design systems using XML and XSLT (XML과 XSLT를 이용한 레거시 기어 설계 시스템의 확장에 관한 연구)

  • 정태형;박승현
    • Proceedings of the Korean Society of Machine Tool Engineers Conference
    • /
    • 2001.10a
    • /
    • pp.257-262
    • /
    • 2001
  • As computer-related technologies have been developed, legacy design systems have not been appropriate for new computing environment. Therefore, it is necessary that most of them are either modified or newly developed. However, this requires quite much amount of cost and time. This paper presents a method of extending legacy design system without modification using XML and XSLT. In order to apply the developed method, a good example of legacy design systems, AGMA gear rating system has been extended so as to be suitable for the distributed computing environment. An XML document for AGMA gear rating process is defined. It is transformed to the form of the input document of AGMA gear rating system by XSLT processor according to the transformation rules defined in the AGMA gear rating XSLT document. After that, AGMA gear rating system reads this input document and generates an output document in the server. These operations are automatically executed by the external legacy system controller without user interactions. Using these operations, AGMA gear rating web service has been developed based on SOAP and WSDL to provide the functions of legacy AGMA gear rating system through the distributed network. Any system or user can implement AGMA gear rating process independently to the platform type, without making it for oneself, by simply referring the AGMA gear rating web service via Internet.

  • PDF

Document Summarization using Topic Phrase Extraction and Query-based Summarization (주제어구 추출과 질의어 기반 요약을 이용한 문서 요약)

  • 한광록;오삼권;임기욱
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.4
    • /
    • pp.488-497
    • /
    • 2004
  • This paper describes the hybrid document summarization using the indicative summarization and the query-based summarization. The learning models are built from teaming documents in order to extract topic phrases. We use Naive Bayesian, Decision Tree and Supported Vector Machine as the machine learning algorithm. The system extracts topic phrases automatically from new document based on these models and outputs the summary of the document using query-based summarization which considers the extracted topic phrases as queries and calculates the locality-based similarity of each topic phrase. We examine how the topic phrases affect the summarization and how many phrases are proper to summarization. Then, we evaluate the extracted summary by comparing with manual summary, and we also compare our summarization system with summarization mettled from MS-Word.

A Study on Development of SGML Repository System Based on DTD-dependent Schema (DTD 의존 스키마에 기반한 SGML 문서 저장 시스템 개발에 관한 연구)

  • Kim, Hyeon-Gi;No, Dae-Sik;Gang, Hyeon-Gyu
    • The Transactions of the Korea Information Processing Society
    • /
    • v.6 no.5
    • /
    • pp.1153-1165
    • /
    • 1999
  • In various fields of information technology, it is growing up the needs about dynamic content management systems to store and manage SGML(Standard Generalized Markup language) documents in a database system. In this paper, we consider the issue of storing SGML documents that having complex hierarchical structure into a database system, and then propose a data model based on ODMG(Object Database Management Group) object model in order to store SGML documents without loss of information. Because the proposed data model reflects physical element structure and logical entity structure of SGML documents, it is able to store the SGML document in a database system at the system at the element- level granularity without any information loss. And also the proposed data model can be adapted among ODMG-compliant object database management systems. Finally, we will discuss on the implementation details of SGML repository system supports the functionality of automatic database schema creation for any DTD(Document Type Definition0, the functionality of storing the SGML document, the functionality of dynamic document assembly from stored database objects to SGML document, and the functionality of indexing and searching for database objects.

  • PDF

An Indexing Scheme for Efficient Retrieval and Update of Structured Documents Based on GDIT (GDIT를 기반으로 한 구조적 문서의 효율적 검색과 갱신을 위한 인덱스 설계)

  • Kim, Young-Ja;Bae, Jong-Min
    • The Transactions of the Korea Information Processing Society
    • /
    • v.7 no.2
    • /
    • pp.411-425
    • /
    • 2000
  • Information retrieval systems for structured documents which are written in SGML or XML support partial retrieval of document. In order to efficiently process queries based on document structures, low memory overhead for indexing, quick response time for queries, supports to powerful types of user queries, and minimal updates of index structure for document updates are required. This paper suggests the Global Document Instance Tree(GDIT) and proposes an effective indexing scheme and query processing algorithms based on the GDIT. The indexing scheme keeps up indexing and retrieval effciency and also guarantees minimal updates of the index structure when document structures are updated.

  • PDF

Detection of Malicious PDF based on Document Structure Features and Stream Objects

  • Kang, Ah Reum;Jeong, Young-Seob;Kim, Se Lyeong;Kim, Jonghyun;Woo, Jiyoung;Choi, Sunoh
    • Journal of the Korea Society of Computer and Information
    • /
    • v.23 no.11
    • /
    • pp.85-93
    • /
    • 2018
  • In recent years, there has been an increasing number of ways to distribute document-based malicious code using vulnerabilities in document files. Because document type malware is not an executable file itself, it is easy to bypass existing security programs, so research on a model to detect it is necessary. In this study, we extract main features from the document structure and the JavaScript contained in the stream object In addition, when JavaScript is inserted, keywords with high occurrence frequency in malicious code such as function name, reserved word and the readable string in the script are extracted. Then, we generate a machine learning model that can distinguish between normal and malicious. In order to make it difficult to bypass, we try to achieve good performance in a black box type algorithm. For an experiment, a large amount of documents compared to previous studies is analyzed. Experimental results show 98.9% detection rate from three different type algorithms. SVM, which is a black box type algorithm and makes obfuscation difficult, shows much higher performance than in previous studies.

An Efficient Updates Processing Using Labeling Scheme In Dynamic Ordered XML Trees (동적 순서 XML 트리에서 레이블링 기법을 이용한 효율적인 수정처리)

  • Lee, Kang-Woo
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.12 no.12
    • /
    • pp.2219-2225
    • /
    • 2008
  • Labeling schemes which don't consider about frequent update in dynamic XML documents need relabeling process to reflect the changed label information whenever the tree of XML document is update. There is disadvantage of considerable expenses in the dynamic XML document which can occurs frequent update. To solve this problem, we suggest prime number labeling scheme that doesn't need relabeling process. However the prime number labeling scheme does not consider that it needs to update the sibling order of nodes in the XML tree of document. This update process needs much costs because the most of the XML tree of document has to be relabeling and recalculation. In this paper, we propose the prime number labeling scheme with sibling order value that can maintain the sibling order without relabeling or recalculation the XML tree of documents.