• Title/Summary/Keyword: Document Order

Search Result 777, Processing Time 0.022 seconds

An Archival Study on the Arrangement and Description of Old Document(Diploma) (고문서 정리(整理)에 대한 기록학적 연구 - 새로운 고문서 정리 방법의 모색을 위하여 -)

  • Cho, Kyung-Koo
    • The Korean Journal of Archival Studies
    • /
    • no.7
    • /
    • pp.37-74
    • /
    • 2003
  • An Old document(Diploma) is a historical and unique record, so it must be collected, arranged, and preserved for research as soon as possible. Especially, for the effective use of the Old Document(Diploma), it is needed to arrange and describe the material systematically on the ground of modern archival theory. The Kyujanggak Archives in the Seoul National University has published 23 volumes of Old document(Diploma) material Old Document(Diploma). But they seem to cause the readers inconvenience, because the materials are classified and gathered only by genre, the titles or the orders of the materials are not standardized, and there is no description about the content of each Old document(Diploma). Jangseo-gak Library in The Academy of Korean Studies has also published the series of Old document(Diploma) material Old Document(Diploma) Collection. However the case is not different, since they are all mixed up with materials classified and gathered by genre, family, academy, or local school. And a great part of the materials have no titles and no description about the content of each Old document(Diploma), either. About the arrangement and description of the records, European and American archival science has established the theory of l)the principle of provenance, 2)the principle of original order, 3)levels of control, 4)collective description. These theories are valuable for the effective use of Old document(Diploma). On the viewpoint of the principle of provenance, Old document(Diploma) materials should not be classified by subject and genre, but by family and person. Then, the Old document(Diploma) materials, after collected by the unit of family or person on the viewpoint of the principle of provenance, should be arranged in their original order for more detailed arrangement and furthermore, for the work to find their relationship. This is so called the principle of original order. The hierarchical management of the Old document(Diploma) materials, for example, classifying by record group, sub-group, series, item and so on, is the concept of the levels of control, and comprehensive description of the each hierarchical structure is the concept of the collective description. Let's apply these archival theories to 34 pieces of the Chung, Man-Seok's material in the series of Old document(Diploma) material Old Document(Diploma). First, collect the Old document(Diploma) materials into Chung, Man-Seok's collection(the principle of provenance), which were scattered in the series classified by genre. Secondly, rearrange them chronologically(the principle of original order), and then we can find the comprehensive information about Chung, Man-Seok. For the hierarchical management of the Old document(Diploma) materials, we should establish a few concepts from the general, large group to specific, small item. The concepts can be organized as following; l)record group(Chung, Man-Seok record group) - 2)sub-group(personnel document, property document, family document, social activity document, political activity document, etc) - 3)series(gyoji-series, gyoseo-series, yuji-series etc. in the personnel document) - 4)folder(document with additions) - 5)item(one document). According to the the theory of the collective description, in the level of record group, there should be a collective description of Chung, Man-Seok's biography or a summary of record group. Similarly, there should be a collective description of a summary of sub-group in the level of sub-group and a summary of series in the level of series.

A Study on Word Sense Disambiguation Using Bidirectional Recurrent Neural Network for Korean Language

  • Min, Jihong;Jeon, Joon-Woo;Song, Kwang-Ho;Kim, Yoo-Sung
    • Journal of the Korea Society of Computer and Information
    • /
    • v.22 no.4
    • /
    • pp.41-49
    • /
    • 2017
  • Word sense disambiguation(WSD) that determines the exact meaning of homonym which can be used in different meanings even in one form is very important to understand the semantical meaning of text document. Many recent researches on WSD have widely used NNLM(Neural Network Language Model) in which neural network is used to represent a document into vectors and to analyze its semantics. Among the previous WSD researches using NNLM, RNN(Recurrent Neural Network) model has better performance than other models because RNN model can reflect the occurrence order of words in addition to the word appearance information in a document. However, since RNN model uses only the forward order of word occurrences in a document, it is not able to reflect natural language's characteristics that later words can affect the meanings of the preceding words. In this paper, we propose a WSD scheme using Bidirectional RNN that can reflect not only the forward order but also the backward order of word occurrences in a document. From the experiments, the accuracy of the proposed model is higher than that of previous method using RNN. Hence, it is confirmed that bidirectional order information of word occurrences is useful for WSD in Korean language.

A Study on Intelligent Document Processing Management using Unstructured Data (비정형 데이터를 활용한 지능형 문서 처리 관리에 관한 연구)

  • Kyoung Hoon Park;Kwang-Kyu Seo
    • Journal of the Semiconductor & Display Technology
    • /
    • v.23 no.2
    • /
    • pp.71-75
    • /
    • 2024
  • This research focuses on processing unstructured data efficiently, containing various formulas in document processing and management regarding the terms and rules of domestic insurance documents using text mining techniques. Through parsing and compilation technology, document context, content, constants, and variables are automatically separated, and errors are verified in order of the document and logic to improve document accuracy accordingly. Through document debugging technology, errors in the document are identified in real time. Furthermore, it is necessary to predict the changes that intelligent document processing will bring to document management work, in particular, the impact on documents and utilization tasks that are double managed due to various formulas and prepare necessary capabilities in the future.

  • PDF

전자원문제공서비스의 현황과 과제

  • 이경호
    • Journal of Korean Library and Information Science Society
    • /
    • v.29
    • /
    • pp.171-212
    • /
    • 1998
  • In this study, the concept, developments and the present situations of an electronic document delivery services, projects and systems are examined. Also the implications of an electronic document delivery services in the library and the future of the services are studied. Some conclusions and a few suggestions derived from the study are as follows : (1) An electronic document delivery services, one of the most innovative methods for delivering the needed materials to a researcher is now being incorporated into an important part of today's information industries. (2) The technological developments have made it possible to deliver nearly all the document formats electronically, and can make the shortest turnaround time to be 30minutes. The technology has also made it possible to develop user-friendly document delivery services by providing the various methods of requesting of, delivering of and charging for the materials. (3) Different types of institutions have made researches, tests, developments and implementation of an electronic document delivery techniques with different features. (4) The issues of copyrights and standards involved in an electronic document delivery still remain as the problems to be solved. (5) The increase and development of patron-initiated document delivery services have and will have some impacts on the library services with the possibility to pass by the librarians intermediation, but to deliver the materials directly to the end-users. (6) The library could take the outside electronic document delivery services as an opportunity. Accordingly, in order to incorporate this services in the interlibrary loan, collection development and other library services, the library should establish appropriate policies, guidelines and management strategies related to the operations. (7) In order to maximize the use of the electronic document delivery services, the library should provide an appropriate education for the librarian and users to have knowledge and skills on the changing techniques of the electronic document delivery and on the various features as well as changing mechanisms by each system and service.

  • PDF

Improving the Performance of Document Clustering with Distributional Similarities (분포유사도를 이용한 문헌클러스터링의 성능향상에 대한 연구)

  • Lee, Jae-Yun
    • Journal of the Korean Society for information Management
    • /
    • v.24 no.4
    • /
    • pp.267-283
    • /
    • 2007
  • In this study, measures of distributional similarity such as KL-divergence are applied to cluster documents instead of traditional cosine measure, which is the most prevalent vector similarity measure for document clustering. Three variations of KL-divergence are investigated; Jansen-Shannon divergence, symmetric skew divergence, and minimum skew divergence. In order to verify the contribution of distributional similarities to document clustering, two experiments are designed and carried out on three test collections. In the first experiment the clustering performances of the three divergence measures are compared to that of cosine measure. The result showed that minimum skew divergence outperformed the other divergence measures as well as cosine measure. In the second experiment second-order distributional similarities are calculated with Pearson correlation coefficient from the first-order similarity matrixes. From the result of the second experiment, secondorder distributional similarities were found to improve the overall performance of document clustering. These results suggest that minimum skew divergence must be selected as document vector similarity measure when considering both time and accuracy, and second-order similarity is a good choice for considering clustering accuracy only.

XML Document Keyword Weight Analysis based Paragraph Extraction Model (XML 문서 키워드 가중치 분석 기반 문단 추출 모델)

  • Lee, Jongwon;Kang, Inshik;Jung, Hoekyung
    • Journal of the Korea Institute of Information and Communication Engineering
    • /
    • v.21 no.11
    • /
    • pp.2133-2138
    • /
    • 2017
  • The analysis of existing XML documents and other documents was centered on words. It can be implemented using a morpheme analyzer, but it can classify many words in the document and cannot grasp the core contents of the document. In order for a user to efficiently understand a document, a paragraph containing a main word must be extracted and presented to the user. The proposed system retrieves keyword in the normalized XML document. Then, the user extracts the paragraphs containing the keyword inputted for searching and displays them to the user. In addition, the frequency and weight of the keyword used in the search are informed to the user, and the order of the extracted paragraphs and the redundancy elimination function are minimized so that the user can understand the document. The proposed system can minimize the time and effort required to understand the document by allowing the user to understand the document without reading the whole document.

History Document Image Background Noise and Removal Methods

  • Ganchimeg, Ganbold
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.5 no.2
    • /
    • pp.11-24
    • /
    • 2015
  • It is common for archive libraries to provide public access to historical and ancient document image collections. It is common for such document images to require specialized processing in order to remove background noise and become more legible. Document images may be contaminated with noise during transmission, scanning or conversion to digital form. We can categorize noises by identifying their features and can search for similar patterns in a document image to choose appropriate methods for their removal. In this paper, we propose a hybrid binarization approach for improving the quality of old documents using a combination of global and local thresholding. This article also reviews noises that might appear in scanned document images and discusses some noise removal methods.

Development of Common Document Structure based on XML for Representing Mechanical Part and Assembly Information (기계 조립품 정보의 표현을 위한 XML기반 공용문서 구조)

  • 정태형;박승현;윤성원
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.20 no.9
    • /
    • pp.180-187
    • /
    • 2003
  • In engineering design environment it is hard to link design data and systems because the types of them are disparate. Therefore, the importance of metadata has increased. Some researches have been executed to develop metadata. But they cannot interact with other metadata and are difficult to extend. The purpose of this paper is to develop a common document structure which represents the general information of mechanical part assembly using XML, and to use it as base documents in order to integrate design data and systems. It is composed of part, assembly and user documents. Part document represents the information of a part independently to part type. Assembly document represents the location of constituent part documents. User document represents user's information. Common documents can be used as a broker between design data and systems, and it can improve the interpretability and reusability of document. We applied the developed common document structure to 2-stage spur gear drive.

A Prime Numbering Scheme with Sibling-Order Value for Efficient Labeling in Dynamic XML Documents (동적 XML 문서에서 효과적인 레이블링을 위해 형제순서 값을 갖는 프라임 넘버링 기법)

  • Lee, Kang-Woo;Lee, Joon-Dong
    • Journal of the Korea Society of Computer and Information
    • /
    • v.12 no.5
    • /
    • pp.65-72
    • /
    • 2007
  • Labeling schemes which don't consider about frequent update in dynamic XML documents need relabeling process to reflect the changed label information whenever the tree of XML document is update. There is disadvantage of considerable expenses in the dynamic XML document which can occurs frequent update. To solve this problem, we suggest prime number labeling scheme that doesn't need relabeling process. However the prime number labeling scheme does not consider that it needs to update the sibling order of nodes in the tree of XML document. This update process needs much costs because the most of the tree of XML document has to be researched and rewritten. In this paper, we propose the prime number labeling scheme with sibling order value that can maintain the sibling order without researching or rewriting the tree of XML documents.

  • PDF

The Design and Implementation of SGML Document Editing System Using Document Structure Information (문서 구조정보를 이용한 SGML 문서 편집 시스템의 설계 및 구현)

  • Kim, Chang-Su;Jo, In-June;Jung, Hoe-Kyung
    • The Journal of Engineering Research
    • /
    • v.3 no.1
    • /
    • pp.21-27
    • /
    • 1998
  • This paper describes the design and implementation of system for editing SGML document instance using document structure information of SGML DTD. For make use of structure window for logical structure expression of document to SGML document editing without editing mistake of user and easy update the using support to editing process of elements, attributes, entities tools and product document, and valid using SGML parser. Also, in order to support Korean and English text using KS 5601. In this paper, the proposed SGML document editing system is used common controls support of window 95 for window user interface

  • PDF