• Title/Summary/Keyword: PDF Document

Search Result 47, Processing Time 0.022 seconds

A Study on the online of PDF Electronic Documents System (인터넷 원거리출판의 응용과 PDF의 인쇄활용에 관한 연구)

  • 유영수;강영립;김병현;이광수
    • Proceedings of the Korean Printing Society Conference
    • /
    • 2001.06a
    • /
    • pp.63-77
    • /
    • 2001
  • PDF(Portable Document Format) is a file format that Adobe advances postscritp technique and use in managing document information or electric publishing(internet, CD-ROM, DVD). PDF is a devised document type for being able to read and print anywhere, independent of OS, printer type, resolution, and the kind of computer etc. Because this includes a compressing function, it transfers document through a small size of file in internet or intranet. In addition, that is a file format has various advantages-sharing of information and transfering documents in on line or off line environment. In this paper, we developed electronic document system using PDF format. Electronic document system consists of filter, automatic indexing, special searching system and web server. The information used in this paper is database made using Zwon\`s DocuCom. The filter recognizes various kinds of document structure. And according to property of document, it produces ASCII output. In addition to processing various formats of document, the filter can extract keywords in documents of MS WORD, Excel, Powerpoint, PDF, CAD etc. This filter uses the structure of window printer drive and can extract the information for text, page, font type and size from relevant document. The automatic indexing recognizes the formatted tag of document form ASCII text produced by filter and extracts adequate keyword to structure and property of document. PDF electronic document systems proposed in this paper can be used in Internet, PC communication. Users can choose and read electronic documents by two ways. First, users can choose and read relevant books using PDF electronic document homepage. Second, users can use PDF integrated-search system. User can search after inputing keyword and choose reference field and type of data. But, now, PDF products of Adobe can\`t support the Korean character. If this problem is resolved, we thick that PDF applications system looks active. Although there is limited function in case of using Zwon DocuCom used in this study, we think that there isn\`t a great deal of difficulty in electronic document and building digital database.

  • PDF

PDF Publication Solution based on Web (웹을 기반으로 한 PDF 출판 솔류션에 관한 연구)

  • Lee Jae-Deuk
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.28 no.2
    • /
    • pp.109-116
    • /
    • 2005
  • In the previous C/S publishing system, the editor or contributor can arbitrarily modify the document created by the author, in which case it is difficult to identify the changes made in the document. Another shortcoming is in that when the document is in need of tracking or editing, the client must have the respective editing system. To solve this problem, the gist of the document must be preserved along with the document itself, and the process of handling the document must be standardized. Publishing on the web ensures a more stable and accurate result in processing documents. The significance of web publishing is made clear, when we consider the importance of information per se and the growing demand for immediate publication in the present day. The need for a simple and straightforward apache-based PDF publishing system, in which HTML and CSS are supported, and a converting engine provides PDF standard security application support, is prominent. This provides a library in which one can directly create a PDF via Windows, Linux, or Unix without having to rely on a client, allowing high-speed PDF creation. The development of a web-accessed PDF converting engine forms the basis for e-transactions, online brochures, electronic B/L, and many other industrial sectors.

PMCN: Combining PDF-modified Similarity and Complex Network in Multi-document Summarization

  • Tu, Yi-Ning;Hsu, Wei-Tse
    • International Journal of Knowledge Content Development & Technology
    • /
    • v.9 no.3
    • /
    • pp.23-41
    • /
    • 2019
  • This study combines the concept of degree centrality in complex network with the Term Frequency $^*$ Proportional Document Frequency ($TF^*PDF$) algorithm; the combined method, called PMCN (PDF-Modified similarity and Complex Network), constructs relationship networks among sentences for writing news summaries. The PMCN method is a multi-document summarization extension of the ideas of Bun and Ishizuka (2002), who first published the $TF^*PDF$ algorithm for detecting hot topics. In their $TF^*PDF$ algorithm, Bun and Ishizuka defined the publisher of a news item as its channel. If the PDF weight of a term is higher than the weights of other terms, then the term is hotter than the other terms. However, this study attempts to develop summaries for news items. Because the $TF^*PDF$ algorithm summarizes daily news, PMCN replaces the concept of "channel" with "the date of the news event", and uses the resulting chronicle ordering for a multi-document summarization algorithm, of which the F-measure scores were 0.042 and 0.051 higher than LexRank for the famous d30001t and d30003t tasks, respectively.

Design and Implementation of the Document HTML System for Preserving Content Integrity

  • Hyun Cheon Hwang;Ji Su Park;Jin Gon Shon
    • Journal of Information Processing Systems
    • /
    • v.19 no.3
    • /
    • pp.334-346
    • /
    • 2023
  • An electronic document based on PDF has been widely used in customer communication between an enterprise and a customer to deliver personalized content. However, electronic documents based on PDF in the form of paper layouts are not suitable for mobile environments because of low readability and lack of interactive interaction. Even though HTML is an essential language in a mobile environment, electronic document based on PDF is still used as it has a content integrity verification feature with a digital signature. It means that a user is sacrificing user experience in a mobile environment for content integrity and using paper-layout electronic documents. In this research, we design the Document HTML specification by setting the Document HTML conformance, adding the extended meta tags, and signing the message digest with a digital signature based on public key infrastructure (PKI). Furthermore, we implemented the Document HTML system, which has REST API services to generate and verify the Document HTML, and did experimental verification of the theory. As a result, we have confirmed that the Document HTML has both content integrity and user experience on mobile. Furthermore, the Document HTML is expected to be an alternative document format to deliver personalized content from an enterprise to a customer in a mobile environment instead of the paper layout electronic document such as PDF.

Automatic Generation of Interactive 3D PDF Document in a 3D Viewer Environment (CAD 뷰어 기반 대화형 3D PDF 문서 생성 자동화)

  • Park, Kyeong-Ho;Choi, Young;Yang, Sang-Wook;Song, In-Ho
    • Journal of the Korean Society for Precision Engineering
    • /
    • v.25 no.4
    • /
    • pp.77-85
    • /
    • 2008
  • PDF is widely accepted as a standard document format and now it supports 3D contents as well. Within the engineering application areas, this new 3D feature may be used to support sharing of 3D documents and thus collaboration between engineering departments, suppliers and partners. In this paper, we describe a system that automatically generates formatted engineering documents including 3D data converted from 3D applications such as commercial 3D CAD viewer. The system consists of two major modules. One is U3D conversion module and the other is PDF conversion module. U3D conversion module extracts geometry, view data, assembly and disassembly information from 3D viewer and converts to U3D format, currently in IDTF text file format. PDF conversion module generates a PDF file and inserts U3D data, various annotation information, and scripts for custom generated operations such as assembly and disassembly in the PDF document.

Study on Methods of Digitalization of Older Books Using PDF (PDF를 활용한 고문헌의 원문디지털화 방안에 대한 고찰)

  • Lee, Sang-Yong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.34 no.1
    • /
    • pp.133-153
    • /
    • 2000
  • This article is a study on methods of digitalization for eider books using PDF (Portable Document Format) supported by Acrobat 4.0 which was introduced in April of 1999. Acrobat 3.0 has caused many problems in supporting Korean language or Hangul. However, the revised 4.0 version of this software made the conversion of Korean, Japanese and Chinese language possible due to its support by the multi-language fonts. Therefore, it Is possible to converse and to edit the text file of older books written with Hangul. The Acrobat Reader, the viewer of PDF, can be downloaded for free from its website. However, the digitalized text of older books by PDF has still some problems. But the user can retrieve the text of older books from the Internet easily.

  • PDF

Improvement of the PDF Standard to Apply Long Term Electronic Signatures (전자서명 장기검증 기능 적용을 위한 PDF 표준 개선방안)

  • Park, Sunwoo;Jung, Jaewook;Won, Dongho
    • Proceedings of the Korean Society of Computer Information Conference
    • /
    • 2012.07a
    • /
    • pp.381-384
    • /
    • 2012
  • 2008년 국제표준 ISO 32000-1로 지정된 PDF 표준은 전자서명에 대한 표준을 함께 제공함으로써 PDF 문서의 신뢰성을 확보하고자 하였다. 하지만 ISO 32000-1에 포함된 전자서명 관련 표준은 서명에 사용된 인증서의 유효기간이 만료되면 전자서명의 유효성을 검증할 수 없기 때문에 장기적으로 보존되는 문서의 신뢰성을 보장하는데 적절하지 않다. 따라서 본 논문에서는 PDF 국제 표준인 ISO 32000-1의 전자서명 관련 표준을 분석하고 전자서명 장기검증 기능을 적용할 수 있는 방안을 제시한다. 본 논문에서 제안한 내용을 활용한다면 다양한 PDF 소프트웨어에서 호환 가능한 전자서명 장기검증 기능을 제공할 수 있을 것이며, 이를 통해 PDF 문서의 신뢰성을 향상시킬 수 있을 것이다.

  • PDF

A Study on Tools to Develop Electronic Documents (전자문헌 개발도구에 관한 고찰 - SGML, HTML과 PDF를 중심으로 -)

  • Kim, Yong;NamKoong, Hwang
    • Journal of Information Management
    • /
    • v.29 no.1
    • /
    • pp.1-19
    • /
    • 1998
  • With development in computing and networking technologies, national supports and attention for building digital library, which is to overcome the limits of time and location in using information resources, is increasing. To accomplish the main goal of digital library that is to freely share and transfer information on network, the importance of standardization in developing electronic document is increasing. Now several tools to develop electronic document, which will be used in digital library, are developed for electronic document used on WWW. But none of them has absolute advantages to other formats. Those tools, that is, have comparative advantages and disadvantages for making electronic documents. Through reviewing features and analyzing comparative advantage and disadvantage of SGML, HTML, and PDF, which will be used to develop electronic documents in digital libraries, this study focuses on their comparative advantages and disadvantages. With doing it, this study propose relevant type of electronic document formats to the types of information resources.

  • PDF

A Study on the Selection of Preservation Format for Long-Term Preservation of Electronic Records (전자기록물의 장기보존을 위한 보존포맷 선정 방안에 관한 연구)

  • Han, Hui-Jeong;Oh, Hyo-Jung;Yang, Dongmin
    • Journal of Korean Society of Archives and Records Management
    • /
    • v.20 no.1
    • /
    • pp.69-87
    • /
    • 2020
  • For the long-term preservation of document-type electronic records, the National Archives of Korea has chosen PDF/A-1 as the preservation format named as the document file format, and established it as a public standard. The only option of selecting PDF/A-1 restricts the use of various electronic file formats that can or must be applied to actual works as IT advances and tasks change. Moreover, it is difficult to apply PDF/A-1 to other types of electronic records (administrative information datasets, audiovisual records, web records, etc.). Therefore, it is necessary to diversify the preservation formats of electronic records. We suggest a framework for selecting various preservation formats. Furthermore, we propose common criteria and evaluation methods frequently applied to all electronic records when selecting a preservation format, and introduce a methodology for deriving intrinsic criteria applied to each type of electronic records.

PDFindexer: Distributed PDF Indexing system using MapReduce

  • Murtazaev, JAziz;Kihm, Jang-Su;Oh, Sangyoon
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.4 no.1
    • /
    • pp.13-17
    • /
    • 2012
  • Indexing allows converting raw document collection into easily searchable representation. Web searching by Google or Yahoo provides subsecond response time which is made possible by efficient indexing of web-pages over the entire Web. Indexing process gets challenging when the scale gets bigger. Parallel techniques, such as MapReduce framework can assist in efficient large-scale indexing process. In this paper we propose PDFindexer, system for indexing scientific papers in PDF using MapReduce programming model. Unlike Web search engines, our target domain is scientific papers, which has pre-defined structure, such as title, abstract, sections, references. Our proposed system enables parsing scientific papers in PDF recreating their structure and performing efficient distributed indexing with MapReduce framework in a cluster of nodes. We provide the overview of the system, their components and interactions among them. We discuss some issues related with the design of the system and usage of MapReduce in parsing and indexing of large document collection.