• Title/Summary/Keyword: Document Based Database

Search Result 197, Processing Time 0.023 seconds

Style Control of Structured Documents using DSSSL

  • Lee, Kyong-Ho;Lee, Jin-Ho;Choy, Yoon-Chul
    • Proceedings of the Korea Database Society Conference
    • /
    • 1997.10a
    • /
    • pp.455-462
    • /
    • 1997
  • SGML(Standard Generalized Markup Language) is the ISO standard fer describing the logical structure of documents and is also adopted as the CALS standard for document description. Since then, there have been growing interests in SGML application in a variety of fields. However because SGML doesn't provide a standard method for describing various processing informations, ie, formatting and transformation, most applications have applied methods that are system dependent. Recently, ISO defined DSSSL(Document Style Semantics and Specification Language) as a standard mechanism to specify the formatting, transformation and retrieval of structured documents. Therefore, in this paper, we present a DSSSL processing system far style control of structured documents such as SGML documents. The system processes DSSSL style sheet that describes layout of documents and browses the result of its application to a SGML document. We have conducted tests on a lot of SGML documents and DSSSL style sheets successfully. Now, we are developing the SGML document management system that supports creation, editing, storage and retrieval of SGML document based upon the DSSSL processor and the SGML parser which we have developed.

  • PDF

Automatic Preference Rating using User Profile in Content-based Collaborative Filtering System (내용 기반 협력적 여과 시스템에서 사용자 프로파일을 이용한 자동 선호도 평가)

  • 고수정;최성용;임기욱;이정현
    • Journal of KIISE:Software and Applications
    • /
    • v.31 no.8
    • /
    • pp.1062-1072
    • /
    • 2004
  • Collaborative filtering systems based on {user-document} matrix are effective in recommending web documents to user. But they have a shortcoming of decreasing the accuracy of recommendations by the first rater problem and the sparsity. This paper proposes the automatic preference rating method that generates user profile to solve the shortcoming. The profile in this paper is content-based collaborative user profile. The content-based collaborative user profile is generated by combining a content-based user profile with a collaborative user profile by mutual information method. Collaborative user profile is based on {user-document} matrix in collaborative filtering system, thus, content-based user profile is generated by relevance feedback in content-based filtering systems. After normalizing combined content-based collaborative user profiles, it automatically rates user preference by reflecting normalized profile in {user-document}matrix of collaborative filtering systems. We evaluated our method on a large database of user ratings for web document and it was certified that was more efficient than existent methods.

A Study of the Influence of Choice of Record Fields on Retrieval Performance in the Bibliographic Database (서지 데이터베이스에서의 레코드 필드 선택이 검색 성능에 미치는 영향에 관한 연구)

  • Heesop Kim
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.35 no.4
    • /
    • pp.97-122
    • /
    • 2001
  • This empirical study investigated the effect of choice of record field(s) upon which to search on retrieval performance for a large operational bibliographic database. The query terms used in the study were identified algorithmically from each target set in four different ways: (1) controlled terms derived from index term frequency weights, (2) uncontrolled terms derived from index term frequency weights. (3) controlled terms derived from inverse document frequency weights, and (4) uncontrolled terms based on universe document frequency weights. Su potable choices of record field were recognised. Using INSPEC terminology, these were the fields: (1) Abstract. (2) 'Anywhere'(i.e., ail fields). (3) Descriptors. (4) Identifiers, (5) 'Subject'(i.e., 'Descriptors' plus Identifiers'). and (6) Title. The study was undertaken in an operational web-based IR environment using the INSPEC bibliographic database. The retrieval performances were evaluated using D measure (bivariate in Recall and Precision). The main findings were that: (1) there exist significant differences in search performance arising from choice of field, using 'mean performance measure' as the criterion statistic; (2) the rankings of field-choices for each of these performance measures is sensitive to the choice of query : and (3) the optimal choice of field for the D-measure is Title.

  • PDF

Mapping from XML DTD to RDB Schema based on Object Model (객체모델을 기반으로 한 XML DTD의 RDB 스키마로의 변환 방법)

  • 이상태;이정수;주경수
    • Proceedings of the IEEK Conference
    • /
    • 2001.06c
    • /
    • pp.113-116
    • /
    • 2001
  • XML (extensible Markup Language) is a flexible way to create common information formats and share both the format and the data on the World Wide Web, intranets, and elsewhere. A document type definition (DTD) is a specific definition of the rules of the Standard Generalized Markup Language. A relational database management system (RDBMS) is a program that lets you create, update, and administer a relational database. An RDBMS takes Structured Query Language (SQL) statements entered by a user or contained in an application program and creates, updates, or provides access to the database. This paper has been studied a method of mappings from XML DTD to RDB schemas based on object model.

  • PDF

A Handwritten Document Digitalization Framework based Defect Management System in Educational Facilities (수기문서 전자화 프레임워크 기반의 교육시설 하자관리 시스템)

  • Son, Bong-Ki
    • The Journal of Sustainable Design and Educational Environment Research
    • /
    • v.9 no.3
    • /
    • pp.1-11
    • /
    • 2010
  • In the construction industry, IT based information system has been diversely applied to increase productivity. Although IT device such as PDA, RFID, Barcode, wireless network and web camera has been introduced to gather information in construction site, the effect of the IT device is limited, because of bringing about additional works of engineer. In this paper, we proposed a defect management system which is based on handwritten document digitalization framework for introducing applicability of new IT device, digital pen. By the proposed system, we can effectively gather and input defect information to defect management system by using digital pen and paper like conventional way. Applying the data gathering device, digital pen to defect management, it is able to increase productivity by improving work process, building up and utilizing defect information database of good quality.

Integration of Gear Design Data using XML in the Web-based Environment (웹 기반 환경에서 XML을 이용한 기어 설계 데이터의 통합)

  • 정태형;박승현
    • Proceedings of the Korean Society of Precision Engineering Conference
    • /
    • 2001.04a
    • /
    • pp.627-630
    • /
    • 2001
  • XML is suitable to integrate various forms of engineering design data since it possesses the characteristics of both documents and data. In this research a web-based design system has been developed, which integrates various gear design data in the form of XML. The system generates XML document containing gear design data and transforms gear design data in the relational database into XML document form automatically. The XML documents are transmitted to gear modeler agent through SOAP, and then the agent is automatically executed and generates CAD model files and VRML files. The designer can check the generated VRML model of gear immediately in the web service.

  • PDF

A Web-based Unified Design Methodology using XML Applications (XML을 이용한 웹기반 정보 관리 통합설계 방법론)

  • 김경수;신현철;장희선
    • Journal of the Korea Society of Computer and Information
    • /
    • v.7 no.4
    • /
    • pp.157-162
    • /
    • 2002
  • In this paper, we implement the XML and data modeling by the UML tool, in which the class diagram is constructed from the sequence diagram after making the use case diagram. For the XML modeling. the guiding line will be presented to transform the UML class into the XML document, and then an example to draw the XML DTD from the UML class will be also shown. Furthermore, through the proposed data modeling, the integrated design methods for the transformation of the UML class into relational database schema. object-relational database schema and object-oriented database schema also will be proposed. Finally, we will be presented schema for each database system.

  • PDF

Development and Operation of Integrated Technical Information System(ITIS) for an Aircraft Development (항공기 통합기술정보시스템(ITIS) 개발 및 운용)

  • Chung, Joon-Young;Lee, Joon-Woo;Kim, Cheon-Young
    • The Journal of the Korea Contents Association
    • /
    • v.6 no.2
    • /
    • pp.75-83
    • /
    • 2006
  • The Aircraft Development Department operated a Technical Document Management System that was effectively managed using a database management system for the management of the technical information created by a research and development phase of an aircraft and managed a technical information of a research and development varied by the project on a lot of the aircraft development program. While managing the user management and technical information project, it caused some problems on the workflow of the Work Memo and a search of some technical information, etc. As result, we developed the web-based Integrated Technical Information System(ITIS) which be able to totally manage the Technical Document Management System varied by program. According to the construction and operation of this system, we can access by program using user account and privilege and dramatically increased the productivity of a research and development because of performing the workflow of the Work Memo and some search of technical information by the integrated screen of the ITIS.

  • PDF

Web Document Classification Based on Hangeul Morpheme and Keyword Analyses (한글 형태소 및 키워드 분석에 기반한 웹 문서 분류)

  • Park, Dan-Ho;Choi, Won-Sik;Kim, Hong-Jo;Lee, Seok-Lyong
    • The KIPS Transactions:PartD
    • /
    • v.19D no.4
    • /
    • pp.263-270
    • /
    • 2012
  • With the current development of high speed Internet and massive database technology, the amount of web documents increases rapidly, and thus, classifying those documents automatically is getting important. In this study, we propose an effective method to extract document features based on Hangeul morpheme and keyword analyses, and to classify non-structured documents automatically by predicting subjects of those documents. To extract document features, first, we select terms using a morpheme analyzer, form the keyword set based on term frequency and subject-discriminating power, and perform the scoring for each keyword using the discriminating power. Then, we generate the classification model by utilizing the commercial software that implements the decision tree, neural network, and SVM(support vector machine). Experimental results show that the proposed feature extraction method has achieved considerable performance, i.e., average precision 0.90 and recall 0.84 in case of the decision tree, in classifying the web documents by subjects.

Semantic Similarity Measures Between Words within a Document using WordNet (워드넷을 이용한 문서내에서 단어 사이의 의미적 유사도 측정)

  • Kang, SeokHoon;Park, JongMin
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.16 no.11
    • /
    • pp.7718-7728
    • /
    • 2015
  • Semantic similarity between words can be applied in many fields including computational linguistics, artificial intelligence, and information retrieval. In this paper, we present weighted method for measuring a semantic similarity between words in a document. This method uses edge distance and depth of WordNet. The method calculates a semantic similarity between words on the basis of document information. Document information uses word term frequencies(TF) and word concept frequencies(CF). Each word weight value is calculated by TF and CF in the document. The method includes the edge distance between words, the depth of subsumer, and the word weight in the document. We compared out scheme with the other method by experiments. As the result, the proposed method outperforms other similarity measures. In the document, the word weight value is calculated by the proposed method. Other methods which based simple shortest distance or depth had difficult to represent the information or merge informations. This paper considered shortest distance, depth and information of words in the document, and also improved the performance.