• Title/Summary/Keyword: Text Databases

Search Result 194, Processing Time 0.028 seconds

An Automatic Tagging System and Environments for Construction of Korean Text Database

  • Lee, Woon-Jae;Choi, Key-Sun;Lim, Yun-Ja;Lee, Yong-Ju;Kwon, Oh-Woog;Kim, Hiong-Geun;Park, Young-Chan
    • Proceedings of the Acoustical Society of Korea Conference
    • /
    • 1994.06a
    • /
    • pp.1082-1087
    • /
    • 1994
  • A set of text database is indispensable to the probabilistic models for speech recognition, linguistic model, and machine translation. We introduce an environment to canstruct text databases : an automatic tagging system and a set of tools for lexical knowledge acquisition, which provides the facilities of automatic part of speech recognition and guessing.

  • PDF

The Historical Development of Information Retrieval Systems (정보검색 발전사)

  • 사공철;서경주
    • Journal of the Korean Society for information Management
    • /
    • v.13 no.2
    • /
    • pp.19-37
    • /
    • 1996
  • The development of information retrieval between 1950s and 1990s is described chronologically. For each decade, the following information retrieval systems are examined : post-coordinate and KWIC indexing methods for the 1950s ; off-line and experimental on-line systems for the 1960s ; on-line and full-text retrieval systems for the 1970s ; full-text databases, on-line interfaces, and overseas and domestic on-line databases for the 1980s ; and finally for the 1990s, CD-ROM, multimedia, hypertext, and Internet. The prospects for the future are also discussed.

  • PDF

Commercial Databases : The Keypoints and Practical Use(3) - Journal Articles and Books - (상용(商用) 데이터베이스 : 요점(要點)과 활용(活用)(3) - 잡지(雜誌).도서(圖書) -)

  • Cho, Jae-Ho
    • Journal of Information Management
    • /
    • v.24 no.4
    • /
    • pp.58-77
    • /
    • 1993
  • Database of journals/books are categorized into bibliographic database and clearinghouse-type databases which tell you locations of original materials. There is such a problem in the latter type of databases that those are not likely to be commercialized, although ultimate purpose which users have is to obtain original materials. We find photocopying service through on-line or full-text databases currently available, but we can't get information which meets our needs only by those databases. This paper describes major database services and how to use them by document type(journal or book). The author also discusses what future information centers utilizing databases should be.

  • PDF

User Access and Preferences to Full-text Databases When Searching Individual and Integrated Databases (데이터베이스통합이 유용성과 이용자선호도에 미치는 영향)

  • 박소연
    • Proceedings of the Korean Society for Information Management Conference
    • /
    • 1999.08a
    • /
    • pp.157-162
    • /
    • 1999
  • 본 연구는 분산환경에서 이용자가 다수의 데이터베이스를 개별적으로 검색할 매와 통합적으로 검색할 때에 유용성과 이용자선호도, 이용자 만족도를 비교 분석하였다. 본 연구에는 럿거스대학 School of Communication, Information, and Library Studies에 재학중인 28명의 대학원생들이 참가하였다. 두 시스템에 대한 이용자선호도와 만족도에는 통계적으로 유의한 차이가 있는 것으로 나타났다. 즉, 많은 참가자들이 통합인터페이스보다 분리인터페이스를 선호하였고, 분리인터페이스의 검색결과에 더 만족하였다. 통합인터페이스의 편리함과 능률성에도 불구하고 참가자들이 분리인터페이스를 선호한 주된 이유중의 하나는 데이터베이스를 이용자 스스로 선택하고 통제할 수 있기 때문인 것으로 나타났다.

  • PDF

A Study on Building Society Research Information System (학회 학술정보시스템 구축에 관한 연구)

  • 조현양;최선희
    • Journal of Korean Library and Information Science Society
    • /
    • v.30 no.3
    • /
    • pp.405-426
    • /
    • 1999
  • Academic societies in the field of science and technology are major producers of domestic research information. These information are very important sources to researchers, students and so on. KORDIC built an integrated information system which facilitates the progress of building databases and promotes users easy access to databases. In order to build efficient society research information system, we investigated former cases and analyzed requirement of each society. We identified principal information sources and built an integrated information service system using internet homepage and information retrieval system(KRISTAL-II). In the future we will expand participating societies and focus on text-based information.

  • PDF

Implementation of text to speech terminal system by distributed database (데이터베이스 분산을 통한 소용량 문자-음성 합성 단말기 구현)

  • 김영길;박창현;양윤기
    • Proceedings of the IEEK Conference
    • /
    • 2003.07e
    • /
    • pp.2431-2434
    • /
    • 2003
  • In this research, our goal is to realize Korean Distribute TTS system with server/client function in wireless network. The speech databases and some routines of TTS system is stuck with the server which has strong functions and we made Korean speech databases and accomplished research about DB which is suitable for distributed TTS. We designed a terminal has the minimum setting which operate this TTS and designed proper protocol so we will check action of Distributed TTS.

  • PDF

Collection Fusion Algorithm in Distributed Multimedia Databases (분산 멀티미디어 데이터베이스에 대한 수집 융합 알고리즘)

  • Kim, Deok-Hwan;Lee, Ju-Hong;Lee, Seok-Lyong;Chung, Chin-Wan
    • Journal of KIISE:Databases
    • /
    • v.28 no.3
    • /
    • pp.406-417
    • /
    • 2001
  • With the advances in multimedia databases on the World Wide Web, it becomes more important to provide users with the search capability of distributed multimedia data. While there have been many studies about the database selection and the collection fusion for text databases. The multimedia databases on the Web have autonomous and heterogeneous properties and they use mainly the content based retrieval. The collection fusion problem of multimedia databases is concerned with the merging of results retrieved by content based retrieval from heterogeneous multimedia databases on the Web. This problem is crucial for the search in distributed multimedia databases, however, it has not been studied yet. This paper provides novel algorithms for processing the collection fusion of heterogeneous multimedia databases on the Web. We propose two heuristic algorithms for estimating the number of objects to be retrieved from local databases and an algorithm using the linear regression. Extensive experiments show the effectiveness and efficiency of these algorithms. These algorithms can provide the basis for the distributed content based retrieval algorithms for multimedia databases on the Web.

  • PDF

A Study of the Behaviours in Searching Full-Text Databases- Subject Specialists vs. Professional Searchers - (전문데이터베이스의 탐색특성에 관한 연구 - 주제전문가와 탐색전문가 -)

  • Lee Eung-Bong
    • Journal of the Korean Society for Library and Information Science
    • /
    • v.30 no.2
    • /
    • pp.51-86
    • /
    • 1996
  • The primary purpose of this study is to verify the difference of behavioural characteristics between the subject specialists and professional searchers in searching full-text databases. The major findings and conclusions from this study are summarized as follows. Analyses of Search questions(the degree of understanding with search questions, the degree of difficulty in selecting terms, and the degree of expectation of search results), search processes(the number of search terms used, the number of Boolean operators and qualifiers used, the number of documents browsed and the search time(the connecting time, time to spend per one output document, time to spend per one relevant output document) and search results(the searching efficiency(the number of relevant documents, the ,recall ratio and the precision ratio), the search cost(the total search cost. the search cost per one output document and the search cost per one relevant output document) and the degree of satisfaction with search results) are significantly different between the subject specialists and professional searchers in searching full-text databases.

  • PDF

Multi-Dimensional Keyword Search and Analysis of Hotel Review Data Using Multi-Dimensional Text Cubes (다차원 텍스트 큐브를 이용한 호텔 리뷰 데이터의 다차원 키워드 검색 및 분석)

  • Kim, Namsoo;Lee, Suan;Jo, Sunhwa;Kim, Jinho
    • Journal of Information Technology and Architecture
    • /
    • v.11 no.1
    • /
    • pp.63-73
    • /
    • 2014
  • As the advance of WWW, unstructured data including texts are taking users' interests more and more. These unstructured data created by WWW users represent users' subjective opinions thus we can get very useful information such as users' personal tastes or perspectives from them if we analyze appropriately. In this paper, we provide various analysis efficiently for unstructured text documents by taking advantage of OLAP (On-Line Analytical Processing) multidimensional cube technology. OLAP cubes have been widely used for the multidimensional analysis for structured data such as simple alphabetic and numberic data but they didn't have used for unstructured data consisting of long texts. In order to provide multidimensional analysis for unstructured text data, however, Text Cube model has been proposed precently. It incorporates term frequency and inverted index as measurements to search and analyze text databases which play key roles in information retrieval. The primary goal of this paper is to apply this text cube model to a real data set from in an Internet site sharing hotel information and to provide multidimensional analysis for users' reviews on hotels written in texts. To achieve this goal, we first build text cubes for the hotel review data. By using the text cubes, we design and implement the system which provides multidimensional keyword search features to search and to analyze review texts on various dimensions. This system will be able to help users to get valuable guest-subjective summary information easily. Furthermore, this paper evaluats the proposed systems through various experiments and it reveals the effectiveness of the system.

Currents in Integrative Biochip Informatics

  • Kim, Ju-Han
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2001.10a
    • /
    • pp.1-9
    • /
    • 2001
  • scale genomic and postgenomic data means that many of the challenges in biomedical research are now challenges in computational sciences and information technology. The informatics revolutions both in clinical informatics and bioinformatics will change the current paradigm of biomedical sciences and practice of clinical medicine, including diagnostics, therapeutics, and prognostics. Postgenome informatics, powered by high throughput technologies and genomic-scale databases, is likely to transform our biomedical understanding forever much the same way that biochemistry did a generation ago. In this talk, 1 will describe how these technologies will in pact biomedical research and clinical care, emphasizing recent advances in biochip-based functional genomics. Basic data preprocessing with normalization and filtering, primary pattern analysis, and machine teaming algorithms will be presented. Issues of integrated biochip informatics technologies including multivariate data projection, gene-metabolic pathway mapping, automated biomolecular annotation, text mining of factual and literature databases, and integrated management of biomolecular databases will be discussed. Each step will be given with real examples from ongoing research activities in the context of clinical relevance. Issues of linking molecular genotype and clinical phenotype information will be discussed.

  • PDF