• Title/Summary/Keyword: text information

Search Result 4,380, Processing Time 0.036 seconds

A Study on Research Trends of Graph-Based Text Representations for Text Mining (텍스트 마이닝을 위한 그래프 기반 텍스트 표현 모델의 연구 동향)

  • Chang, Jae-Young
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.13 no.5
    • /
    • pp.37-47
    • /
    • 2013
  • Text Mining is a research area of retrieving high quality hidden information such as patterns, trends, or distributions through analyzing unformatted text. Basically, since text mining assumes an unstructured text, it needs to be represented as a simple text model for analyzing it. So far, most frequently used model is VSM(Vector Space Model), in which a text is represented as a bag of words. However, recently much researches tried to apply a graph-based text model for representing semantic relationships between words. In this paper, we survey research trends of graph-based text representation models for text mining. Additionally, we also discuss about future models of graph-based text mining.

Robust Recognition of a Player Name in Golf Videos (골프 동영상에서의 강건한 선수명 인식)

  • Jung, Cheol-Kon;Kim, Joong-Kyu
    • 한국HCI학회:학술대회논문집
    • /
    • 2008.02a
    • /
    • pp.659-662
    • /
    • 2008
  • In sports videos, text provides valuable information about the game such as scores and information about the players. This paper proposed a robust recognition method of player name in golf videos. In golf, most of users want to search the scenes which contain the play shots of favorite players. We use text information in golf videos for robust extraction of player information, By using OCR, we have obtained the text information, and then recognized the player information from player name DB. We can search the scenes of favorite players by using this player information. By conducting experiments on several golf videos, we demonstrate that our method achieves impressive performance with respect to the robustness.

  • PDF

Consideration of a Robust Search Methodology that could be used in Full-Text Information Retrieval Systems (퍼지 논리를 이용한 사용자 중심적인 Full-Text 검색방법에 관한 연구)

  • Lee, Won-Bu
    • Asia pacific journal of information systems
    • /
    • v.1 no.1
    • /
    • pp.87-101
    • /
    • 1991
  • The primary purpose of this study was to investigate a robust search methodology that could be used in full-text information retrieval systems. A robust search methodology is one that can be easily used by a variety of users (particularly naive users) and it will give them comparable search performance regardless of their different expertise or interests In order to develop a possibly robust search methodology, a fully functional prototype of a fuzzy knowledge based information retrieval system was developed. Also, an experiment that used this prototype information retreival system was designed to investigate the performance of that search methodology over a small exploratory sample of user queries To probe the relatonships between the possibly robust search performance and the query organization using fuzzy inference logic, the search performance of a shallow query structure was analyzes. Consequently the following several noteworthy findings were obtained: 1) the hierachical(tree type) query structure might be a better query organization than the linear type query structure 2) comparing with the complex tree query structure, the simple tree query structure that has at most three levels of query might provide better search performance 3) the fuzzy search methodology that employs a proper levels of cut-off value might provide more efficient search performance than the boolean search methodology. Even though findings could not be statistically verified because the experiments were done using a single replication, it is worth noting however, that the research findings provided valuable information for developing a possibly robust search methodology in full-text information retrieval.

  • PDF

Cross-Lingual Text Retrieval Based on a Knowledge Base (지식베이스에 기반한 다언어 문서 검색)

  • Choi, Myeong-Bok;Jo, Jun
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.10 no.1
    • /
    • pp.21-32
    • /
    • 2010
  • User query formation highly acts on the effectiveness of information retrieval when we retrieve documents from the general domain as a web. This thesis proposes a intelligent information retrieval method based on a cross-lingual knowledge base to effectively perform a cross-lingual text retrieval from the web. The inferred knowledge from the cross-lingual knowledge base helps user's word association to make up user query easily and exactly for effective cross-lingual text information retrieval. This thesis develops user's query reformation algorithm and experiments it with Korean and English web. Experimental results show that the algorithm based on the proposed knowledge base is much more effective than without knowledge base in the cross-lingual text retrieval.

Finding Naval Ship Maintenance Expertise Through Text Mining and SNA

  • Kim, Jin-Gwang;Yoon, Soung-woong;Lee, Sang-Hoon
    • Journal of the Korea Society of Computer and Information
    • /
    • v.24 no.7
    • /
    • pp.125-133
    • /
    • 2019
  • Because military weapons systems for special purposes are small and complex, they are not easy to maintain. Therefore, it is very important to maintain combat strength through quick maintenance in the event of a breakdown. In particular, naval ships are complex weapon systems equipped with various equipment, so other equipment must be considered for maintenance in the event of equipment failure, so that skilled maintenance personnel have a great influence on rapid maintenance. Therefore, in this paper, we analyzed maintenance data of defense equipment maintenance information system through text mining and social network analysis(SNA), and tried to identify the naval ship maintenance expertise. The defense equipment maintenance information system is a system that manages military equipment efficiently. In this study, the data(2,538cases) of some naval ship maintenance teams were analyzed. In detail, we examined the contents of main maintenance and maintenance personnel through text mining(word cloud, word network). Next, social network analysis(collaboration analysis, centrality analysis) was used to confirm the collaboration relationship between maintenance personnel and maintenance expertise. Finally, we compare the results of text mining and social network analysis(SNA) to find out appropriate methods for finding and finding naval ship maintenance expertise.

A Study on the Method for Extracting the Purpose-Specific Customized Information from Online Product Reviews based on Text Mining (텍스트 마이닝 기반의 온라인 상품 리뷰 추출을 통한 목적별 맞춤화 정보 도출 방법론 연구)

  • Kim, Joo Young;Kim, Dong soo
    • The Journal of Society for e-Business Studies
    • /
    • v.21 no.2
    • /
    • pp.151-161
    • /
    • 2016
  • In the era of the Web 2.0, characterized by the openness, sharing and participation, it is easy for internet users to produce and share the data. The amount of the unstructured data which occupies most of the digital world's data has increased exponentially. One of the kinds of the unstructured data called personal online product reviews is necessary for both the company that produces those products and the potential customers who are interested in those products. In order to extract useful information from lots of scattered review data, the process of collecting data, storing, preprocessing, analyzing, and drawing a conclusion is needed. Therefore we introduce the text-mining methodology for applying the natural language process technology to the text format data like product review in order to carry out extracting structured data by using R programming. Also, we introduce the data-mining to derive the purpose-specific customized information from the structured review information drawn by the text-mining.

Text Document Classification Scheme using TF-IDF and Naïve Bayes Classifier (TF-IDF와 Naïve Bayes 분류기를 활용한 문서 분류 기법)

  • Yoo, Jong-Yeol;Hyun, Sang-Hyun;Yang, Dong-Min
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2015.10a
    • /
    • pp.242-245
    • /
    • 2015
  • Recently due to large-scale data spread in digital economy, the era of big data is coming. Through big data, unstructured text data consisting of technical text document, confidential document, false information documents are experiencing serious problems in the runoff. To prevent this, the need of art to sort and process the document consisting of unstructured text data has increased. In this paper, we propose a novel text classification scheme which learns some data sets and correctly classifies unstructured text data into two different categories, True and False. For the performance evaluation, we implement our proposed scheme using $Na{\ddot{i}}ve$ Bayes document classifier and TF-IDF modules in Python library, and compare it with the existing document classifier.

  • PDF

Table based Matching Algorithm for Soft Categorization of News Articles in Reuter 21578

  • Jo, Tae-Ho
    • Journal of Korea Multimedia Society
    • /
    • v.11 no.6
    • /
    • pp.875-882
    • /
    • 2008
  • This research proposes an alternative approach to machine learning based ones for text categorization. For using machine learning based approaches for any task of text mining, documents should be encoded into numerical vectors; it causes two problems: huge dimensionality and sparse distribution. Although there are various tasks of text mining such as text categorization, text clustering, and text summarization, the scope of this research is restricted to text categorization. The idea of this research is to avoid the two problems by encoding a document or documents into a table, instead of numerical vectors. Therefore, the goal of this research is to improve the performance of text categorization by proposing approaches, which are free from the two problems.

  • PDF

Design and Development of a Multimodal Biomedical Information Retrieval System

  • Demner-Fushman, Dina;Antani, Sameer;Simpson, Matthew;Thoma, George R.
    • Journal of Computing Science and Engineering
    • /
    • v.6 no.2
    • /
    • pp.168-177
    • /
    • 2012
  • The search for relevant and actionable information is a key to achieving clinical and research goals in biomedicine. Biomedical information exists in different forms: as text and illustrations in journal articles and other documents, in images stored in databases, and as patients' cases in electronic health records. This paper presents ways to move beyond conventional text-based searching of these resources, by combining text and visual features in search queries and document representation. A combination of techniques and tools from the fields of natural language processing, information retrieval, and content-based image retrieval allows the development of building blocks for advanced information services. Such services enable searching by textual as well as visual queries, and retrieving documents enriched by relevant images, charts, and other illustrations from the journal literature, patient records and image databases.