• Title/Summary/Keyword: Information Mining

Search Result 3,300, Processing Time 0.031 seconds

Biomedical Ontologies and Text Mining for Biomedicine and Healthcare: A Survey

  • Yoo, Ill-Hoi;Song, Min
    • Journal of Computing Science and Engineering
    • /
    • v.2 no.2
    • /
    • pp.109-136
    • /
    • 2008
  • In this survey paper, we discuss biomedical ontologies and major text mining techniques applied to biomedicine and healthcare. Biomedical ontologies such as UMLS are currently being adopted in text mining approaches because they provide domain knowledge for text mining approaches. In addition, biomedical ontologies enable us to resolve many linguistic problems when text mining approaches handle biomedical literature. As the first example of text mining, document clustering is surveyed. Because a document set is normally multiple topic, text mining approaches use document clustering as a preprocessing step to group similar documents. Additionally, document clustering is able to inform the biomedical literature searches required for the practice of evidence-based medicine. We introduce Swanson's UnDiscovered Public Knowledge (UDPK) model to generate biomedical hypotheses from biomedical literature such as MEDLINE by discovering novel connections among logically-related biomedical concepts. Another important area of text mining is document classification. Document classification is a valuable tool for biomedical tasks that involve large amounts of text. We survey well-known classification techniques in biomedicine. As the last example of text mining in biomedicine and healthcare, we survey information extraction. Information extraction is the process of scanning text for information relevant to some interest, including extracting entities, relations, and events. We also address techniques and issues of evaluating text mining applications in biomedicine and healthcare.

A Process Mining using Association Rule and Sequence Pattern (연관규칙과 순차패턴을 이용한 프로세스 마이닝)

  • Chung, So-Young;Kwon, Soo-Tae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.31 no.2
    • /
    • pp.104-111
    • /
    • 2008
  • A process mining is considered to support the discovery of business process for unstructured process model, and a process mining algorithm by using the associated rule and sequence pattern of data mining is developed to extract information about processes from event-log, and to discover process of alternative, concurrent and hidden activities. Some numerical examples are presented to show the effectiveness and efficiency of the algorithm.

Rating and Comments Mining Using TF-IDF and SO-PMI for Improved Priority Ratings

  • Kim, Jinah;Moon, Nammee
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.11
    • /
    • pp.5321-5334
    • /
    • 2019
  • Data mining technology is frequently used in identifying the intention of users over a variety of information contexts. Since relevant terms are mainly hidden in text data, it is necessary to extract them. Quantification is required in order to interpret user preference in association with other structured data. This paper proposes rating and comments mining to identify user priority and obtain improved ratings. Structured data (location and rating) and unstructured data (comments) are collected and priority is derived by analyzing statistics and employing TF-IDF. In addition, the improved ratings are generated by applying priority categories based on materialized ratings through Sentiment-Oriented Point-wise Mutual Information (SO-PMI)-based emotion analysis. In this paper, an experiment was carried out by collecting ratings and comments on "place" and by applying them. We confirmed that the proposed mining method is 1.2 times better than the conventional methods that do not reflect priorities and that the performance is improved to almost 2 times when the number to be predicted is small.

Personalized Book Curation System based on Integrated Mining of Book Details and Body Texts (도서 정보 및 본문 텍스트 통합 마이닝 기반 사용자 맞춤형 도서 큐레이션 시스템)

  • Ahn, Hee-Jeong;Kim, Kee-Won;Kim, Seung-Hoon
    • Journal of Information Technology Applications and Management
    • /
    • v.24 no.1
    • /
    • pp.33-43
    • /
    • 2017
  • The content curation service through big data analysis is receiving great attention in various content fields, such as film, game, music, and book. This service recommends personalized contents to the corresponding user based on user's preferences. The existing book curation systems recommended books to users by using bibliographic citation, user profile or user log data. However, these systems are difficult to recommend books related to character names or spatio-temporal information in text contents. Therefore, in this paper, we suggest a personalized book curation system based on integrated mining of a book. The proposed system consists of mining system, recommendation system, and visualization system. The mining system analyzes book text, user information or profile, and SNS data. The recommendation system recommends personalized books for users based on the analysed data in the mining system. This system can recommend related books using based on book keywords even if there is no user information like new customer. The visualization system visualizes book bibliographic information, mining data such as keyword, characters, character relations, and book recommendation results. In addition, this paper also includes the design and implementation of the proposed mining and recommendation module in the system. The proposed system is expected to broaden users' selection of books and encourage balanced consumption of book contents.

PubMiner: Machine Learning-based Text Mining for Biomedical Information Analysis

  • Eom, Jae-Hong;Zhang, Byoung-Tak
    • Genomics & Informatics
    • /
    • v.2 no.2
    • /
    • pp.99-106
    • /
    • 2004
  • In this paper we introduce PubMiner, an intelligent machine learning based text mining system for mining biological information from the literature. PubMiner employs natural language processing techniques and machine learning based data mining techniques for mining useful biological information such as protein­protein interaction from the massive literature. The system recognizes biological terms such as gene, protein, and enzymes and extracts their interactions described in the document through natural language processing. The extracted interactions are further analyzed with a set of features of each entity that were collected from the related public databases to infer more interactions from the original interactions. An inferred interaction from the interaction analysis and native interaction are provided to the user with the link of literature sources. The performance of entity and interaction extraction was tested with selected MEDLINE abstracts. The evaluation of inference proceeded using the protein interaction data of S. cerevisiae (bakers yeast) from MIPS and SGD.

Mining Spatio-Temporal Patterns in Trajectory Data

  • Kang, Ju-Young;Yong, Hwan-Seung
    • Journal of Information Processing Systems
    • /
    • v.6 no.4
    • /
    • pp.521-536
    • /
    • 2010
  • Spatio-temporal patterns extracted from historical trajectories of moving objects reveal important knowledge about movement behavior for high quality LBS services. Existing approaches transform trajectories into sequences of location symbols and derive frequent subsequences by applying conventional sequential pattern mining algorithms. However, spatio-temporal correlations may be lost due to the inappropriate approximations of spatial and temporal properties. In this paper, we address the problem of mining spatio-temporal patterns from trajectory data. The inefficient description of temporal information decreases the mining efficiency and the interpretability of the patterns. We provide a formal statement of efficient representation of spatio-temporal movements and propose a new approach to discover spatio-temporal patterns in trajectory data. The proposed method first finds meaningful spatio-temporal regions and extracts frequent spatio-temporal patterns based on a prefix-projection approach from the sequences of these regions. We experimentally analyze that the proposed method improves mining performance and derives more intuitive patterns.

A Methodology for Searching Frequent Pattern Using Graph-Mining Technique (그래프마이닝을 활용한 빈발 패턴 탐색에 관한 연구)

  • Hong, June Seok
    • Journal of Information Technology Applications and Management
    • /
    • v.26 no.1
    • /
    • pp.65-75
    • /
    • 2019
  • As the use of semantic web based on XML increases in the field of data management, a lot of studies to extract useful information from the data stored in ontology have been tried based on association rule mining. Ontology data is advantageous in that data can be freely expressed because it has a flexible and scalable structure unlike a conventional database having a predefined structure. On the contrary, it is difficult to find frequent patterns in a uniformized analysis method. The goal of this study is to provide a basis for extracting useful knowledge from ontology by searching for frequently occurring subgraph patterns by applying transaction-based graph mining techniques to ontology schema graph data and instance graph data constituting ontology. In order to overcome the structural limitations of the existing ontology mining, the frequent pattern search methodology in this study uses the methodology used in graph mining to apply the frequent pattern in the graph data structure to the ontology by applying iterative node chunking method. Our suggested methodology will play an important role in knowledge extraction.

A Study of Data Mining Application in Information Management Field (정보관리분야의 데이터 마이닝 기법 적용에 대한 연구)

  • Choi, Hee-Yoon
    • Journal of Information Management
    • /
    • v.31 no.3
    • /
    • pp.1-20
    • /
    • 2000
  • A variety of trials selecting necessary and valuable information from rapidly increasing volume of data are made, and as one of them, data mining methods is an interest. This methodology is increasingly appzied to information management field which consists of efficient processing and systemizing increasing digital documents for user service. This article analyzes theoletical background and empirical case studies of data mining, and predicts the possibility of its application to information management area.

  • PDF

Transmit Precoder Design for Two-User Broadcast Channel with Statistical and Delayed CSIT

  • Sun, Yanjing;Zhou, Shu;Cao, Qi;Wang, Yanfen;Liu, Wen;Zhang, Xiaoguang
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.12 no.5
    • /
    • pp.2124-2141
    • /
    • 2018
  • Recent studies have revealed the efficacy of incorporating delayed channel state information at transmit side (CSIT) in transmission scheme design. This paper focuses on transmit precoder design to maximize the ergodic sum-rate in a two-user Multiple-Input Single-Output (MISO) system with delayed and statistical CSIT. A new transmit strategy which precodes signals in all transmit slots is proposed in this paper, denoted as all time-slots precoding Alternative MAT (AAMAT). There is a common procedure in conventional delayed-CSIT based schemes, which is retransmitting the overheard interferences. Since the retransmitting signal is intended to both users, all previous schemes tend to use only one antenna. We however figure out an improvement in spectral efficiency could be realized if all antennas can be utilized. In this paper, we detail the design of the procoder which enabling all antennas and also we compute a lower bound of the ergodic sum-rate in an ideal condition. In addition, simulation results demonstrate the superiority of our proposed scheme.

Web Mining for successful e-Business based on Artificial Intelligence Techniques (성공적인 e-Business를 위한 인공지능 기법 기반 웹 마이닝)

  • 이장희;유성진;박상찬
    • Journal of Intelligence and Information Systems
    • /
    • v.8 no.2
    • /
    • pp.159-175
    • /
    • 2002
  • Web mining is an emerging science of applying modem data mining technologies to the problem of extracting valid, comprehensible, and actionable information from large databases of web in e-Business environment and of using it to make crucial e-Business decisions. In this paper, we present the noble framework of data visualization system based on web mining for analyzing the characteristics of on-line customers in e-Business. We also propose the framework of forecasting system for providing the forecasting information of sales/purchase through the use of web mining based on artificial intelligence techniques such as back-propagation network, memory-based reasoning, and self-organizing map.

  • PDF