• 제목/요약/키워드: knowledge discovery system

검색결과 129건 처리시간 0.026초

CiteSpace 적용을 통한 디지털 보존 지식영역 비주얼화 연구 (A Study on Visualization of Digital Preservation Knowledge Domain Using CiteSpace)

  • 김희정
    • 한국문헌정보학회지
    • /
    • 제39권4호
    • /
    • pp.89-104
    • /
    • 2005
  • 디지털 보존 주제분야를 중심으로 지식영역 비주얼화(knowledge domain visualization)를 시도하였다. 분석을 위한 데이터는 1990년부터 2005년까지의 기간 동안의 Web of Science DB를 중심으로 총 74건의 문헌을 추출하여 활용하였다. 지식영역 비주얼화를 위하여 사용한 툴은 서지DB를 중심으로 비주얼 데이터마이닝 결과를 제공하는 Java 어플리케이션인 CiteSpace이다. 분석 결과, 디지털 보존 분야의 핵심적인 지식 영역은 최신정보기술을 중심으로 한 디지털 보존전략, 정보네트워크와 보존시스템, 전자정부와 지식관리의 세 영역인 것으로 나타났다.

LitCovid-AGAC: cellular and molecular level annotation data set based on COVID-19

  • Ouyang, Sizhuo;Wang, Yuxing;Zhou, Kaiyin;Xia, Jingbo
    • Genomics & Informatics
    • /
    • 제19권3호
    • /
    • pp.23.1-23.7
    • /
    • 2021
  • Currently, coronavirus disease 2019 (COVID-19) literature has been increasing dramatically, and the increased text amount make it possible to perform large scale text mining and knowledge discovery. Therefore, curation of these texts becomes a crucial issue for Bio-medical Natural Language Processing (BioNLP) community, so as to retrieve the important information about the mechanism of COVID-19. PubAnnotation is an aligned annotation system which provides an efficient platform for biological curators to upload their annotations or merge other external annotations. Inspired by the integration among multiple useful COVID-19 annotations, we merged three annotations resources to LitCovid data set, and constructed a cross-annotated corpus, LitCovid-AGAC. This corpus consists of 12 labels including Mutation, Species, Gene, Disease from PubTator, GO, CHEBI from OGER, Var, MPA, CPA, NegReg, PosReg, Reg from AGAC, upon 50,018 COVID-19 abstracts in LitCovid. Contain sufficient abundant information being possible to unveil the hidden knowledge in the pathological mechanism of COVID-19.

A Framework for Inteligent Remote Learning System

  • 유영동
    • 한국정보시스템학회지:정보시스템연구
    • /
    • 제2권
    • /
    • pp.194-206
    • /
    • 1993
  • Intelligent remote learning system is a system that incorporate communication technology and others : a database engine, an intelligent tutorial system. Learners can study by themselves through the intelligent tutorial system. The existence of a communication, database and artificial intelligence enhance the capability of IRLS. According to Parsaye, an intelligent databases should have the following features : 1) Knowledge discovery. 2) Data integrity and quality control. 3) Hypermedia management. 4) Data presentation and display. 5) Decision support and scenario analysis. 6) Data format management. 7) Intelligent system design tools. I hope that this research of framework for IRLS paves for the future research. As mentioned in the above, the future work will include an intelligent database, self-learning mechanism using neural network.

  • PDF

설명기반 유전자알고리즘을 활용한 경영성과 데이터베이스이 데이터마이닝 (Data-Mining in Business Performance Database Using Explanation-Based Genetic Algorithms)

  • 조성훈;정민용
    • 경영과학
    • /
    • 제18권1호
    • /
    • pp.135-145
    • /
    • 2001
  • In recent environment of dynamic management, there is growing recognition that information and knowledge management systems are essential for efficient/effective decision making by CEO. To cope with this situation, we suggest the Data-Miming scheme as a key component of integrated information and knowledge management system. The proposed system measures business performance by considering both VA(Value-Added), which represents stakeholder’s point of view and EVA (Economic Value-Added), which represents shareholder’s point of view. To mine the new information & Knowledge discovery, we applied the improved genetic algorithms that consider predictability, understandability (lucidity) and reasonability factors simultaneously, we use a linear combination model for GAs learning structure. Although this model’s predictability will be more decreased than non-linear model, this model can increase the knowledge’s understandability that is meaning of induced values. Moreover, we introduce a random variable scheme based on normal distribution for initial chromosomes in GAs, so we can expect to increase the knowledge’s reasonability that is degree of expert’s acceptability. the random variable scheme based on normal distribution uses statistical correlation/determination coefficient that is calculated with training data. To demonstrate the performance of the system, we conducted a case study using financial data of Korean automobile industry over 16 years from 1981 to 1996, which is taken from database of KISFAS (Korea Investors Services Financial Analysis System).

  • PDF

PubMine: An Ontology-Based Text Mining System for Deducing Relationships among Biological Entities

  • Kim, Tae-Kyung;Oh, Jeong-Su;Ko, Gun-Hwan;Cho, Wan-Sup;Hou, Bo-Kyeng;Lee, Sang-Hyuk
    • Interdisciplinary Bio Central
    • /
    • 제3권2호
    • /
    • pp.7.1-7.6
    • /
    • 2011
  • Background: Published manuscripts are the main source of biological knowledge. Since the manual examination is almost impossible due to the huge volume of literature data (approximately 19 million abstracts in PubMed), intelligent text mining systems are of great utility for knowledge discovery. However, most of current text mining tools have limited applicability because of i) providing abstract-based search rather than sentence-based search, ii) improper use or lack of ontology terms, iii) the design to be used for specific subjects, or iv) slow response time that hampers web services and real time applications. Results: We introduce an advanced text mining system called PubMine that supports intelligent knowledge discovery based on diverse bio-ontologies. PubMine improves query accuracy and flexibility with advanced search capabilities of fuzzy search, wildcard search, proximity search, range search, and the Boolean combinations. Furthermore, PubMine allows users to extract multi-dimensional relationships between genes, diseases, and chemical compounds by using OLAP (On-Line Analytical Processing) techniques. The HUGO gene symbols and the MeSH ontology for diseases, chemical compounds, and anatomy have been included in the current version of PubMine, which is freely available at http://pubmine.kobic.re.kr. Conclusions: PubMine is a unique bio-text mining system that provides flexible searches and analysis of biological entity relationships. We believe that PubMine would serve as a key bioinformatics utility due to its rapid response to enable web services for community and to the flexibility to accommodate general ontology.

약물부작용감시시스템에서 재현성 평가를 통한 마이닝 모델 개발 (Development of Mining model through reproducibility assessment in Adverse drug event surveillance system)

  • 이영호;윤영미;이병문;황희정;강운구
    • 한국컴퓨터정보학회논문지
    • /
    • 제14권3호
    • /
    • pp.183-192
    • /
    • 2009
  • 약물부작용감시시스템 (Adverse drug event surveillance system)은 약물부작용신호를 이용하여 약물의 부작용 여부를 식별하는 시스템이다. 기존의 자발적 보고나 차트리뷰 보다 효율성이 뛰어난 시스템으로 분류할 수 있다. 본 논문에서는 약물부작용감시시스템을 구현하기 위하여 임상데이터마트(GDM)를 구축하였다. 특히, 데이터 품질관리 기법을 적용하여 구축된 CDM에 지식 탐사 기법 중 비교사학습 기법으로 적용하여 모델의 재현성을 평가하여 최적의 약물부작용 군집화 개수(n=4)를 도출하였다. 군집화 개수(n=4)를 이용하여 약물부작용 판별을 위한 K-means, Kohonen, two-step clustering model 알고리즘에 적용하여 분석함으로써 K-means 알고리즘이 가장 우수한 군집 효과를 나타냄을 확인하였다.

사용자 행동 패턴 선호도 학습을 위한 퍼지 귀납 학습 시스템 (Fuzzy Inductive Learning System for Learning Preference of the User's Behavior Pattern)

  • 이형욱;김용휘;박광현;김용수;정진우;조준면;김민경;변증남
    • 한국지능시스템학회논문지
    • /
    • 제15권7호
    • /
    • pp.805-812
    • /
    • 2005
  • 본 논문은 스마트 홈과 같이 다양한 센서 및 제어 네트워크가 밀집되어 있는 유비쿼터스 환경 하에서 복잡한 인터페이스의 사용에 대한 사용자의 인지 부담(cognitive load)을 줄이고, 개인화된(personalized) 서비스를 자율적으로 제공하기 위한 새로운 사용자 행동 패턴 선호도 학습기법을 제안하였다. 이를 위해 지식 발견(knowledge discovery)을 위한 평생 학습(life-long learning)의 관점에서 퍼지귀납(fuzzy inductive) 학습 방법론을 제안하며, 이것은 수치 데이터로부터 입력 공간에 대한 효율적인 퍼지 분할(fuzzy partition)을 얻어내고 일관성 있는(consistent) 퍼지 상관 롤(fuzzy association rule)을 얻어내도록 한다.

데이터 마이닝에서 샘플링 기법을 이용한 연속패턴 알고리듬 (An Algorithm for Sequential Sampling Method in Data Mining)

  • 홍지명;김낙현;김성집
    • 산업경영시스템학회지
    • /
    • 제21권45호
    • /
    • pp.101-112
    • /
    • 1998
  • Data mining, which is also referred to as knowledge discovery in database, means a process of nontrivial extraction of implicit, previously unknown and potentially useful information (such as knowledge rules, constraints, regularities) from data in databases. The discovered knowledge can be applied to information management, decision making, and many other applications. In this paper, a new data mining problem, discovering sequential patterns, is proposed which is to find all sequential patterns using sampling method. Recognizing that the quantity of database is growing exponentially and transaction database is frequently updated, sampling method is a fast algorithm reducing time and cost while extracting the trend of customer behavior. This method analyzes the fraction of database but can in general lead to results of a very high degree of accuracy. The relaxation factor, as well as the sample size, can be properly adjusted so as to improve the result accuracy while minimizing the corresponding execution time. The superiority of the proposed algorithm will be shown through analyzing accuracy and efficiency by comparing with Apriori All algorithm.

  • PDF

Temperature Inference System by Rough-Neuro-Fuzzy Network

  • Il Hun jung;Park, Hae jin;Kang, Yun-Seok;Kim, Jae-In;Lee, Hong-Won;Jeon, Hong-Tae
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 1998년도 The Third Asian Fuzzy Systems Symposium
    • /
    • pp.296-301
    • /
    • 1998
  • The Rough Set theory suggested by Pawlak in 1982 has been useful in AI, machine learning, knowledge acquisition, knowledge discovery from databases, expert system, inductive reasoning. etc. The main advantages of rough set are that it does not need any preliminary or additional information about data and reduce the superfluous informations. but it is a significant disadvantage in the real application that the inference result form is not the real control value but the divided disjoint interval attribute. In order to overcome this difficulty, we will propose approach in which Rough set theory and Neuro-fuzzy fusion are combined to obtain the optimal rule base from lots of input/output datum. These results are applied to the rule construction for infering the temperatures of refrigerator's specified points.

  • PDF

인과지도를 활용한 건설 안전사고 원인 분석 : 안전문화 관점 (A Cause Analysis of the Construction Incident Using Causal Loop Diagram : Safety Culture Perspective)

  • 최윤길;조근태
    • 한국안전학회지
    • /
    • 제35권2호
    • /
    • pp.34-46
    • /
    • 2020
  • Unlike research focused on existing technologies and individual errors to analyze the causes of incidents, this study approached them from an organization and culture. And this study is not a one way study but cyclical study what can track cause down using causal loop diagram methodology. Four diagnostic criteria for the negative state of the safety culture : secretive, blame, failure to learning, and incremental learning, combine literature study and expert opinion to derive 41 variables. Connecting these variable make 4 causal loop diagrams and total causal loop diagram. Case accumulation in secretive, accident report in blame, knowledge accumulation in failure to learning, near miss discovery in incremental learning are the main variables. Safety incident is the objective variable by classifying them into 4 stages in total loop, leading track as the most affect is case accumulation, and Step 4 as you can see accident report and near miss discovery are the result of tracking down the cause. This study can be used as a basis for improving the management priority and the system in incident prevention.