• Title/Summary/Keyword: 트리 마이닝

Search Result 129, Processing Time 0.026 seconds

Anomaly Intrusion Detection based on Association Rule Mining in a Database System (데이터베이스 시스템에서 연관 규칙 탐사 기법을 이용한 비정상 행위 탐지)

  • Park, Jeong-Ho;Oh, Sang-Hyun;Lee, Won-Suk
    • The KIPS Transactions:PartC
    • /
    • v.9C no.6
    • /
    • pp.831-840
    • /
    • 2002
  • Due to the advance of computer and communication technology, intrusions or crimes using a computer have been increased rapidly while tremendous information has been provided to users conveniently Specially, for the security of a database which stores important information such as the private information of a customer or the secret information of a company, several basic suity methods of a database management system itself or conventional misuse detection methods have been used. However, a problem caused by abusing the authority of an internal user such as the drain of secret information is more serious than the breakdown of a system by an external intruder. Therefore, in order to maintain the sorority of a database effectively, an anomaly defection technique is necessary. This paper proposes a method that generates the normal behavior profile of a user from the database log of the user based on an association mining method. For this purpose, the Information of a database log is structured by a semantically organized pattern tree. Consequently, an online transaction of a user is compared with the profile of the user, so that any anomaly can be effectively detected.

Podiatric Clinical Diagnosis using Decision Tree Data Mining (결정트리 데이터마이닝을 이용한 족부 임상 진단)

  • Kim, Jin-Ho;Park, In-Sik;Kim, Bong-Ok;Yang, Yoon-Seok;Won, Yong-Gwan;Kim, Jung-Ja
    • Journal of the Institute of Electronics Engineers of Korea CI
    • /
    • v.48 no.2
    • /
    • pp.28-37
    • /
    • 2011
  • With growing concerns about healthy life recently, although the podiatry which deals with the whole area for diagnosis, treatment of foot and leg, and prevention has been widely interested, research in our country is not active. Also, because most of the previous researches in data analysis performed the quantitative approaches, the reasonable level of reliability for clinical application could not be guaranteed. Clinical data mining utilizes various data mining analysis methods for clinical data, which provides decision support for expert's diagnosis and treatment for the patients. Because the decision tree can provide good explanation and description for the analysis procedure and is easy to interpret the results, it is simple to apply for clinical problems. This study investigate rules of item of diagnosis in disease types for adapting decision tree after collecting diagnosed data patients who are 2620 feet of 1310(males:633, females:677) in shoes clinic (department of rehabilitation medicine, Chungnam National University Hospital). and we classified 15 foot diseases followed factor of 22 foot diseases, which investigated diagnosis of 64 rules. Also, we analyzed and compared correlation relationship of characteristic of disease and factor in types through made decision tree from 5 class types(infants, child, adolescent, adult, total). Investigated results can be used qualitative and useful knowledge for clinical expert`s, also can be used tool for taking effective and accurate diagnosis.

지노믹트리 Microarray 토탈솔루션

  • O Tae-Jeong
    • Proceedings of the Korean Society for Bioinformatics Conference
    • /
    • 2006.02a
    • /
    • pp.46-55
    • /
    • 2006
  • (주)지노믹트리는 DNA 마이크로어레이 기술을 기반으로 하는 분자진단회사로서, 다음의 세가지 사업에 전력하고 있다. 첫째는 독창적이며 특화된 바이오마커 발굴기술 (MAGIC system)을 바탕으로 각종 암진단을 위한 바이오마커 개발연구 두 번째는 당사의 원천 기술인 다중동시검출 시스템을 이용한 질병 진단 시스템 및 증폭시스템 세 번째는 마이크로어레이 기술을 이용한 유전자 발현 분석, Array CGH, DNA 메틸레이션 분석 그리고 miRNA 검출 등의 지노믹스시대의 연구를 위한 토탈솔루션을 제공하고 있다. 지난 5년간의 마이크로어레이 기반기술을 이용한 자체연구 활동을 수행하면서 축적된 마이크로어레이 관련기술 노-하우들을 국내 마이크로어레이 연구자들에게 공급하기 위하여 노력하고 있다. 특히 당사의 지노믹서비스 부문은 유전자 발현 분석 솔루션 제공을 위해서 자체적으로 제작하여 공급하고 있는 human cDNA(17K/25K) 및 rat cDNA (5.0K) 마이크로어레이, Human (22K) 및 mouse (10K) 올리고뉴클레오타이드 마이크로 어레이 그리고 미생물 연구를 위한 대장균 (6K) 및 폐렴균 (2.2K) 올리고뉴클레오타이드 마이크로어레이 제공 및 이를 이용한 유전자 발현 분석 서비스를 제공하고 있다. 체적으로 제작되는 마이크로어레이 서비스는 2001년 도입한 ISO9001 품질인증시스템의 기반하에서 제작부터 생산까지의 엄격한 품질관리 과정을 거쳐서 고품질의 마이크로어레이를 이용한 분석서비스를 제공 하고 있다. 또한 고객요구형 서비스를 위하여 국외 유수의 마이크로어레이 회사 (Agilent, Microarray Inc, TIGR, Eurogentec 등)의 whole genome 기반의 마이크로어레이 제품을 이용한 분석서비스를 제공하고 있으며 마이크로어레이 실험을 위해서 필수적으로 이용되고 있는 시약 (labeling kit), 마이크로어레이 hybridization을 위한 hardware (hybridization chamber, hnay centrifuge)등을 자체적으로 개발하여 공급하고 있다. DNA copy number 측정을 위한 Array CGH 분석을 위해서는 자체적으로 제작공구하고 있는 human cDNA 마이크로어레이 (17K/25K) 그기고 rat (5.0K) 마이크로어레이를 이용한 분석서비스 및 whole genome 기반의 Agilent 올리고뉴클레오타이드 CGH 어레이 (44K, 35Kb resolution)를 이용한 분석서비스를 제공하고 있다. Epigenetic study를 하는 연구자들을 위한 메틸레이션 마이크로어레이 분석 서비스를 제공하고 있다. 기존분석법인 Bisulfite 처리기반의 분석이 아닌 enzyme digestion후 PCR 증폭방법을 이용한 분석방법을 이용함으로써, bisulfite 처리에 의한 DNA 손실문제를 최소화 하였다. 현재 50개의 문헌을 통해 잘 보고된 메틸레이션 유전자들에 대한 분석서비스를 제공하고 있으며, 지속적으로 표적컨텐츠의 숫자를 증가시킬 예정이다. 최근 많은 연구자들의 관심을 끌고 있는 micro RNA 검출을 위한 DNA 마이크로어레이 서비스를 제공할 예정이다 (2006년 3월 출시). 현재 까지 알려진 약 320개의 모든 miRNA를 탑재하고 있는 소형 DNA 마이크로어레이를 이용한 분석서비스로서 1장의 마이크로어레이 실험을 통하여 알려진 모든 miRNA의 비교분석이 가능하다. 마이크로어레이 실험 뿐만 아니라 data 분석을 위한 software도 상당히 중요한 비중을 차지하고 있다 이를 위하여 (주)지노믹트리는 Agilent에서 개발한 GeneSpring GX (유전자 발현 분석), Signet (마이크로어레이 database) 및 GeneSpring GT (SNP 분석)를 공급하고 있다. 통계적인 기반 지식의 없은 일반 user들을 위한 간편하면서도 종합적인 기능을 포함하고 있는 우수한 프로그램으로 이미 국제적으로 많은 인정을 받고 있다. (주)지노믹트리는 국내외 많은 연구자들의 경제적, 시간적 연구여건을 고려한 마이크로어레이 토탈솔루션을 제공하고 있으며, 실험 분석에서 data 마이닝 그리고 마이크로어레이 실험 디자인에 이르는 토탈솔루션을 제공하고 있다.

  • PDF

Top-down Hierarchical Clustering using Multidimensional Indexes (다차원 색인을 이용한 하향식 계층 클러스터링)

  • Hwang, Jae-Jun;Mun, Yang-Se;Hwang, Gyu-Yeong
    • Journal of KIISE:Databases
    • /
    • v.29 no.5
    • /
    • pp.367-380
    • /
    • 2002
  • Due to recent increase in applications requiring huge amount of data such as spatial data analysis and image analysis, clustering on large databases has been actively studied. In a hierarchical clustering method, a tree representing hierarchical decomposition of the database is first created, and then, used for efficient clustering. Existing hierarchical clustering methods mainly adopted the bottom-up approach, which creates a tree from the bottom to the topmost level of the hierarchy. These bottom-up methods require at least one scan over the entire database in order to build the tree and need to search most nodes of the tree since the clustering algorithm starts from the leaf level. In this paper, we propose a novel top-down hierarchical clustering method that uses multidimensional indexes that are already maintained in most database applications. Generally, multidimensional indexes have the clustering property storing similar objects in the same (or adjacent) data pares. Using this property we can find adjacent objects without calculating distances among them. We first formally define the cluster based on the density of objects. For the definition, we propose the concept of the region contrast partition based on the density of the region. To speed up the clustering algorithm, we use the branch-and-bound algorithm. We propose the bounds and formally prove their correctness. Experimental results show that the proposed method is at least as effective in quality of clustering as BIRCH, a bottom-up hierarchical clustering method, while reducing the number of page accesses by up to 26~187 times depending on the size of the database. As a result, we believe that the proposed method significantly improves the clustering performance in large databases and is practically usable in various database applications.

Automatic Construction of Class Hierarchies and Named Entity Dictionaries using Korean Wikipedia (한국어 위키피디아를 이용한 분류체계 생성과 개체명 사전 자동 구축)

  • Bae, Sang-Joon;Ko, Young-Joong
    • Journal of KIISE:Computing Practices and Letters
    • /
    • v.16 no.4
    • /
    • pp.492-496
    • /
    • 2010
  • Wikipedia as an open encyclopedia contains immense human knowledge written by thousands of volunteer editors and its reliability is also high. In this paper, we propose to automatically construct a Korean named entity dictionary using the several features of the Wikipedia. Firstly, we generate class hierarchies using the class information from each article of Wikipedia. Secondly, the titles of each article are mapped to our class hierarchies, and then we calculate the entropy value of the root node in each class hierarchy. Finally, we construct named entity dictionary with high performance by removing the class hierarchies which have a higher entropy value than threshold. Our experiment results achieved overall F1-measure of 81.12% (precision : 83.94%, recall : 78.48%).

OLAP and Decision Tree Analysis of Productivity Affected by Construction Duration Impact Factors (공사기간 영향요인에 따른 생산성의 OLAP 분석과 의사결정트리 분석)

  • Ryu, Han-Guk
    • Journal of the Korea Institute of Building Construction
    • /
    • v.11 no.2
    • /
    • pp.100-107
    • /
    • 2011
  • As construction duration significantly influences the performance and the success of construction projects, it is necessary to appropriately manage the impact factors affecting construction duration. Recently, interest in the construction industry has been rising due to the recent change in the construction legal system, and the competition among the construction companies on construction time. However, the impact factors are extremely diverse. The existing productivity data on impact factors is not sufficient to properly identify the impact factor and measure the productivity from various perspectives, such as subcontractor, time, crew, work and so on. In this respect, a multidimensional analysis by a data warehouse is very helpful in order to view the manner in which productivity is affected by impact factors from various perspectives. Therefore, this research proposes a method that effectively takes the diverse productivity data of impact factors, and generates a multidimensional analysis. Decision tree analysis, a data mining technique, is also applied in this research in order to supply construction managers with appropriate productivity data on impact factors during the construction management process.

An Incremental Web Document Clustering Based on the Transitive Closure Tree (이행적 폐쇄트리를 기반으로 한 점증적 웹 문서 클러스터링)

  • Youn Sung-Dae;Ko Suc-Bum
    • Journal of Korea Multimedia Society
    • /
    • v.9 no.1
    • /
    • pp.1-10
    • /
    • 2006
  • In document clustering methods, the k-means algorithm and the Hierarchical Alglomerative Clustering(HAC) are often used. The k-means algorithm has the advantage of a processing time and HAC has also the advantage of a precision of classification. But both methods have mutual drawbacks, a slow processing time and a low quality of classification for the k-means algorithm and the HAC, respectively. Also both methods have the serious problem which is to compute a document similarity whenever new document is inserted into a cluster. A main property of web resource is to accumulate an information by adding new documents frequently. Therefore, we propose a new method of transitive closure tree based on the HAC method which can improve a processing time for a document clustering, and also propose a superior incremental clustering method for an insertion of a new document and a deletion of a document contained in a cluster. The proposed method is compared with those existing algorithms on the basis of a pre챠sion, a recall, a F-Measure, and a processing time and we present the experimental results.

  • PDF

Automatic Construction of Reduced Dimensional Cluster-based Keyword Association Networks using LSI (LSI를 이용한 차원 축소 클러스터 기반 키워드 연관망 자동 구축 기법)

  • Yoo, Han-mook;Kim, Han-joon;Chang, Jae-young
    • Journal of KIISE
    • /
    • v.44 no.11
    • /
    • pp.1236-1243
    • /
    • 2017
  • In this paper, we propose a novel way of producing keyword networks, named LSI-based ClusterTextRank, which extracts significant key words from a set of clusters with a mutual information metric, and constructs an association network using latent semantic indexing (LSI). The proposed method reduces the dimension of documents through LSI, decomposes documents into multiple clusters through k-means clustering, and expresses the words within each cluster as a maximal spanning tree graph. The significant key words are identified by evaluating their mutual information within clusters. Then, the method calculates the similarities between the extracted key words using the term-concept matrix, and the results are represented as a keyword association network. To evaluate the performance of the proposed method, we used travel-related blog data and showed that the proposed method outperforms the existing TextRank algorithm by about 14% in terms of accuracy.

Analysis for Changes of Mode Choice Behavior from Providing Real-time Schedule for Public Transportation by Smartphone Application (스마트폰 애플리케이션을 이용한 대중교통 운행정보 제공에 따른 통행자 수단선택 행태변화 분석)

  • Choi, Sung-Taek;Rho, Jeong-Hyun
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.11 no.6
    • /
    • pp.60-69
    • /
    • 2012
  • Public Transport Information Service which use smartphone Apps has received attention as the way of solution that reduced transport problem. Smartphone can offer real-time information because of a LBS(Location Based Service) system. This study try to find out which factor affect mode choice ratio of public transport, especially smartphone Apps. The result shows that rising oil price, traffic congestion, public information service with smartphone apps, BIS(Bus Information System) factors get 0.39, 0.27, 0.18, 0.16 scores with paired comparison. Younger and student respondents prefer smart phone public information service. Decision Tree shows that the most important decision factor is smartphone information service factor.

A Study on the Effective Selection of Tunnel Reinforcement Methods using Decision Tree Technique (의사결정트리 기법을 이용한 터널 보조공법 선정방안 연구)

  • Kim, Jong-Gyu;Sagong, Myung;Lee, Jun S.;Lee, Yong-Joo
    • KSCE Journal of Civil and Environmental Engineering Research
    • /
    • v.26 no.4C
    • /
    • pp.255-264
    • /
    • 2006
  • The auxiliary reinforcement method is normally applied to prevent a possible collapse of the tunnel face where the ground condition is not favorable or geologic information is not sufficient. Recently, several engineering approaches have been made to choose the effective reinforcement methods using expert system such as neural network and fuzzy theory field, among others. Even if the expert system has offered many decision aid tools to properly select the reinforcement method, the quantitative assessment items are not easy to estimate and this is why the data mining technique, widely used in the field of social science, medical treatment, banking and agriculture, is introduced in this study. Using decision tree together with PDA, the decision aids for reinforcement method based on field construction data are created to derive the field rules and future study will be concentrated on the application of the proposed methods in a variety of underground development cases.