• Title/Summary/Keyword: 데이터 마이닝

Search Result 1,008, Processing Time 0.179 seconds

Design and Implementation of a Spatial Data Mining System (공간 데이터 마이닝 시스템의 설계 및 구현)

  • Bae, DUck-Ho;Baek, Ji-Haeng;Oh, Hyun-Kyo;Song, Ju-Won;Kim, Sang-Wook;Choi, Myoung-Hoi;Jo, Hyeon-Ju
    • Journal of Korea Spatial Information System Society
    • /
    • v.11 no.2
    • /
    • pp.119-132
    • /
    • 2009
  • Owing to the GIS technology, a vast volume of spatial data has been accumulated, thereby incurring the necessity of spatial data mining techniques. In this paper, we propose a new spatial data mining system named SD-Miner. SD-Miner consists of three parts: a graphical user interface for inputs and outputs, a data mining module that processes spatial mining functionalities, a data storage model that stores and manages spatial as well as non-spatial data by using a DBMS. In particular, the data mining module provides major data mining functionalities such as spatial clustering, spatial classification, spatial characterization, and spatio-temporal association rule mining. SD-Miner has own characteristics: (1) It supports users to perform non-spatial data mining functionalities as well as spatial data mining functionalities intuitively and effectively; (2) It provides users with spatial data mining functions as a form of libraries, thereby making applications conveniently use those functions. (3) It inputs parameters for mining as a form of database tables to increase flexibility. In order to verify the practicality of our SD-Miner developed, we present meaningful results obtained by performing spatial data mining with real-world spatial data.

  • PDF

Buying Customer Classification in Automotive Corporation with Decision Tree (의사결정트리를 통한 자동차산업의 구매패턴분류)

  • Lee, Byoung-Yup;Park, Yong-Hoon;Yoo, Jae-Soo
    • The Journal of the Korea Contents Association
    • /
    • v.10 no.2
    • /
    • pp.372-380
    • /
    • 2010
  • Generally, data mining is the process of analyzing data from different perspectives and summarizing it into useful information that can be used to increase revenue, cuts costs, or both. It allows users to analyze data from many different dimensions or angles, categorize it, and summarize the relationships identified. Technically, data mining is the process of finding correlations or patterns among dozens of fields in large relational databases. Data mining is one of the fastest growing field in the computer industry. Because of According to computer technology has been improving, Massive customer data has stored in database. Using this massive data, decision maker can extract the useful information to make a valuable plan with data mining. Data mining offers service providers great opportunities to get closer to customer. Data mining doesn't always require the latest technology, but it does require a magic eye that looks beyond the obvious to find and use the hidden knowledge to drive marketing strategies. Automotive market face an explosion of data arising from customer but a rate of increasing customer is getting lower. therefore, we need to determine which customer are profitable clients whom you wish to hold. This paper builds model of customer loyalty detection and analyzes customer buying patterns in automotive market with data mining using decision tree as a quinlan C4.5 and basic statics methods.

Grid-based Biological Data Mining using Dynamic Load Balancing (동적 로드 밸런싱을 이용한 그리드 기반의 생물학 데이터 마이닝)

  • Ma, Yong-Beom;Kim, Tae-Young;Lee, Jong-Sik
    • Journal of the Korea Society for Simulation
    • /
    • v.19 no.2
    • /
    • pp.81-89
    • /
    • 2010
  • Biological data mining has been noticed as an issue as the volume of biological data is increasing extremely. Grid technology can share and utilize computing data and resources. In this paper, we propose a hybrid system that combines biological data mining with grid technology. Especially, we propose a decision range adjustment algorithm for processing efficiency of biological data mining. We obtain a reliable data mining recognition rate automatically and rapidly through this algorithm. And communication loads and resource allocation are key issues in grid environment because the resources are geographically distributed and interacted with themselves. Therefore, we propose a dynamic load balancing algorithm and apply it to the grid-based biological data mining method. For performance evaluation, we measure average processing time, average communication time, and average resource utilization. Experimental results show that this method provides many advantages in aspects of processing time and cost.

Finding Frequent Itemsets based on Open Data Mining in Data Streams (데이터 스트림에서 개방 데이터 마이닝 기반의 빈발항목 탐색)

  • Chang, Joong-Hyuk;Lee, Won-Suk
    • The KIPS Transactions:PartD
    • /
    • v.10D no.3
    • /
    • pp.447-458
    • /
    • 2003
  • The basic assumption of conventional data mining methodology is that the data set of a knowledge discovery process should be fixed and available before the process can proceed. Consequently, this assumption is valid only when the static knowledge embedded in a specific data set is the target of data mining. In addition, a conventional data mining method requires considerable computing time to produce the result of mining from a large data set. Due to these reasons, it is almost impossible to apply the mining method to a realtime analysis task in a data stream where a new transaction is continuously generated and the up-to-dated result of data mining including the newly generated transaction is needed as quickly as possible. In this paper, a new mining concept, open data mining in a data stream, is proposed for this purpose. In open data mining, whenever each transaction is newly generated, the updated mining result of whole transactions including the newly generated transactions is obtained instantly. In order to implement this mechanism efficiently, it is necessary to incorporate the delayed-insertion of newly identified information in recent transactions as well as the pruning of insignificant information in the mining result of past transactions. The proposed algorithm is analyzed through a series of experiments in order to identify the various characteristics of the proposed algorithm.

Data Mining Technology for Efficient Information Application (교육에서의 효율적인 정보 활용을 위한 데이터 마이닝 기법)

  • Lee, Chul-Hwan;Han, Sun-Gwan
    • Journal of The Korean Association of Information Education
    • /
    • v.3 no.1
    • /
    • pp.75-85
    • /
    • 1999
  • The purpose of the paper is to apply a Data Mining method to Data Base System for more efficient educational data used in elementary and secondary education. First, this study investigated the whole contents of Data Mining and technique relation to Machine Learning. Mainly Data Base Systems in education are general life checking, record of health, and score reports. We suggested Data Mining method and Machine Learning when we search for information of usefulness in a particular representational form or a set of such representations in data. Also, we propose the problem and the solution when using data mining techniques in education.

  • PDF

Using Genetic Rule-Based Classifier System for Data Mining (유전자 알고리즘을 이용한 데이터 마이닝의 분류 시스템에 관한 연구)

  • Han, Myung-Mook
    • Journal of Internet Computing and Services
    • /
    • v.1 no.1
    • /
    • pp.63-72
    • /
    • 2000
  • Data mining means a process of nontrivial extraction of hidden knowledge or potentially useful information from data in large databases. Data mining algorithm is a multi-disciplinary field of research; machine learning, statistics, and computer science all make a contribution. Different classification schemes can be used to categorize data mining methods based on the kinds of tasks to be implemented and the kinds of application classes to be utilized, and classification has been identified as an important task in the emerging field of data mining. Since classification is the basic element of human's way of thinking, it is a well-studied problem in a wide varietyof application. In this paper, we propose a classifier system based on genetic algorithm with robust property, and the proposed system is evaluated by applying it to nDmC problem related to classification task in data mining.

  • PDF

Privacy-Preserving k-Bits Inner Product Protocol (프라이버시 보장 k-비트 내적연산 기법)

  • Lee, Sang Hoon;Kim, Kee Sung;Jeong, Ik Rae
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.23 no.1
    • /
    • pp.33-43
    • /
    • 2013
  • The research on data mining that can manage a large amount of information efficiently has grown with the drastic increment of information. Privacy-preserving data mining can protect the privacy of data owners. There are several privacy-preserving association rule, clustering and classification protocols. A privacy-preserving association rule protocol is used to find association rules among data, which is often used for marketing. In this paper, we propose a privacy-preserving k-bits inner product protocol based on Shamir's secret sharing.

Data Mining Time Series Data With Virtual Transaction (가상 트랜잭션을 이용한 시계열 데이터의 데이터 마이닝)

  • Kim, Min-Su;Kim, Cheol-Hwan;Kim, Eung-Mo
    • The KIPS Transactions:PartD
    • /
    • v.9D no.2
    • /
    • pp.251-258
    • /
    • 2002
  • There has been much research on data mining techniques for applying more advanced applications. However, most of those techniques has focused on transaction data rather than time series data. In this paper, we introduce a approach to convert time series data into virtual transaction data for more useful data mining applications. A virtual transaction is defined to be a collection of events that occur relatively close to each other. A virtual transaction generator uses time window or event window methods. Our approach based on time series data can be used with most conventional transaction algorithms without further modification.

시퀀스 패턴 마이닝 기법을 적용한 침입탐지 시스템의 경보데이터 패턴분석

  • Shin, Moon-Sun
    • Proceedings of the KAIS Fall Conference
    • /
    • 2010.05a
    • /
    • pp.451-454
    • /
    • 2010
  • 침입탐지란 컴퓨터와 네트워크 자원에 대한 유해한 침입 행동을 식별하고 대응하는 과정이다. 점차적으로 시스템에 대한 침입의 유형들이 복잡해지고 전문적으로 이루어지면서 빠르고 정확한 대응을 할 수 있는 시스템이 요구되고 있다. 이에 대용량의 데이터를 분석하여 의미 있는 정보를 추출하는 데이터 마이닝 기법을 적용하여 지능적이고 자동화된 탐지 및 경보데이터 패턴 분석에 이용할 수 있다. 본 논문에서는 경보데이터 패턴 분석을 위해 시퀀스패턴기법을 적용한 경보데이터 마이닝 엔진을 구축한다. 구현된 경보데이터 마이닝 시스템은 기존의 시퀀스 패턴 알고리즘인 PrefixSpan 알고리즘을 확장 구현하여 경보데이터의 빈발 경보시퀀스 분석과 빈발 공격시퀀스 분석에 활용할 수 있다.

  • PDF

The Analysis Telecommunication Service MarKet with Data Mining (통신시장에서 마이터 마이닝 분석)

  • 장일동;위승민
    • Proceedings of the Korean Information Science Society Conference
    • /
    • 2001.10a
    • /
    • pp.1-3
    • /
    • 2001
  • 이 논문에서는 지식발견과 데이터 마이닝에 관한 전반적인 소개와 고객이탈에 관한 것이다. 데이터 마이닝이란 과거에 수집된 데이터로부터 반복적인 학습과정을 거쳐 데이터에 내재되어 있는 패턴을 찾아내는 모델링 기법이며 통신서비스시장에서 데이터 마이닝 활용으로 고객이탈방지 모델을 인공신경망을 통해 구축하였다. 통신서비스시장의 경쟁이 심화됨에 따라 통신서비스 제공 업체가 고통으로 겪는 어려움 중의 하나가 고객이탈률이다. 따라서 데이터베이스에서 보다 가치 있는 정보를 찾아내 고객 이탈고객 분류의 적중률에 관하여 논의하였다.

  • PDF