• 제목/요약/키워드: Data Mining Technique

검색결과 637건 처리시간 0.031초

데이터 마이닝 기법을 활용한 산업재해자들에 대한 요인분석 (Factor Analysis on Injured People Using Data Mining Technique)

  • 임영문;황영섭;최요한
    • 대한안전경영과학회지
    • /
    • 제7권4호
    • /
    • pp.61-71
    • /
    • 2005
  • Many researches have been focused on the analysis of industry disasters in order to reduce them. As a similar endeavor, this paper provides a propensity analysis of injured people from various industries using classification and regression tree(CART), a data mining algorithm. The sample for this work was chosen from 25,157data related to various industries during one year ( $2003.2\sim2004.1$ ) at Kangwon-Do in Korea. For the purpose of this paper, eight independent variables (injured date, injured time, injured month, type of Injured person, continuous service period, sex, company size, age)are taken from injured person group. According to the analysis result, it is found that five out of the eight factors that are predicted as significant have salient effects. Factors of season, time/hour, day of the week, or month which disasters happened do not show any significant effect. This paper provides common features of injured people. The provided analysis result will be helpful as a starting point for root cause analysis and reduction of industry disasters and also for development of a guideline of safety management.

위치 기반 서비스를 위한 이동 객체의 시간 패턴 탐사 기법 (Temporal Pattern Mining of Moving Objects for Location based Services)

  • 이준욱;백옥현;류근호
    • 한국정보과학회논문지:데이타베이스
    • /
    • 제29권5호
    • /
    • pp.335-346
    • /
    • 2002
  • 위치 기반 서비스는 이동중인 사용자에게 위치와 관련된 정보를 제공한다. 최소한의 자원으로 사용자에게 유용한 정보를 개인화하여 제공하는 것은 위치 기반 서비스가 가져야 할 필수적인 기능이다. 이 기능은 데이타 마이닝을 통해 실현될 수 있다. 하지만 기존의 데이터 마이닝 연구는 시간 및 공간 속성을 동시에 고려하고 있지 않다. 따라서 시간에 따라 공간 위치 속성이 변경되는 특성을 갖는 위치 기반 서비스의 대상에는 적절하지 않다. 이 논문에서는 시간 및 공간 속성을 가지는 이동 객체의 위치 데이타로부터 유용한 시간 패턴을 탐사하기 위한 새로운 데이타 마이닝 기법을 제안하였다. 평면 상에서 좌표로 표현되는 이동 객체의 위치 정보를 일반화하기 위하여 contains와 같은 공간 연산을 사용하였다. 또한 이동 패턴 탐사 시 실제 유효한 시퀀스를 만들기 위해 객체의 위치 사이에 시간 제약조건을 적용하였다. 이렇게 생성된 이동 객체 위치의 시퀀스로부터 빈발 이동 시퀀스를 구하여 시간 패턴을 생성하였다. 제안한 기법은 기존과는 다른 시, 공간적 접근을 취함으로써 시간과 공간 의미가 중요시되는 위치 기반 서비스에 적합한 새로운 유형의 지식을 제공할 수 있다.

U-Health에서 이벤트 상태 변화를 고려한 시간 마이닝 기법 개발 (The Development of Temporal Mining Technique Considering the Event Change of State in U-Health)

  • 김재인;김대인;황부현
    • 정보처리학회논문지D
    • /
    • 제18D권4호
    • /
    • pp.215-224
    • /
    • 2011
  • U-Health는 다양한 종류의 센서로 환자 정보를 수집하며, 스트림 데이터는 시작 시점과 종료 시점을 갖는 인터벌 이벤트로 요약 가능하다. 그러나 대부분의 시간 데이터 마이닝 기법들은 이벤트 발생 시점만을 고려하며 스트림 데이터의 상태 변화는 간과하는 문제가 있다. 이 논문은 U-Health에서 이벤트 상태 변화를 고려한 시간 마이닝 기법을 제안한다. 제안 방법은 U-Health에서 관심이 있는 이벤트만을 센서에서 서버로 전송함으로써 환경의 제약 사항들을 극복하고 스트림 데이터에 대한 네 가지 이벤트 상태를 정의하여 상태 변화를 고려한 시간 마이닝을 수행한다. 최종적으로, 제안 방법은 이벤트들 사이에 존재하는 인과 관계를 시간 관계 시퀀스로 기술하여 탐사 규칙의 모호함을 제거한다.

A Study on a Statistical Matching Method Using Clustering for Data Enrichment

  • Kim Soon Y.;Lee Ki H.;Chung Sung S.
    • Communications for Statistical Applications and Methods
    • /
    • 제12권2호
    • /
    • pp.509-520
    • /
    • 2005
  • Data fusion is defined as the process of combining data and information from different sources for the effectiveness of the usage of useful information contents. In this paper, we propose a data fusion algorithm using k-means clustering method for data enrichment to improve data quality in knowledge discovery in database(KDD) process. An empirical study was conducted to compare the proposed data fusion technique with the existing techniques and shows that the newly proposed clustering data fusion technique has low MSE in continuous fusion variables.

Proposing a New Approach for Detecting Malware Based on the Event Analysis Technique

  • Vu Ngoc Son
    • International Journal of Computer Science & Network Security
    • /
    • 제23권12호
    • /
    • pp.107-114
    • /
    • 2023
  • The attack technique by the malware distribution form is a dangerous, difficult to detect and prevent attack method. Current malware detection studies and proposals are often based on two main methods: using sign sets and analyzing abnormal behaviors using machine learning or deep learning techniques. This paper will propose a method to detect malware on Endpoints based on Event IDs using deep learning. Event IDs are behaviors of malware tracked and collected on Endpoints' operating system kernel. The malware detection proposal based on Event IDs is a new research approach that has not been studied and proposed much. To achieve this purpose, this paper proposes to combine different data mining methods and deep learning algorithms. The data mining process is presented in detail in section 2 of the paper.

오차 패턴 모델링을 이용한 Hybrid 데이터 마이닝 기법 (A Hybrid Data Mining Technique Using Error Pattern Modeling)

  • 허준;김종우
    • 한국경영과학회지
    • /
    • 제30권4호
    • /
    • pp.27-43
    • /
    • 2005
  • This paper presents a new hybrid data mining technique using error pattern modeling to improve classification accuracy when the data type of a target variable is binary. The proposed method increases prediction accuracy by combining two different supervised learning methods. That is, the algorithm extracts a subset of training cases that are predicted inconsistently by both methods, and models error patterns from the cases. Based on the error pattern model, the Predictions of two different methods are merged to generate final prediction. The proposed method has been tested using practical 10 data sets. The analysis results show that the performance of proposed method is superior to the existing methods such as artificial neural networks and decision tree induction.

Compromising Multiple Objectives in Production Scheduling: A Data Mining Approach

  • Hwang, Wook-Yeon;Lee, Jong-Seok
    • Management Science and Financial Engineering
    • /
    • 제20권1호
    • /
    • pp.1-9
    • /
    • 2014
  • In multi-objective scheduling problems, the objectives are usually in conflict. To obtain a satisfactory compromise and resolve the issue of NP-hardness, most existing works have suggested employing meta-heuristic methods, such as genetic algorithms. In this research, we propose a novel data-driven approach for generating a single solution that compromises multiple rules pursuing different objectives. The proposed method uses a data mining technique, namely, random forests, in order to extract the logics of several historic schedules and aggregate those. Since it involves learning predictive models, future schedules with the same previous objectives can be easily and quickly obtained by applying new production data into the models. The proposed approach is illustrated with a simulation study, where it appears to successfully produce a new solution showing balanced scheduling performances.

Development of a Knowledge Discovery System using Hierarchical Self-Organizing Map and Fuzzy Rule Generation

  • Koo, Taehoon;Rhee, Jongtae
    • 한국지능정보시스템학회:학술대회논문집
    • /
    • 한국지능정보시스템학회 2001년도 The Pacific Aisan Confrence On Intelligent Systems 2001
    • /
    • pp.431-434
    • /
    • 2001
  • Knowledge discovery in databases(KDD) is the process for extracting valid, novel, potentially useful and understandable knowledge form real data. There are many academic and industrial activities with new technologies and application areas. Particularly, data mining is the core step in the KDD process, consisting of many algorithms to perform clustering, pattern recognition and rule induction functions. The main goal of these algorithms is prediction and description. Prediction means the assessment of unknown variables. Description is concerned with providing understandable results in a compatible format to human users. We introduce an efficient data mining algorithm considering predictive and descriptive capability. Reasonable pattern is derived from real world data by a revised neural network model and a proposed fuzzy rule extraction technique is applied to obtain understandable knowledge. The proposed neural network model is a hierarchical self-organizing system. The rule base is compatible to decision makers perception because the generated fuzzy rule set reflects the human information process. Results from real world application are analyzed to evaluate the system\`s performance.

  • PDF

데이터마이닝을 이용한 공정변수 확인 및 공정개선 (Identification Process Variables and Process Improvement Using Data Mining)

  • 정영수;강창욱;변성규
    • 산업경영시스템학회지
    • /
    • 제28권3호
    • /
    • pp.166-171
    • /
    • 2005
  • With development of the database, there are too many data on process variables and the manufacturing process for the traditional statistical process control methods to identify the process variables related with assignable causes. Data mining is useful in this situation and provides variety of approaches for improving the process. In this paper, we applied control charts to monitor the process and if assignable causes are detected, then we applied the SVM technique and the sequence pattern analysis to find out the process variables suspected. These techniques made possible to predict the behavior of process variables. We illustrated our proposed methods with real manufacturing process data.

Study Factors for Student Performance Applying Data Mining Regression Model Approach

  • Khan, Shakir
    • International Journal of Computer Science & Network Security
    • /
    • 제21권2호
    • /
    • pp.188-192
    • /
    • 2021
  • In this paper, we apply data mining techniques and machine learning algorithms using R software, which is used to predict, here we applied a regression model to test some factor on the dataset for which we assumed that it effects student performance. Model was built on an existing dataset which contains many factors and the final grades. The factors tested are the attention to higher education, absences, study time, parent's education level, parent's jobs, and the number of failures in the past. The result shows that only study time and absences can affect the students' performance. Prediction of student academic performance helps instructors develop a good understanding of how well or how poorly the students in their classes will perform, so instructors can take proactive measures to improve student learning. This paper also focuses on how the prediction algorithm can be used to identify the most important attributes in a student's data.