• 제목/요약/키워드: Data Mining Models

검색결과 414건 처리시간 0.027초

데이터 마이닝을 이용한 패트리어트 수리부속의 간헐적 수요 예측에 관한 연구 (A Study on Intermittent Demand Forecasting of Patriot Spare Parts Using Data Mining)

  • 박천규;마정목
    • 한국산학기술학회논문지
    • /
    • 제22권3호
    • /
    • pp.234-241
    • /
    • 2021
  • 군에서는 수요예측에 대한 중요성을 인식하여 수리부속에 대해 예측 정확도 향상을 위한 많은 연구가 이루어지고 있다. 수리부속 수요예측은 예산 운영과 장비 가동률 측면에서 매우 중요한 요소가 되고 있다. 그러나 현재 군에서 적용중인 시계열 모형으로는 수요량의 변동과 발생주기가 일정하지 않은 간헐적 수요에 대해서는 예측에 한계가 있는 실정이다. 따라서, 본 연구는 공군 패트리어트 수리부속의 간헐적 수요에 대한 예측 정확도를 제고하는 방법을 제시하고자 하였다. 이를 위해서 2013년부터 2019년까지의 701개의 수리부속 소모개수를 토대로 수요 유형을 구분하여 수리부속의 간헐적 수요 자료를 수집하였다. 또한, 장비 고장에 영향을 줄 수 있는 외부 요인으로는 기온, 장비운영시간을 식별하여 입력변수로 선정하였다. 그 후, 소모개수와 외부 요인을 통해 군에서 적용하는 시계열 모형과 제안하는 데이터 마이닝 모형으로 예측을 실시하여 모형별 예측 정확도를 판단했다. 예측 결과로 기존의 시계열 모형과 비교하여 데이터 마이닝 모형의 예측 정확도가 높았으며, 그 중 다층 퍼셉트론 모형이 가장 우수한 성능을 보였다.

IP기반 유선인터넷전화 가입요인 도출을 위한 분석적 연구: 통신상품결합서비스의 영향

  • 하성호;양정원
    • 한국데이타베이스학회:학술대회논문집
    • /
    • 한국데이타베이스학회 2010년도 춘계국제학술대회
    • /
    • pp.187-199
    • /
    • 2010
  • Recently, Internet Telephony has become increasingly popular in telecommunication industry. However, previous research on Internet Telephony has focused on analyzing specific Internet Telephony solutions, identifying the Internet Telephony movement itself. The research on prediction models about Internet Telephony adoption has been minimal. The main propose of this study is to develop models for predicting transition intention from using traditional telephones to using Internet Telephony. To do so, this study uses data mining methods to analyze demands in the IT communications market and to provide management strategies for Internet telephony providers. Especially this study uses discriminant analysis, logistic regression, classification tree, and neural nets to develop the prediction models for the Internet Telephony adoption. The models are compared with each other and a superior model is chosen.

  • PDF

Input Variable Importance in Supervised Learning Models

  • Huh, Myung-Hoe;Lee, Yong Goo
    • Communications for Statistical Applications and Methods
    • /
    • 제10권1호
    • /
    • pp.239-246
    • /
    • 2003
  • Statisticians, or data miners, are often requested to assess the importances of input variables in the given supervised learning model. For the purpose, one may rely on separate ad hoc measures depending on modeling types, such as linear regressions, the neural networks or trees. Consequently, the conceptual consistency in input variable importance measures is lacking, so that the measures cannot be directly used in comparing different types of models, which is often done in data mining processes, In this short communication, we propose a unified approach to the importance measurement of input variables. Our method uses sensitivity analysis which begins by perturbing the values of input variables and monitors the output change. Research scope is limited to the models for continuous output, although it is not difficult to extend the method to supervised learning models for categorical outcomes.

Hybrid Internet Business Model using Evolutionary Support Vector Regression and Web Response Survey

  • Jun, Sung-Hae
    • 한국지능시스템학회:학술대회논문집
    • /
    • 한국퍼지및지능시스템학회 2006년도 추계학술대회 학술발표 논문집 제16권 제2호
    • /
    • pp.408-411
    • /
    • 2006
  • Currently, the nano economy threatens the mass economy. This is based on the internet business models. In the nano business models based on internet, the diversely personalized services are needed. Many researches of the personalization on the web have been studied. The web usage mining using click stream data is a tool for personalization model. In this paper, we propose an internet business model using evolutionary support vector machine and web response survey as a web usage mining. After analyzing click stream data for web usage mining, a personalized service model is constructed in our work. Also, using an approach of web response survey, we improve the performance of the customers' satisfaction. From the experimental results, we verify the performance of proposed model using two data sets from KDD Cup 2000 and our web server.

  • PDF

연속발생 데이터를 위한 실시간 데이터 마이닝 기법 (A Real-Time Data Mining for Stream Data Sets)

  • 김진화;민진영
    • 한국경영과학회지
    • /
    • 제29권4호
    • /
    • pp.41-60
    • /
    • 2004
  • A stream data is a data set that is accumulated to the data storage from a data source over time continuously. The size of this data set, in many cases. becomes increasingly large over time. To mine information from this massive data. it takes much resource such as storage, memory and time. These unique characteristics of the stream data make it difficult and expensive to use this large size data accumulated over time. Otherwise. if we use only recent or part of a whole data to mine information or pattern. there can be loss of information. which may be useful. To avoid this problem. we suggest a method that efficiently accumulates information. in the form of rule sets. over time. It takes much smaller storage compared to traditional mining methods. These accumulated rule sets are used as prediction models in the future. Based on theories of ensemble approaches. combination of many prediction models. in the form of systematically merged rule sets in this study. is better than one prediction model in performance. This study uses a customer data set that predicts buying power of customers based on their information. This study tests the performance of the suggested method with the data set alone with general prediction methods and compares performances of them.

토픽 모형 및 사회연결망 분석을 이용한 한국데이터정보과학회지 영문초록 분석 (Analysis of English abstracts in Journal of the Korean Data & Information Science Society using topic models and social network analysis)

  • 김규하;박철용
    • Journal of the Korean Data and Information Science Society
    • /
    • 제26권1호
    • /
    • pp.151-159
    • /
    • 2015
  • 이 논문에서는 텍스트마이닝 (text mining) 기법을 이용하여 한국데이터정보과학회지에 게재된 논문의 영어초록을 분석하였다. 먼저 다양한 방법을 통해 단어-문서 행렬 (term-document matrix)을 생성하고 이를 사회연결망 분석 (social network analysis)을 통해 시각화하였다. 또한 토픽을 추출하기 위한 방법으로 LDA (latent Dirichlet allocation)와 CTM (correlated topic model)을 사용하였다. 토픽의 수, 단어-문서 행렬의 생성방법에 따라 엔트로피 (entropy)를 통해 토픽 추출 모형들의 성능을 비교하였다.

데이터 마이닝을 활용한 장기저장탄약 상태 결정요인 분석 연구 (A Study on Determinants of Stockpile Ammunition using Data Mining)

  • 노유찬;조남욱;이동녁
    • 품질경영학회지
    • /
    • 제48권2호
    • /
    • pp.297-307
    • /
    • 2020
  • Purpose: The purpose of this study is to analyze the factors that affect ammunition performance by applying data mining techniques to the Ammunition Stockpile Reliability Program (ASRP) data of the 155mm propelling charge. Methods: The ASRP data from 1999 to 2017 have been utilized. Logistic regression and decision tree analysis were used to investigate the factors that affect performance of ammunition. The performance evaluation of each model was conducted through comparison with an artificial neural networks(ANN) model. Results: The results of this study are as follows; logistic regression and the decision tree analysis showed that major defect rate of visual inspection is the most significant factor. Also, muzzle velocity by base charge and muzzle velocity by increment charge are also among the significant factors affecting the performance of 155mm propelling charge. To validate the logistic regression and decision tree models, their classification accuracies have been compared with the results of an ANN model. The results indicate that the logistic regression and decision tree models show sufficient performance which conforms the validity of the models. Conclusion: The main contribution of this paper is that, to our best knowledge, it is the first attempt at identifying the significant factors of ASPR data by using data mining techniques. The approaches suggested in the paper could also be extended to other types ammunition data.

Applications of artificial intelligence and data mining techniques in soil modeling

  • Javadi, A.A.;Rezania, M.
    • Geomechanics and Engineering
    • /
    • 제1권1호
    • /
    • pp.53-74
    • /
    • 2009
  • In recent years, several computer-aided pattern recognition and data mining techniques have been developed for modeling of soil behavior. The main idea behind a pattern recognition system is that it learns adaptively from experience and is able to provide predictions for new cases. Artificial neural networks are the most widely used pattern recognition methods that have been utilized to model soil behavior. Recently, the authors have pioneered the application of genetic programming (GP) and evolutionary polynomial regression (EPR) techniques for modeling of soils and a number of other geotechnical applications. The paper reviews applications of pattern recognition and data mining systems in geotechnical engineering with particular reference to constitutive modeling of soils. It covers applications of artificial neural network, genetic programming and evolutionary programming approaches for soil modeling. It is suggested that these systems could be developed as efficient tools for modeling of soils and analysis of geotechnical engineering problems, especially for cases where the behavior is too complex and conventional models are unable to effectively describe various aspects of the behavior. It is also recognized that these techniques are complementary to conventional soil models rather than a substitute to them.

분산형 데이터마이닝 구현을 위한 의사결정나무 모델 전송 기술 (The Transfer Technique among Decision Tree Models for Distributed Data Mining)

  • 김충곤;우정근;백성욱
    • 디지털콘텐츠학회 논문지
    • /
    • 제8권3호
    • /
    • pp.309-314
    • /
    • 2007
  • 분산형 데이터마이닝을 위해 의사결정나무 알고리즘은 분산형 협업 환경에 적합하도록 변환되어야 한다. 본 논문에서 제시된 분산형 데이터마이닝 시스템은 각각의 사이트에서 부분적인 데이터를 위한 데이터마이닝 작업을 수행할 수 있는 에이전트와 여러 에이전트들의 협업을 통해 최종적인 의사결정나무 모델을 완성할 수 있도록 에이전트들 간의 통신을 중재하는 미디에이터로 구성되어 있다. 분산형 데이터마이닝의 장점 중에 하나는 여러 사이트에 분산되어 있는 대량의 데이터를 분산 처리하므로 데이터마이닝의 소요시간을 현저하게 줄일 수 있다는 점이다. 그러나 각 사이트들에 존재하고 있는 에이전트들 간의 통신에 부하가 과도하게 걸린다면, 효율적인 시스템으로의 활용도가 낮아질 것 이다. 본 논문은 에이전트들 간에 의사결정나무 모델의 전송량을 최소로 할 수 있는 방법론에 초점을 맞추었다.

  • PDF

데이터마이닝을 활용한 한국프로야구 승패예측모형 수립에 관한 연구 (Using Data Mining Techniques to Predict Win-Loss in Korean Professional Baseball Games)

  • 오윤학;김한;윤재섭;이종석
    • 대한산업공학회지
    • /
    • 제40권1호
    • /
    • pp.8-17
    • /
    • 2014
  • In this research, we employed various data mining techniques to build predictive models for win-loss prediction in Korean professional baseball games. The historical data containing information about players and teams was obtained from the official materials that are provided by the KBO website. Using the collected raw data, we additionally prepared two more types of dataset, which are in ratio and binary format respectively. Dividing away-team's records by the records of the corresponding home-team generated the ratio dataset, while the binary dataset was obtained by comparing the record values. We applied seven classification techniques to three (raw, ratio, and binary) datasets. The employed data mining techniques are decision tree, random forest, logistic regression, neural network, support vector machine, linear discriminant analysis, and quadratic discriminant analysis. Among 21(= 3 datasets${\times}$7 techniques) prediction scenarios, the most accurate model was obtained from the random forest technique based on the binary dataset, which prediction accuracy was 84.14%. It was also observed that using the ratio and the binary dataset helped to build better prediction models than using the raw data. From the capability of variable selection in decision tree, random forest, and stepwise logistic regression, we found that annual salary, earned run, strikeout, pitcher's winning percentage, and four balls are important winning factors of a game. This research is distinct from existing studies in that we used three different types of data and various data mining techniques for win-loss prediction in Korean professional baseball games.