• Title/Summary/Keyword: clustering의사결정나무

Search Result 15, Processing Time 0.019 seconds

A study on decision tree creation using intervening variable (매개 변수를 이용한 의사결정나무 생성에 관한 연구)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.4
    • /
    • pp.671-678
    • /
    • 2011
  • Data mining searches for interesting relationships among items in a given database. The methods of data mining are decision tree, association rules, clustering, neural network and so on. The decision tree approach is most useful in classification problems and to divide the search space into rectangular regions. Decision tree algorithms are used extensively for data mining in many domains such as retail target marketing, customer classification, etc. When create decision tree model, complicated model by standard of model creation and number of input variable is produced. Specially, there is difficulty in model creation and analysis in case of there are a lot of numbers of input variable. In this study, we study on decision tree using intervening variable. We apply to actuality data to suggest method that remove unnecessary input variable for created model and search the efficiency.

Study on Development of Classification Model and Implementation for Diagnosis System of Sasang Constitution (사상체질 분류모형 개발 및 진단시스템의 구현에 관한 연구)

  • Beum, Soo-Gyun;Jeon, Mi-Ran;Oh, Am-Suk
    • Proceedings of the Korean Institute of Information and Commucation Sciences Conference
    • /
    • 2008.08a
    • /
    • pp.155-159
    • /
    • 2008
  • In this thesis, in order to develop a new classification model of Sasang Constitutional medical types, which is helpful for improving the accuracy of diagnosis of medical types. various data-mining classification models such as discriminant analysis. decision trees analysis, neural networks analysis, logistics regression analysis, clustering analysis which are main classification methods were applied to the questionnaires of medical type classification. In this manner, a model which scientifically classifies constitutional medical types in the field of Sasang Constitutional Medicine, one of a traditional Korean medicine, has been developed. Also, the above-mentioned analysis models were systematically compared and analyzed. In this study, a classification of Sasang constitutional medical types was developed based on the discriminate analysis model and decision trees analysis model of which accuracy is relatively high, of which analysis procedure is easy to understand and to explain and which are easy to implement. Also, a diagnosis system of Sasang constitution was implemented applying the two analysis models.

  • PDF

Predicting Power Generation Patterns Using the Wind Power Data (풍력 데이터를 이용한 발전 패턴 예측)

  • Suh, Dong-Hyok;Kim, Kyu-Ik;Kim, Kwang-Deuk;Ryu, Keun-Ho
    • Journal of the Korea Society of Computer and Information
    • /
    • v.16 no.11
    • /
    • pp.245-253
    • /
    • 2011
  • Due to the imprudent spending of the fossil fuels, the environment was contaminated seriously and the exhaustion problems of the fossil fuels loomed large. Therefore people become taking a great interest in alternative energy resources which can solve problems of fossil fuels. The wind power energy is one of the most interested energy in the new and renewable energy. However, the plants of wind power energy and the traditional power plants should be balanced between the power generation and the power consumption. Therefore, we need analysis and prediction to generate power efficiently using wind energy. In this paper, we have performed a research to predict power generation patterns using the wind power data. Prediction approaches of datamining area can be used for building a prediction model. The research steps are as follows: 1) we performed preprocessing to handle the missing values and anomalous data. And we extracted the characteristic vector data. 2) The representative patterns were found by the MIA(Mean Index Adequacy) measure and the SOM(Self-Organizing Feature Map) clustering approach using the normalized dataset. We assigned the class labels to each data. 3) We built a new predicting model about the wind power generation with classification approach. In this experiment, we built a forecasting model to predict wind power generation patterns using the decision tree.

A study on 3-step complex data mining in society indicator survey (사회지표조사에서의 3단계 복합 데이터마이닝의 적용 방안)

  • Cho, Kwang-Hyun;Park, Hee-Chang
    • Journal of the Korean Data and Information Science Society
    • /
    • v.23 no.5
    • /
    • pp.983-992
    • /
    • 2012
  • Social indicator survey can identify the state of society as a whole. When we create a policy, social indicator survey can reflect the public opinion of the region. Social indicator survey is an important measure of social change. Social indicator survey has been conducted in many municipalities (Seoul, Incheon, Busan, Ulsan, Gyeongsangnamdo, etc.). But, the result of social indicator survey analysis is mainly the basic statistical analysis. In this study, we propose a new data mining methodology for effective analysis. We propose a 3-step complex data mining in society indicator survey. 3-step complex data mining uses three data mining method (intervening association rule, clustering, decision tree).

Discretization of continuous-valued attributes considering data distribution (데이터 분포를 고려한 연속 값 속성의 이산화)

  • 이상훈;박정은;오경환
    • Proceedings of the Korean Institute of Intelligent Systems Conference
    • /
    • 2003.05a
    • /
    • pp.217-220
    • /
    • 2003
  • 본 논문에서는 특정 매개변수의 입력 없이 속성(attribute)에 따른 목적속성(class)값의 분포를 고려하여 연속형(conti-nuous) 값을 범주형(categorical)의 형태로 변환시키는 새로운 방법을 제안하였다. 각각의 속성에 대해 목적속성의 분포를 1차원 공간에 사상(mapping)하고, 각 목적속성의 밀도, 다른 목적속성과의 중복 정도 등의 기준에 따라 구간을 군집화 한다. 이렇게 생성된 군집들은 각각 목적속성을 예측할 수 있는 확률적 수치에 기반한 것으로, 각 속성이 제공하는 정보의 손실을 최소화하는 이산화 경계선을 갖고 있다. 제안된 데이터 이산화 방법의 향상된 성능은 C4.5 알고리즘과 UCI Machine Learning Data Repository 데이터를 사용하여 확인할 수 있다.

  • PDF

Sales Forecasting Model for Apparel Products Using Machine Learning Technique - A Case Study on Forecasting Outerwear Items - (머신 러닝을 활용한 의류제품의 판매량 예측 모델 - 아우터웨어 품목을 중심으로 -)

  • Chae, Jin Mie;Kim, Eun Hie
    • Fashion & Textile Research Journal
    • /
    • v.23 no.4
    • /
    • pp.480-490
    • /
    • 2021
  • Sales forecasting is crucial for many retail operations. For apparel retailers, accurate sales forecast for the next season is critical to properly manage inventory and plan their supply chains. The challenge in this increases because apparel products are always new for the next season, have numerous variations, short life cycles, long lead times, and seasonal trends. In this study, a sales forecasting model is proposed for apparel products using machine learning techniques. The sales data pertaining to outerwear items for four years were collected from a Korean sports brand and filtered with outliers. Subsequently, the data were standardized by removing the effects of exogenous variables. The sales patterns of outerwear items were clustered by applying K-means clustering, and outerwear attributes associated with the specific sales-pattern type were determined by using a decision tree classifier. Six types of sales pattern clusters were derived and classified using a hybrid model of clustering and decision tree algorithm, and finally, the relationship between outerwear attributes and sales patterns was revealed. Each sales pattern can be used to predict stock-keeping-unit-level sales based on item attributes.

Clustering and classification to characterize daily electricity demand (시간단위 전력사용량 시계열 패턴의 군집 및 분류분석)

  • Park, Dain;Yoon, Sanghoo
    • Journal of the Korean Data and Information Science Society
    • /
    • v.28 no.2
    • /
    • pp.395-406
    • /
    • 2017
  • The purpose of this study is to identify the pattern of daily electricity demand through clustering and classification. The hourly data was collected by KPS (Korea Power Exchange) between 2008 and 2012. The time trend was eliminated for conducting the pattern of daily electricity demand because electricity demand data is times series data. We have considered k-means clustering, Gaussian mixture model clustering, and functional clustering in order to find the optimal clustering method. The classification analysis was conducted to understand the relationship between external factors, day of the week, holiday, and weather. Data was divided into training data and test data. Training data consisted of external factors and clustered number between 2008 and 2011. Test data was daily data of external factors in 2012. Decision tree, random forest, Support vector machine, and Naive Bayes were used. As a result, Gaussian model based clustering and random forest showed the best prediction performance when the number of cluster was 8.

Sales Pattern and Related Product Attributes of T-shirts (티셔츠 상품의 판매패턴과 연관된 상품속성)

  • Chae, Jin Mie;Kim, Eun Hie
    • Journal of the Korean Society of Clothing and Textiles
    • /
    • v.44 no.6
    • /
    • pp.1053-1069
    • /
    • 2020
  • This study examined the sales pattern relationship with respect to product attributes to propose sales forecasting for fashion products. We analyzed 537 SKU sales data of T-shirts in the domestic sports brand using SAS program. The sales pattern of fashion products fluctuated and were influenced by exogenous factors; therefore, we removed the influence of exogenous factors found to be price discounts and holiday effects as a result of regression analysis. In addition, it was difficult to predict sales using the sales patterns of the same product since fashion products were released as new products every year. Therefore, the forecasting model was proposed using sales patterns of related product attributes when attributes were considered descriptive variables. We classified sales patterns using K-means clustering in order to explain the relationship between sales patterns and product attributes along with creating a decision tree classifier using attributes as input and sales patterns as output. As a result, the sales patterns of T-shirts were clustered into six types that featured the characteristic shape of peak and slope. It was also associated with the combination of product attributes and their values in regards to the proposed sales pattern prediction model.

A Study on the Database Marketing using Data Mining in the Traditional Medicine (데이터마이닝을 활용한 한방분야에서의 데이터베이스 마케팅에 대한 연구)

  • Lee Sang-Young;Lee Yun-Seok
    • Journal of the Korea Society of Computer and Information
    • /
    • v.10 no.5 s.37
    • /
    • pp.271-280
    • /
    • 2005
  • This study is to elicit the factors affected on the medical examination in the tra야tional medicine using the technical method of the decision tree and characterize the Patient subject by clustering analysis technique. And to draw results from the association analysis between the form of diseases in the re-hospitalized Patient group. The obtained results were analyzed for their effect on the hospital Profits. Thus. through application of the database marketing to the data mining technique in the tradition리 medicine, the characteristics of patient clients for the objective induction of factors affected on the hospital Fronts can be identified. Practical application of the database marketing as presented in this study will bring about a fundamental efficiency of hospital management and vitalization.

  • PDF

Data-driven Co-Design Process for New Product Development: A Case Study on Smart Heating Jacket (신제품 개발을 위한 데이터 기반 공동 디자인 프로세스: 스마트 난방복 사례 연구)

  • Leem, Sooyeon;Lee, Sang Won
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.1
    • /
    • pp.133-141
    • /
    • 2021
  • This research suggests a design process that effectively complements the human-centered design through an objective data-driven approach. The subjective human-centered design process can often lack objectivity and can be supplemented by the data-driven approaches to effectively discover hidden user needs. This research combines the data mining analysis with co-design process and verifies its applicability through the case study on the smart heating jacket. In the data mining process, the clustering can group the users which is the basis for selecting the target groups and the decision tree analysis primarily identifies the important user perception attributes and values. The broad point of view based on the data analysis is modified through the co-design process which is the deeper human-centered design process by using the developed workbook. In the co-design process, the journey maps, needs and pain points, ideas, values for the target user groups are identified and finalized. They can become the basis for starting new product development.