• Title/Summary/Keyword: 의사결정 나무모형

Search Result 228, Processing Time 0.029 seconds

Analysis of Automatic Meter Reading Systems (IBM, Oracle, and Itron) (국외 상수도 원격검침 시스템(IBM, Oracle, Itron) 분석)

  • Joo, Jin Chul;Kim, Juhwan;Lee, Doojin;Choi, Taeho;Kim, Jong Kyu
    • Proceedings of the Korea Water Resources Association Conference
    • /
    • 2017.05a
    • /
    • pp.264-264
    • /
    • 2017
  • 국외의 상수도 원격검침 시스템 내 데이터 전송방식은 도시 규모, 계량기의 밀도, 전력공급 여부 및 통신망의 설치 여부 등을 종합적으로 고려하여 결정되었다. 대부분의 스마트워터미터 제조업체들은 계량기의 부호기가 공급하는 판독 내용(데이터)을 전송할 검침단말기와 근거리 통신망(neighborhood area network)을 연계하여 개발 및 판매하였으며, 자체 소유 통신 프로토콜을 사용하여 라디오 주파수(RF) 통신 기술을 사용하고 있다. 광역통신망(wide area network)의 경우, 노드(말단의 계량기 및 센서)들과 이에 연결된 통신망 들을 포함한 네트웍의 배열이나 구성이 스타(star), 메쉬(mesh), 버스(bus), 나무(tree) 등의 형태로 통신망이 구성되어 있으나, 스타와 메쉬형 통신망 구성형태가 가장 널리 활용되는 것으로 조사되었다. 시스템 통합운영관리 업체들인 IBM, Oracle, Itron 등은 용수 인프라 관리 또는 통합네트워크 솔루션 등의 통합 물관리 시스템(integrated water management system)을 개발하여 현장적용을 하고 있으며, 원격검침 시스템을 통해 고객들의 현재 소비량과 과거 누적 소비량, 누수 감지 서비스 및 실시간 요금 고지 등을 실시간으로 웹 포털과 앱을 통해 제공하고 있다. 또한, 일부 제조업체들은 도시 용수공급/소비 관리자가 주민의 용수사용량을 모니터링하여 일평균 용수사용량 및 사용 경향을 파악하고, 누수를 검지하여 복구 및 용수 사용 지속가능성 지수를 제시하고, 실시간으로 주민의 용수사용량 관련 데이터를 모니터링하여 용수공급의 최적화를 위한 의사결정지원 서비스를 용수공급자에게 제공하고 있다. 최근에는 인공지능을 활용해 가정용수의 용도별(세탁용수, 화장실용수, 샤워용수, 식기세척용수 등) 사용량 곡선을 패터닝하여 profiling 기법을 도입해, 스마트워터미터에서 용수사용량이 통합되어 검지될 시 용수사용량의 세부 용도별 re-profiling 기법을 도입하여 가정용수내 과소비되는 지점을 도출 후 절감을 유도하는 기술이 개발 중이다. 또한, 미래 용수 사용량 예측을 위해 다양한 시계열 자료를 분석하는 선형 종속 모형(자기회귀모형, 자기회귀이동평균모형, 자기회귀적분이동평균모형 등)과 비선형 종속 모형(Fuzzy Logic, Neural Network, Genetic Algorithm 등)을 활용한 예측기능이 구축되어 상호 비교하여 최적의 용수사용량 예측 도구를 제공되고 있다.

  • PDF

Detection of Phantom Transaction using Data Mining: The Case of Agricultural Product Wholesale Market (데이터마이닝을 이용한 허위거래 예측 모형: 농산물 도매시장 사례)

  • Lee, Seon Ah;Chang, Namsik
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.1
    • /
    • pp.161-177
    • /
    • 2015
  • With the rapid evolution of technology, the size, number, and the type of databases has increased concomitantly, so data mining approaches face many challenging applications from databases. One such application is discovery of fraud patterns from agricultural product wholesale transaction instances. The agricultural product wholesale market in Korea is huge, and vast numbers of transactions have been made every day. The demand for agricultural products continues to grow, and the use of electronic auction systems raises the efficiency of operations of wholesale market. Certainly, the number of unusual transactions is also assumed to be increased in proportion to the trading amount, where an unusual transaction is often the first sign of fraud. However, it is very difficult to identify and detect these transactions and the corresponding fraud occurred in agricultural product wholesale market because the types of fraud are more intelligent than ever before. The fraud can be detected by verifying the overall transaction records manually, but it requires significant amount of human resources, and ultimately is not a practical approach. Frauds also can be revealed by victim's report or complaint. But there are usually no victims in the agricultural product wholesale frauds because they are committed by collusion of an auction company and an intermediary wholesaler. Nevertheless, it is required to monitor transaction records continuously and to make an effort to prevent any fraud, because the fraud not only disturbs the fair trade order of the market but also reduces the credibility of the market rapidly. Applying data mining to such an environment is very useful since it can discover unknown fraud patterns or features from a large volume of transaction data properly. The objective of this research is to empirically investigate the factors necessary to detect fraud transactions in an agricultural product wholesale market by developing a data mining based fraud detection model. One of major frauds is the phantom transaction, which is a colluding transaction by the seller(auction company or forwarder) and buyer(intermediary wholesaler) to commit the fraud transaction. They pretend to fulfill the transaction by recording false data in the online transaction processing system without actually selling products, and the seller receives money from the buyer. This leads to the overstatement of sales performance and illegal money transfers, which reduces the credibility of market. This paper reviews the environment of wholesale market such as types of transactions, roles of participants of the market, and various types and characteristics of frauds, and introduces the whole process of developing the phantom transaction detection model. The process consists of the following 4 modules: (1) Data cleaning and standardization (2) Statistical data analysis such as distribution and correlation analysis, (3) Construction of classification model using decision-tree induction approach, (4) Verification of the model in terms of hit ratio. We collected real data from 6 associations of agricultural producers in metropolitan markets. Final model with a decision-tree induction approach revealed that monthly average trading price of item offered by forwarders is a key variable in detecting the phantom transaction. The verification procedure also confirmed the suitability of the results. However, even though the performance of the results of this research is satisfactory, sensitive issues are still remained for improving classification accuracy and conciseness of rules. One such issue is the robustness of data mining model. Data mining is very much data-oriented, so data mining models tend to be very sensitive to changes of data or situations. Thus, it is evident that this non-robustness of data mining model requires continuous remodeling as data or situation changes. We hope that this paper suggest valuable guideline to organizations and companies that consider introducing or constructing a fraud detection model in the future.

Exploring On-line Consumption Tendency of Sports 4.0 Market Consumer: Focused on Sports Goods Consumption by Generation of Working Age Population (스포츠 4.0 시장 소비자의 온라인 소비성향 탐색: 생산 가능인구의 세대별 스포츠 용품 소비를 중심으로)

  • Jin-Ho Shin
    • Journal of the Korean Applied Science and Technology
    • /
    • v.40 no.1
    • /
    • pp.24-34
    • /
    • 2023
  • This study sought to explore the online consumption propensity of sports goods by generation of the productive population and to provide basic data to predict the future consumption market by segmenting online consumers in the sports 4.0 market. Therefore, this survey was conducted on those who consumed sports goods among the generation-specific groups (Generation Y and above, Z) of the productive population, and a total of 478 people's data were applied to the final analysis. Data processing was conducted with SPSS statistics (ver.21.0), frequency analysis, exploratory factor analysis, correlation analysis of re-examination reliability, reliability analysis, and decision tree analysis. According to the online consumption propensity of sports goods by generation of the productive population, there is a high probability of being classified as Generation Z group if the factors of leisure, joy, and environment are high. In addition, the classification accuracy of such a model was 69.7%.

Developing a regional fog prediction model using tree-based machine-learning techniques and automated visibility observations (시정계 자료와 기계학습 기법을 이용한 지역 안개예측 모형 개발)

  • Kim, Daeha
    • Journal of Korea Water Resources Association
    • /
    • v.54 no.12
    • /
    • pp.1255-1263
    • /
    • 2021
  • While it could become an alternative water resource, fog could undermine traffic safety and operational performance of infrastructures. To reduce such adverse impacts, it is necessary to have spatially continuous fog risk information. In this work, tree-based machine-learning models were developed in order to quantify fog risks with routine meteorological observations alone. The Extreme Gradient Boosting (XGB), Light Gradient Boosting (LGB), and Random Forests (RF) were chosen for the regional fog models using operational weather and visibility observations within the Jeollabuk-do province. Results showed that RF seemed to show the most robust performance to categorize between fog and non-fog situations during the training and evaluation period of 2017-2019. While the LGB performed better than in predicting fog occurrences than the others, its false alarm ratio was the highest (0.695) among the three models. The predictability of the three models considerably declined when applying them for an independent period of 2020, potentially due to the distinctively enhanced air quality in the year under the global lockdown. Nonetheless, even in 2020, the three models were all able to produce fog risk information consistent with the spatial variation of observed fog occurrences. This work suggests that the tree-based machine learning models could be used as tools to find locations with relatively high fog risks.

A prediction model for adolescents' skipping breakfast using the CART algorithm for decision trees: 7th (2016-2018) Korea National Health and Nutrition Examination Survey (의사결정나무 CART 알고리즘을 이용한 청소년 아침결식 예측 모형: 제7기 (2016-2018년) 국민건강영양조사 자료분석)

  • Sun A Choi;Sung Suk Chung;Jeong Ok Rho
    • Journal of Nutrition and Health
    • /
    • v.56 no.3
    • /
    • pp.300-314
    • /
    • 2023
  • Purpose: This study sought to predict the reasons for skipping breakfast by adolescents aged 13-18 years using the 7th Korea National Health and Nutrition Examination Survey (KNHANES). Methods: The participants included 1,024 adolescents. The data were analyzed using a complex-sample t-test, the Rao Scott χ2-test, and the classification and regression tree (CART) algorithm for decision tree analysis with SPSS v. 27.0. The participants were divided into two groups, one regularly eating breakfast and the other skipping it. Results: A total of 579 and 445 study participants were found to be breakfast consumers and breakfast skippers respectively. Breakfast consumers were significantly younger than those who skipped breakfast. In addition, breakfast consumers had a significantly higher frequency of eating dinner, had been taught about nutrition, and had a lower frequency of eating out. The breakfast skippers did so to lose weight. Children who skipped breakfast consumed less energy, carbohydrates, proteins, fats, fiber, cholesterol, vitamin C, vitamin A, calcium, vitamin B1, vitamin B2, phosphorus, sodium, iron, potassium, and niacin than those who consumed breakfast. The best predictor of skipping breakfast was identifying adolescents who sought to control their weight by not eating meals. Other participants who had low and middle-low household incomes, ate dinner 3-4 times a week, were more than 14.5 years old, and ate out once a day showed a higher frequency of skipping breakfast. Conclusion: Based on these results, nutrition education targeted at losing weight correctly and emphasizing the importance of breakfast, especially for adolescents, is required. Moreover, nutrition educators should consider designing and implementing specific action plans to encourage adolescents to improve their breakfast-eating practices by also eating dinner regularly and reducing eating out.

Soil Moisture Estimation Using CART Algorithm and Ancillary Data (CART기법과 보조자료를 이용한 토양수분 추정)

  • Kim, Gwang-Seob;Park, Han-Gyun
    • Journal of Korea Water Resources Association
    • /
    • v.43 no.7
    • /
    • pp.597-608
    • /
    • 2010
  • In this study, a method for soil moisture estimation was proposed to obtain the nationwide soil moisture distribution map using on-site soil moisture observations, rainfall, surface temperature, NDVI, land cover, effective soil depth, and CART (Classification And Regression Tree) algorithm. The method was applied to the Yong-dam dam basin since the soil moisture data (4 sites) of the basin were reliable. Soil moisture observations of 3 sites (Bu-gui, San-jeon, Cheon-cheon2) were used for training the algorithm and 1 site (Gye-buk2) was used for the algorithm validation. The correlation coefficient between the observed and estimated data of soil moisture in the validation sites is about 0.737. Results show that even though there are limitations of the lack of reliable soil moisture observation for various land use, soil type, and topographic conditions, the soil moisture estimation method using ancillary data and CART algorithm can be a reasonable approach since the algorithm provided a fairly good estimation of soil moisture distribution for the study area.

A Study on Propriety of Pilot Aptitude Test Using Phased Analysis of Pilot Training (비행교육과정 단계별 분석을 통한 조종적성검사 항목 타당성 연구)

  • Kim, HeeYoung;Kim, SuHwan;Moon, HoSeok
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.218-225
    • /
    • 2016
  • It is important to select the personnel with ideal pilot aptitude considering dramatically advancing aircraft performance and complexity of military operations as a consequence to the highly developed science and technology. The opportunity cost lost from dropouts and human error being the first cause of aviation accidents are the realistic reasons for the significance of personnel selection based on their aptitude. This study analyses the ROKAF pilot aptitude test that was improved in 2004, using various classification models. This study discusses the significance of the selected variables along with the direction of ROKAF pilot aptitude test for its development in the future. The accuracy of the classification models was improved by taking into account differing personnel characteristics of individuals on the test.

An Analysis of Ordinary Mail Service Quality Attributes using Kano Model and Decision Tree Model (카노모형에서 의사결정나무모형을 이용한 통상우편서비스 품질속성 분석)

  • Choi, Hyeon Deok;Riew, Moon Charn
    • Journal of Korean Society for Quality Management
    • /
    • v.44 no.4
    • /
    • pp.883-895
    • /
    • 2016
  • Purpose: The demand for ordinary mail services supplied by 'Korea POST' is decreasing due to the opening of mail service market and the growth of alternative communication media such as e-mail and SNS. To overcome this situation it is urgent to introduce new services that can be able to appeal customers and to improve existing services. Methods: A field survey is conducted to corporate customers who send ordinary mails and individual customers who receive these mails, respectively. Quality attributes of ordinary mail services are classified by two-dimensional perspectives in terms of Kano model. Decision tree model is utilized for classifying the quality attributes. Comparative analyses are done whether there are perceived differences on each quality attributes between corporate customers and individual customers. Results: Quality attributes such as 'discount postal charges', 'sending small packages by simply dropping it into a mail box', 'sending a mail of any appearance', 'delivering a mail anywhere', and 'receiving a mail at a preferred time where a customer is located ' are classified differently according to some market segments, while most of the quality attributes are classified as attractive or one-dimensional. Conclusion: Decision tree model has been found to be most effective to classify quality attributes for each market segment especially when trying to classify quality attributes belonging to 'gray areas'. Based on the perceived differences on quality attributes among customers, strategic implications are suggested to obtain potential customers and to have competitive advantages.

An Optimized Combination of π-fuzzy Logic and Support Vector Machine for Stock Market Prediction (주식 시장 예측을 위한 π-퍼지 논리와 SVM의 최적 결합)

  • Dao, Tuanhung;Ahn, Hyunchul
    • Journal of Intelligence and Information Systems
    • /
    • v.20 no.4
    • /
    • pp.43-58
    • /
    • 2014
  • As the use of trading systems has increased rapidly, many researchers have become interested in developing effective stock market prediction models using artificial intelligence techniques. Stock market prediction involves multifaceted interactions between market-controlling factors and unknown random processes. A successful stock prediction model achieves the most accurate result from minimum input data with the least complex model. In this research, we develop a combination model of ${\pi}$-fuzzy logic and support vector machine (SVM) models, using a genetic algorithm to optimize the parameters of the SVM and ${\pi}$-fuzzy functions, as well as feature subset selection to improve the performance of stock market prediction. To evaluate the performance of our proposed model, we compare the performance of our model to other comparative models, including the logistic regression, multiple discriminant analysis, classification and regression tree, artificial neural network, SVM, and fuzzy SVM models, with the same data. The results show that our model outperforms all other comparative models in prediction accuracy as well as return on investment.

Development of prediction model identifying high-risk older persons in need of long-term care (장기요양 필요 발생의 고위험 대상자 발굴을 위한 예측모형 개발)

  • Song, Mi Kyung;Park, Yeongwoo;Han, Eun-Jeong
    • The Korean Journal of Applied Statistics
    • /
    • v.35 no.4
    • /
    • pp.457-468
    • /
    • 2022
  • In aged society, it is important to prevent older people from being disability needing long-term care. The purpose of this study is to develop a prediction model to discover high-risk groups who are likely to be beneficiaries of Long-Term Care Insurance. This study is a retrospective study using database of National Health Insurance Service (NHIS) collected in the past of the study subjects. The study subjects are 7,724,101, the population over 65 years of age registered for medical insurance. To develop the prediction model, we used logistic regression, decision tree, random forest, and multi-layer perceptron neural network. Finally, random forest was selected as the prediction model based on the performances of models obtained through internal and external validation. Random forest could predict about 90% of the older people in need of long-term care using DB without any information from the assessment of eligibility for long-term care. The findings might be useful in evidencebased health management for prevention services and can contribute to preemptively discovering those who need preventive services in older people.