• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.022 seconds

Efficient DRG Fraud Candidate Detection Method Using Data Mining Techniques (데이터마이닝 기법을 이용한 효율적인 DRG 확인심사대상건 검색방법)

  • Lee, Jung-Kyu;Jo, Min-Woo;Park, Ki-Dong;Lee, Moo-Song;Lee, Sang-Il;Kim, Chang-Yup;Kim, Yong-Ik;Hong, Du-Ho
    • Journal of Preventive Medicine and Public Health
    • /
    • v.36 no.2
    • /
    • pp.147-152
    • /
    • 2003
  • Objectives : To develop a Diagnosis-Related Group (DRG) fraud candidate detection method, using data mining techniques, and to examine the efficiency of the developed method. Methods ; The Study included 79,790 DRGs and their related claims of 8 disease groups (Lens procedures, with or without, vitrectomy, tonsillectomy and/or adenoidectomy only, appendectomy, Cesarean section, vaginal delivery, anal and/or perianal procedures, inguinal and/or femoral hernia procedures, uterine and/or adnexa procedures for nonmalignancy), which were examined manually during a 32 months period. To construct an optimal prediction model, 38 variables were applied, and the correction rate and lift value of 3 models (decision tree, logistic regression, neural network) compared. The analyses were peformed separately by disease group. Results : The correction rates of the developed method, using data mining techniques, were 15.4 to 81.9%, according to disease groups, with an overall correction rate of 60.7%. The lift values were 1.9 to 7.3 according to disease groups, with an overall lift value of 4.1. Conclusions : The above findings suggested that the applying of data mining techniques is necessary to improve the efficiency of DRG fraud candidate detection.

Monitoring and Analytical Techniques for the Discharged Radiocarbon from Nuclear Facility (핵시설로부터 발생되는 방사성탄소 분석기술 및 감시)

  • Chun, Sang-Ki;Kim, Nak-Bae;Kim, Kun-Han;Choi, Su-Young;Park, Chan-Jo;Lee, Joung-Dae;Shin, Jang-Sik
    • Analytical Science and Technology
    • /
    • v.13 no.6
    • /
    • pp.693-698
    • /
    • 2000
  • The object of this series of experiments was aimed for the systematic and long-term radioactivity monitoring through indirect search of C-14 concentration level changes in the natural conditions around the operating nuclear facilities. The result of environmental radioactivity level through tree-ring analysis is increased after operating nuclear facilities and such a level can be proved to relate power generation closely. The measured result of ${\delta}^{13}C$ through the treatment of cellulose can be showed the level -30‰. This figure is very different from one which is measured the -17‰ of air sample by passive air sampling and -8‰ of air sample by active air sampling. And these differences can be assumed as isotope fractionation by photosynthesis, but the problem is more study as needed.

  • PDF

A Study on Monitoring Method of Citizen Opinion based on Big Data : Focused on Gyeonggi Lacal Currency (Gyeonggi Money) (빅데이터 기반 시민의견 모니터링 방안 연구 : "경기지역화폐"를 중심으로)

  • Ahn, Soon-Jae;Lee, Sae-Mi;Ryu, Seung-Ei
    • Journal of Digital Convergence
    • /
    • v.18 no.7
    • /
    • pp.93-99
    • /
    • 2020
  • Text mining is one of the big data analysis methods that extracts meaningful information from atypical large-scale text data. In this study, text mining was used to monitor citizens' opinions on the policies and systems being implemented. We collected 5,108 newspaper articles and 748 online cafe posts related to 'Gyeonggi Lacal Currency' and performed frequency analysis, TF-IDF analysis, association analysis, and word tree visualization analysis. As a result, many articles related to the purpose of introducing local currency, the benefits provided, and the method of use. However, the contents related to the actual use of local currency were written in the online cafe posts. In order to revitalize local currency, the news was involved in the promotion of local currency as an informant. Online cafe posts consisted of the opinions of citizens who are local currency users. SNS and text mining are expected to effectively activate various policies as well as local currency.

Anomaly Intrusion Detection based on Association Rule Mining in a Database System (데이터베이스 시스템에서 연관 규칙 탐사 기법을 이용한 비정상 행위 탐지)

  • Park, Jeong-Ho;Oh, Sang-Hyun;Lee, Won-Suk
    • The KIPS Transactions:PartC
    • /
    • v.9C no.6
    • /
    • pp.831-840
    • /
    • 2002
  • Due to the advance of computer and communication technology, intrusions or crimes using a computer have been increased rapidly while tremendous information has been provided to users conveniently Specially, for the security of a database which stores important information such as the private information of a customer or the secret information of a company, several basic suity methods of a database management system itself or conventional misuse detection methods have been used. However, a problem caused by abusing the authority of an internal user such as the drain of secret information is more serious than the breakdown of a system by an external intruder. Therefore, in order to maintain the sorority of a database effectively, an anomaly defection technique is necessary. This paper proposes a method that generates the normal behavior profile of a user from the database log of the user based on an association mining method. For this purpose, the Information of a database log is structured by a semantically organized pattern tree. Consequently, an online transaction of a user is compared with the profile of the user, so that any anomaly can be effectively detected.

A Study on Detection of Small Size Malicious Code using Data Mining Method (데이터 마이닝 기법을 이용한 소규모 악성코드 탐지에 관한 연구)

  • Lee, Taek-Hyun;Kook, Kwang-Ho
    • Convergence Security Journal
    • /
    • v.19 no.1
    • /
    • pp.11-17
    • /
    • 2019
  • Recently, the abuse of Internet technology has caused economic and mental harm to society as a whole. Especially, malicious code that is newly created or modified is used as a basic means of various application hacking and cyber security threats by bypassing the existing information protection system. However, research on small-capacity executable files that occupy a large portion of actual malicious code is rather limited. In this paper, we propose a model that can analyze the characteristics of known small capacity executable files by using data mining techniques and to use them for detecting unknown malicious codes. Data mining analysis techniques were performed in various ways such as Naive Bayesian, SVM, decision tree, random forest, artificial neural network, and the accuracy was compared according to the detection level of virustotal. As a result, more than 80% classification accuracy was verified for 34,646 analysis files.

A study on forecasting attendance rate of reserve forces training based on Data Mining (데이터마이닝에 기반한 예비군훈련 입소율 예측에 관한 연구)

  • Cho, Sangjoon;Ma, Jungmok
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.22 no.3
    • /
    • pp.261-267
    • /
    • 2021
  • The mission of the reserve forces unit is to prepare good training for reserve forces during peacetime. For good training, units require proper organization support agents, but they have difficulties due to a lack of unit members. For that reason, the units forecast the monthly attendance rate of reserve forces (using the x-1 year's result) to organize support agents and unit schedule. On the other hand, the existing planning method can have more errors compared to the actual result of the attendance rate. This problem has a negative effect on the training performance. Therefore, it requires more accurate forecast models to reduce attendance rate errors. This paper proposes an attendance rate forecast model using data mining. To verify the proposed data mining based model, the existing planning method was compared with the proposed model using real data. The results showed that the proposed model outperforms the existing planning method.

Incremental Generation of A Decision Tree Using Global Discretization For Large Data (대용량 데이터를 위한 전역적 범주화를 이용한 결정 트리의 순차적 생성)

  • Han, Kyong-Sik;Lee, Soo-Won
    • The KIPS Transactions:PartB
    • /
    • v.12B no.4 s.100
    • /
    • pp.487-498
    • /
    • 2005
  • Recently, It has focused on decision tree algorithm that can handle large dataset. However, because most of these algorithms for large datasets process data in a batch mode, if new data is added, they have to rebuild the tree from scratch. h more efficient approach to reducing the cost problem of rebuilding is an approach that builds a tree incrementally. Representative algorithms for incremental tree construction methods are BOAT and ITI and most of these algorithms use a local discretization method to handle the numeric data type. However, because a discretization requires sorted numeric data in situation of processing large data sets, a global discretization method that sorts all data only once is more suitable than a local discretization method that sorts in every node. This paper proposes an incremental tree construction method that efficiently rebuilds a tree using a global discretization method to handle the numeric data type. When new data is added, new categories influenced by the data should be recreated, and then the tree structure should be changed in accordance with category changes. This paper proposes a method that extracts sample points and performs discretiration from these sample points to recreate categories efficiently and uses confidence intervals and a tree restructuring method to adjust tree structure to category changes. In this study, an experiment using people database was made to compare the proposed method with the existing one that uses a local discretization.

A Study on the Combined Decision Tree(C4.5) and Neural Network Algorithm for Classification of Mobile Telecommunication Customer (이동통신고객 분류를 위한 의사결정나무(C4.5)와 신경망 결합 알고리즘에 관한 연구)

  • 이극노;이홍철
    • Journal of Intelligence and Information Systems
    • /
    • v.9 no.1
    • /
    • pp.139-155
    • /
    • 2003
  • This paper presents the new methodology of analyzing and classifying patterns of customers in mobile telecommunication market to enhance the performance of predicting the credit information based on the decision tree and neural network. With the application of variance selection process from decision tree, the systemic process of defining input vector's value and the rule generation were developed. In point of customer management, this research analyzes current customers and produces the patterns of them so that the company can maintain good customer relationship and makes special management on the customer who has huh potential of getting out of contract in advance. The real implementation of proposed method shows that the predicted accuracy is higher than existing methods such as decision tree(CART, C4.5), regression, neural network and combined model(CART and NN).

  • PDF

Customer Segmentation of a Home Study Company using a Hybrid Decision Tree and Artificial Neural Network Model (하이브리드 의사결정나무와 인공신경망 모델을 이용한 방문학습지사의 고객세분화)

  • Seo Kwang-Kyu;Ahn Beum-Jun
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.7 no.3
    • /
    • pp.518-523
    • /
    • 2006
  • Due to keen competition among companies, they have segmented customers and they are trying to offer specially targeted customer by means of the distinguished method. In accordance, data mining techniques are noted as the effective method that extracts useful information. This paper explores customer segmentation of the home study company using a hybrid decision tree and artificial neural network model. With the application of variance selection process from decision tree, the systemic process of defining input vector's value and the rule generation were developed. In point of customer management, this research analyzes current customers and produces the patterns of them so that the company can maintain good customer relationship. The case study shows that the predicted accuracy of the proposed model is higher than those of regression, decision tree (CART), artificial neural networks.

  • PDF

A Study on the Design of Tolerance for Process Parameter using Decision Tree and Loss Function (의사결정나무와 손실함수를 이용한 공정파라미터 허용차 설계에 관한 연구)

  • Kim, Yong-Jun;Chung, Young-Bae
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.39 no.1
    • /
    • pp.123-129
    • /
    • 2016
  • In the manufacturing industry fields, thousands of quality characteristics are measured in a day because the systems of process have been automated through the development of computer and improvement of techniques. Also, the process has been monitored in database in real time. Particularly, the data in the design step of the process have contributed to the product that customers have required through getting useful information from the data and reflecting them to the design of product. In this study, first, characteristics and variables affecting to them in the data of the design step of the process were analyzed by decision tree to find out the relation between explanatory and target variables. Second, the tolerance of continuous variables influencing on the target variable primarily was shown by the application of algorithm of decision tree, C4.5. Finally, the target variable, loss, was calculated by a loss function of Taguchi and analyzed. In this paper, the general method that the value of continuous explanatory variables has been used intactly not to be transformed to the discrete value and new method that the value of continuous explanatory variables was divided into 3 categories were compared. As a result, first, the tolerance obtained from the new method was more effective in decreasing the target variable, loss, than general method. In addition, the tolerance levels for the continuous explanatory variables to be chosen of the major variables were calculated. In further research, a systematic method using decision tree of data mining needs to be developed in order to categorize continuous variables under various scenarios of loss function.