• Title/Summary/Keyword: Tree mining

Search Result 566, Processing Time 0.028 seconds

Severity-Adjusted LOS Model of AMI patients based on the Korean National Hospital Discharge in-depth Injury Survey Data (퇴원손상심층조사 자료를 기반으로 한 급성심근경색환자 재원일수의 중증도 보정 모형 개발)

  • Kim, Won-Joong;Kim, Sung-Soo;Kim, Eun-Ju;Kang, Sung-Hong
    • Journal of the Korea Academia-Industrial cooperation Society
    • /
    • v.14 no.10
    • /
    • pp.4910-4918
    • /
    • 2013
  • This study aims to design a Severity-Adjusted LOS(Length of Stay) Model in order to efficiently manage LOS of AMI(Acute Myocardial Infarction) patients. We designed a Severity-Adjusted LOS Model with using data-mining methods(multiple regression analysis, decision trees, and neural network) which covered 6,074 AMI patients who showed the diagnosis of I21 from 2004-2009 Korean National Hospital Discharge in-depth Injury Survey. A decision tree model was chosen for the final model that produced superior results. This study discovered that the execution of CABG, status at discharge(alive or dead), comorbidity index, etc. were major factors affecting a Sevirity-Adjustment of LOS of AMI patients. The difference between real LOS and adjusted LOS resulted from hospital location and bed size. The efficient management of LOS of AMI patients requires that we need to perform various activities after identifying differentiating factors. These factors can be specified by applying each hospital's data into this newly designed Severity-Adjusted LOS Model.

Research of recognition factors of folk medicine using statistical testing and data mining (통계적 검정과 데이터마이닝기법의 융합을 통한 민간요법 인식 요인 탐색조사)

  • Yoo, Jin Ah;Choi, Kyoung-Ho;Cho, Jung-Keun
    • Journal of Digital Convergence
    • /
    • v.13 no.2
    • /
    • pp.393-399
    • /
    • 2015
  • Nowaday, beyond the time of wellbeing and LOHAS, many people have great interest in self therapy, so it is called healing era. As the folk medicine fields are actively industrialized and the interest in health improvement, not disease cure, is increased, many researches about the alternative medicine and therapy in various fields are being performed. In the times of the interest in health improvement and spontaneous, natural healing ability of human body is getting increase, it is very meaningful to search the factors which consist of recognition to folk medicine. So in this study, we developed the questionaries on the basis of previous studies, researched the factors affecting the recognition to folk medicine using factor analysis, and tested statistically the difference of recognition character according to demo-statistical traits. As the result, the twenty-four measurable variables related to folk medicine are sorted to four factors, ie, health improvement factor, safety factor, psycholocial factor, and substitutional factor. And overall, the middle and senior ages, the forties to sixties, and higher-educated peoples have more experiences in folk medicine than the younger ages, below thirties and lower-educated peoples. The distiction of sex makes little differences.

Development of severity-adjusted length of stay in knee replacement surgery (무릎관절치환술 환자의 중증도 보정 재원일수 모형 개발)

  • Hong, Sung-Ok;Kim, Young-Teak;Choi, Youn-Hee;Park, Jong-Ho;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.13 no.2
    • /
    • pp.215-225
    • /
    • 2015
  • This study was conducted to develop a severity-adjusted LOS(Length of Stay) model for knee replacement patients and identify factors that can influence the LOS by using the Korean National Hospital Discharge in-depth Injury Survey data. The comorbidity scoring systems and data-mining methods were used to design a severity-adjusted LOS model which covered 4,102 knee replacement patients. In this study, a decision tree model using CCS comorbidity scoring index was chosen for the final model that produced superior results. Factors such as presence of arthritis, patient sex and admission route etc. influenced patient length of stay. And there was a statistically significant difference between real LOS and adjusted LOS resulted from health-insurance type, bed size, and hospital location. Therefore the policy alternative on excessive medical utilization is needed to reduce variation in length of hospital stay in patients who undergo knee replacement.

Developing data quality management algorithm for Hypertension Patients accompanied with Diabetes Mellitus By Data Mining (데이터 마이닝을 이용한 고혈압환자의 당뇨질환 동반에 관한 데이터 질 관리 알고리즘 개발)

  • Hwang, Kyu-Yeon;Lee, Eun-Sook;Kim, Go-Won;Hong, Sung-Ok;Park, Jong-Son;Kwak, Mi-Sook;Lee, Ye-Jin;Im, Chae-Hyuk;Park, Tae-Hyun;Park, Jong-Ho;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.14 no.7
    • /
    • pp.309-319
    • /
    • 2016
  • There is a need to develop a data quality management algorithm in order to improve the quality of health care data. In this study, we developed a data quality control algorithms associated diseases related to diabetes in patients with hypertension. To make a data quality algorithm, we extracted hypertension patients from 2011 and 2012 discharge damage survey data. As the result of developing Data quality management algorithm, significant factors in hypertension patients with diabetes are gender, age, Glomerular disorders in diabetes mellitus, Diabetic retinopathy, Diabetic polyneuropathy, Closed [percutaneous] [needle] biopsy of kidney. Depending on the decision tree results, we defined Outlier which was probability values associated with a patient having diabetes corporal with hypertension or more than 80%, or not more than 20%, and found six groups with extreme values for diabetes accompanying hypertension patients. Thus there is a need to check the actual data contained in the Outlier(extreme value) groups to improve the quality of the data.

A Study on the Analysis Effect Factors of Illegal Parking Using Data Mining Techniques (데이터마이닝 기법을 활용한 불법주차 영향요인 분석)

  • Lee, Chang-Hee;Kim, Myung-Soo;Seo, So-Min
    • The Journal of The Korea Institute of Intelligent Transport Systems
    • /
    • v.13 no.4
    • /
    • pp.63-72
    • /
    • 2014
  • With the rapid development in the economy and other fields as well, the standard of living in South Korea has been improved, and consequently, the demand of automobiles has quickly increased. It leads to various traffic issues such as traffic congestion, traffic accident, and parking problem. In particular, this illegal parking caused by the increase in the number of automobiles has been considered one of the main reasons to bring about traffic congestion as intensifying any dispute between neighbors in relation to a parking space, which has been also coming to the fore as a social issue. Therefore, this study looked into Daejeon Metropolitan City, the city that is understood to have the highest automobile sharing rate in South Korea but with relatively few cases of illegal parking crackdowns. In order to investigate the theoretical problems of the illegal parking, this study conducted a decision-making tree model-based Exhaustive CHAID analysis to figure out not only what makes drivers park illegally when they try to park vehicles but also those factors that would tempt the drivers into the illegal parking. The study, then, comes up with solutions to the problem. According to the analysis, in terms of the influential factors that encourage the drivers to park at some illegal areas, it was learned that these factors, the distance, a driver's experience of getting caught, the occupation and the use time in order, have an effect on the drivers' deciding to park illegally. After working on the prediction model, four nodes were finally extracted. Given the analysis result, as a solution to the illegal parking, it is necessary to establish public parking lots additionally and first secure the parking space for the vehicles used for living and working, and to activate the campaign for enhancing illegal parking crackdown and encouraging civic consciousness.

A Convergence Study in the Severity-adjusted Mortality Ratio on inpatients with multiple chronic conditions (복합만성질환 입원환자의 중증도 보정 사망비에 대한 융복합 연구)

  • Seo, Young-Suk;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.13 no.12
    • /
    • pp.245-257
    • /
    • 2015
  • This study was to develop the predictive model for severity-adjusted mortality of inpatients with multiple chronic conditions and analyse the factors on the variation of hospital standardized mortality ratio(HSMR) to propose the plan to reduce the variation. We collect the data "Korean National Hospital Discharge In-depth Injury Survey" from 2008 to 2010 and select the final 110,700 objects of study who have chronic diseases for principal diagnosis and who are over the age of 30 with more than 2 chronic diseases including principal diagnosis. We designed a severity-adjusted mortality predictive model with using data-mining methods (logistic regression analysis, decision tree and neural network method). In this study, we used the predictive model for severity-adjusted mortality ratio by the decision tree using Elixhauser comorbidity index. As the result of the hospital standardized mortality ratio(HSMR) of inpatients with multiple chronic conditions, there were statistically significant differences in HSMR by the insurance type, bed number of hospital, and the location of hospital. We should find the method based on the result of this study to manage mortality ratio of inpatients with multiple chronic conditions efficiently as the national level. So we should make an effort to increase the quality of medical treatment for inpatients with multiple chronic diseases and to reduce growing medical expenses.

Multi-family Housing Complex Breakdown Structure for Decision Making on Rehabilitation (노후 공동주택 개선여부 의사결정을 위한 공동주택 분류체계 개발)

  • Hong, Tae-Hoon;Kim, Hyun-Joong;Koo, Choong-Wan;Park, Sung-Ki
    • Korean Journal of Construction Engineering and Management
    • /
    • v.12 no.6
    • /
    • pp.101-109
    • /
    • 2011
  • As climate change is becoming the main issue, various efforts are focused on saving building energy consumption both at home and abroad. In particular, it is very important to save energy by maintenance, repair and rehabilitation of existing multi-family housing complex, because energy consumption in residential buildings is not only forming a great part of gross energy consumption in Korea but the number of deteriorated complexes is also sharply increasing. However, energy saving is not considered as a main factor in decision making on rehabilitation project. Also, any supporting tool is not appropriately prepared in existing process. As the first step for development of decision support system on rehabilitation, this paper developed a breakdown structure, which makes clusters of multi-family housing complexes. Decision tree, one of data mining methods, was used to make clusters based on the characteristics and energy consumption data of multi-family housing complexes. Energy saving and CO2 reduction will be maximized by considering energy consumption during rehabilitation process of multi-family housing complex, based on these results and following research.

An analysis of the signaling effect of FOMC statements (미 연준 통화정책방향 의결문의 시그널링 효과 분석)

  • Woo, Shinwook;Chang, Youngjae
    • The Korean Journal of Applied Statistics
    • /
    • v.33 no.3
    • /
    • pp.321-334
    • /
    • 2020
  • The US Federal Reserve (Fed) has decided to cut interest rates. When we look at the expression of the FOMC statements at the time of policy change period we can understand that Fed has been communicating with markets through a change of word selection. However, there is a criticism that the method of analyzing the expression of the decision sentence through the context can be subjective and limited in qualitative analysis. In this paper, we evaluate the signaling effect of FOMC statements based on previous research. We analyze decision making characteristics from the viewpoint of text mining and try to predict future policy trend changes by capturing changes in expressions between statements. For this purpose, a decision tree and neural network models are used. As a result of the analysis, it can be judged that the discrepancy indicators between statements could be used to predict the policy change in the future and that the US Federal Reserve has systematically implemented policy signaling through the policy statements.

Development of Advanced TB Case Classification Model Using NHI Claims Data (국민건강보험 청구자료 기반의 결핵환자 분류 고도화 모형 개발)

  • Park, Il-Su;Kim, Yoo-Mi;Choi, Youn-Hee;Kim, Sung-Soo;Kim, Eun-Ju;Won, Si-Yeon;Kang, Sung-Hong
    • Journal of Digital Convergence
    • /
    • v.11 no.9
    • /
    • pp.289-299
    • /
    • 2013
  • The aim of this study was to enhance the NHI claims data-based tuberculosis classification rule of KCDC(Korea centers for disease control & prevention) for an effective TB surveillance system. 8,118 cases, 10% samples of 81,199 TB cases from NHI claims data during 2009, were subject to the Medical Record Survey about whether they are real TB patients. The final study population was 7,132 cases whose medical records were surveyed. The decision tree model was evaluated as the most superior TB patients detection model. This model required the main independent variables of age, the number of anti-tuberculosis drugs, types of medical institution, tuberculosis tests, prescription days, types of TB. This model had sensitivity of 90.6%, PPV of 96.1%, and correct classification rate of 93.8%, which was better than KCDC's TB detection model with two or more NHI claims for TB and TB drugs(sensitivity of 82.6%, PPV of 95%, and correct classification rate of 80%).

Prediction of golf scores on the PGA tour using statistical models (PGA 투어의 골프 스코어 예측 및 분석)

  • Lim, Jungeun;Lim, Youngin;Song, Jongwoo
    • The Korean Journal of Applied Statistics
    • /
    • v.30 no.1
    • /
    • pp.41-55
    • /
    • 2017
  • This study predicts the average scores of top 150 PGA golf players on 132 PGA Tour tournaments (2013-2015) using data mining techniques and statistical analysis. This study also aims to predict the Top 10 and Top 25 best players in 4 different playoffs. Linear and nonlinear regression methods were used to predict average scores. Stepwise regression, all best subset, LASSO, ridge regression and principal component regression were used for the linear regression method. Tree, bagging, gradient boosting, neural network, random forests and KNN were used for nonlinear regression method. We found that the average score increases as fairway firmness or green height or average maximum wind speed increases. We also found that the average score decreases as the number of one-putts or scrambling variable or longest driving distance increases. All 11 different models have low prediction error when predicting the average scores of PGA Tournaments in 2015 which is not included in the training set. However, the performances of Bagging and Random Forest models are the best among all models and these two models have the highest prediction accuracy when predicting the Top 10 and Top 25 best players in 4 different playoffs.