Developing data quality management algorithm for Hypertension Patients accompanied with Diabetes Mellitus By Data Mining

데이터 마이닝을 이용한 고혈압환자의 당뇨질환 동반에 관한 데이터 질 관리 알고리즘 개발

  • Received : 2016.06.01
  • Accepted : 2016.07.20
  • Published : 2016.07.28


There is a need to develop a data quality management algorithm in order to improve the quality of health care data. In this study, we developed a data quality control algorithms associated diseases related to diabetes in patients with hypertension. To make a data quality algorithm, we extracted hypertension patients from 2011 and 2012 discharge damage survey data. As the result of developing Data quality management algorithm, significant factors in hypertension patients with diabetes are gender, age, Glomerular disorders in diabetes mellitus, Diabetic retinopathy, Diabetic polyneuropathy, Closed [percutaneous] [needle] biopsy of kidney. Depending on the decision tree results, we defined Outlier which was probability values associated with a patient having diabetes corporal with hypertension or more than 80%, or not more than 20%, and found six groups with extreme values for diabetes accompanying hypertension patients. Thus there is a need to check the actual data contained in the Outlier(extreme value) groups to improve the quality of the data.


Data Mining;Data Quality Management Algorithm;Outlier Detection Method;Hypertension;Diabetes Mellitus


Grant : 융복합보건의료기술

Supported by : 한국보건산업진흥원


  1. Health Insurance Review & Assessment Service, "Survey results of medical information status", 14p, 2014.
  2. Yoomi Kim, Ilsoo Park, Misook Kwak, Misun Kim, Yae-En Kim. "Health Information Management", chapter 8, Secondary Data Sources, 2014.
  3. Korea Database Agency, "2010 Data quality management maturity level research report", p13, 2010.
  4. Insook cho, "Assessing the Quality of Structured Data Entry for the Secondary Use of Electronic Medical Records", Med Informatics, Vol. 15, No. 4, pp.423-431, 2009.
  6. Nicole Lewis, informationweek Health care connecring the healthcare thechnology community,, July 24, 2012.
  7. Juliano, "A Systemic Review Of Outlier Detection Techniques In Medical Data: Preminary Data", 2011.
  8. K.Suganya, S.Dhamodharan, "Assessment of Data Quality in Health Care Using Association Rules", International Journal of Engineering and Advanced Technology, Vol 3, No 4, pp.36-37, 2014.
  9. S. preetha, V. Radha, "Enhanced Outlier Detection Method Using Association Rule Mining Technique", International Journal of Computer Applications, Vol 42. No.7, 2012.
  10. National Health Insurance Corporation, Benefits by Classification of 298 Disease Categories, 2014.
  11. Korea Centers for Disease Control and Prevention , Discharge damage depth investigation 2011, 2012.
  12. Yoomi Kim, Daegon Cho, Sungok Hong, Eunju Kim, Sunghong Kang, "Analysis on Geographical Variations of the Prevalence of Hypertension Using Multi-year Data", The Korean Geographical Society, vol 49, No.6, pp. 935-948, 2014.
  13. Mi-Jin Kim, Yoon-Sik Yoo, "A Study on the Application Methods of Big data in the Healthcare Field", 2015.
  14. Eun-Young Jung, Byoung-Hui Jeong, Eun-Sil Yoon, Dong-Jin Kim, Yoon-Young Park, Dong-Kyun Park, "Personalized diet and exercise management service based on PHR", Journal of The Korea Society of Computer and Information, Vol. 17 No. 9, pp.113-125. 2012.
  15. Statistics Korea, Korean standard classification of Disease 2010, 2010.
  16. Statistics Korea, Korean standard classification of Disease Vol 2 Instruction Manual 2010, 2010.
  17. Statistics Korea, Korean standard classification of Disease Vol 3 Index 2010, 2010.
  18. Young-Jun Kim, "Convergence of Business Information System Process using Knowledge-based Method", Journal of the Korea Convergence Society, Vol. 6, No. 4, pp. 65-71, 2015.
  19. yong-won kim, "A study on Convergent & Adaptive Quality Analysis using DQnA model", Journal of the Korea Convergence Society, Vol. 5, No. 4, pp. 21-25, 2014.
  20. Ankerst, M., Elsen, C., Ester, M., Kriegel, H.P., "Visual Classication: An Interactive Approach to Decision Tree Construction", KDD '99 Proceedings of the fifth ACM SIGKDD international conference on Knowledge discovery and datamining, pp. 392-396, 1999.