• Title/Summary/Keyword: Heart Data mining

Search Result 21, Processing Time 0.029 seconds

Analysis of Defection Customer Using Customer Segmentation on Bank -Focusing on Personal Deposit- (은행고객 세분화를 통한 이탈고객 관리분석 -가계성 예금을 중심으로-)

  • 이건창;권순재;신경식
    • Journal of Intelligence and Information Systems
    • /
    • v.7 no.1
    • /
    • pp.177-197
    • /
    • 2001
  • This paper is aimed at proposing a data mining-driven analysis to manage the customer defection rate in the bank. After 1997 IMF crisis, Korean banks were suffering from hard-pressed restructuring. At the heart of such restructuring effects, there was the need to manage the customer more effectively than ever. So far, many banks in Korea used to a poor management of customers without any highly-skillful techniques. In line with this argument, we propose several data mining techniques to determine more effective technique far managing customer deflection. We applied three data mining techniques such as logit model, neural network, and C5.0. Experiment data were collected from personal deposit account data of a specific bank in Korea. After experiments, we found that C5.0 showed more robust performance compared to other two techniques. On the basis of those experiment results, we proposed customer defection management policy.

  • PDF

Analysis of Media Articles on COVID-19 and Nurses Using Text Mining and Topic Modeling (텍스트 마이닝과 토픽모델링 분석을 활용한 코로나19와 간호사에 대한 언론기사 분석)

  • An, Jiyeon;Yi, Yunjeong;Lee, Bokim
    • Research in Community and Public Health Nursing
    • /
    • v.32 no.4
    • /
    • pp.467-476
    • /
    • 2021
  • Purpose: The purpose of this study is to understand the social perceptions of nurses in the context of the COVID-19 outbreak through analysis of media articles. Methods: Among the media articles reported from January 1st to September 30th, 2020, those containing the keywords '[corona or Wuhan pneumonia or covid] and [nurse or nursing]' are extracted. After the selection process, the text mining and topic modeling are performed on 454 media articles using textom version 4.5. Results: Frequency Top 30 keywords include 'Nurse', 'Corona', 'Isolation', 'Support', 'Shortage', 'Protective Clothing', and so on. Keywords that ranked high in Term Frequency-Inverse Document Frequency (TF-IDF) values are 'Daegu', 'President', 'Gwangju', 'manpower', and so on. As a result of the topic analysis, 10 topics are derived, such as 'Local infection', 'Dispatch of personnel', 'Message for thanks', and 'Delivery of one's heart'. Conclusion: Nurses are both the contributors and victims of COVID-19 prevention. The government and the nurses' community should make efforts to improve poor working conditions and manpower shortages.

Analysis of Unstructured Data on Detecting of New Drug Indication of Atorvastatin (아토바스타틴의 새로운 약물 적응증 탐색을 위한 비정형 데이터 분석)

  • Jeong, Hwee-Soo;Kang, Gil-Won;Choi, Woong;Park, Jong-Hyock;Shin, Kwang-Soo;Suh, Young-Sung
    • Journal of health informatics and statistics
    • /
    • v.43 no.4
    • /
    • pp.329-335
    • /
    • 2018
  • Objectives: In recent years, there has been an increased need for a way to extract desired information from multiple medical literatures at once. This study was conducted to confirm the usefulness of unstructured data analysis using previously published medical literatures to search for new indications. Methods: The new indications were searched through text mining, network analysis, and topic modeling analysis using 5,057 articles of atorvastatin, a treatment for hyperlipidemia, from 1990 to 2017. Results: The extracted keywords was 273. In the frequency of text mining and network analysis, the existing indications of atorvastatin were extracted in top level. The novel indications by Term Frequency-Inverse Document Frequency (TF-IDF) were atrial fibrillation, heart failure, breast cancer, rheumatoid arthritis, combined hyperlipidemia, arrhythmias, multiple sclerosis, non-alcoholic fatty liver disease, contrast-induced acute kidney injury and prostate cancer. Conclusions: Unstructured data analysis for discovering new indications from massive medical literature is expected to be used in drug repositioning industries.

Development of Healthcare Data Quality Control Algorithm Using Interactive Decision Tree: Focusing on Hypertension in Diabetes Mellitus Patients (대화식 의사결정나무를 이용한 보건의료 데이터 질 관리 알고리즘 개발: 당뇨환자의 고혈압 동반을 중심으로)

  • Hwang, Kyu-Yeon;Lee, Eun-Sook;Kim, Go-Won;Hong, Seong-Ok;Park, Jung-Sun;Kwak, Mi-Sook;Lee, Ye-Jin;Lim, Chae-Hyeok;Park, Tae-Hyun;Park, Jong-Ho;Kang, Sung-Hong
    • The Korean Journal of Health Service Management
    • /
    • v.10 no.3
    • /
    • pp.63-74
    • /
    • 2016
  • Objectives : There is a need to develop a data quality management algorithm to improve the quality of healthcare data using a data quality management system. In this study, we developed a data quality control algorithms associated with diseases related to hypertension in patients with diabetes mellitus. Methods : To make a data quality algorithm, we extracted the 2011 and 2012 discharge damage survey data from diabetes mellitus patients. Derived variables were created using the primary diagnosis, diagnostic unit, primary surgery and treatment, minor surgery and treatment items. Results : Significant factors in diabetes mellitus patients with hypertension were sex, age, ischemic heart disease, and diagnostic ultrasound of the heart. Depending on the decision tree results, we found four groups with extreme values for diabetes accompanying hypertension patients. Conclusions : There is a need to check the actual data contained in the Outlier (extreme value) groups to improve the quality of the data.

A study on analysis of factors on in-hospital mortality for community-acquired pneumonia (지역사회획득 폐렴 환자의 퇴원시 사망 요인 분석)

  • Kim, Yoo-Mi
    • Journal of the Korean Data and Information Science Society
    • /
    • v.22 no.3
    • /
    • pp.389-400
    • /
    • 2011
  • This study was carried out to analysis factors related to in-hospital mortality of community-acquired peumonia using administrative database. The subjects were 5,353 community-acquired pneumonia inpatients of the Korean National Hospital Discharge Injury Survey 2004-2006 data. The data were analyzed using chi-squared test and decision tree model in the data mining technique. Among the decision tree model, C4.5 had the best performance. The critical factors on in-hospital mortality of communityacquired pneumonia are admission route, respiratory failure, congenital heart failure including age, comorbidity, and bed size. This study was carried out using the administrative database including patients' characteristics and comorbidity. However further study should be extensively including hospital characteristics, regional medical resources, and patient management practice behavior.

The Variation Factors of Severity-Adjusted Length of Stay in CABG (관상동맥우회술 시행환자의 중증도 보정 재원일수 변이에 관한 연구)

  • Kim, Sun-Ja;Kang, Sung-Hong;Kim, Won-Joong;Kim, Yoo-Mi
    • Journal of Korean Society for Quality Management
    • /
    • v.39 no.3
    • /
    • pp.391-399
    • /
    • 2011
  • Our study was carried out to analyze the variation factors of severity-adjusted length of stay(LOS) in coronary artery bypass graft(CABG). The subjects were 932 CABG inpatients of the Korean National Hospital Discharge In-depth Injury Survey from 2004 through 2008. The data were analyzed using $x^2$ test and the severity-adjusted model was developed using data mining technique. The results of the study were as follows: male(71.1%), older than 61 years of age(61.6%), more than 500 beds(92.8%) and admitting via ambulatory care(70.0%) appeared to have higher rate than otherwise. In-hospital mortality of CABG inpatients was 2.8%. In addition, 46.4% of the patients received their care in other residence. The angina pectoris(45.6%) was found to be the highest in principle diagnosis, followed by chronic ischemic heart disease(36.9%) and acute myocardial infarction(12.0%). We developed severity-adjusted LOS model using the variables such as gender, age and comorbidity. Comparison of adjusted values in predicted LOS revealed that there were significant variations in LOS by location of hospital, bed size, and whether patients received the care in their residences. The variations of LOS can be explained as the indirect indicator for quality variation of medical process. It is suggested that the severity-adjusted LOS model developed in this study should be utilized as a useful method for benchmarking in hospital and it is necessary that national standard clinical practice guideline should be developed.

CADICA: Diagnosis of Coronary Artery Disease Using the Imperialist Competitive Algorithm

  • Mahmoodabadi, Zahra;Abadeh, Mohammad Saniee
    • Journal of Computing Science and Engineering
    • /
    • v.8 no.2
    • /
    • pp.87-93
    • /
    • 2014
  • Coronary artery disease (CAD) is currently a prevalent disease from which many people suffer. Early detection and treatment could reduce the risk of heart attack. Currently, the golden standard for the diagnosis of CAD is angiography, which is an invasive procedure. In this article, we propose an algorithm that uses data mining techniques, a fuzzy expert system, and the imperialist competitive algorithm (ICA), to make CAD diagnosis by a non-invasive procedure. The ICA is used to adjust the fuzzy membership functions. The proposed method has been evaluated with the Cleveland and Hungarian datasets. The advantage of this method, compared with others, is the interpretability. The accuracy of the proposed method is 94.92% by 11 rules, and the average length of 4. To compare the colonial competitive algorithm with other metaheuristic algorithms, the proposed method has been implemented with the particle swarm optimization (PSO) algorithm. The results indicate that the colonial competition algorithm is more efficient than the PSO algorithm.

Prediction Model of User Physical Activity using Data Characteristics-based Long Short-term Memory Recurrent Neural Networks

  • Kim, Joo-Chang;Chung, Kyungyong
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.13 no.4
    • /
    • pp.2060-2077
    • /
    • 2019
  • Recently, mobile healthcare services have attracted significant attention because of the emerging development and supply of diverse wearable devices. Smartwatches and health bands are the most common type of mobile-based wearable devices and their market size is increasing considerably. However, simple value comparisons based on accumulated data have revealed certain problems, such as the standardized nature of health management and the lack of personalized health management service models. The convergence of information technology (IT) and biotechnology (BT) has shifted the medical paradigm from continuous health management and disease prevention to the development of a system that can be used to provide ground-based medical services regardless of the user's location. Moreover, the IT-BT convergence has necessitated the development of lifestyle improvement models and services that utilize big data analysis and machine learning to provide mobile healthcare-based personal health management and disease prevention information. Users' health data, which are specific as they change over time, are collected by different means according to the users' lifestyle and surrounding circumstances. In this paper, we propose a prediction model of user physical activity that uses data characteristics-based long short-term memory (DC-LSTM) recurrent neural networks (RNNs). To provide personalized services, the characteristics and surrounding circumstances of data collectable from mobile host devices were considered in the selection of variables for the model. The data characteristics considered were ease of collection, which represents whether or not variables are collectable, and frequency of occurrence, which represents whether or not changes made to input values constitute significant variables in terms of activity. The variables selected for providing personalized services were activity, weather, temperature, mean daily temperature, humidity, UV, fine dust, asthma and lung disease probability index, skin disease probability index, cadence, travel distance, mean heart rate, and sleep hours. The selected variables were classified according to the data characteristics. To predict activity, an LSTM RNN was built that uses the classified variables as input data and learns the dynamic characteristics of time series data. LSTM RNNs resolve the vanishing gradient problem that occurs in existing RNNs. They are classified into three different types according to data characteristics and constructed through connections among the LSTMs. The constructed neural network learns training data and predicts user activity. To evaluate the proposed model, the root mean square error (RMSE) was used in the performance evaluation of the user physical activity prediction method for which an autoregressive integrated moving average (ARIMA) model, a convolutional neural network (CNN), and an RNN were used. The results show that the proposed DC-LSTM RNN method yields an excellent mean RMSE value of 0.616. The proposed method is used for predicting significant activity considering the surrounding circumstances and user status utilizing the existing standardized activity prediction services. It can also be used to predict user physical activity and provide personalized healthcare based on the data collectable from mobile host devices.

A Machine Learning Approach for Stress Status Identification of Early Childhood by Using Bio-Signals (생체신호를 활용한 학습기반 영유아 스트레스 상태 식별 모델 연구)

  • Jeon, Yu-Mi;Han, Tae Seong;Kim, Kwanho
    • The Journal of Society for e-Business Studies
    • /
    • v.22 no.2
    • /
    • pp.1-18
    • /
    • 2017
  • Recently, identification of the extremely stressed condition of children is an essential skill for real-time recognition of a dangerous situation because incidents of children have been dramatically increased. In this paper, therefore, we present a model based on machine learning techniques for stress status identification of a child by using bio-signals such as voice and heart rate that are major factors for presenting a child's emotion. In addition, a smart band for collecting such bio-signals and a mobile application for monitoring child's stress status are also suggested. Specifically, the proposed method utilizes stress patterns of children that are obtained in advance for the purpose of training stress status identification model. Then, the model is used to predict the current stress status for a child and is designed based on conventional machine learning algorithms. The experiment results conducted by using a real-world dataset showed that the possibility of automated detection of a child's stress status with a satisfactory level of accuracy. Furthermore, the research results are expected to be used for preventing child's dangerous situations.

A Study on the Improving Method of Academic Effect based on Arduino sensors (아두이노 센서 기반 학업 효과 개선 방안 연구)

  • Bae, Youngchul;Hong, YouSik
    • Journal of the Korean Institute of Intelligent Systems
    • /
    • v.26 no.3
    • /
    • pp.226-232
    • /
    • 2016
  • The research for the improvement in math and science scores is active by the brain exercises, stress reliefs, and emotion sensitized illuminations. This principle is based on the following facts that the most effective brain turns are supported with the circumstances not only when the brain wave should keep stability and comfort in science criticism, but also when minimized stress and comfortable illumination should be adjusted in solving math problem. In this paper, in order to effectively learn mathematics and science, the most optimized simulating tests in learning conditions are conducted by using a stress relief. However, depending on the users' tastes, the effectiveness on favorite music or colors therapy have no convergency but many differentiations. Therefore, in this paper, in order to solve this problem, the proposed optimal illumination and music therapy treatment using fuzzy inference method.