• Title/Summary/Keyword: Non-Prediction Algorithm

Search Result 218, Processing Time 0.024 seconds

A study on the prediction of korean NPL market return (한국 NPL시장 수익률 예측에 관한 연구)

  • Lee, Hyeon Su;Jeong, Seung Hwan;Oh, Kyong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.2
    • /
    • pp.123-139
    • /
    • 2019
  • The Korean NPL market was formed by the government and foreign capital shortly after the 1997 IMF crisis. However, this market is short-lived, as the bad debt has started to increase after the global financial crisis in 2009 due to the real economic recession. NPL has become a major investment in the market in recent years when the domestic capital market's investment capital began to enter the NPL market in earnest. Although the domestic NPL market has received considerable attention due to the overheating of the NPL market in recent years, research on the NPL market has been abrupt since the history of capital market investment in the domestic NPL market is short. In addition, decision-making through more scientific and systematic analysis is required due to the decline in profitability and the price fluctuation due to the fluctuation of the real estate business. In this study, we propose a prediction model that can determine the achievement of the benchmark yield by using the NPL market related data in accordance with the market demand. In order to build the model, we used Korean NPL data from December 2013 to December 2017 for about 4 years. The total number of things data was 2291. As independent variables, only the variables related to the dependent variable were selected for the 11 variables that indicate the characteristics of the real estate. In order to select the variables, one to one t-test and logistic regression stepwise and decision tree were performed. Seven independent variables (purchase year, SPC (Special Purpose Company), municipality, appraisal value, purchase cost, OPB (Outstanding Principle Balance), HP (Holding Period)). The dependent variable is a bivariate variable that indicates whether the benchmark rate is reached. This is because the accuracy of the model predicting the binomial variables is higher than the model predicting the continuous variables, and the accuracy of these models is directly related to the effectiveness of the model. In addition, in the case of a special purpose company, whether or not to purchase the property is the main concern. Therefore, whether or not to achieve a certain level of return is enough to make a decision. For the dependent variable, we constructed and compared the predictive model by calculating the dependent variable by adjusting the numerical value to ascertain whether 12%, which is the standard rate of return used in the industry, is a meaningful reference value. As a result, it was found that the hit ratio average of the predictive model constructed using the dependent variable calculated by the 12% standard rate of return was the best at 64.60%. In order to propose an optimal prediction model based on the determined dependent variables and 7 independent variables, we construct a prediction model by applying the five methodologies of discriminant analysis, logistic regression analysis, decision tree, artificial neural network, and genetic algorithm linear model we tried to compare them. To do this, 10 sets of training data and testing data were extracted using 10 fold validation method. After building the model using this data, the hit ratio of each set was averaged and the performance was compared. As a result, the hit ratio average of prediction models constructed by using discriminant analysis, logistic regression model, decision tree, artificial neural network, and genetic algorithm linear model were 64.40%, 65.12%, 63.54%, 67.40%, and 60.51%, respectively. It was confirmed that the model using the artificial neural network is the best. Through this study, it is proved that it is effective to utilize 7 independent variables and artificial neural network prediction model in the future NPL market. The proposed model predicts that the 12% return of new things will be achieved beforehand, which will help the special purpose companies make investment decisions. Furthermore, we anticipate that the NPL market will be liquidated as the transaction proceeds at an appropriate price.

The Prediction of Survival of Breast Cancer Patients Based on Machine Learning Using Health Insurance Claim Data (건강보험 청구 데이터를 활용한 머신러닝 기반유방암 환자의 생존 여부 예측)

  • Doeggyu Lee;Kyungkeun Byun;Hyungdong Lee;Sunhee Shin
    • Journal of Korea Society of Industrial Information Systems
    • /
    • v.28 no.2
    • /
    • pp.1-9
    • /
    • 2023
  • Research using AI and big data is also being actively conducted in the health and medical fields such as disease diagnosis and treatment. Most of the existing research data used cohort data from research institutes or some patient data. In this paper, the difference in the prediction rate of survival and the factors affecting survival between breast cancer patients in their 40~50s and other age groups was revealed using health insurance review claim data held by the HIRA. As a result, the accuracy of predicting patients' survival was 0.93 on average in their 40~50s, higher than 0.86 in their 60~80s. In terms of that factor, the number of treatments was high for those in their 40~50s, and age was high for those in their 60~80s. Performance comparison with previous studies, the average precision was 0.90, which was higher than 0.81 of the existing paper. As a result of performance comparison by applied algorithm, the overall average precision of Decision Tree, Random Forest, and Gradient Boosting was 0.90, and the recall was 1.0, and the precision of multi-layer perceptrons was 0.89, and the recall was 1.0. I hope that more research will be conducted using machine learning automation(Auto ML) tools for non-professionals to enhance the use of the value for health insurance review claim data held by the HIRA.

Analytical Evaluation of PPG Blood Glucose Monitoring System - researcher clinical trial (PPG 혈당 모니터링 시스템의 분석적 평가 - 연구자 임상)

  • Cheol-Gu Park;Sang-Ki Choi;Seong-Geun Jo;Kwon-Min Kim
    • Journal of Digital Convergence
    • /
    • v.21 no.3
    • /
    • pp.33-39
    • /
    • 2023
  • This study is a performance evaluation of a blood sugar monitoring system that combines a PPG sensor, which is an evaluation device for blood glucose monitoring, and a DNN algorithm when monitoring capillary blood glucose. The study is a researcher-led clinical trial conducted on participants from September 2023 to November 2023. PPG-BGMS compared predicted blood sugar levels for evaluation using 1-minute heart rate and heart rate variability information and the DNN prediction algorithm with capillary blood glucose levels measured with a blood glucose meter of the standard personal blood sugar management system. Of the 100 participants, 50 had type 2 diabetes (T2DM), and the average age was 67 years (range, 28 to 89 years). It was found that 100% of the predicted blood sugar level of PPG-BGMS was distributed in the A+B area of the Clarke error grid and Parker(Consensus) error grid. The MARD value of PPG-BGMS predicted blood glucose is 5.3 ± 4.0%. Consequentially, the non-blood-based PPG-BGMS was found to be non-inferior to the instantaneous blood sugar level of the clinical standard blood-based personal blood glucose measurement system.

Predicting Crime Risky Area Using Machine Learning (머신러닝기반 범죄발생 위험지역 예측)

  • HEO, Sun-Young;KIM, Ju-Young;MOON, Tae-Heon
    • Journal of the Korean Association of Geographic Information Studies
    • /
    • v.21 no.4
    • /
    • pp.64-80
    • /
    • 2018
  • In Korea, citizens can only know general information about crime. Thus it is difficult to know how much they are exposed to crime. If the police can predict the crime risky area, it will be possible to cope with the crime efficiently even though insufficient police and enforcement resources. However, there is no prediction system in Korea and the related researches are very much poor. From these backgrounds, the final goal of this study is to develop an automated crime prediction system. However, for the first step, we build a big data set which consists of local real crime information and urban physical or non-physical data. Then, we developed a crime prediction model through machine learning method. Finally, we assumed several possible scenarios and calculated the probability of crime and visualized the results in a map so as to increase the people's understanding. Among the factors affecting the crime occurrence revealed in previous and case studies, data was processed in the form of a big data for machine learning: real crime information, weather information (temperature, rainfall, wind speed, humidity, sunshine, insolation, snowfall, cloud cover) and local information (average building coverage, average floor area ratio, average building height, number of buildings, average appraised land value, average area of residential building, average number of ground floor). Among the supervised machine learning algorithms, the decision tree model, the random forest model, and the SVM model, which are known to be powerful and accurate in various fields were utilized to construct crime prevention model. As a result, decision tree model with the lowest RMSE was selected as an optimal prediction model. Based on this model, several scenarios were set for theft and violence cases which are the most frequent in the case city J, and the probability of crime was estimated by $250{\times}250m$ grid. As a result, we could find that the high crime risky area is occurring in three patterns in case city J. The probability of crime was divided into three classes and visualized in map by $250{\times}250m$ grid. Finally, we could develop a crime prediction model using machine learning algorithm and visualized the crime risky areas in a map which can recalculate the model and visualize the result simultaneously as time and urban conditions change.

Development of a New Personal Magnetic Field Exposure Estimation Method for Use in Epidemiological EMF Surveys among Children under 17 Years of Age

  • Yang, Kwang-Ho;Ju, Mun-No;Myung, Sung-Ho;Shin, Koo-Yong;Hwang, Gi-Hyun;Park, June-Ho
    • Journal of Electrical Engineering and Technology
    • /
    • v.7 no.3
    • /
    • pp.376-383
    • /
    • 2012
  • A number of scientific researches are currently being conducted on the potential health hazards of power frequency electric and magnetic field (EMF). There exists a non-objective and psychological belief that they are harmful, although no scientific and objective proof of such exists. This possible health risk from ELF magnetic field (MF) exposure, especially for children under 17 years of age, is currently one of Korea's most highly contested social issues. Therefore, to assess the magnetic field exposure levels of those children in their general living environments, the personal MF exposure levels of 436 subjects were measured for about 6 years using government funding. Using the measured database, estimation formulas were developed to predict personal MF exposure levels. These formulas can serve as valuable tools in estimating 24-hour personal MF exposure levels without directly measuring the exposure. Three types of estimation formulas were developed by applying evolutionary computation methods such as genetic algorithm (GA) and genetic programming (GP). After tuning the database, the final three formulas with the smallest estimation error were selected, where the target estimation error was approximately 0.03 ${\mu}T$. The seven parameters of each of these three formulas are gender (G), age (A), house type (H), house size (HS), distance between the subject's residence and a power line (RD), power line voltage class (KV), and the usage conditions of electric appliances (RULE).

Spectogram analysis of active power of appliances and LSTM-based Energy Disaggregation (다수 가전기기 유효전력의 스팩토그램 분석 및 LSTM기반의 전력 분해 알고리즘)

  • Kim, Imgyu;Kim, Hyuncheol;Kim, Seung Yun;Shin, Sangyong
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.2
    • /
    • pp.21-28
    • /
    • 2021
  • In this study, we propose a deep learning-based NILM technique using actual measured power data for 5 kinds of home appliances and verify its effectiveness. For about 3 weeks, the active power of the central power measuring device and five kinds of home appliances (refrigerator, induction, TV, washing machine, air cleaner) was individually measured. The preprocessing method of the measured data was introduced, and characteristics of each household appliance were analyzed through spectogram analysis. The characteristics of each household appliance are organized into a learning data set. All the power data measured by the central power measuring device and 5 kinds of home appliances were time-series mapping, and training was performed using a LSTM neural network, which is excellent for time series data prediction. An algorithm that can disaggregate five types of energies using only the power data of the main central power measuring device is proposed.

Application of POD reduced-order algorithm on data-driven modeling of rod bundle

  • Kang, Huilun;Tian, Zhaofei;Chen, Guangliang;Li, Lei;Wang, Tianyu
    • Nuclear Engineering and Technology
    • /
    • v.54 no.1
    • /
    • pp.36-48
    • /
    • 2022
  • As a valid numerical method to obtain a high-resolution result of a flow field, computational fluid dynamics (CFD) have been widely used to study coolant flow and heat transfer characteristics in fuel rod bundles. However, the time-consuming, iterative calculation of Navier-Stokes equations makes CFD unsuitable for the scenarios that require efficient simulation such as sensitivity analysis and uncertainty quantification. To solve this problem, a reduced-order model (ROM) based on proper orthogonal decomposition (POD) and machine learning (ML) is proposed to simulate the flow field efficiently. Firstly, a validated CFD model to output the flow field data set of the rod bundle is established. Secondly, based on the POD method, the modes and corresponding coefficients of the flow field were extracted. Then, an deep feed-forward neural network, due to its efficiency in approximating arbitrary functions and its ability to handle high-dimensional and strong nonlinear problems, is selected to build a model that maps the non-linear relationship between the mode coefficients and the boundary conditions. A trained surrogate model for modes coefficients prediction is obtained after a certain number of training iterations. Finally, the flow field is reconstructed by combining the product of the POD basis and coefficients. Based on the test dataset, an evaluation of the ROM is carried out. The evaluation results show that the proposed POD-ROM accurately describe the flow status of the fluid field in rod bundles with high resolution in only a few milliseconds.

Decision based uncertainty model to predict rockburst in underground engineering structures using gradient boosting algorithms

  • Kidega, Richard;Ondiaka, Mary Nelima;Maina, Duncan;Jonah, Kiptanui Arap Too;Kamran, Muhammad
    • Geomechanics and Engineering
    • /
    • v.30 no.3
    • /
    • pp.259-272
    • /
    • 2022
  • Rockburst is a dynamic, multivariate, and non-linear phenomenon that occurs in underground mining and civil engineering structures. Predicting rockburst is challenging since conventional models are not standardized. Hence, machine learning techniques would improve the prediction accuracies. This study describes decision based uncertainty models to predict rockburst in underground engineering structures using gradient boosting algorithms (GBM). The model input variables were uniaxial compressive strength (UCS), uniaxial tensile strength (UTS), maximum tangential stress (MTS), excavation depth (D), stress ratio (SR), and brittleness coefficient (BC). Several models were trained using different combinations of the input variables and a 3-fold cross-validation resampling procedure. The hyperparameters comprising learning rate, number of boosting iterations, tree depth, and number of minimum observations were tuned to attain the optimum models. The performance of the models was tested using classification accuracy, Cohen's kappa coefficient (k), sensitivity and specificity. The best-performing model showed a classification accuracy, k, sensitivity and specificity values of 98%, 93%, 1.00 and 0.957 respectively by optimizing model ROC metrics. The most and least influential input variables were MTS and BC, respectively. The partial dependence plots revealed the relationship between the changes in the input variables and model predictions. The findings reveal that GBM can be used to anticipate rockburst and guide decisions about support requirements before mining development.

Application Verification of AI&Thermal Imaging-Based Concrete Crack Depth Evaluation Technique through Mock-up Test (Mock-up Test를 통한 AI 및 열화상 기반 콘크리트 균열 깊이 평가 기법의 적용성 검증)

  • Jeong, Sang-Gi;Jang, Arum;Park, Jinhan;Kang, Chang-hoon;Ju, Young K.
    • Journal of Korean Association for Spatial Structures
    • /
    • v.23 no.3
    • /
    • pp.95-103
    • /
    • 2023
  • With the increasing number of aging buildings across Korea, emerging maintenance technologies have surged. One such technology is the non-contact detection of concrete cracks via thermal images. This study aims to develop a technique that can accurately predict the depth of a crack by analyzing the temperature difference between the crack part and the normal part in the thermal image of the concrete. The research obtained temperature data through thermal imaging experiments and constructed a big data set including outdoor variables such as air temperature, illumination, and humidity that can influence temperature differences. Based on the collected data, the team designed an algorithm for learning and predicting the crack depth using machine learning. Initially, standardized crack specimens were used in experiments, and the big data was updated by specimens similar to actual cracks. Finally, a crack depth prediction technology was implemented using five regression analysis algorithms for approximately 24,000 data points. To confirm the practicality of the development technique, crack simulators with various shapes were added to the study.

Identifying Atrial Fibrillation With Sinus Rhythm Electrocardiogram in Embolic Stroke of Undetermined Source: A Validation Study With Insertable Cardiac Monitors

  • Ki-Hyun Jeon;Jong-Hwan Jang;Sora Kang;Hak Seung Lee;Min Sung Lee;Jeong Min Son;Yong-Yeon Jo;Tae Jun Park;Il-Young Oh;Joon-myoung Kwon;Ji Hyun Lee
    • Korean Circulation Journal
    • /
    • v.53 no.11
    • /
    • pp.758-771
    • /
    • 2023
  • Background and Objectives: Paroxysmal atrial fibrillation (AF) is a major potential cause of embolic stroke of undetermined source (ESUS). However, identifying AF remains challenging because it occurs sporadically. Deep learning could be used to identify hidden AF based on the sinus rhythm (SR) electrocardiogram (ECG). We combined known AF risk factors and developed a deep learning algorithm (DLA) for predicting AF to optimize diagnostic performance in ESUS patients. Methods: A DLA was developed to identify AF using SR 12-lead ECG with the database consisting of AF patients and non-AF patients. The accuracy of the DLA was validated in 221 ESUS patients who underwent insertable cardiac monitor (ICM) insertion to identify AF. Results: A total of 44,085 ECGs from 12,666 patient were used for developing the DLA. The internal validation of the DLA revealed 0.862 (95% confidence interval, 0.850-0.873) area under the curve (AUC) in the receiver operating curve analysis. In external validation data from 221 ESUS patients, the diagnostic accuracy of DLA and AUC were 0.811 and 0.827, respectively, and DLA outperformed conventional predictive models, including CHARGE-AF, C2HEST, and HATCH. The combined model, comprising atrial ectopic burden, left atrial diameter and the DLA, showed excellent performance in AF prediction with AUC of 0.906. Conclusions: The DLA accurately identified paroxysmal AF using 12-lead SR ECG in patients with ESUS and outperformed the conventional models. The DLA model along with the traditional AF risk factors could be a useful tool to identify paroxysmal AF in ESUS patients.