• Title/Summary/Keyword: financial machine learning

Search Result 145, Processing Time 0.022 seconds

Selecting Stock by Value Investing based on Machine Learning: Focusing on Intrinsic Value (머신러닝 기반 가치투자를 통한 주식 종목 선정 연구: 내재가치를 중심으로)

  • Kim, Youn Seung;Yoo, Dong Hee
    • The Journal of Information Systems
    • /
    • v.32 no.1
    • /
    • pp.179-199
    • /
    • 2023
  • Purpose This study builds a prediction model to find stocks that can reach intrinsic value among KOSPI and KOSDAQ-listed companies to improve the stability and profitability of the stock investment. And investment simulations are conducted to verify whether stock investment performance is improved by comparing the prediction model, random stock selection, and the market indexes. Design/methodology/approach Value investment theory and machine learning techniques are applied to build the model. Various experiments find conditions such as the algorithm with the best predictive performance, learning period, and intrinsic value-reaching period. This study selects stocks through the prediction model learned with inventive variables, does not limit the holding period after buying to reach the intrinsic value of the stocks, and targets all KOSPI and KOSDAQ companies. The stock and financial data are collected for 21 years (2001-2021). Findings As a result of the experiment, using the random forest technique, the prediction model's performance was the best with one year of learning period and within one year of the intrinsic value reaching period. As a result of the investment simulation, the cumulative return of the prediction model was up to 1.68 times higher than the random stock selection and 17 times higher than the KOSPI index. The usefulness of the prediction model was confirmed in that the number of intrinsic values reaching the predicted stock was up to 70% higher than the random selection.

Stock Price Direction Prediction Using Convolutional Neural Network: Emphasis on Correlation Feature Selection (합성곱 신경망을 이용한 주가방향 예측: 상관관계 속성선택 방법을 중심으로)

  • Kyun Sun Eo;Kun Chang Lee
    • Information Systems Review
    • /
    • v.22 no.4
    • /
    • pp.21-39
    • /
    • 2020
  • Recently, deep learning has shown high performance in various applications such as pattern analysis and image classification. Especially known as a difficult task in the field of machine learning research, stock market forecasting is an area where the effectiveness of deep learning techniques is being verified by many researchers. This study proposed a deep learning Convolutional Neural Network (CNN) model to predict the direction of stock prices. We then used the feature selection method to improve the performance of the model. We compared the performance of machine learning classifiers against CNN. The classifiers used in this study are as follows: Logistic Regression, Decision Tree, Neural Network, Support Vector Machine, Adaboost, Bagging, and Random Forest. The results of this study confirmed that the CNN showed higher performancecompared with other classifiers in the case of feature selection. The results show that the CNN model effectively predicted the stock price direction by analyzing the embedded values of the financial data

Hybrid Machine Learning Model for Predicting the Direction of KOSPI Securities (코스피 방향 예측을 위한 하이브리드 머신러닝 모델)

  • Hwang, Heesoo
    • Journal of the Korea Convergence Society
    • /
    • v.12 no.6
    • /
    • pp.9-16
    • /
    • 2021
  • In the past, there have been various studies on predicting the stock market by machine learning techniques using stock price data and financial big data. As stock index ETFs that can be traded through HTS and MTS are created, research on predicting stock indices has recently attracted attention. In this paper, machine learning models for KOSPI's up and down predictions are implemented separately. These models are optimized through a grid search of their control parameters. In addition, a hybrid machine learning model that combines individual models is proposed to improve the precision and increase the ETF trading return. The performance of the predictiion models is evaluated by the accuracy and the precision that determines the ETF trading return. The accuracy and precision of the hybrid up prediction model are 72.1 % and 63.8 %, and those of the down prediction model are 79.8% and 64.3%. The precision of the hybrid down prediction model is improved by at least 14.3 % and at most 20.5 %. The hybrid up and down prediction models show an ETF trading return of 10.49%, and 25.91%, respectively. Trading inverse×2 and leverage ETF can increase the return by 1.5 to 2 times. Further research on a down prediction machine learning model is expected to increase the rate of return.

Trends in Patents for Numerical Analysis-Based Financial Instruments Valuation Systems (수치해석 기반 금융상품 가치평가 시스템 특허 동향)

  • Moonseong Kim
    • Journal of Internet Computing and Services
    • /
    • v.24 no.6
    • /
    • pp.41-47
    • /
    • 2023
  • Financial instruments valuation continues to evolve due to various technological changes. Recently, there has been increased interest in valuation using machine learning and artificial intelligence, enabling the financial market to swiftly adapt to changes. This technological advancement caters to the demand for real-time data processing and facilitates accurate and effective valuation, considering the diverse nature of the financial market. Numerical analysis techniques serve as crucial decision-making tools among financial institutions and investors, acknowledged as essential for performance prediction and risk management in investments. This paper analyzes Korean patent trends of numerical analysis-based financial systems, considering the diverse shifts in the financial market and asset data to provide accurate predictions. This study could shed light on the advancement of financial technology and serves as a gauge for technological standards within the financial market.

Corporate Default Prediction Model Using Deep Learning Time Series Algorithm, RNN and LSTM (딥러닝 시계열 알고리즘 적용한 기업부도예측모형 유용성 검증)

  • Cha, Sungjae;Kang, Jungseok
    • Journal of Intelligence and Information Systems
    • /
    • v.24 no.4
    • /
    • pp.1-32
    • /
    • 2018
  • In addition to stakeholders including managers, employees, creditors, and investors of bankrupt companies, corporate defaults have a ripple effect on the local and national economy. Before the Asian financial crisis, the Korean government only analyzed SMEs and tried to improve the forecasting power of a default prediction model, rather than developing various corporate default models. As a result, even large corporations called 'chaebol enterprises' become bankrupt. Even after that, the analysis of past corporate defaults has been focused on specific variables, and when the government restructured immediately after the global financial crisis, they only focused on certain main variables such as 'debt ratio'. A multifaceted study of corporate default prediction models is essential to ensure diverse interests, to avoid situations like the 'Lehman Brothers Case' of the global financial crisis, to avoid total collapse in a single moment. The key variables used in corporate defaults vary over time. This is confirmed by Beaver (1967, 1968) and Altman's (1968) analysis that Deakins'(1972) study shows that the major factors affecting corporate failure have changed. In Grice's (2001) study, the importance of predictive variables was also found through Zmijewski's (1984) and Ohlson's (1980) models. However, the studies that have been carried out in the past use static models. Most of them do not consider the changes that occur in the course of time. Therefore, in order to construct consistent prediction models, it is necessary to compensate the time-dependent bias by means of a time series analysis algorithm reflecting dynamic change. Based on the global financial crisis, which has had a significant impact on Korea, this study is conducted using 10 years of annual corporate data from 2000 to 2009. Data are divided into training data, validation data, and test data respectively, and are divided into 7, 2, and 1 years respectively. In order to construct a consistent bankruptcy model in the flow of time change, we first train a time series deep learning algorithm model using the data before the financial crisis (2000~2006). The parameter tuning of the existing model and the deep learning time series algorithm is conducted with validation data including the financial crisis period (2007~2008). As a result, we construct a model that shows similar pattern to the results of the learning data and shows excellent prediction power. After that, each bankruptcy prediction model is restructured by integrating the learning data and validation data again (2000 ~ 2008), applying the optimal parameters as in the previous validation. Finally, each corporate default prediction model is evaluated and compared using test data (2009) based on the trained models over nine years. Then, the usefulness of the corporate default prediction model based on the deep learning time series algorithm is proved. In addition, by adding the Lasso regression analysis to the existing methods (multiple discriminant analysis, logit model) which select the variables, it is proved that the deep learning time series algorithm model based on the three bundles of variables is useful for robust corporate default prediction. The definition of bankruptcy used is the same as that of Lee (2015). Independent variables include financial information such as financial ratios used in previous studies. Multivariate discriminant analysis, logit model, and Lasso regression model are used to select the optimal variable group. The influence of the Multivariate discriminant analysis model proposed by Altman (1968), the Logit model proposed by Ohlson (1980), the non-time series machine learning algorithms, and the deep learning time series algorithms are compared. In the case of corporate data, there are limitations of 'nonlinear variables', 'multi-collinearity' of variables, and 'lack of data'. While the logit model is nonlinear, the Lasso regression model solves the multi-collinearity problem, and the deep learning time series algorithm using the variable data generation method complements the lack of data. Big Data Technology, a leading technology in the future, is moving from simple human analysis, to automated AI analysis, and finally towards future intertwined AI applications. Although the study of the corporate default prediction model using the time series algorithm is still in its early stages, deep learning algorithm is much faster than regression analysis at corporate default prediction modeling. Also, it is more effective on prediction power. Through the Fourth Industrial Revolution, the current government and other overseas governments are working hard to integrate the system in everyday life of their nation and society. Yet the field of deep learning time series research for the financial industry is still insufficient. This is an initial study on deep learning time series algorithm analysis of corporate defaults. Therefore it is hoped that it will be used as a comparative analysis data for non-specialists who start a study combining financial data and deep learning time series algorithm.

A Case Study on the Establishment of an Equity Investment Optimization Model based on FinTech: For Institutional Investors (핀테크 기반 주식투자 최적화 모델 구축 사례 연구 : 기관투자자 대상)

  • Kim, Hong Gon;Kim, Sodam;Kim, Hee-Wooong
    • Knowledge Management Research
    • /
    • v.19 no.1
    • /
    • pp.97-118
    • /
    • 2018
  • The finance-investment industry is currently focusing on research related to artificial intelligence and big data, moving beyond conventional theories of financial engineering. However, the case of equity optimization portfolio by using an artificial intelligence, big data, and its performance is rarely realized in practice. Thus, the purpose of this study is to propose process improvements in equity selection, information analysis, and portfolio composition, and lastly an improvement in portfolio returns, with the case of an equity optimization model based on quantitative research by an artificial intelligence. This paper is an empirical study of the portfolio based on an artificial intelligence technology of "D" asset management, which is the largest domestic active-quant-fiduciary management in accordance with the purpose of this paper. This study will apply artificial intelligence to finance, analyzing financial and demand-supply information and automating factor-selection and weight of equity through machine learning based on the artificial neural network. Also, the learning the process for the composition of portfolio optimization and its performance by applying genetic algorithms to models will be documented. This study posits a model that the asset management industry can achieve, with continuous and stable excess performance, low costs and high efficiency in the process of investment.

Experimental Analysis of Bankruptcy Prediction with SHAP framework on Polish Companies

  • Tuguldur Enkhtuya;Dae-Ki Kang
    • International journal of advanced smart convergence
    • /
    • v.12 no.1
    • /
    • pp.53-58
    • /
    • 2023
  • With the fast development of artificial intelligence day by day, users are demanding explanations about the results of algorithms and want to know what parameters influence the results. In this paper, we propose a model for bankruptcy prediction with interpretability using the SHAP framework. SHAP (SHAPley Additive exPlanations) is framework that gives a visualized result that can be used for explanation and interpretation of machine learning models. As a result, we can describe which features are important for the result of our deep learning model. SHAP framework Force plot result gives us top features which are mainly reflecting overall model score. Even though Fully Connected Neural Networks are a "black box" model, Shapley values help us to alleviate the "black box" problem. FCNNs perform well with complex dataset with more than 60 financial ratios. Combined with SHAP framework, we create an effective model with understandable interpretation. Bankruptcy is a rare event, then we avoid imbalanced dataset problem with the help of SMOTE. SMOTE is one of the oversampling technique that resulting synthetic samples are generated for the minority class. It uses K-nearest neighbors algorithm for line connecting method in order to producing examples. We expect our model results assist financial analysts who are interested in forecasting bankruptcy prediction of companies in detail.

A Comparative Analysis of Ensemble Learning-Based Classification Models for Explainable Term Deposit Subscription Forecasting (설명 가능한 정기예금 가입 여부 예측을 위한 앙상블 학습 기반 분류 모델들의 비교 분석)

  • Shin, Zian;Moon, Jihoon;Rho, Seungmin
    • The Journal of Society for e-Business Studies
    • /
    • v.26 no.3
    • /
    • pp.97-117
    • /
    • 2021
  • Predicting term deposit subscriptions is one of representative financial marketing in banks, and banks can build a prediction model using various customer information. In order to improve the classification accuracy for term deposit subscriptions, many studies have been conducted based on machine learning techniques. However, even if these models can achieve satisfactory performance, utilizing them is not an easy task in the industry when their decision-making process is not adequately explained. To address this issue, this paper proposes an explainable scheme for term deposit subscription forecasting. For this, we first construct several classification models using decision tree-based ensemble learning methods, which yield excellent performance in tabular data, such as random forest, gradient boosting machine (GBM), extreme gradient boosting (XGB), and light gradient boosting machine (LightGBM). We then analyze their classification performance in depth through 10-fold cross-validation. After that, we provide the rationale for interpreting the influence of customer information and the decision-making process by applying Shapley additive explanation (SHAP), an explainable artificial intelligence technique, to the best classification model. To verify the practicality and validity of our scheme, experiments were conducted with the bank marketing dataset provided by Kaggle; we applied the SHAP to the GBM and LightGBM models, respectively, according to different dataset configurations and then performed their analysis and visualization for explainable term deposit subscriptions.

P-Triple Barrier Labeling: Unifying Pair Trading Strategies and Triple Barrier Labeling Through Genetic Algorithm Optimization

  • Ning Fu;Suntae Kim
    • International journal of advanced smart convergence
    • /
    • v.12 no.4
    • /
    • pp.111-118
    • /
    • 2023
  • In the ever-changing landscape of finance, the fusion of artificial intelligence (AI)and pair trading strategies has captured the interest of investors and institutions alike. In the context of supervised machine learning, crafting precise and accurate labels is crucial, as it remains a top priority to empower AI models to surpass traditional pair trading methods. However, prevailing labeling techniques in the financial sector predominantly concentrate on individual assets, posing a challenge in aligning with pair trading strategies. To address this issue, we propose an inventive approach that melds the Triple Barrier Labeling technique with pair trading, optimizing the resultant labels through genetic algorithms. Rigorous backtesting on cryptocurrency datasets illustrates that our proposed labeling method excels over traditional pair trading methods and corresponding buy-and-hold strategies in both profitability and risk control. This pioneering method offers a novel perspective on trading strategies and risk management within the financial domain, laying a robust groundwork for further enhancing the precision and reliability of pair trading strategies utilizing AI models.

Predictive Analysis of Financial Fraud Detection using Azure and Spark ML

  • Priyanka Purushu;Niklas Melcher;Bhagyashree Bhagwat;Jongwook Woo
    • Asia pacific journal of information systems
    • /
    • v.28 no.4
    • /
    • pp.308-319
    • /
    • 2018
  • This paper aims at providing valuable insights on Financial Fraud Detection on a mobile money transactional activity. We have predicted and classified the transaction as normal or fraud with a small sample and massive data set using Azure and Spark ML, which are traditional systems and Big Data respectively. Experimenting with sample dataset in Azure, we found that the Decision Forest model is the most accurate to proceed in terms of the recall value. For the massive data set using Spark ML, it is found that the Random Forest classifier algorithm of the classification model proves to be the best algorithm. It is presented that the Spark cluster gets much faster to build and evaluate models as adding more servers to the cluster with the same accuracy, which proves that the large scale data set can be predictable using Big Data platform. Finally, we reached a recall score with 0.73, which implies a satisfying prediction quality in predicting fraudulent transactions.