• Title/Summary/Keyword: financial machine learning

Search Result 145, Processing Time 0.025 seconds

Some Observations for Portfolio Management Applications of Modern Machine Learning Methods

  • Park, Jooyoung;Heo, Seongman;Kim, Taehwan;Park, Jeongho;Kim, Jaein;Park, Kyungwook
    • International Journal of Fuzzy Logic and Intelligent Systems
    • /
    • v.16 no.1
    • /
    • pp.44-51
    • /
    • 2016
  • Recently, artificial intelligence has reached the level of top information technologies that will have significant influence over many aspects of our future lifestyles. In particular, in the fields of machine learning technologies for classification and decision-making, there have been a lot of research efforts for solving estimation and control problems that appear in the various kinds of portfolio management problems via data-driven approaches. Note that these modern data-driven approaches, which try to find solutions to the problems based on relevant empirical data rather than mathematical analyses, are useful particularly in practical application domains. In this paper, we consider some applications of modern data-driven machine learning methods for portfolio management problems. More precisely, we apply a simplified version of the sparse Gaussian process (GP) classification method for classifying users' sensitivity with respect to financial risk, and then present two portfolio management issues in which the GP application results can be useful. Experimental results show that the GP applications work well in handling simulated data sets.

Identifying Cluster Patterns in Relationship Between Municipal Revenue Configuration and Fiscal Surplus: Application of Machine Learning Methodologies

  • Im Chunghyeok;Ryou Jaemin;Han JunHyun;Bae Jayon
    • International Journal of Advanced Culture Technology
    • /
    • v.12 no.3
    • /
    • pp.159-164
    • /
    • 2024
  • Net surplus serves as a crucial indicator of how efficiently local governments utilize their resources. This study aims to analyze and categorize the patterns of net surplus across 75 local governments in Korea. By employing machine learning techniques such as K-means clustering and silhouette analysis, this research delves into surplus patterns, revealing insights that differ from those provided by traditional analytical methods. Machine learning enables a broader spectrum of discoveries, leading us to identify three distinct clusters in the net surplus of Korean local finances. The characteristics of these three clusters show that the wealthiest cities have the highest surplus ratios. In contrast, mid-sized municipalities, constrained by limited central government support and scarce local resources, exhibit the lowest surplus ratios. Interestingly, a significant number of cities maintain a median surplus ratio even under challenging fiscal conditions. Additionally, we identify critical thresholds that differentiate the three clusters: a grant-in-aid ratio of 19.31%, a debt ratio of 3.52%, and a local tax ratio of 25.58%. This identification of thresholds is a key contribution of our study, as these specific thresholds have not been previously addressed in the literature.

A Classification Model for Illegal Debt Collection Using Rule and Machine Learning Based Methods

  • Kim, Tae-Ho;Lim, Jong-In
    • Journal of the Korea Society of Computer and Information
    • /
    • v.26 no.4
    • /
    • pp.93-103
    • /
    • 2021
  • Despite the efforts of financial authorities in conducting the direct management and supervision of collection agents and bond-collecting guideline, the illegal and unfair collection of debts still exist. To effectively prevent such illegal and unfair debt collection activities, we need a method for strengthening the monitoring of illegal collection activities even with little manpower using technologies such as unstructured data machine learning. In this study, we propose a classification model for illegal debt collection that combine machine learning such as Support Vector Machine (SVM) with a rule-based technique that obtains the collection transcript of loan companies and converts them into text data to identify illegal activities. Moreover, the study also compares how accurate identification was made in accordance with the machine learning algorithm. The study shows that a case of using the combination of the rule-based illegal rules and machine learning for classification has higher accuracy than the classification model of the previous study that applied only machine learning. This study is the first attempt to classify illegalities by combining rule-based illegal detection rules with machine learning. If further research will be conducted to improve the model's completeness, it will greatly contribute in preventing consumer damage from illegal debt collection activities.

Financial Instruments Recommendation based on Classification Financial Consumer by Text Mining Techniques (비정형 데이터 분석을 통한 금융소비자 유형화 및 그에 따른 금융상품 추천 방법)

  • Lee, Jaewoong;Kim, Young-Sik;Kwon, Ohbyung
    • Journal of Information Technology Services
    • /
    • v.15 no.4
    • /
    • pp.1-24
    • /
    • 2016
  • With the innovation of information technology, non-face-to-face robo advisor with high accessibility and convenience is spreading. The current robot advisor recommends appropriate investment products after understanding the investment propensity based on the structured data entered directly or indirectly by individuals. However, it is an inconvenient and obtrusive way for financial consumers to inquire or input their own subjective propensity to invest. Hence, this study proposes a way to deduce the propensity to invest in unstructured data that customers voluntarily exposed during consultation or online. Since prediction performance based on unstructured document differs according to the characteristics of text, in this study, classification algorithm optimized for the characteristic of text left by financial consumers is selected by performing prediction performance evaluation of various learning discrimination algorithms and proposed an intelligent method that automatically recommends investment products. User tests were given to MBA students. After showing the recommended investment and list of investment products, satisfaction was asked. Financial consumers' satisfaction was measured by dividing them into investment propensity and recommendation goods. The results suggest that the users high satisfaction with investment products recommended by the method proposed in this paper. The results showed that it can be applies to non-face-to-face robo advisor.

A Methodology for Bankruptcy Prediction in Imbalanced Datasets using eXplainable AI (데이터 불균형을 고려한 설명 가능한 인공지능 기반 기업부도예측 방법론 연구)

  • Heo, Sun-Woo;Baek, Dong Hyun
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.45 no.2
    • /
    • pp.65-76
    • /
    • 2022
  • Recently, not only traditional statistical techniques but also machine learning algorithms have been used to make more accurate bankruptcy predictions. But the insolvency rate of companies dealing with financial institutions is very low, resulting in a data imbalance problem. In particular, since data imbalance negatively affects the performance of artificial intelligence models, it is necessary to first perform the data imbalance process. In additional, as artificial intelligence algorithms are advanced for precise decision-making, regulatory pressure related to securing transparency of Artificial Intelligence models is gradually increasing, such as mandating the installation of explanation functions for Artificial Intelligence models. Therefore, this study aims to present guidelines for eXplainable Artificial Intelligence-based corporate bankruptcy prediction methodology applying SMOTE techniques and LIME algorithms to solve a data imbalance problem and model transparency problem in predicting corporate bankruptcy. The implications of this study are as follows. First, it was confirmed that SMOTE can effectively solve the data imbalance issue, a problem that can be easily overlooked in predicting corporate bankruptcy. Second, through the LIME algorithm, the basis for predicting bankruptcy of the machine learning model was visualized, and derive improvement priorities of financial variables that increase the possibility of bankruptcy of companies. Third, the scope of application of the algorithm in future research was expanded by confirming the possibility of using SMOTE and LIME through case application.

A Pre-processing Process Using TadGAN-based Time-series Anomaly Detection (TadGAN 기반 시계열 이상 탐지를 활용한 전처리 프로세스 연구)

  • Lee, Seung Hoon;Kim, Yong Soo
    • Journal of Korean Society for Quality Management
    • /
    • v.50 no.3
    • /
    • pp.459-471
    • /
    • 2022
  • Purpose: The purpose of this study was to increase prediction accuracy for an anomaly interval identified using an artificial intelligence-based time series anomaly detection technique by establishing a pre-processing process. Methods: Significant variables were extracted by applying feature selection techniques, and anomalies were derived using the TadGAN time series anomaly detection algorithm. After applying machine learning and deep learning methodologies using normal section data (excluding anomaly sections), the explanatory power of the anomaly sections was demonstrated through performance comparison. Results: The results of the machine learning methodology, the performance was the best when SHAP and TadGAN were applied, and the results in the deep learning, the performance was excellent when Chi-square Test and TadGAN were applied. Comparing each performance with the papers applied with a Conventional methodology using the same data, it can be seen that the performance of the MLR was significantly improved to 15%, Random Forest to 24%, XGBoost to 30%, Lasso Regression to 73%, LSTM to 17% and GRU to 19%. Conclusion: Based on the proposed process, when detecting unsupervised learning anomalies of data that are not actually labeled in various fields such as cyber security, financial sector, behavior pattern field, SNS. It is expected to prove the accuracy and explanation of the anomaly detection section and improve the performance of the model.

An Ensemble Approach to Detect Fake News Spreaders on Twitter

  • Sarwar, Muhammad Nabeel;UlAmin, Riaz;Jabeen, Sidra
    • International Journal of Computer Science & Network Security
    • /
    • v.22 no.5
    • /
    • pp.294-302
    • /
    • 2022
  • Detection of fake news is a complex and a challenging task. Generation of fake news is very hard to stop, only steps to control its circulation may help in minimizing its impacts. Humans tend to believe in misleading false information. Researcher started with social media sites to categorize in terms of real or fake news. False information misleads any individual or an organization that may cause of big failure and any financial loss. Automatic system for detection of false information circulating on social media is an emerging area of research. It is gaining attention of both industry and academia since US presidential elections 2016. Fake news has negative and severe effects on individuals and organizations elongating its hostile effects on the society. Prediction of fake news in timely manner is important. This research focuses on detection of fake news spreaders. In this context, overall, 6 models are developed during this research, trained and tested with dataset of PAN 2020. Four approaches N-gram based; user statistics-based models are trained with different values of hyper parameters. Extensive grid search with cross validation is applied in each machine learning model. In N-gram based models, out of numerous machine learning models this research focused on better results yielding algorithms, assessed by deep reading of state-of-the-art related work in the field. For better accuracy, author aimed at developing models using Random Forest, Logistic Regression, SVM, and XGBoost. All four machine learning algorithms were trained with cross validated grid search hyper parameters. Advantages of this research over previous work is user statistics-based model and then ensemble learning model. Which were designed in a way to help classifying Twitter users as fake news spreader or not with highest reliability. User statistical model used 17 features, on the basis of which it categorized a Twitter user as malicious. New dataset based on predictions of machine learning models was constructed. And then Three techniques of simple mean, logistic regression and random forest in combination with ensemble model is applied. Logistic regression combined in ensemble model gave best training and testing results, achieving an accuracy of 72%.

Developing the Automated Sentiment Learning Algorithm to Build the Korean Sentiment Lexicon for Finance (재무분야 감성사전 구축을 위한 자동화된 감성학습 알고리즘 개발)

  • Su-Ji Cho;Ki-Kwang Lee;Cheol-Won Yang
    • Journal of Korean Society of Industrial and Systems Engineering
    • /
    • v.46 no.1
    • /
    • pp.32-41
    • /
    • 2023
  • Recently, many studies are being conducted to extract emotion from text and verify its information power in the field of finance, along with the recent development of big data analysis technology. A number of prior studies use pre-defined sentiment dictionaries or machine learning methods to extract sentiment from the financial documents. However, both methods have the disadvantage of being labor-intensive and subjective because it requires a manual sentiment learning process. In this study, we developed a financial sentiment dictionary that automatically extracts sentiment from the body text of analyst reports by using modified Bayes rule and verified the performance of the model through a binary classification model which predicts actual stock price movements. As a result of the prediction, it was found that the proposed financial dictionary from this research has about 4% better predictive power for actual stock price movements than the representative Loughran and McDonald's (2011) financial dictionary. The sentiment extraction method proposed in this study enables efficient and objective judgment because it automatically learns the sentiment of words using both the change in target price and the cumulative abnormal returns. In addition, the dictionary can be easily updated by re-calculating conditional probabilities. The results of this study are expected to be readily expandable and applicable not only to analyst reports, but also to financial field texts such as performance reports, IR reports, press articles, and social media.

URL Phishing Detection System Utilizing Catboost Machine Learning Approach

  • Fang, Lim Chian;Ayop, Zakiah;Anawar, Syarulnaziah;Othman, Nur Fadzilah;Harum, Norharyati;Abdullah, Raihana Syahirah
    • International Journal of Computer Science & Network Security
    • /
    • v.21 no.9
    • /
    • pp.297-302
    • /
    • 2021
  • The development of various phishing websites enables hackers to access confidential personal or financial data, thus, decreasing the trust in e-business. This paper compared the detection techniques utilizing URL-based features. To analyze and compare the performance of supervised machine learning classifiers, the machine learning classifiers were trained by using more than 11,005 phishing and legitimate URLs. 30 features were extracted from the URLs to detect a phishing or legitimate URL. Logistic Regression, Random Forest, and CatBoost classifiers were then analyzed and their performances were evaluated. The results yielded that CatBoost was much better classifier than Random Forest and Logistic Regression with up to 96% of detection accuracy.

Android Botnet Detection Using Hybrid Analysis

  • Mamoona Arhsad;Ahmad Karim
    • KSII Transactions on Internet and Information Systems (TIIS)
    • /
    • v.18 no.3
    • /
    • pp.704-719
    • /
    • 2024
  • Botnet pandemics are becoming more prevalent with the growing use of mobile phone technologies. Mobile phone technologies provide a wide range of applications, including entertainment, commerce, education, and finance. In addition, botnet refers to the collection of compromised devices managed by a botmaster and engaging with each other via a command server to initiate an attack including phishing email, ad-click fraud, blockchain, and much more. As the number of botnet attacks rises, detecting harmful activities is becoming more challenging in handheld devices. Therefore, it is crucial to evaluate mobile botnet assaults to find the security vulnerabilities that occur through coordinated command servers causing major financial and ethical harm. For this purpose, we propose a hybrid analysis approach that integrates permissions and API and experiments on the machine-learning classifiers to detect mobile botnet applications. In this paper, the experiment employed benign, botnet, and malware applications for validation of the performance and accuracy of classifiers. The results conclude that a classifier model based on a simple decision tree obtained 99% accuracy with a low 0.003 false-positive rate than other machine learning classifiers for botnet applications detection. As an outcome of this paper, a hybrid approach enhances the accuracy of mobile botnet detection as compared to static and dynamic features when both are taken separately.