• Title/Summary/Keyword: financial machine learning

Search Result 145, Processing Time 0.034 seconds

Differences and Multi-dimensionality of the Perception of Career Success among Korean Employees: A Topic Modeling Approach (기업근로자 경력성공 인식의 다차원성과 차이: 토픽모델링의 적용)

  • Lee, Jaeeun;Chae, Chungil
    • The Journal of the Korea Contents Association
    • /
    • v.19 no.6
    • /
    • pp.58-71
    • /
    • 2019
  • The purpose of this study is to explore the multi-dimensionality and the differences of the career success that is revealed by the employee's perception. In order to fulfill the research purpose, LDA topic modeling has applied to extract latent topics of career success from 126 Korean employees' open-end survey questionnaires. The extracted latent topics are social recognition, continuing service within an organization, expertise, financial rewards, and pursuing personal meaning. The occurrence probability of each topic was different by individual characteristics such as gender, education, position. Study findings showed there is multi-dimensionality in career success, and there are differences of topic occurrence probability by demographic characteristics. Additionally, this study showed how to apply the recently developed machine learning approach in order to reduce the researcher's bias by adapting the LDA topic modeling to the qualitative open-ended survey data.

A New Method to Detect Anomalous State of Network using Information of Clusters (클러스터 정보를 이용한 네트워크 이상상태 탐지방법)

  • Lee, Ho-Sub;Park, Eung-Ki;Seo, Jung-Taek
    • Journal of the Korea Institute of Information Security & Cryptology
    • /
    • v.22 no.3
    • /
    • pp.545-552
    • /
    • 2012
  • The rapid development of information technology is making large changes in our lives today. Also the infrastructure and services are combinding with information technology which predicts another huge change in our environment. However, the development of information technology brings various types of side effects and these side effects not only cause financial loss but also can develop into a nationwide crisis. Therefore, the detection and quick reaction towards these side effects is critical and much research is being done. Intrusion detection systems can be an example of such research. However, intrusion detection systems mostly tend to focus on judging whether particular traffic or files are malicious or not. Also it is difficult for intrusion detection systems to detect newly developed malicious codes. Therefore, this paper proposes a method which determines whether the present network model is normal or abnormal by comparing it with past network situations.

Predicting Default Risk among Young Adults with Random Forest Algorithm (랜덤포레스트 모델을 활용한 청년층 차입자의 채무 불이행 위험 연구)

  • Lee, Jonghee
    • Journal of Family Resource Management and Policy Review
    • /
    • v.26 no.3
    • /
    • pp.19-34
    • /
    • 2022
  • There are growing concerns about debt insolvency among youth and low-income households. The deterioration in household debt quality among young people is due to a combination of sluggish employment, an increase in student loan burden and an increase in high-interest loans from the secondary financial sector. The purpose of this study was to explore the possibility of household debt default among young borrowers in Korea and to predict the factors affecting this possibility. This study utilized the 2021 Household Finance and Welfare Survey and used random forest algorithm to comprehensively analyze factors related to the possibility of default risk among young adults. This study presented the importance index and partial dependence charts of major determinants. This study found that the ratio of debt to assets(DTA), medical costs, household default risk index (HDRI), communication costs, and housing costs the focal independent variables.

Cryptocurrency Recommendation Model using the Similarity and Association Rule Mining (유사도와 연관규칙분석을 이용한 암호화폐 추천모형)

  • Kim, Yechan;Kim, Jinyoung;Kim, Chaerin;Kim, Kyoung-jae
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.4
    • /
    • pp.287-308
    • /
    • 2022
  • The explosive growth of cryptocurrency, led by Bitcoin has emerged as a major issue in the financial market recently. As a result, interest in cryptocurrency investment is increasing, but the market opens 24 hours and 365 days a year, price volatility, and exponentially increasing number of cryptocurrencies are provided as risks to cryptocurrency investors. For that reasons, It is raising the need for research to reduct investors' risks by dividing cryptocurrency which is not suitable for recommendation. Unlike the previous studies of maximizing returns by simply predicting the future of cryptocurrency prices or constructing cryptocurrency portfolios by focusing on returns, this paper reflects the tendencies of investors and presents an appropriate recommendation method with interpretation that can reduct investors' risks by selecting suitable Altcoins which are recommended using Apriori algorithm, one of the machine learning techniques, but based on the similarity and association rules of Bitocoin.

A Study on the Forecasting Trend of Apartment Prices: Focusing on Government Policy, Economy, Supply and Demand Characteristics (아파트 매매가 추이 예측에 관한 연구: 정부 정책, 경제, 수요·공급 속성을 중심으로)

  • Lee, Jung-Mok;Choi, Su An;Yu, Su-Han;Kim, Seonghun;Kim, Tae-Jun;Yu, Jong-Pil
    • The Journal of Bigdata
    • /
    • v.6 no.1
    • /
    • pp.91-113
    • /
    • 2021
  • Despite the influence of real estate in the Korean asset market, it is not easy to predict market trends, and among them, apartments are not easy to predict because they are both residential spaces and contain investment properties. Factors affecting apartment prices vary and regional characteristics should also be considered. This study was conducted to compare the factors and characteristics that affect apartment prices in Seoul as a whole, 3 Gangnam districts, Nowon, Dobong, Gangbuk, Geumcheon, Gwanak and Guro districts and to understand the possibility of price prediction based on this. The analysis used machine learning algorithms such as neural networks, CHAID, linear regression, and random forests. The most important factor affecting the average selling price of all apartments in Seoul was the government's policy element, and easing policies such as easing transaction regulations and easing financial regulations were highly influential. In the case of the three Gangnam districts, the policy influence was low, and in the case of Gangnam-gu District, housing supply was the most important factor. On the other hand, 6 mid-lower-level districts saw government policies act as important variables and were commonly influenced by financial regulatory policies.

Systemic literature review on the impact of government financial support on innovation in private firms (정부의 기술혁신 재정지원 정책효과에 대한 체계적 문헌연구)

  • Ahn, Joon Mo
    • Journal of Technology Innovation
    • /
    • v.30 no.1
    • /
    • pp.57-104
    • /
    • 2022
  • The government has supported the innovation of private firms by intervening the market for various purposes, such as preventing market failure, alleviating information asymmetry, and allocating resources efficiently. Although the government's R&D budget increased rapidly in the 2000s, it is not clear whether the government intervention has made desirable impact on the market. To address this, the current study attempts to explore this issue by doing a systematic literature review on foreign and domestic papers in an integrated way. In total, 168 studies are analyzed using contents analysis approach and various lens, such as policy additionality, policy tools, firm size, unit of analysis, data and method, are adopted for analysis. Overlapping policy target, time lag between government intervention and policy effects, non-linearity of financial supports, interference between different polices, and out-dated R&D tax incentive system are reported as factors hampering the effect of the government intervention. Many policy prescriptions, such as program evaluation indices reflecting behavioral additionality, an introduction of policy mix and evidence-based policy using machine learning, are suggested to improve these hurdles.

Text Filtering using Iterative Boosting Algorithms (반복적 부스팅 학습을 이용한 문서 여과)

  • Hahn, Sang-Youn;Zang, Byoung-Tak
    • Journal of KIISE:Software and Applications
    • /
    • v.29 no.4
    • /
    • pp.270-277
    • /
    • 2002
  • Text filtering is a task of deciding whether a document has relevance to a specified topic. As Internet and Web becomes wide-spread and the number of documents delivered by e-mail explosively grows the importance of text filtering increases as well. The aim of this paper is to improve the accuracy of text filtering systems by using machine learning techniques. We apply AdaBoost algorithms to the filtering task. An AdaBoost algorithm generates and combines a series of simple hypotheses. Each of the hypotheses decides the relevance of a document to a topic on the basis of whether or not the document includes a certain word. We begin with an existing AdaBoost algorithm which uses weak hypotheses with their output of 1 or -1. Then we extend the algorithm to use weak hypotheses with real-valued outputs which was proposed recently to improve error reduction rates and final filtering performance. Next, we attempt to achieve further improvement in the AdaBoost's performance by first setting weights randomly according to the continuous Poisson distribution, executing AdaBoost, repeating these steps several times, and then combining all the hypotheses learned. This has the effect of mitigating the ovefitting problem which may occur when learning from a small number of data. Experiments have been performed on the real document collections used in TREC-8, a well-established text retrieval contest. This dataset includes Financial Times articles from 1992 to 1994. The experimental results show that AdaBoost with real-valued hypotheses outperforms AdaBoost with binary-valued hypotheses, and that AdaBoost iterated with random weights further improves filtering accuracy. Comparison results of all the participants of the TREC-8 filtering task are also provided.

Predicting stock movements based on financial news with systematic group identification (시스템적인 군집 확인과 뉴스를 이용한 주가 예측)

  • Seong, NohYoon;Nam, Kihwan
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.3
    • /
    • pp.1-17
    • /
    • 2019
  • Because stock price forecasting is an important issue both academically and practically, research in stock price prediction has been actively conducted. The stock price forecasting research is classified into using structured data and using unstructured data. With structured data such as historical stock price and financial statements, past studies usually used technical analysis approach and fundamental analysis. In the big data era, the amount of information has rapidly increased, and the artificial intelligence methodology that can find meaning by quantifying string information, which is an unstructured data that takes up a large amount of information, has developed rapidly. With these developments, many attempts with unstructured data are being made to predict stock prices through online news by applying text mining to stock price forecasts. The stock price prediction methodology adopted in many papers is to forecast stock prices with the news of the target companies to be forecasted. However, according to previous research, not only news of a target company affects its stock price, but news of companies that are related to the company can also affect the stock price. However, finding a highly relevant company is not easy because of the market-wide impact and random signs. Thus, existing studies have found highly relevant companies based primarily on pre-determined international industry classification standards. However, according to recent research, global industry classification standard has different homogeneity within the sectors, and it leads to a limitation that forecasting stock prices by taking them all together without considering only relevant companies can adversely affect predictive performance. To overcome the limitation, we first used random matrix theory with text mining for stock prediction. Wherever the dimension of data is large, the classical limit theorems are no longer suitable, because the statistical efficiency will be reduced. Therefore, a simple correlation analysis in the financial market does not mean the true correlation. To solve the issue, we adopt random matrix theory, which is mainly used in econophysics, to remove market-wide effects and random signals and find a true correlation between companies. With the true correlation, we perform cluster analysis to find relevant companies. Also, based on the clustering analysis, we used multiple kernel learning algorithm, which is an ensemble of support vector machine to incorporate the effects of the target firm and its relevant firms simultaneously. Each kernel was assigned to predict stock prices with features of financial news of the target firm and its relevant firms. The results of this study are as follows. The results of this paper are as follows. (1) Following the existing research flow, we confirmed that it is an effective way to forecast stock prices using news from relevant companies. (2) When looking for a relevant company, looking for it in the wrong way can lower AI prediction performance. (3) The proposed approach with random matrix theory shows better performance than previous studies if cluster analysis is performed based on the true correlation by removing market-wide effects and random signals. The contribution of this study is as follows. First, this study shows that random matrix theory, which is used mainly in economic physics, can be combined with artificial intelligence to produce good methodologies. This suggests that it is important not only to develop AI algorithms but also to adopt physics theory. This extends the existing research that presented the methodology by integrating artificial intelligence with complex system theory through transfer entropy. Second, this study stressed that finding the right companies in the stock market is an important issue. This suggests that it is not only important to study artificial intelligence algorithms, but how to theoretically adjust the input values. Third, we confirmed that firms classified as Global Industrial Classification Standard (GICS) might have low relevance and suggested it is necessary to theoretically define the relevance rather than simply finding it in the GICS.

A Study on the Development Trend of Artificial Intelligence Using Text Mining Technique: Focused on Open Source Software Projects on Github (텍스트 마이닝 기법을 활용한 인공지능 기술개발 동향 분석 연구: 깃허브 상의 오픈 소스 소프트웨어 프로젝트를 대상으로)

  • Chong, JiSeon;Kim, Dongsung;Lee, Hong Joo;Kim, Jong Woo
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.1-19
    • /
    • 2019
  • Artificial intelligence (AI) is one of the main driving forces leading the Fourth Industrial Revolution. The technologies associated with AI have already shown superior abilities that are equal to or better than people in many fields including image and speech recognition. Particularly, many efforts have been actively given to identify the current technology trends and analyze development directions of it, because AI technologies can be utilized in a wide range of fields including medical, financial, manufacturing, service, and education fields. Major platforms that can develop complex AI algorithms for learning, reasoning, and recognition have been open to the public as open source projects. As a result, technologies and services that utilize them have increased rapidly. It has been confirmed as one of the major reasons for the fast development of AI technologies. Additionally, the spread of the technology is greatly in debt to open source software, developed by major global companies, supporting natural language recognition, speech recognition, and image recognition. Therefore, this study aimed to identify the practical trend of AI technology development by analyzing OSS projects associated with AI, which have been developed by the online collaboration of many parties. This study searched and collected a list of major projects related to AI, which were generated from 2000 to July 2018 on Github. This study confirmed the development trends of major technologies in detail by applying text mining technique targeting topic information, which indicates the characteristics of the collected projects and technical fields. The results of the analysis showed that the number of software development projects by year was less than 100 projects per year until 2013. However, it increased to 229 projects in 2014 and 597 projects in 2015. Particularly, the number of open source projects related to AI increased rapidly in 2016 (2,559 OSS projects). It was confirmed that the number of projects initiated in 2017 was 14,213, which is almost four-folds of the number of total projects generated from 2009 to 2016 (3,555 projects). The number of projects initiated from Jan to Jul 2018 was 8,737. The development trend of AI-related technologies was evaluated by dividing the study period into three phases. The appearance frequency of topics indicate the technology trends of AI-related OSS projects. The results showed that the natural language processing technology has continued to be at the top in all years. It implied that OSS had been developed continuously. Until 2015, Python, C ++, and Java, programming languages, were listed as the top ten frequently appeared topics. However, after 2016, programming languages other than Python disappeared from the top ten topics. Instead of them, platforms supporting the development of AI algorithms, such as TensorFlow and Keras, are showing high appearance frequency. Additionally, reinforcement learning algorithms and convolutional neural networks, which have been used in various fields, were frequently appeared topics. The results of topic network analysis showed that the most important topics of degree centrality were similar to those of appearance frequency. The main difference was that visualization and medical imaging topics were found at the top of the list, although they were not in the top of the list from 2009 to 2012. The results indicated that OSS was developed in the medical field in order to utilize the AI technology. Moreover, although the computer vision was in the top 10 of the appearance frequency list from 2013 to 2015, they were not in the top 10 of the degree centrality. The topics at the top of the degree centrality list were similar to those at the top of the appearance frequency list. It was found that the ranks of the composite neural network and reinforcement learning were changed slightly. The trend of technology development was examined using the appearance frequency of topics and degree centrality. The results showed that machine learning revealed the highest frequency and the highest degree centrality in all years. Moreover, it is noteworthy that, although the deep learning topic showed a low frequency and a low degree centrality between 2009 and 2012, their ranks abruptly increased between 2013 and 2015. It was confirmed that in recent years both technologies had high appearance frequency and degree centrality. TensorFlow first appeared during the phase of 2013-2015, and the appearance frequency and degree centrality of it soared between 2016 and 2018 to be at the top of the lists after deep learning, python. Computer vision and reinforcement learning did not show an abrupt increase or decrease, and they had relatively low appearance frequency and degree centrality compared with the above-mentioned topics. Based on these analysis results, it is possible to identify the fields in which AI technologies are actively developed. The results of this study can be used as a baseline dataset for more empirical analysis on future technology trends that can be converged.

Who Gets Government SME R&D Subsidy? Application of Gradient Boosting Model (Gradient Boosting 모형을 이용한 중소기업 R&D 지원금 결정요인 분석)

  • Kang, Sung Won;Kang, HeeChan
    • The Journal of Society for e-Business Studies
    • /
    • v.25 no.4
    • /
    • pp.77-109
    • /
    • 2020
  • In this paper, we build a gradient Boosting model to predict government SME R&D subsidy, select features of high importance, and measure the impact of each features to the predicted subsidy using PDP and SHAP value. Unlike previous empirical researches, we focus on the effect of the R&D subsidy distribution pattern to the incentive of the firms participating subsidy competition. We used the firm data constructed by KISTEP linking government R&D subsidy record with financial statements provided by NICE, and applied a Gradient Boosting model to predict R&D subsidy. We found that firms with higher R&D performance and larger R&D investment tend to have higher R&D subsidies, but firms with higher operation profit or total asset turnover rate tend to have lower R&D subsidies. Our results suggest that current government R&D subsidy distribution pattern provides incentive to improve R&D project performance, but not business performance.