• Title/Summary/Keyword: artificial intelligence techniques

Search Result 689, Processing Time 0.022 seconds

Corporate Credit Rating based on Bankruptcy Probability Using AdaBoost Algorithm-based Support Vector Machine (AdaBoost 알고리즘기반 SVM을 이용한 부실 확률분포 기반의 기업신용평가)

  • Shin, Taek-Soo;Hong, Tae-Ho
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.3
    • /
    • pp.25-41
    • /
    • 2011
  • Recently, support vector machines (SVMs) are being recognized as competitive tools as compared with other data mining techniques for solving pattern recognition or classification decision problems. Furthermore, many researches, in particular, have proved them more powerful than traditional artificial neural networks (ANNs) (Amendolia et al., 2003; Huang et al., 2004, Huang et al., 2005; Tay and Cao, 2001; Min and Lee, 2005; Shin et al., 2005; Kim, 2003).The classification decision, such as a binary or multi-class decision problem, used by any classifier, i.e. data mining techniques is so cost-sensitive particularly in financial classification problems such as the credit ratings that if the credit ratings are misclassified, a terrible economic loss for investors or financial decision makers may happen. Therefore, it is necessary to convert the outputs of the classifier into wellcalibrated posterior probabilities-based multiclass credit ratings according to the bankruptcy probabilities. However, SVMs basically do not provide such probabilities. So it required to use any method to create the probabilities (Platt, 1999; Drish, 2001). This paper applied AdaBoost algorithm-based support vector machines (SVMs) into a bankruptcy prediction as a binary classification problem for the IT companies in Korea and then performed the multi-class credit ratings of the companies by making a normal distribution shape of posterior bankruptcy probabilities from the loss functions extracted from the SVMs. Our proposed approach also showed that their methods can minimize the misclassification problems by adjusting the credit grade interval ranges on condition that each credit grade for credit loan borrowers has its own credit risk, i.e. bankruptcy probability.

Development of a Stock Trading System Using M & W Wave Patterns and Genetic Algorithms (M&W 파동 패턴과 유전자 알고리즘을 이용한 주식 매매 시스템 개발)

  • Yang, Hoonseok;Kim, Sunwoong;Choi, Heung Sik
    • Journal of Intelligence and Information Systems
    • /
    • v.25 no.1
    • /
    • pp.63-83
    • /
    • 2019
  • Investors prefer to look for trading points based on the graph shown in the chart rather than complex analysis, such as corporate intrinsic value analysis and technical auxiliary index analysis. However, the pattern analysis technique is difficult and computerized less than the needs of users. In recent years, there have been many cases of studying stock price patterns using various machine learning techniques including neural networks in the field of artificial intelligence(AI). In particular, the development of IT technology has made it easier to analyze a huge number of chart data to find patterns that can predict stock prices. Although short-term forecasting power of prices has increased in terms of performance so far, long-term forecasting power is limited and is used in short-term trading rather than long-term investment. Other studies have focused on mechanically and accurately identifying patterns that were not recognized by past technology, but it can be vulnerable in practical areas because it is a separate matter whether the patterns found are suitable for trading. When they find a meaningful pattern, they find a point that matches the pattern. They then measure their performance after n days, assuming that they have bought at that point in time. Since this approach is to calculate virtual revenues, there can be many disparities with reality. The existing research method tries to find a pattern with stock price prediction power, but this study proposes to define the patterns first and to trade when the pattern with high success probability appears. The M & W wave pattern published by Merrill(1980) is simple because we can distinguish it by five turning points. Despite the report that some patterns have price predictability, there were no performance reports used in the actual market. The simplicity of a pattern consisting of five turning points has the advantage of reducing the cost of increasing pattern recognition accuracy. In this study, 16 patterns of up conversion and 16 patterns of down conversion are reclassified into ten groups so that they can be easily implemented by the system. Only one pattern with high success rate per group is selected for trading. Patterns that had a high probability of success in the past are likely to succeed in the future. So we trade when such a pattern occurs. It is a real situation because it is measured assuming that both the buy and sell have been executed. We tested three ways to calculate the turning point. The first method, the minimum change rate zig-zag method, removes price movements below a certain percentage and calculates the vertex. In the second method, high-low line zig-zag, the high price that meets the n-day high price line is calculated at the peak price, and the low price that meets the n-day low price line is calculated at the valley price. In the third method, the swing wave method, the high price in the center higher than n high prices on the left and right is calculated as the peak price. If the central low price is lower than the n low price on the left and right, it is calculated as valley price. The swing wave method was superior to the other methods in the test results. It is interpreted that the transaction after checking the completion of the pattern is more effective than the transaction in the unfinished state of the pattern. Genetic algorithms(GA) were the most suitable solution, although it was virtually impossible to find patterns with high success rates because the number of cases was too large in this simulation. We also performed the simulation using the Walk-forward Analysis(WFA) method, which tests the test section and the application section separately. So we were able to respond appropriately to market changes. In this study, we optimize the stock portfolio because there is a risk of over-optimized if we implement the variable optimality for each individual stock. Therefore, we selected the number of constituent stocks as 20 to increase the effect of diversified investment while avoiding optimization. We tested the KOSPI market by dividing it into six categories. In the results, the portfolio of small cap stock was the most successful and the high vol stock portfolio was the second best. This shows that patterns need to have some price volatility in order for patterns to be shaped, but volatility is not the best.

A Study on Market Size Estimation Method by Product Group Using Word2Vec Algorithm (Word2Vec을 활용한 제품군별 시장규모 추정 방법에 관한 연구)

  • Jung, Ye Lim;Kim, Ji Hui;Yoo, Hyoung Sun
    • Journal of Intelligence and Information Systems
    • /
    • v.26 no.1
    • /
    • pp.1-21
    • /
    • 2020
  • With the rapid development of artificial intelligence technology, various techniques have been developed to extract meaningful information from unstructured text data which constitutes a large portion of big data. Over the past decades, text mining technologies have been utilized in various industries for practical applications. In the field of business intelligence, it has been employed to discover new market and/or technology opportunities and support rational decision making of business participants. The market information such as market size, market growth rate, and market share is essential for setting companies' business strategies. There has been a continuous demand in various fields for specific product level-market information. However, the information has been generally provided at industry level or broad categories based on classification standards, making it difficult to obtain specific and proper information. In this regard, we propose a new methodology that can estimate the market sizes of product groups at more detailed levels than that of previously offered. We applied Word2Vec algorithm, a neural network based semantic word embedding model, to enable automatic market size estimation from individual companies' product information in a bottom-up manner. The overall process is as follows: First, the data related to product information is collected, refined, and restructured into suitable form for applying Word2Vec model. Next, the preprocessed data is embedded into vector space by Word2Vec and then the product groups are derived by extracting similar products names based on cosine similarity calculation. Finally, the sales data on the extracted products is summated to estimate the market size of the product groups. As an experimental data, text data of product names from Statistics Korea's microdata (345,103 cases) were mapped in multidimensional vector space by Word2Vec training. We performed parameters optimization for training and then applied vector dimension of 300 and window size of 15 as optimized parameters for further experiments. We employed index words of Korean Standard Industry Classification (KSIC) as a product name dataset to more efficiently cluster product groups. The product names which are similar to KSIC indexes were extracted based on cosine similarity. The market size of extracted products as one product category was calculated from individual companies' sales data. The market sizes of 11,654 specific product lines were automatically estimated by the proposed model. For the performance verification, the results were compared with actual market size of some items. The Pearson's correlation coefficient was 0.513. Our approach has several advantages differing from the previous studies. First, text mining and machine learning techniques were applied for the first time on market size estimation, overcoming the limitations of traditional sampling based- or multiple assumption required-methods. In addition, the level of market category can be easily and efficiently adjusted according to the purpose of information use by changing cosine similarity threshold. Furthermore, it has a high potential of practical applications since it can resolve unmet needs for detailed market size information in public and private sectors. Specifically, it can be utilized in technology evaluation and technology commercialization support program conducted by governmental institutions, as well as business strategies consulting and market analysis report publishing by private firms. The limitation of our study is that the presented model needs to be improved in terms of accuracy and reliability. The semantic-based word embedding module can be advanced by giving a proper order in the preprocessed dataset or by combining another algorithm such as Jaccard similarity with Word2Vec. Also, the methods of product group clustering can be changed to other types of unsupervised machine learning algorithm. Our group is currently working on subsequent studies and we expect that it can further improve the performance of the conceptually proposed basic model in this study.

Deep Learning-based Hyperspectral Image Classification with Application to Environmental Geographic Information Systems (딥러닝 기반의 초분광영상 분류를 사용한 환경공간정보시스템 활용)

  • Song, Ahram;Kim, Yongil
    • Korean Journal of Remote Sensing
    • /
    • v.33 no.6_2
    • /
    • pp.1061-1073
    • /
    • 2017
  • In this study, images were classified using convolutional neural network (CNN) - a deep learning technique - to investigate the feasibility of information production through a combination of artificial intelligence and spatial data. CNN determines kernel attributes based on a classification criterion and extracts information from feature maps to classify each pixel. In this study, a CNN network was constructed to classify materials with similar spectral characteristics and attribute information; this is difficult to achieve by conventional image processing techniques. A Compact Airborne Spectrographic Imager(CASI) and an Airborne Imaging Spectrometer for Application (AISA) were used on the following three study sites to test this method: Site 1, Site 2, and Site 3. Site 1 and Site 2 were agricultural lands covered in various crops,such as potato, onion, and rice. Site 3 included different buildings,such as single and joint residential facilities. Results indicated that the classification of crop species at Site 1 and Site 2 using this method yielded accuracies of 96% and 99%, respectively. At Site 3, the designation of buildings according to their purpose yielded an accuracy of 96%. Using a combination of existing land cover maps and spatial data, we propose a thematic environmental map that provides seasonal crop types and facilitates the creation of a land cover map.

Design and Implementation of Optimal Smart Home Control System (최적의 스마트 홈 제어 시스템 설계 및 구현)

  • Lee, Hyoung-Ro;Lin, Chi-Ho
    • The Journal of the Institute of Internet, Broadcasting and Communication
    • /
    • v.18 no.1
    • /
    • pp.135-141
    • /
    • 2018
  • In this paper, we describe design and implementation of optimal smart home control system. Recent developments in technologies such as sensors and communication have enabled the Internet of Things to control a wide range of objects, such as light bulbs, socket-outlet, or clothing. Many businesses rely on the launch of collaborative services between them. However, traditional IoT systems often support a single protocol, although data is transmitted across multiple protocols for end-to-end devices. In addition, depending on the manufacturer of the Internet of things, there is a dedicated application and it has a high degree of complexity in registering and controlling different IoT devices for the internet of things. ARIoT system, special marking points and edge extraction techniques are used to detect objects, but there are relatively low deviations depending on the sampling data. The proposed system implements an IoT gateway of object based on OneM2M to compensate for existing problems. It supports diverse protocols of end to end devices and supported them with a single application. In addition, devices were learned by using deep learning in the artificial intelligence field and improved object recognition of existing systems by inference and detection, reducing the deviation of recognition rates.

IoT Open-Source and AI based Automatic Door Lock Access Control Solution

  • Yoon, Sung Hoon;Lee, Kil Soo;Cha, Jae Sang;Mariappan, Vinayagam;Young, Ko Eun;Woo, Deok Gun;Kim, Jeong Uk
    • International Journal of Internet, Broadcasting and Communication
    • /
    • v.12 no.2
    • /
    • pp.8-14
    • /
    • 2020
  • Recently, there was an increasing demand for an integrated access control system which is capable of user recognition, door control, and facility operations control for smart buildings automation. The market available door lock access control solutions need to be improved from the current level security of door locks operations where security is compromised when a password or digital keys are exposed to the strangers. At present, the access control system solution providers focusing on developing an automatic access control system using (RF) based technologies like bluetooth, WiFi, etc. All the existing automatic door access control technologies required an additional hardware interface and always vulnerable security threads. This paper proposes the user identification and authentication solution for automatic door lock control operations using camera based visible light communication (VLC) technology. This proposed approach use the cameras installed in building facility, user smart devices and IoT open source controller based LED light sensors installed in buildings infrastructure. The building facility installed IoT LED light sensors transmit the authorized user and facility information color grid code and the smart device camera decode the user informations and verify with stored user information then indicate the authentication status to the user and send authentication acknowledgement to facility door lock integrated camera to control the door lock operations. The camera based VLC receiver uses the artificial intelligence (AI) methods to decode VLC data to improve the VLC performance. This paper implements the testbed model using IoT open-source based LED light sensor with CCTV camera and user smartphone devices. The experiment results are verified with custom made convolutional neural network (CNN) based AI techniques for VLC deciding method on smart devices and PC based CCTV monitoring solutions. The archived experiment results confirm that proposed door access control solution is effective and robust for automatic door access control.

Exploratory Research on Automating the Analysis of Scientific Argumentation Using Machine Learning (머신 러닝을 활용한 과학 논변 구성 요소 코딩 자동화 가능성 탐색 연구)

  • Lee, Gyeong-Geon;Ha, Heesoo;Hong, Hun-Gi;Kim, Heui-Baik
    • Journal of The Korean Association For Science Education
    • /
    • v.38 no.2
    • /
    • pp.219-234
    • /
    • 2018
  • In this study, we explored the possibility of automating the process of analyzing elements of scientific argument in the context of a Korean classroom. To gather training data, we collected 990 sentences from science education journals that illustrate the results of coding elements of argumentation according to Toulmin's argumentation structure framework. We extracted 483 sentences as a test data set from the transcription of students' discourse in scientific argumentation activities. The words and morphemes of each argument were analyzed using the Python 'KoNLPy' package and the 'Kkma' module for Korean Natural Language Processing. After constructing the 'argument-morpheme:class' matrix for 1,473 sentences, five machine learning techniques were applied to generate predictive models relating each sentences to the element of argument with which it corresponded. The accuracy of the predictive models was investigated by comparing them with the results of pre-coding by researchers and confirming the degree of agreement. The predictive model generated by the k-nearest neighbor algorithm (KNN) demonstrated the highest degree of agreement [54.04% (${\kappa}=0.22$)] when machine learning was performed with the consideration of morpheme of each sentence. The predictive model generated by the KNN exhibited higher agreement [55.07% (${\kappa}=0.24$)] when the coding results of the previous sentence were added to the prediction process. In addition, the results indicated importance of considering context of discourse by reflecting the codes of previous sentences to the analysis. The results have significance in that, it showed the possibility of automating the analysis of students' argumentation activities in Korean language by applying machine learning.

Artificial Intelligence Algorithms, Model-Based Social Data Collection and Content Exploration (소셜데이터 분석 및 인공지능 알고리즘 기반 범죄 수사 기법 연구)

  • An, Dong-Uk;Leem, Choon Seong
    • The Journal of Bigdata
    • /
    • v.4 no.2
    • /
    • pp.23-34
    • /
    • 2019
  • Recently, the crime that utilizes the digital platform is continuously increasing. About 140,000 cases occurred in 2015 and about 150,000 cases occurred in 2016. Therefore, it is considered that there is a limit handling those online crimes by old-fashioned investigation techniques. Investigators' manual online search and cognitive investigation methods those are broadly used today are not enough to proactively cope with rapid changing civil crimes. In addition, the characteristics of the content that is posted to unspecified users of social media makes investigations more difficult. This study suggests the site-based collection and the Open API among the content web collection methods considering the characteristics of the online media where the infringement crimes occur. Since illegal content is published and deleted quickly, and new words and alterations are generated quickly and variously, it is difficult to recognize them quickly by dictionary-based morphological analysis registered manually. In order to solve this problem, we propose a tokenizing method in the existing dictionary-based morphological analysis through WPM (Word Piece Model), which is a data preprocessing method for quick recognizing and responding to illegal contents posting online infringement crimes. In the analysis of data, the optimal precision is verified through the Vote-based ensemble method by utilizing a classification learning model based on supervised learning for the investigation of illegal contents. This study utilizes a sorting algorithm model centering on illegal multilevel business cases to proactively recognize crimes invading the public economy, and presents an empirical study to effectively deal with social data collection and content investigation.

  • PDF

Integer Programming-based Local Search Techniques for the Multidimensional Knapsack Problem (다차원 배낭 문제를 위한 정수계획법 기반 지역 탐색 기법)

  • Hwang, Jun-Ha
    • Journal of the Korea Society of Computer and Information
    • /
    • v.17 no.6
    • /
    • pp.13-27
    • /
    • 2012
  • Integer programming-based local search(IPbLS) is a kind of local search based on simple hill-climbing search and adopts integer programming for neighbor generation unlike general local search. According to an existing research [1], IPbLS is known as an effective method for the multidimensional knapsack problem(MKP) which has received wide attention in operations research and artificial intelligence area. However, the existing research has a shortcoming that it verified the superiority of IPbLS targeting only largest-scale problems among MKP test problems in the OR-Library. In this paper, I verify the superiority of IPbLS more objectively by applying it to other problems. In addition, unlike the existing IPbLS that combines simple hill-climbing search and integer programming, I propose methods combining other local search algorithms like hill-climbing search, tabu search, simulated annealing with integer programming. Through the experimental results, I confirmed that IPbLS shows comparable or better performance than the best known heuristic search also for mid or small-scale MKP test problems.

Artificial Intelligence Techniques for Predicting Online Peer-to-Peer(P2P) Loan Default (인공지능기법을 이용한 온라인 P2P 대출거래의 채무불이행 예측에 관한 실증연구)

  • Bae, Jae Kwon;Lee, Seung Yeon;Seo, Hee Jin
    • The Journal of Society for e-Business Studies
    • /
    • v.23 no.3
    • /
    • pp.207-224
    • /
    • 2018
  • In this article, an empirical study was conducted by using public dataset from Lending Club Corporation, the largest online peer-to-peer (P2P) lending in the world. We explore significant predictor variables related to P2P lending default that housing situation, length of employment, average current balance, debt-to-income ratio, loan amount, loan purpose, interest rate, public records, number of finance trades, total credit/credit limit, number of delinquent accounts, number of mortgage accounts, and number of bank card accounts are significant factors to loan funded successful on Lending Club platform. We developed online P2P lending default prediction models using discriminant analysis, logistic regression, neural networks, and decision trees (i.e., CART and C5.0) in order to predict P2P loan default. To verify the feasibility and effectiveness of P2P lending default prediction models, borrower loan data and credit data used in this study. Empirical results indicated that neural networks outperforms other classifiers such as discriminant analysis, logistic regression, CART, and C5.0. Neural networks always outperforms other classifiers in P2P loan default prediction.