Search | Korea Science

A Study on the Effect of the Document Summarization Technique on the Fake News Detection Model (문서 요약 기법이 가짜 뉴스 탐지 모형에 미치는 영향에 관한 연구)

Shim, Jae-Seung;Won, Ha-Ram;Ahn, Hyunchul
- Journal of Intelligence and Information Systems
- /
- v.25 no.3
- /
- pp.201-220
- /
- 2019
Fake news has emerged as a significant issue over the last few years, igniting discussions and research on how to solve this problem. In particular, studies on automated fact-checking and fake news detection using artificial intelligence and text analysis techniques have drawn attention. Fake news detection research entails a form of document classification; thus, document classification techniques have been widely used in this type of research. However, document summarization techniques have been inconspicuous in this field. At the same time, automatic news summarization services have become popular, and a recent study found that the use of news summarized through abstractive summarization has strengthened the predictive performance of fake news detection models. Therefore, the need to study the integration of document summarization technology in the domestic news data environment has become evident. In order to examine the effect of extractive summarization on the fake news detection model, we first summarized news articles through extractive summarization. Second, we created a summarized news-based detection model. Finally, we compared our model with the full-text-based detection model. The study found that BPN(Back Propagation Neural Network) and SVM(Support Vector Machine) did not exhibit a large difference in performance; however, for DT(Decision Tree), the full-text-based model demonstrated a somewhat better performance. In the case of LR(Logistic Regression), our model exhibited the superior performance. Nonetheless, the results did not show a statistically significant difference between our model and the full-text-based model. Therefore, when the summary is applied, at least the core information of the fake news is preserved, and the LR-based model can confirm the possibility of performance improvement. This study features an experimental application of extractive summarization in fake news detection research by employing various machine-learning algorithms. The study's limitations are, essentially, the relatively small amount of data and the lack of comparison between various summarization technologies. Therefore, an in-depth analysis that applies various analytical techniques to a larger data volume would be helpful in the future.
https://doi.org/10.13088/jiis.2019.25.3.201 인용 PDF KSCI

Development of an Intelligent Trading System Using Support Vector Machines and Genetic Algorithms (Support Vector Machines와 유전자 알고리즘을 이용한 지능형 트레이딩 시스템 개발)

Kim, Sun-Woong;Ahn, Hyun-Chul
- Journal of Intelligence and Information Systems
- /
- v.16 no.1
- /
- pp.71-92
- /
- 2010
As the use of trading systems increases recently, many researchers are interested in developing intelligent trading systems using artificial intelligence techniques. However, most prior studies on trading systems have common limitations. First, they just adopted several technical indicators based on stock indices as independent variables although there are a variety of variables that can be used as independent variables for predicting the market. In addition, most of them focus on developing a model that predicts the direction of the stock market indices rather than one that can generate trading signals for maximizing returns. Thus, in this study, we propose a novel intelligent trading system that mitigates these limitations. It is designed to use both the technical indicators and the other non-price variables on the market. Also, it adopts 'two-threshold mechanism' so that it can transform the outcome of the stock market prediction model based on support vector machines to the trading decision signals like buy, sell or hold. To validate the usefulness of the proposed system, we applied it to the real world data-the KOSPI200 index from May 2004 to December 2009. As a result, we found that the proposed system outperformed other comparative models from the perspective of 'rate of return'.
PDF KSCI

Development of Sentiment Analysis Model for the hot topic detection of online stock forums (온라인 주식 포럼의 핫토픽 탐지를 위한 감성분석 모형의 개발)

Hong, Taeho;Lee, Taewon;Li, Jingjing
- Journal of Intelligence and Information Systems
- /
- v.22 no.1
- /
- pp.187-204
- /
- 2016
Document classification based on emotional polarity has become a welcomed emerging task owing to the great explosion of data on the Web. In the big data age, there are too many information sources to refer to when making decisions. For example, when considering travel to a city, a person may search reviews from a search engine such as Google or social networking services (SNSs) such as blogs, Twitter, and Facebook. The emotional polarity of positive and negative reviews helps a user decide on whether or not to make a trip. Sentiment analysis of customer reviews has become an important research topic as datamining technology is widely accepted for text mining of the Web. Sentiment analysis has been used to classify documents through machine learning techniques, such as the decision tree, neural networks, and support vector machines (SVMs). is used to determine the attitude, position, and sensibility of people who write articles about various topics that are published on the Web. Regardless of the polarity of customer reviews, emotional reviews are very helpful materials for analyzing the opinions of customers through their reviews. Sentiment analysis helps with understanding what customers really want instantly through the help of automated text mining techniques. Sensitivity analysis utilizes text mining techniques on text on the Web to extract subjective information in the text for text analysis. Sensitivity analysis is utilized to determine the attitudes or positions of the person who wrote the article and presented their opinion about a particular topic. In this study, we developed a model that selects a hot topic from user posts at China's online stock forum by using the k-means algorithm and self-organizing map (SOM). In addition, we developed a detecting model to predict a hot topic by using machine learning techniques such as logit, the decision tree, and SVM. We employed sensitivity analysis to develop our model for the selection and detection of hot topics from China's online stock forum. The sensitivity analysis calculates a sentimental value from a document based on contrast and classification according to the polarity sentimental dictionary (positive or negative). The online stock forum was an attractive site because of its information about stock investment. Users post numerous texts about stock movement by analyzing the market according to government policy announcements, market reports, reports from research institutes on the economy, and even rumors. We divided the online forum's topics into 21 categories to utilize sentiment analysis. One hundred forty-four topics were selected among 21 categories at online forums about stock. The posts were crawled to build a positive and negative text database. We ultimately obtained 21,141 posts on 88 topics by preprocessing the text from March 2013 to February 2015. The interest index was defined to select the hot topics, and the k-means algorithm and SOM presented equivalent results with this data. We developed a decision tree model to detect hot topics with three algorithms: CHAID, CART, and C4.5. The results of CHAID were subpar compared to the others. We also employed SVM to detect the hot topics from negative data. The SVM models were trained with the radial basis function (RBF) kernel function by a grid search to detect the hot topics. The detection of hot topics by using sentiment analysis provides the latest trends and hot topics in the stock forum for investors so that they no longer need to search the vast amounts of information on the Web. Our proposed model is also helpful to rapidly determine customers' signals or attitudes towards government policy and firms' products and services.
https://doi.org/10.13088/jiis.2016.22.1.187 인용 PDF KSCI

Stock Price Direction Prediction Using Convolutional Neural Network: Emphasis on Correlation Feature Selection (합성곱 신경망을 이용한 주가방향 예측: 상관관계 속성선택 방법을 중심으로)

Kyun Sun Eo;Kun Chang Lee
- Information Systems Review
- /
- v.22 no.4
- /
- pp.21-39
- /
- 2020
Recently, deep learning has shown high performance in various applications such as pattern analysis and image classification. Especially known as a difficult task in the field of machine learning research, stock market forecasting is an area where the effectiveness of deep learning techniques is being verified by many researchers. This study proposed a deep learning Convolutional Neural Network (CNN) model to predict the direction of stock prices. We then used the feature selection method to improve the performance of the model. We compared the performance of machine learning classifiers against CNN. The classifiers used in this study are as follows: Logistic Regression, Decision Tree, Neural Network, Support Vector Machine, Adaboost, Bagging, and Random Forest. The results of this study confirmed that the CNN showed higher performancecompared with other classifiers in the case of feature selection. The results show that the CNN model effectively predicted the stock price direction by analyzing the embedded values of the financial data
https://doi.org/10.14329/isr.2020.22.4.021 인용 PDF

A Framework for Preliminary Ship Design Process Management System (선박 초기 설계 프로세스 관리 시스템을 위한 프레임워크 제안)

Jang, Beom-Seon;Yang, Young-Soon;Lee, Chang-Hyun
- Journal of the Computational Structural Engineering Institute of Korea
- /
- v.21 no.6
- /
- pp.535-541
- /
- 2008
As the concurrent engineering concept has emerged along with the support of optimization techniques, lots of endeavors have been made to apply optimization techniques to actual design problems for a holistic decision. Even if the range of design problems which the optimization is applicable to has been extended, most of ship designs still remain in an iterative approach due to the difficulties of seamless integration of all related design activities. In this approach, an entire design problem is divided into many sub-problems and carried out by many different disciplines through complicated internal interactions. This paper focuses on preliminary ship design process. This paper proposes a process centric integrated framework as the first step to establish a workflow based design process management system. The framework consists of two parts; a schedule management part to support a manager to monitor current progress status and adjust current schedule, and a process management part to assist a design to effectively perform a series of design activities by following a predefined procedure. Overall system are decomposed into modules according to the target to be managed in each module. Appropriate interactions between the decomposed modules are designed to achieve a consistency of the entire system. Design process model is also designed on a thorough analysis of actual ship design practice. The proposed framework will be embodied using a commercial workflow package.
PDF KSCI

A Decision Support Model for the Exchange Risk Management of Overseas Construction Projects (해외 건설 프로젝트의 환리스크 관리를 위한 의사결정 지원 모델)

An, Chi-Hoon;Yoo, Hyun-Seok;Kim, Young-Suk
- Korean Journal of Construction Engineering and Management
- /
- v.13 no.3
- /
- pp.109-121
- /
- 2012
Overseas construction project orders have shown steady increase since 2001, and it took 44.5% of the total construction project orders in 2010. Overseas construction project needs more complex risk management because it is affected by more various circumstance factors than the domestic construction is. Previous studies have centered on the internal risk factors to assist the decision-making, but there are few researches on the importance and techniques of foreign exchange risk management. Inadequate management of foreign exchange risk has been found to cause huge damages due to the lacking recognition on the importance of foreign exchange risk management. Therefore, current study designed a foreign exchange risk manage model to help efficient management and decision-making. This model was developed as a technique to meet the demand of the increasing overseas construction projects for the efficient management of foreign exchange risk, and the technique will lower the risk with more and more accurate outcome by accumulating the data of profit-and-loss.
https://doi.org/10.6106/KJCEM.2012.13.3.109 인용 PDF KSCI

A Recommending System for Care Plan(Res-CP) in Long-Term Care Insurance System (데이터마이닝 기법을 활용한 노인장기요양급여 권고모형 개발)

Han, Eun-Jeong;Lee, Jung-Suk;Kim, Dong-Geon;Ka, Im-Ok
- The Korean Journal of Applied Statistics
- /
- v.22 no.6
- /
- pp.1229-1237
- /
- 2009
In the long-term care insurance(LTCI) system, the question of how to provide the most appropriate care has become a major issue for the elderly, their family, and for policy makers. To help beneficiaries use LTC services appropriately to their needs of care, National Health Insurance Corporation(NHIC) provide them with the individualized care plan, named the Long-term Care User Guide. It includes recommendations for beneficiaries' most appropriate type of care. The purpose of this study is to develop a recommending system for care plan(Res-CP) in LTCI system. We used data set for Long-term Care User Guide in the 3rd long-term care insurance pilot programs. To develop the model, we tested four models, including a decision-tree model in data-mining, a logistic regression model, and a boosting and boosting techniques in an ensemble model. A decision-tree model was selected to describe the Res-CP, because it may be easy to explain the algorithm of Res-CP to the working groups. Res-CP might be useful in an evidence-based care planning in LTCI system and may contribute to support use of LTC services efficiently.
https://doi.org/10.5351/KJAS.2009.22.6.1229 인용 PDF KSCI

Study on Predicting the Designation of Administrative Issue in the KOSDAQ Market Based on Machine Learning Based on Financial Data (머신러닝 기반 KOSDAQ 시장의 관리종목 지정 예측 연구: 재무적 데이터를 중심으로)

Yoon, Yanghyun;Kim, Taekyung;Kim, Suyeong
- Asia-Pacific Journal of Business Venturing and Entrepreneurship
- /
- v.17 no.1
- /
- pp.229-249
- /
- 2022
This paper investigates machine learning models for predicting the designation of administrative issues in the KOSDAQ market through various techniques. When a company in the Korean stock market is designated as administrative issue, the market recognizes the event itself as negative information, causing losses to the company and investors. The purpose of this study is to evaluate alternative methods for developing a artificial intelligence service to examine a possibility to the designation of administrative issues early through the financial ratio of companies and to help investors manage portfolio risks. In this study, the independent variables used 21 financial ratios representing profitability, stability, activity, and growth. From 2011 to 2020, when K-IFRS was applied, financial data of companies in administrative issues and non-administrative issues stocks are sampled. Logistic regression analysis, decision tree, support vector machine, random forest, and LightGBM are used to predict the designation of administrative issues. According to the results of analysis, LightGBM with 82.73% classification accuracy is the best prediction model, and the prediction model with the lowest classification accuracy is a decision tree with 71.94% accuracy. As a result of checking the top three variables of the importance of variables in the decision tree-based learning model, the financial variables common in each model are ROE(Net profit) and Capital stock turnover ratio, which are relatively important variables in designating administrative issues. In general, it is confirmed that the learning model using the ensemble had higher predictive performance than the single learning model.
PDF KSCI

Improving the Accuracy of Document Classification by Learning Heterogeneity (이질성 학습을 통한 문서 분류의 정확성 향상 기법)

Wong, William Xiu Shun;Hyun, Yoonjin;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.24 no.3
- /
- pp.21-44
- /
- 2018
In recent years, the rapid development of internet technology and the popularization of smart devices have resulted in massive amounts of text data. Those text data were produced and distributed through various media platforms such as World Wide Web, Internet news feeds, microblog, and social media. However, this enormous amount of easily obtained information is lack of organization. Therefore, this problem has raised the interest of many researchers in order to manage this huge amount of information. Further, this problem also required professionals that are capable of classifying relevant information and hence text classification is introduced. Text classification is a challenging task in modern data analysis, which it needs to assign a text document into one or more predefined categories or classes. In text classification field, there are different kinds of techniques available such as K-Nearest Neighbor, Naïve Bayes Algorithm, Support Vector Machine, Decision Tree, and Artificial Neural Network. However, while dealing with huge amount of text data, model performance and accuracy becomes a challenge. According to the type of words used in the corpus and type of features created for classification, the performance of a text classification model can be varied. Most of the attempts are been made based on proposing a new algorithm or modifying an existing algorithm. This kind of research can be said already reached their certain limitations for further improvements. In this study, aside from proposing a new algorithm or modifying the algorithm, we focus on searching a way to modify the use of data. It is widely known that classifier performance is influenced by the quality of training data upon which this classifier is built. The real world datasets in most of the time contain noise, or in other words noisy data, these can actually affect the decision made by the classifiers built from these data. In this study, we consider that the data from different domains, which is heterogeneous data might have the characteristics of noise which can be utilized in the classification process. In order to build the classifier, machine learning algorithm is performed based on the assumption that the characteristics of training data and target data are the same or very similar to each other. However, in the case of unstructured data such as text, the features are determined according to the vocabularies included in the document. If the viewpoints of the learning data and target data are different, the features may be appearing different between these two data. In this study, we attempt to improve the classification accuracy by strengthening the robustness of the document classifier through artificially injecting the noise into the process of constructing the document classifier. With data coming from various kind of sources, these data are likely formatted differently. These cause difficulties for traditional machine learning algorithms because they are not developed to recognize different type of data representation at one time and to put them together in same generalization. Therefore, in order to utilize heterogeneous data in the learning process of document classifier, we apply semi-supervised learning in our study. However, unlabeled data might have the possibility to degrade the performance of the document classifier. Therefore, we further proposed a method called Rule Selection-Based Ensemble Semi-Supervised Learning Algorithm (RSESLA) to select only the documents that contributing to the accuracy improvement of the classifier. RSESLA creates multiple views by manipulating the features using different types of classification models and different types of heterogeneous data. The most confident classification rules will be selected and applied for the final decision making. In this paper, three different types of real-world data sources were used, which are news, twitter and blogs.
https://doi.org/10.13088/jiis.2018.24.3.021 인용 PDF KSCI

Stakeholder Awareness of Rural Spatial Planning Data Utilization Based on Survey (농촌공간계획 데이터 수급에 대한 이해당사자 인식조사)

Zaewoong Rhee;Sang-Hyun Lee;Sungyun Lee;Jinsung Kim;Rui Qu;Seung-Jong Bae;Soo-Jin Kim;Sangbum Kim
- Journal of Korean Society of Rural Planning
- /
- v.29 no.3
- /
- pp.25-37
- /
- 2023
According to the 「Rural Spatial Reconstruction and Regeneration Support Act」, enacted on March 29, 2024, all local governments are required to establish a 'Rural Spatial Reconstruction and Regeneration Plan' (hereinafter referred to as the 'Rural Spatial Plan'). In order for the 'Rural Spatial Plan' to be appropriately established, this study analyzed the supply and demand of spatial data from the perspective of user stakeholders and derived implications for improving rural spatial planning data utilization. In conclusion, three key recommendations come from this result. Firstly, it is necessary to establish an integrated DB for rural spatial planning data. This can solve the problem of low awareness of scattered data-providing websites, reduce the processing time of non-GIS data, and reduce the time required to acquire data by securing the availability of data search and download. In particular, research should be conducted on the establishment of a spatial analysis simulation system to support stakeholders' decision-making, considering that many stakeholders have difficulty in spatial analysis because spatial analysis techniques were not actively used in rural projects before the implementation of the rural agreement system in 2020. Secondly, research on how to improve data acquisition should be conducted in each data sector. The data sector group with the lowest ease of receiving are 'Local Community Domain', 'Changes in Domestic and International Conditions', and 'Provision and Utilization of Daily Life Services'. Lastly, in-depth research is needed on how to raise each rural spatial planning data supply stakeholder to the position of player. Stakeholders of 'University Institutions' and 'Public Enterprises and Research Institutes' should give those who participate in the formulation of rural spatial plans access to the raw data collected for public work. Stakeholders of 'Private company' need to come up with realistic measures to build a data pool centered on consultative bodies between existing private companies and then prepare a step-by-step strategy to fully open it by participating various stakeholders. In order to induce 'Village Residents and Associations' stakeholders to play a leading role as owners and producers of data, personnel should be trained to collect and record data related to the village. In addition, support measures should be prepared to continue these activities.
https://doi.org/10.7851/ksrp.2023.29.3.025 인용 PDF

Search Result 220, Processing Time 0.022 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)