Search | Korea Science

A Study on the Evaluation Indicators for the Establishment of Marine Fisheries Safety Education Facilities (해양수산안전 교육시설 설립을 위한 입지평가요인 도출에 관한 연구)

Shin-Young Ha;Bo-Young Kim;Sung-Ho Park
- Journal of the Korean Society of Marine Environment & Safety
- /
- v.30 no.4
- /
- pp.340-347
- /
- 2024
In this study, an expert survey was conducted using the Delphi technique to select items and indicators for evaluation before installing educational facilities in the marine fisheries safety field, in which the educational infrastructure gap between regions is wide. Seven indicators were selected as geographic, social, and administrative factors. In order to objectively evaluate each indicator, evaluation indicators that could be evaluated using public data such as the "Comprehensive National Balanced Development Information System" and "National Statistical Portal" were developed. The Analytic Hierarchy Process (AHP) method was applied to select the weight for each indicator, resulting in 10 most important influencing factors on the selection of the location of educational facilities of the Marine Fisheries Safety Education Facilities: the distribution of marine officers, access to high-speed railways, the number of small ships less than 5 tons, access to highways interchange, the distribution of fishing boats, the close relationship of related industries, the planned new port, the distribution of commercial ports, the number of marine leisure riders, and the availability of long-term land leases in local government councils. The location evaluation index of marine and fishery safety education facilities developed in this study can be used to evaluate each region using national public data, and has the advantage of enabling objective evaluation. Therefore, it is judged that this evaluation index can be used to verify the feasibility of installing marine fisheries safety education facilities as well as other marine-related facilities.
https://doi.org/10.7837/kosomes.2024.30.4.340 인용 PDF

Intelligent Brand Positioning Visualization System Based on Web Search Traffic Information : Focusing on Tablet PC (웹검색 트래픽 정보를 활용한 지능형 브랜드 포지셔닝 시스템 : 태블릿 PC 사례를 중심으로)

Jun, Seung-Pyo;Park, Do-Hyung
- Journal of Intelligence and Information Systems
- /
- v.19 no.3
- /
- pp.93-111
- /
- 2013
As Internet and information technology (IT) continues to develop and evolve, the issue of big data has emerged at the foreground of scholarly and industrial attention. Big data is generally defined as data that exceed the range that can be collected, stored, managed and analyzed by existing conventional information systems and it also refers to the new technologies designed to effectively extract values from such data. With the widespread dissemination of IT systems, continual efforts have been made in various fields of industry such as R&D, manufacturing, and finance to collect and analyze immense quantities of data in order to extract meaningful information and to use this information to solve various problems. Since IT has converged with various industries in many aspects, digital data are now being generated at a remarkably accelerating rate while developments in state-of-the-art technology have led to continual enhancements in system performance. The types of big data that are currently receiving the most attention include information available within companies, such as information on consumer characteristics, information on purchase records, logistics information and log information indicating the usage of products and services by consumers, as well as information accumulated outside companies, such as information on the web search traffic of online users, social network information, and patent information. Among these various types of big data, web searches performed by online users constitute one of the most effective and important sources of information for marketing purposes because consumers search for information on the internet in order to make efficient and rational choices. Recently, Google has provided public access to its information on the web search traffic of online users through a service named Google Trends. Research that uses this web search traffic information to analyze the information search behavior of online users is now receiving much attention in academia and in fields of industry. Studies using web search traffic information can be broadly classified into two fields. The first field consists of empirical demonstrations that show how web search information can be used to forecast social phenomena, the purchasing power of consumers, the outcomes of political elections, etc. The other field focuses on using web search traffic information to observe consumer behavior, identifying the attributes of a product that consumers regard as important or tracking changes on consumers' expectations, for example, but relatively less research has been completed in this field. In particular, to the extent of our knowledge, hardly any studies related to brands have yet attempted to use web search traffic information to analyze the factors that influence consumers' purchasing activities. This study aims to demonstrate that consumers' web search traffic information can be used to derive the relations among brands and the relations between an individual brand and product attributes. When consumers input their search words on the web, they may use a single keyword for the search, but they also often input multiple keywords to seek related information (this is referred to as simultaneous searching). A consumer performs a simultaneous search either to simultaneously compare two product brands to obtain information on their similarities and differences, or to acquire more in-depth information about a specific attribute in a specific brand. Web search traffic information shows that the quantity of simultaneous searches using certain keywords increases when the relation is closer in the consumer's mind and it will be possible to derive the relations between each of the keywords by collecting this relational data and subjecting it to network analysis. Accordingly, this study proposes a method of analyzing how brands are positioned by consumers and what relationships exist between product attributes and an individual brand, using simultaneous search traffic information. It also presents case studies demonstrating the actual application of this method, with a focus on tablets, belonging to innovative product groups.
https://doi.org/10.13088/jiis.2013.19.3.093 인용 PDF KSCI

User-Perspective Issue Clustering Using Multi-Layered Two-Mode Network Analysis (다계층 이원 네트워크를 활용한 사용자 관점의 이슈 클러스터링)

Kim, Jieun;Kim, Namgyu;Cho, Yoonho
- Journal of Intelligence and Information Systems
- /
- v.20 no.2
- /
- pp.93-107
- /
- 2014
In this paper, we report what we have observed with regard to user-perspective issue clustering based on multi-layered two-mode network analysis. This work is significant in the context of data collection by companies about customer needs. Most companies have failed to uncover such needs for products or services properly in terms of demographic data such as age, income levels, and purchase history. Because of excessive reliance on limited internal data, most recommendation systems do not provide decision makers with appropriate business information for current business circumstances. However, part of the problem is the increasing regulation of personal data gathering and privacy. This makes demographic or transaction data collection more difficult, and is a significant hurdle for traditional recommendation approaches because these systems demand a great deal of personal data or transaction logs. Our motivation for presenting this paper to academia is our strong belief, and evidence, that most customers' requirements for products can be effectively and efficiently analyzed from unstructured textual data such as Internet news text. In order to derive users' requirements from textual data obtained online, the proposed approach in this paper attempts to construct double two-mode networks, such as a user-news network and news-issue network, and to integrate these into one quasi-network as the input for issue clustering. One of the contributions of this research is the development of a methodology utilizing enormous amounts of unstructured textual data for user-oriented issue clustering by leveraging existing text mining and social network analysis. In order to build multi-layered two-mode networks of news logs, we need some tools such as text mining and topic analysis. We used not only SAS Enterprise Miner 12.1, which provides a text miner module and cluster module for textual data analysis, but also NetMiner 4 for network visualization and analysis. Our approach for user-perspective issue clustering is composed of six main phases: crawling, topic analysis, access pattern analysis, network merging, network conversion, and clustering. In the first phase, we collect visit logs for news sites by crawler. After gathering unstructured news article data, the topic analysis phase extracts issues from each news article in order to build an article-news network. For simplicity, 100 topics are extracted from 13,652 articles. In the third phase, a user-article network is constructed with access patterns derived from web transaction logs. The double two-mode networks are then merged into a quasi-network of user-issue. Finally, in the user-oriented issue-clustering phase, we classify issues through structural equivalence, and compare these with the clustering results from statistical tools and network analysis. An experiment with a large dataset was performed to build a multi-layer two-mode network. After that, we compared the results of issue clustering from SAS with that of network analysis. The experimental dataset was from a web site ranking site, and the biggest portal site in Korea. The sample dataset contains 150 million transaction logs and 13,652 news articles of 5,000 panels over one year. User-article and article-issue networks are constructed and merged into a user-issue quasi-network using Netminer. Our issue-clustering results applied the Partitioning Around Medoids (PAM) algorithm and Multidimensional Scaling (MDS), and are consistent with the results from SAS clustering. In spite of extensive efforts to provide user information with recommendation systems, most projects are successful only when companies have sufficient data about users and transactions. Our proposed methodology, user-perspective issue clustering, can provide practical support to decision-making in companies because it enhances user-related data from unstructured textual data. To overcome the problem of insufficient data from traditional approaches, our methodology infers customers' real interests by utilizing web transaction logs. In addition, we suggest topic analysis and issue clustering as a practical means of issue identification.
https://doi.org/10.13088/jiis.2014.20.2.093 인용 PDF KSCI

Mapping Categories of Heterogeneous Sources Using Text Analytics (텍스트 분석을 통한 이종 매체 카테고리 다중 매핑 방법론)

Kim, Dasom;Kim, Namgyu
- Journal of Intelligence and Information Systems
- /
- v.22 no.4
- /
- pp.193-215
- /
- 2016
In recent years, the proliferation of diverse social networking services has led users to use many mediums simultaneously depending on their individual purpose and taste. Besides, while collecting information about particular themes, they usually employ various mediums such as social networking services, Internet news, and blogs. However, in terms of management, each document circulated through diverse mediums is placed in different categories on the basis of each source's policy and standards, hindering any attempt to conduct research on a specific category across different kinds of sources. For example, documents containing content on "Application for a foreign travel" can be classified into "Information Technology," "Travel," or "Life and Culture" according to the peculiar standard of each source. Likewise, with different viewpoints of definition and levels of specification for each source, similar categories can be named and structured differently in accordance with each source. To overcome these limitations, this study proposes a plan for conducting category mapping between different sources with various mediums while maintaining the existing category system of the medium as it is. Specifically, by re-classifying individual documents from the viewpoint of diverse sources and storing the result of such a classification as extra attributes, this study proposes a logical layer by which users can search for a specific document from multiple heterogeneous sources with different category names as if they belong to the same source. Besides, by collecting 6,000 articles of news from two Internet news portals, experiments were conducted to compare accuracy among sources, supervised learning and semi-supervised learning, and homogeneous and heterogeneous learning data. It is particularly interesting that in some categories, classifying accuracy of semi-supervised learning using heterogeneous learning data proved to be higher than that of supervised learning and semi-supervised learning, which used homogeneous learning data. This study has the following significances. First, it proposes a logical plan for establishing a system to integrate and manage all the heterogeneous mediums in different classifying systems while maintaining the existing physical classifying system as it is. This study's results particularly exhibit very different classifying accuracies in accordance with the heterogeneity of learning data; this is expected to spur further studies for enhancing the performance of the proposed methodology through the analysis of characteristics by category. In addition, with an increasing demand for search, collection, and analysis of documents from diverse mediums, the scope of the Internet search is not restricted to one medium. However, since each medium has a different categorical structure and name, it is actually very difficult to search for a specific category insofar as encompassing heterogeneous mediums. The proposed methodology is also significant for presenting a plan that enquires into all the documents regarding the standards of the relevant sites' categorical classification when the users select the desired site, while maintaining the existing site's characteristics and structure as it is. This study's proposed methodology needs to be further complemented in the following aspects. First, though only an indirect comparison and evaluation was made on the performance of this proposed methodology, future studies would need to conduct more direct tests on its accuracy. That is, after re-classifying documents of the object source on the basis of the categorical system of the existing source, the extent to which the classification was accurate needs to be verified through evaluation by actual users. In addition, the accuracy in classification needs to be increased by making the methodology more sophisticated. Furthermore, an understanding is required that the characteristics of some categories that showed a rather higher classifying accuracy of heterogeneous semi-supervised learning than that of supervised learning might assist in obtaining heterogeneous documents from diverse mediums and seeking plans that enhance the accuracy of document classification through its usage.
https://doi.org/10.13088/jiis.2016.22.4.193 인용 PDF KSCI

Implementation of integrated monitoring system for trace and path prediction of infectious disease (전염병의 경로 추적 및 예측을 위한 통합 정보 시스템 구현)

Kim, Eungyeong;Lee, Seok;Byun, Young Tae;Lee, Hyuk-Jae;Lee, Taikjin
- Journal of Internet Computing and Services
- /
- v.14 no.5
- /
- pp.69-76
- /
- 2013
The incidence of globally infectious and pathogenic diseases such as H1N1 (swine flu) and Avian Influenza (AI) has recently increased. An infectious disease is a pathogen-caused disease, which can be passed from the infected person to the susceptible host. Pathogens of infectious diseases, which are bacillus, spirochaeta, rickettsia, virus, fungus, and parasite, etc., cause various symptoms such as respiratory disease, gastrointestinal disease, liver disease, and acute febrile illness. They can be spread through various means such as food, water, insect, breathing and contact with other persons. Recently, most countries around the world use a mathematical model to predict and prepare for the spread of infectious diseases. In a modern society, however, infectious diseases are spread in a fast and complicated manner because of rapid development of transportation (both ground and underground). Therefore, we do not have enough time to predict the fast spreading and complicated infectious diseases. Therefore, new system, which can prevent the spread of infectious diseases by predicting its pathway, needs to be developed. In this study, to solve this kind of problem, an integrated monitoring system, which can track and predict the pathway of infectious diseases for its realtime monitoring and control, is developed. This system is implemented based on the conventional mathematical model called by 'Susceptible-Infectious-Recovered (SIR) Model.' The proposed model has characteristics that both inter- and intra-city modes of transportation to express interpersonal contact (i.e., migration flow) are considered. They include the means of transportation such as bus, train, car and airplane. Also, modified real data according to the geographical characteristics of Korea are employed to reflect realistic circumstances of possible disease spreading in Korea. We can predict where and when vaccination needs to be performed by parameters control in this model. The simulation includes several assumptions and scenarios. Using the data of Statistics Korea, five major cities, which are assumed to have the most population migration have been chosen; Seoul, Incheon (Incheon International Airport), Gangneung, Pyeongchang and Wonju. It was assumed that the cities were connected in one network, and infectious disease was spread through denoted transportation methods only. In terms of traffic volume, daily traffic volume was obtained from Korean Statistical Information Service (KOSIS). In addition, the population of each city was acquired from Statistics Korea. Moreover, data on H1N1 (swine flu) were provided by Korea Centers for Disease Control and Prevention, and air transport statistics were obtained from Aeronautical Information Portal System. As mentioned above, daily traffic volume, population statistics, H1N1 (swine flu) and air transport statistics data have been adjusted in consideration of the current conditions in Korea and several realistic assumptions and scenarios. Three scenarios (occurrence of H1N1 in Incheon International Airport, not-vaccinated in all cities and vaccinated in Seoul and Pyeongchang respectively) were simulated, and the number of days taken for the number of the infected to reach its peak and proportion of Infectious (I) were compared. According to the simulation, the number of days was the fastest in Seoul with 37 days and the slowest in Pyeongchang with 43 days when vaccination was not considered. In terms of the proportion of I, Seoul was the highest while Pyeongchang was the lowest. When they were vaccinated in Seoul, the number of days taken for the number of the infected to reach at its peak was the fastest in Seoul with 37 days and the slowest in Pyeongchang with 43 days. In terms of the proportion of I, Gangneung was the highest while Pyeongchang was the lowest. When they were vaccinated in Pyeongchang, the number of days was the fastest in Seoul with 37 days and the slowest in Pyeongchang with 43 days. In terms of the proportion of I, Gangneung was the highest while Pyeongchang was the lowest. Based on the results above, it has been confirmed that H1N1, upon the first occurrence, is proportionally spread by the traffic volume in each city. Because the infection pathway is different by the traffic volume in each city, therefore, it is possible to come up with a preventive measurement against infectious disease by tracking and predicting its pathway through the analysis of traffic volume.
https://doi.org/10.7472/jksii.2013.14.5.69 인용 PDF KSCI

Validation of Surface Reflectance Product of KOMPSAT-3A Image Data: Application of RadCalNet Baotou (BTCN) Data (다목적실용위성 3A 영상 자료의 지표 반사도 성과 검증: RadCalNet Baotou(BTCN) 자료 적용 사례)

Kim, Kwangseob;Lee, Kiwon
- Korean Journal of Remote Sensing
- /
- v.36 no.6_2
- /
- pp.1509-1521
- /
- 2020
Experiments for validation of surface reflectance produced by Korea Multi-Purpose Satellite (KOMPSAT-3A) were conducted using Chinese Baotou (BTCN) data among four sites of the Radical Calibration Network (RadCalNet), a portal that provides spectrophotometric reflectance measurements. The atmosphere reflectance and surface reflectance products were generated using an extension program of an open-source Orfeo ToolBox (OTB), which was redesigned and implemented to extract those reflectance products in batches. Three image data sets of 2016, 2017, and 2018 were taken into account of the two sensor model variability, ver. 1.4 released in 2017 and ver. 1.5 in 2019, such as gain and offset applied to the absolute atmospheric correction. The results of applying these sensor model variables showed that the reflectance products by ver. 1.4 were relatively well-matched with RadCalNet BTCN data, compared to ones by ver. 1.5. On the other hand, the reflectance products obtained from the Landsat-8 by the USGS LaSRC algorithm and Sentinel-2B images using the SNAP Sen2Cor program were used to quantitatively verify the differences in those of KOMPSAT-3A. Based on the RadCalNet BTCN data, the differences between the surface reflectance of KOMPSAT-3A image were shown to be highly consistent with B band as -0.031 to 0.034, G band as -0.001 to 0.055, R band as -0.072 to 0.037, and NIR band as -0.060 to 0.022. The surface reflectance of KOMPSAT-3A also indicated the accuracy level for further applications, compared to those of Landsat-8 and Sentinel-2B images. The results of this study are meaningful in confirming the applicability of Analysis Ready Data (ARD) to the surface reflectance on high-resolution satellites.
https://doi.org/10.7780/kjrs.2020.36.6.2.3 인용 PDF KSCI HTML

A Comparative Study of Domestic Travel Patterns and Determinant Factors Affecting Satisfaction by Generations (대한민국 국민의 세대별 국내여행 방식 및 만족도 영향요인)

Mi-Sook Lee;Yoon-Joo Park
- Information Systems Review
- /
- v.22 no.2
- /
- pp.137-166
- /
- 2020
While South Koreans overseas travelling rate has been increased every year, domestic travelling rate has been at a standstill for several years. The purpose of this study is to analyze domestic traveling styles of Koreans according to their generations in order to provide generation-specific traveling services. For this purpose, we categorized the survey respondents into four different generations, which are Millennium (age 19~34), X generation (35~54), Baby Boomer (55~64) and senior by following the criterions of the Korea National Tourism Organization. After then, we analyze factors related to travel preparation process, the actual traveling activities and satisfaction after the travel. In this study, 16,713 data collected by the Ministry of Culture, Sports and Tourism are used. The results of this study show that Korean people tends to acquire domestic traveling information from their own or acquaintances past experiences. Also, they do not prefer the organized trip for domestic travels, thus do not buy package products a lot. In addition, natural scenery, rich in cultural heritage, and convenient accommodation are the most important determinant factors affecting the overall travel satisfaction of level for all generations. The traveling characteristics for each generation are as follows. Millennium get traveling information from the internet a lot, and more specifically, they refer portal sites and social network services (SNS) in many cases. Also, they tend to travel in summer peak season to popular destinations and pursues active traveling experiences. Generation X has similar traveling patterns with Millennium, however they major transportation method is using their own car. Also, transportation convenience and satisfactory leisure activity are important factors affecting the overall satisfaction level to Generation X. On the other hand, Baby boomer generation has a greater emphasis on appreciation of nature, visiting famous restaurants, and relaxation, rather than actively participating experiencing programs. They travel evenly in summer and spring/fall season to many different areas instead of focusing on popular tourist spots. In addition, shopping and eating delicious food are the important factors affecting the overall satisfaction level for them. Lastly, Senior generation has similar characteristics with Baby boomer in many ways, however, they travel a lot on the same day using public transportations or car rental service. They prefer spring and autumn trips rather than summer peak season, and tend to buy packaged travel products a lot compared with other generations. If these different traveling characteristics of each generation are considered for organizing and customizing tourism services, it is expected that domestic tourism satisfaction level will be ultimately increased.
https://doi.org/10.14329/isr.2020.22.2.137 인용 PDF

Text Mining-Based Emerging Trend Analysis for e-Learning Contents Targeting for CEO (텍스트마이닝을 통한 최고경영자 대상 이러닝 콘텐츠 트렌드 분석)

Kyung-Hoon Kim;Myungsin Chae;Byungtae Lee
- Information Systems Review
- /
- v.19 no.2
- /
- pp.1-19
- /
- 2017
Original scripts of e-learning lectures for the CEOs of corporation S were analyzed using topic analysis, which is a text mining method. Twenty-two topics were extracted based on the keywords chosen from five-year records that ranged from 2011 to 2015. Research analysis was then conducted on various issues. Promising topics were selected through evaluation and element analysis of the members of each topic. In management and economics, members demonstrated high satisfaction and interest toward topics in marketing strategy, human resource management, and communication. Philosophy, history of war, and history demonstrated high interest and satisfaction in the field of humanities, whereas mind health showed high interest and satisfaction in the field of in lifestyle. Studies were also conducted to identify topics on the proportion of content, but these studies failed to increase member satisfaction. In the field of IT, educational content responds sensitively to change of the times, but it may not increase the interest and satisfaction of members. The present study found that content production for CEOs should draw out deep implications for value innovation through technology application instead of simply ending the technical aspect of information delivery. Previous studies classified contents superficially based on the name of content program when analyzing the status of content operation. However, text mining can derive deep content and subject classification based on the contents of unstructured data script. This approach can examine current shortages and necessary fields if the service contents of the themes are displayed by year. This study was based on data obtained from influential e-learning companies in Korea. Obtaining practical results was difficult because data were not acquired from portal sites or social networking service. The content of e-learning trends of CEOs were analyzed. Data analysis was also conducted on the intellectual interests of CEOs in each field.
https://doi.org/10.14329/isr.2017.19.2.001 인용 PDF

The Analysis of the Current Status of Medical Accidents and Disputes Researched in the Korean Web Sites (인터넷 사이트를 통해 살펴본 의료사고 및 의료분쟁의 현황에 관한 분석)

Cha, Yu-Rim;Kwon, Jeong-Seung;Choi, Jong-Hoon;Kim, Chong-Youl
- Journal of Oral Medicine and Pain
- /
- v.31 no.4
- /
- pp.297-316
- /
- 2006
The increasing tendency of medical disputes is one of the remarkable social phenomena. Especially we must not overlook the phenomenon that production and circulation of information related to medical accidents is increasing rapidly through the internet. In this research, we evaluated the web sites which provide the information related to medical accidents using the keyword "medical accidents" in March 2006, and classified the 28 web sites according to the kinds of establishers. We also analyzed the contents of the sites, and checked and compared the current status of the web sites and problems that have to be improved. Finally, we suggested the possible solutions to prevent medical accidents. The detailed results were listed below. 1. Medical practitioners, general public, and lawyers were all familiar with and prefer the term "medical accidents" mainly. 2. In the number of sites searched by the keyword "medical accidents", lawyer had the most sites and medical practitioners had the least ones. 3. Many sites by general public and lawyers had their own medical record analysts but there was little professional analysts for dentistry. 4. General public were more interested in the prevention of medical accidents but the lawyers were more interested in the process after medical accidents. The sites by medical practitioners dealt with the least remedies of medical accidents, compared with other sites. 5. General public wanted the third party such as government intervention into the disputes including the medical dispute arbitration law or/and the establishment of independent medical dispute judgment institution. 6. In the comparison among the establishers of web sites, medical practitioners dealt with the least examples of medical accidents. 7. The suggestion of cases in counseling articles related to dental accidents were considered less importantly than the reality. 8. Whereas there were many articles about domestic cases related to the bloody dental treatment, in the open counseling articles the number of dental treatment regarding to non insurance treatment was large. 9. In comparing offered information of medical accidents based on the establishers, general public offered vocabularies, lawyers offered related laws and medical practitioners offered medical knowledge relatively. 10. They all cited the news pressed by the media to offer the current status of domestic medical accidents. Especially among the web sites by general public, NGOs provided the plentiful statistical data related to medical accidents. 11. The web sites that collect the medical accidents were only two. As a result of our research, we found out that, in the flood of information, medical disputes can be occurred by the wrong information from third party, and the medical practitioners have the most passive attitudes on the medical accidents. Thus, it is crucial to have the mutual interchange and exchange of information between lawyer, patients and medical practitioners, so that based on clear mutual comprehension we can solve the accidents and disputes more positively and actively.
PDF KSCI

Influence analysis of Internet buzz to corporate performance : Individual stock price prediction using sentiment analysis of online news (온라인 언급이 기업 성과에 미치는 영향 분석 : 뉴스 감성분석을 통한 기업별 주가 예측)

Jeong, Ji Seon;Kim, Dong Sung;Kim, Jong Woo
- Journal of Intelligence and Information Systems
- /
- v.21 no.4
- /
- pp.37-51
- /
- 2015
Due to the development of internet technology and the rapid increase of internet data, various studies are actively conducted on how to use and analyze internet data for various purposes. In particular, in recent years, a number of studies have been performed on the applications of text mining techniques in order to overcome the limitations of the current application of structured data. Especially, there are various studies on sentimental analysis to score opinions based on the distribution of polarity such as positivity or negativity of vocabularies or sentences of the texts in documents. As a part of such studies, this study tries to predict ups and downs of stock prices of companies by performing sentimental analysis on news contexts of the particular companies in the Internet. A variety of news on companies is produced online by different economic agents, and it is diffused quickly and accessed easily in the Internet. So, based on inefficient market hypothesis, we can expect that news information of an individual company can be used to predict the fluctuations of stock prices of the company if we apply proper data analysis techniques. However, as the areas of corporate management activity are different, an analysis considering characteristics of each company is required in the analysis of text data based on machine-learning. In addition, since the news including positive or negative information on certain companies have various impacts on other companies or industry fields, an analysis for the prediction of the stock price of each company is necessary. Therefore, this study attempted to predict changes in the stock prices of the individual companies that applied a sentimental analysis of the online news data. Accordingly, this study chose top company in KOSPI 200 as the subjects of the analysis, and collected and analyzed online news data by each company produced for two years on a representative domestic search portal service, Naver. In addition, considering the differences in the meanings of vocabularies for each of the certain economic subjects, it aims to improve performance by building up a lexicon for each individual company and applying that to an analysis. As a result of the analysis, the accuracy of the prediction by each company are different, and the prediction accurate rate turned out to be 56% on average. Comparing the accuracy of the prediction of stock prices on industry sectors, 'energy/chemical', 'consumer goods for living' and 'consumer discretionary' showed a relatively higher accuracy of the prediction of stock prices than other industries, while it was found that the sectors such as 'information technology' and 'shipbuilding/transportation' industry had lower accuracy of prediction. The number of the representative companies in each industry collected was five each, so it is somewhat difficult to generalize, but it could be confirmed that there was a difference in the accuracy of the prediction of stock prices depending on industry sectors. In addition, at the individual company level, the companies such as 'Kangwon Land', 'KT & G' and 'SK Innovation' showed a relatively higher prediction accuracy as compared to other companies, while it showed that the companies such as 'Young Poong', 'LG', 'Samsung Life Insurance', and 'Doosan' had a low prediction accuracy of less than 50%. In this paper, we performed an analysis of the share price performance relative to the prediction of individual companies through the vocabulary of pre-built company to take advantage of the online news information. In this paper, we aim to improve performance of the stock prices prediction, applying online news information, through the stock price prediction of individual companies. Based on this, in the future, it will be possible to find ways to increase the stock price prediction accuracy by complementing the problem of unnecessary words that are added to the sentiment dictionary.
https://doi.org/10.13088/jiis.2015.21.4.037 인용 PDF KSCI

Search Result 743, Processing Time 0.034 seconds

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)