• Title/Summary/Keyword: 기업성과시스템

Search Result 1,259, Processing Time 0.03 seconds

A Comparative Study of Information Delivery Method in Networks According to Off-line Communication (오프라인 커뮤니케이션 유무에 따른 네트워크 별 정보전달 방법 비교 분석)

  • Park, Won-Kuk;Choi, Chan;Moon, Hyun-Sil;Choi, Il-Young;Kim, Jae-Kyeong
    • Journal of Intelligence and Information Systems
    • /
    • v.17 no.4
    • /
    • pp.131-142
    • /
    • 2011
  • In recent years, Social Network Service, which is defined as a web-based service that allows an individual to construct a public or a semi-public profile within a bounded system, articulates a list of other users with whom they share connections, and traverses their list of connections. For example, Facebook and Twitter are the representative sites of Social Network Service, and these sites are the big issue in the world. A lot of people use Social Network Services to connect and maintain social relationship. Recently the users of Social Network Services have increased dramatically. Accordingly, many organizations become interested in Social Network Services as means of marketing, media, communication with their customers, and so on, because social network services can offer a variety of benefits to organizations such as companies and associations. In other words, organizations can use Social Network Services to respond rapidly to various user's behaviors because Social Network Services can make it possible to communicate between the users more easily and faster. And marketing cost of the Social Network Service is lower than that of existing tools such as broadcasts, news papers, and direct mails. In addition, Social network Services are growing in market place. So, the organizations such as companies and associations can acquire potential customers for the future. However, organizations uniformly communicate with users through Social Network Service without consideration of the characteristics of the networks although networks have different effects on information deliveries. For example, members' cohesion in an offline communication is higher than that in an online communication because the members of the offline communication are very close. that is, the network of the offline communication has a strong tie. Accordingly, information delivery is fast in the network of the offline communication. In this study, we compose two networks which have different characteristic of communication in Twitter. First network is constructed with data based on an offline communication such as friend, family, senior and junior in school. Second network is constructed with randomly selected data from users who want to associate with friends in online. Each network size is 250 people who divide with three groups. The first group is an ego which means a person in the center of the network. The second group is the ego's followers. The last group is composed of the ego's follower's followers. We compare the networks through social network analysis and follower's reaction analysis. We investigate density and centrality to analyze the characteristic of each network. And we analyze the follower's reactions such as replies and retweets to find differences of information delivery in each network. Our experiment results indicate that density and centrality of the offline communicationbased network are higher than those of the online-based network. Also the number of replies are larger than that of retweets in the offline communication-based network. On the other hand, the number of retweets are larger than that of replies in the online based network. We identified that the effect of information delivery in the offline communication-based network was different from those in the online communication-based network through experiments. So, you configure the appropriate network types considering the characteristics of the network if you want to use social network as an effective marketing tool.

A Study on the Influence of Information Security on Consumer's Preference of Android and iOS based Smartphone (정보보안이 안드로이드와 iOS 기반 스마트폰 소비자 선호에 미치는 영향)

  • Park, Jong-jin;Choi, Min-kyong;Ahn, Jong-chang
    • Journal of Internet Computing and Services
    • /
    • v.18 no.1
    • /
    • pp.105-119
    • /
    • 2017
  • Smartphone users hit over eighty-five percentage of Korean populations and personal private items and various information are stored in each user's smartphone. There are so many cases to propagate malicious codes or spywares for the purpose of catching illegally these kinds of information and earning pecuniary gains. Thus, need of information security is outstanding for using smartphone but also user's security perception is important. In this paper, we investigate about how information security affects smartphone operating system choices by users. For statistical analysis, the online survey with questionnaires for users of smartphones is conducted and effective 218 subjects are collected. We test hypotheses via communalities analysis using factor analysis, reliability analysis, independent sample t-test, and linear regression analysis by IBM SPSS statistical package. As a result, it is found that hardware environment influences on perceived ease of use. Brand power affects both perceived usefulness and perceived ease of use and degree of personal risk-accepting influences on perception of smartphone spy-ware risk. In addition, it is found that perceived usefulness, perceived ease of use, degree of personal risk-accepting, and spy-ware risk of smartphone influence significantly on intention to purchase smartphone. However, results of independent sample t-test for each operating system users of Android or iOS do not present statistically significant differences among two OS user groups. In addition, each result of OS user group testing for hypotheses is different from the results of total sample testing. These results can give important suggestions to organizations and managers related to smartphone ecology and contribute to the sphere of information systems (IS) study through a new perspective.

Die Problematik auf gesetzliche Terminologie und gewerbliche Nutzung von Drohne (드론의 현행 법적 정의와 상업적 운용에 따른 문제점)

  • Kim, Sung-Mi
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.33 no.1
    • /
    • pp.3-43
    • /
    • 2018
  • Auf die ganze Welt macht unbemannte $Flugger{\ddot{a}}te$(sog.Drohnen) in vielen Bereichen rasch Fortschritte und Anwendungen gezeigt. Nachdem ferngesteuerte Drohnen $urspr{\ddot{u}}nglich$ $prim{\ddot{a}}r$ $f{\ddot{u}}r$ $milit{\ddot{a}}rische$ Zwecke entwickelt wurden, $erh{\ddot{o}}cht$ sich derzeit ihre zivile Nutzung sowohl im Freizeit- als auch im Dienstleistungsbereich(Paketdrohnen, Drohnen-taxi) stetig. Mit der vermehrten Drohnennutzung steigen allerdings auch die damit verbundenen Risiken und Herausforderungen. In Zusammenhang damit stellt sich dann die Frage, ob $gegenw{\ddot{a}}rtige$ Vorschriften im Bereich von Luftrecht zurecht gekommen sind. Es sieht sich gerade der zwei Schwerpunkt $gegen{\ddot{u}}ber$. Erstens kann $Passagierebef{\ddot{o}}rderung$ mit unbemanntem Luftfahrzeug(mehr als 150kg) im $gegenw{\ddot{a}}ritigen$ Luftrecht keine Anwendung finden. Denn das kor. Luftsicherheitsgesetz und sein Durchsetzungsverordnung definieren die Terminologie von unbemannten Luftfahrzeugen und unbemannten $Flugger{\ddot{a}}te$ als "wenn eine Person nicht an Bord geht und ferngesteuert wird". Also soll Drohne nach dieser gesetzlichen Definition nur "ohne Person" geflogen werden. Das besagt ohne Piloten und ohne Passagiere. Zweitens ist unbemannte $Flugger{\ddot{a}}te$(weniger als 150kg) nicht auf Handelsgesetz anzuwenden, auf das ${\ddot{u}}ber$ Anspruchsgrundlage und Zurechnungsnorm des gewerblichen Luftverkehr geregelt ist. Der unbemannte Luftfahrzeuglieferdienst bringt nicht nur die Gefahr einer $Besch{\ddot{a}}digung$ des Frachtguts mit sich, sondern auch die Gefahr von $Bodensch{\ddot{a}}den$ durch Dritte. Gemäß ${\S}$ 896 des Handelsgesetzes ist aber die Anwendung von unbemannte $Flugger{\ddot{a}}te$(weniger als 150kg) $hierf{\ddot{u}}r$ begrenzt, weil unbemannt $Flugger{\ddot{a}}te$ $einschl{\ddot{a}}gig$ in Ultralight $Flugger{\ddot{a}}t$ ist, die im Handelsgesetz ausschließlich besteht. Technische Fortschritt und die dadurch $erm{\ddot{o}}glichten$ kommerziellen Anwendungen werden die Nachfrage nach unbemannter $Flugger{\ddot{a}}te$ wecken. Die Umsetzung der $bez{\ddot{u}}glichen$ Vorschriften sollte auch diese Entwicklung aktiv begleitet und $fr{\ddot{u}}hzeitig$ kommuniziert und erarbeitet werden, damit Hersteller und Nutzer $fr{\ddot{u}}hzeitig$ Planungssicherheit haben.

The Prediction of Purchase Amount of Customers Using Support Vector Regression with Separated Learning Method (Support Vector Regression에서 분리학습을 이용한 고객의 구매액 예측모형)

  • Hong, Tae-Ho;Kim, Eun-Mi
    • Journal of Intelligence and Information Systems
    • /
    • v.16 no.4
    • /
    • pp.213-225
    • /
    • 2010
  • Data mining has empowered the managers who are charge of the tasks in their company to present personalized and differentiated marketing programs to their customers with the rapid growth of information technology. Most studies on customer' response have focused on predicting whether they would respond or not for their marketing promotion as marketing managers have been eager to identify who would respond to their marketing promotion. So many studies utilizing data mining have tried to resolve the binary decision problems such as bankruptcy prediction, network intrusion detection, and fraud detection in credit card usages. The prediction of customer's response has been studied with similar methods mentioned above because the prediction of customer's response is a kind of dichotomous decision problem. In addition, a number of competitive data mining techniques such as neural networks, SVM(support vector machine), decision trees, logit, and genetic algorithms have been applied to the prediction of customer's response for marketing promotion. The marketing managers also have tried to classify their customers with quantitative measures such as recency, frequency, and monetary acquired from their transaction database. The measures mean that their customers came to purchase in recent or old days, how frequent in a period, and how much they spent once. Using segmented customers we proposed an approach that could enable to differentiate customers in the same rating among the segmented customers. Our approach employed support vector regression to forecast the purchase amount of customers for each customer rating. Our study used the sample that included 41,924 customers extracted from DMEF04 Data Set, who purchased at least once in the last two years. We classified customers from first rating to fifth rating based on the purchase amount after giving a marketing promotion. Here, we divided customers into first rating who has a large amount of purchase and fifth rating who are non-respondents for the promotion. Our proposed model forecasted the purchase amount of the customers in the same rating and the marketing managers could make a differentiated and personalized marketing program for each customer even though they were belong to the same rating. In addition, we proposed more efficient learning method by separating the learning samples. We employed two learning methods to compare the performance of proposed learning method with general learning method for SVRs. LMW (Learning Method using Whole data for purchasing customers) is a general learning method for forecasting the purchase amount of customers. And we proposed a method, LMS (Learning Method using Separated data for classification purchasing customers), that makes four different SVR models for each class of customers. To evaluate the performance of models, we calculated MAE (Mean Absolute Error) and MAPE (Mean Absolute Percent Error) for each model to predict the purchase amount of customers. In LMW, the overall performance was 0.670 MAPE and the best performance showed 0.327 MAPE. Generally, the performances of the proposed LMS model were analyzed as more superior compared to the performance of the LMW model. In LMS, we found that the best performance was 0.275 MAPE. The performance of LMS was higher than LMW in each class of customers. After comparing the performance of our proposed method LMS to LMW, our proposed model had more significant performance for forecasting the purchase amount of customers in each class. In addition, our approach will be useful for marketing managers when they need to customers for their promotion. Even if customers were belonging to same class, marketing managers could offer customers a differentiated and personalized marketing promotion.

A Study on the Impact Factors of Contents Diffusion in Youtube using Integrated Content Network Analysis (일반영향요인과 댓글기반 콘텐츠 네트워크 분석을 통합한 유튜브(Youtube)상의 콘텐츠 확산 영향요인 연구)

  • Park, Byung Eun;Lim, Gyoo Gun
    • Journal of Intelligence and Information Systems
    • /
    • v.21 no.3
    • /
    • pp.19-36
    • /
    • 2015
  • Social media is an emerging issue in content services and in current business environment. YouTube is the most representative social media service in the world. YouTube is different from other conventional content services in its open user participation and contents creation methods. To promote a content in YouTube, it is important to understand the diffusion phenomena of contents and the network structural characteristics. Most previous studies analyzed impact factors of contents diffusion from the view point of general behavioral factors. Currently some researchers use network structure factors. However, these two approaches have been used separately. However this study tries to analyze the general impact factors on the view count and content based network structures all together. In addition, when building a content based network, this study forms the network structure by analyzing user comments on 22,370 contents of YouTube not based on the individual user based network. From this study, we re-proved statistically the causal relations between view count and not only general factors but also network factors. Moreover by analyzing this integrated research model, we found that these factors affect the view count of YouTube according to the following order; Uploader Followers, Video Age, Betweenness Centrality, Comments, Closeness Centrality, Clustering Coefficient and Rating. However Degree Centrality and Eigenvector Centrality affect the view count negatively. From this research some strategic points for the utilizing of contents diffusion are as followings. First, it is needed to manage general factors such as the number of uploader followers or subscribers, the video age, the number of comments, average rating points, and etc. The impact of average rating points is not so much important as we thought before. However, it is needed to increase the number of uploader followers strategically and sustain the contents in the service as long as possible. Second, we need to pay attention to the impacts of betweenness centrality and closeness centrality among other network factors. Users seems to search the related subject or similar contents after watching a content. It is needed to shorten the distance between other popular contents in the service. Namely, this study showed that it is beneficial for increasing view counts by decreasing the number of search attempts and increasing similarity with many other contents. This is consistent with the result of the clustering coefficient impact analysis. Third, it is important to notice the negative impact of degree centrality and eigenvector centrality on the view count. If the number of connections with other contents is too much increased it means there are many similar contents and eventually it might distribute the view counts. Moreover, too high eigenvector centrality means that there are connections with popular contents around the content, and it might lose the view count because of the impact of the popular contents. It would be better to avoid connections with too powerful popular contents. From this study we analyzed the phenomenon and verified diffusion factors of Youtube contents by using an integrated model consisting of general factors and network structure factors. From the viewpoints of social contribution, this study might provide useful information to music or movie industry or other contents vendors for their effective contents services. This research provides basic schemes that can be applied strategically in online contents marketing. One of the limitations of this study is that this study formed a contents based network for the network structure analysis. It might be an indirect method to see the content network structure. We can use more various methods to establish direct content network. Further researches include more detailed researches like an analysis according to the types of contents or domains or characteristics of the contents or users, and etc.

The prediction of the stock price movement after IPO using machine learning and text analysis based on TF-IDF (증권신고서의 TF-IDF 텍스트 분석과 기계학습을 이용한 공모주의 상장 이후 주가 등락 예측)

  • Yang, Suyeon;Lee, Chaerok;Won, Jonggwan;Hong, Taeho
    • Journal of Intelligence and Information Systems
    • /
    • v.28 no.2
    • /
    • pp.237-262
    • /
    • 2022
  • There has been a growing interest in IPOs (Initial Public Offerings) due to the profitable returns that IPO stocks can offer to investors. However, IPOs can be speculative investments that may involve substantial risk as well because shares tend to be volatile, and the supply of IPO shares is often highly limited. Therefore, it is crucially important that IPO investors are well informed of the issuing firms and the market before deciding whether to invest or not. Unlike institutional investors, individual investors are at a disadvantage since there are few opportunities for individuals to obtain information on the IPOs. In this regard, the purpose of this study is to provide individual investors with the information they may consider when making an IPO investment decision. This study presents a model that uses machine learning and text analysis to predict whether an IPO stock price would move up or down after the first 5 trading days. Our sample includes 691 Korean IPOs from June 2009 to December 2020. The input variables for the prediction are three tone variables created from IPO prospectuses and quantitative variables that are either firm-specific, issue-specific, or market-specific. The three prospectus tone variables indicate the percentage of positive, neutral, and negative sentences in a prospectus, respectively. We considered only the sentences in the Risk Factors section of a prospectus for the tone analysis in this study. All sentences were classified into 'positive', 'neutral', and 'negative' via text analysis using TF-IDF (Term Frequency - Inverse Document Frequency). Measuring the tone of each sentence was conducted by machine learning instead of a lexicon-based approach due to the lack of sentiment dictionaries suitable for Korean text analysis in the context of finance. For this reason, the training set was created by randomly selecting 10% of the sentences from each prospectus, and the sentence classification task on the training set was performed after reading each sentence in person. Then, based on the training set, a Support Vector Machine model was utilized to predict the tone of sentences in the test set. Finally, the machine learning model calculated the percentages of positive, neutral, and negative sentences in each prospectus. To predict the price movement of an IPO stock, four different machine learning techniques were applied: Logistic Regression, Random Forest, Support Vector Machine, and Artificial Neural Network. According to the results, models that use quantitative variables using technical analysis and prospectus tone variables together show higher accuracy than models that use only quantitative variables. More specifically, the prediction accuracy was improved by 1.45% points in the Random Forest model, 4.34% points in the Artificial Neural Network model, and 5.07% points in the Support Vector Machine model. After testing the performance of these machine learning techniques, the Artificial Neural Network model using both quantitative variables and prospectus tone variables was the model with the highest prediction accuracy rate, which was 61.59%. The results indicate that the tone of a prospectus is a significant factor in predicting the price movement of an IPO stock. In addition, the McNemar test was used to verify the statistically significant difference between the models. The model using only quantitative variables and the model using both the quantitative variables and the prospectus tone variables were compared, and it was confirmed that the predictive performance improved significantly at a 1% significance level.

A Study on the Support System for Reinforcement of Competitiveness of Small Business persons - Mainly Focused on Support System for Small Business Persons - (소상공인 경쟁력 강화의 지원제도에 관한 연구 - 소상공인 지원제도를 중심으로 -)

  • Woo, Dae-IL;Lee, Sang-Youn
    • The Korean Journal of Franchise Management
    • /
    • v.2 no.2
    • /
    • pp.95-110
    • /
    • 2011
  • As global economic conditions are getting uneasy and polarization of our economy is intensified, the economic sentiment of small businesses is still low and unstable. The collapse of worldwide banking systems due to sub prime crisis in 2007 became the catalyst that shakes financial industries in each country in the world; the most sentiment people, small businesspersons, also have hard time facing survival way out, facing a great crisis. All organizing powers including manufactures, wholesales and retails are being gradually greater in mutual relations and dependence, and unstable factors about risks are also increasing. For exterior environmental and physical risk factors which cannot make small businesses survive themselves by developing ways out are eventually increasing, those who cannot cope with these factors face a great crisis. Although the government tries hard to overcome this situation conducting many ways, the effect does not continue. It is the real state that independent business markets including overall employment and establishing business have vicious cycle that they cannot be improved, due to increase of employment centered on short-term labors which lack durability in creation of employment and decline of household income. Recently, growth shows indication of slowdown because of multinational risk factors including financial crisis in each country in Europe, the death of Kim Jung-il, relationship with North Korea, and unstability of war situation in the Middle East Asia. Experts expect that growth rate will be about 4%, and independent business that ordinary people feel is still gloomy. It's reality that there is no adequate alternative for lack of jobs, unstable employment and a means of living after retirement. Also, the fact that large companies enter the market which is narrow and in the excessive competition should be an environmental factor that makes the situation worse. The business concept, a franchise, is the part we should think about whether it is the institutional solution that can guarantee independent businessmen stable life. Major companies are frightfully entering the market today, breaking the barrier to entry and shouting of a win-win with independent businesses. It's the small businesspersons who go through painful domestic recession, cannot predict the future and manage confusing and unstable independent business. It's very important to restore the domestic economy through wisely boosting consumption as soon as possible. It's also important to lead the situation by gathering powers of the government and related organizations, agonizing, suggesting solutions, and establishing accurate directions. The purpose of this study, therefore, is to suggest ways to strengthen competitiveness of small businesspersons by examining small business support policies which are currently implemented.

The Effect on Air Transport Sector by Korea-China FTA and Aviation Policy Direction of Korea (한·중 FTA가 항공운송 부문에 미치는 영향과 우리나라 항공정책의 방향)

  • Lee, Kang-Bin
    • The Korean Journal of Air & Space Law and Policy
    • /
    • v.32 no.1
    • /
    • pp.83-138
    • /
    • 2017
  • Korea-China FTA entered into force on the 20th of December 2015, and one year elapsed after its effectuation as the FTA with China, our country's largest trading partner. Therefore, this study looks at the trends of air transport trade between Korea and China, and examines the contents of concessions to the air transport services sector in Korea-China FTA, and analyzes the impact on the air transport sector by Korea-China FTA, and proposes our country's aviation policy direction in order to respond to such impact. In 2016 the trends of air transport trade between Korea and China are as follows : The export amount of air transport trade to China was 40.03 billion dollars, down by 9.3% from the last year, and occupied 32.2% of the total export amount to China. The import amount of air transport trade from China was 24.26 billion dollars, down by 9.1% from the last year, and occupied 27.7% of the total import amount from China. The contents of concessions to the air transport services sector in Korea-China FTA are as follows : China made concessions to the aircraft repair and maintenance services and the computer reservation system services with limitations on market access and national treatment in the air transport services sector of the China Schedule of Specific Commitments of Korea-China FTA Chapter 8 Annex. Korea made concessions to the computer reservation system services, selling and marketing of air transport services, and aircraft repair and maintenance without limitations on market access and national treatment in the air transport services sector of the Korea Schedule of Specific Commitments of Korea-China FTA Chapter 8 Annex. The impact on the air transport sector by Korea-China FTA are as follows : As for the impact on the air passenger market, in 2016 the arrival passengers of the international flight from China were 9.96 million, up by 20.6% from the last year, and the departure passengers to China were 9.90 million, up by 34.8% from the last year. As for the impact on the air cargo market, in 2016 the exported goods volumes of air cargo to China were 105,220.2 tons, up by 6.6% from the last year, and imported goods volumes from China were 133,750.9 tons, up by 12.3% from the last year. Among the major items of exported air cargo to China, the exported goods volumes of benefited items in the Tariff Schedule of China of Korea-China FTA were increased, and among the major items of imported air cargo from China, the imported goods volumes of benefited items in the Tariff Schedule of Korea of Korea-China FTA were increased. As for the impact on the logistics market, in 2016 the handling performance of exported air cargo to China by domestic forwarders were 119,618 tons, down by 2.1% from the last year, and the handling performance of imported air cargo from China were 79,430 tons, down by 4.4% from the last year. In 2016 the e-commerce export amount to China were 109.16 million dollars, up by 27.7% from the last year, and the e-commerce import amount from China were 89.43 million dollars, up by 72% from the last year. The author proposes the aviation policy direction of Korea according to Korea-China FTA as follows : First, the open skies between Korea and China shall be pushed ahead. In June 2006 Korea and China concluded the open skies agreement within the scope of the third freedom and fourth freedom of the air for passenger and cargo in Sandong Province and Hainan Province of China, and agreed the full open skies of flights between the two countries from the summer season in 2010. However, China protested against the interpretation of the draft of the memorandum of understanding to the air services agreement, therefore the further open skies did not take place. Through the separate aviation talks with China from Korea-China FTA, the gradual and selective open skies of air passenger market and air cargo market shall be pushed ahead. Second, the competitiveness of air transport industry and airport shall be secured. As for the strengthening methods of the competitiveness of Korea's air transport industry, the support system for the strengthening of national air carriers' competitiveness shall be prepared, and the new basis for competition of national air carriers shall be made, and the strategic network based on national interest shall be built. As for the strengthening methods of the competitiveness of Korea's airports, particularly Incheon Airport, the competitiveness of the network for aviation demand creation shall be strengthened, and the airport facilities and safety infrastructure shall be expanded, and the new added value through the airport shall be created, and the world's No.1 level of services shall be maintained. Third, the competitiveness of aviation logistics enterprises shall be strengthened. As for the strengthening methods of the competitiveness of Korea's aviation logistics enterprises, as the upbringing strategy of higher added value in response to the industry trends changes, the new logistics market shall be developed, and the logistics infrastructure shall be expanded, and the logistics professionals shall be trained. Additionally, as the expanding strategy of global logistics market, the support system for overseas investment of logistics enterprises shall be built, and according to expanding the global transport network, the international cooperation shall be strengthened, and the network infrastructure shall be secured. As for the strengthening methods of aviation logistics competitiveness of Incheon Airport, the enterprises' demand of moving in the logistics complex shall be responded, and the comparative advantage in the field of new growth cargo shall be preoccupied, and the logistics hub's capability shall be strengthened, and the competitiveness of cargo processing speed in the airport shall be advanced. Forth, in the subsequent negotiation of Korea-China FTA, the further opening of air transport services sector shall be secured. In the subsequent negotiation being initiated within two years after entry into force of Korea-China FTA, it is necessary to ask for the further opening of the concessions of computer reservation system services, and aircraft repair and maintenance services in which the concessions level of air transport services sector by China is insufficient compared to the concessions level in the existing FTA concluded by China. In conclusion, in order to respond to the impact on Korea's air passenger market, air cargo market and aviation logistics market by Korea-China FTA, the following policy tasks shall be pushed ahead : Taking into consideration of national air carriers' competitiveness and nation's benefits, the gradual and selective open skies shall be pushed ahead, and the support system to strengthen the competitiveness of air transport industry and airport shall be built, and entry into aviation logistics market by logistics enterprises shall be expanded, and the preparations to ask for the further opening of air transport services sector, low in the concessions level by China shall be made.

  • PDF

Stock Price Prediction by Utilizing Category Neutral Terms: Text Mining Approach (카테고리 중립 단어 활용을 통한 주가 예측 방안: 텍스트 마이닝 활용)

  • Lee, Minsik;Lee, Hong Joo
    • Journal of Intelligence and Information Systems
    • /
    • v.23 no.2
    • /
    • pp.123-138
    • /
    • 2017
  • Since the stock market is driven by the expectation of traders, studies have been conducted to predict stock price movements through analysis of various sources of text data. In order to predict stock price movements, research has been conducted not only on the relationship between text data and fluctuations in stock prices, but also on the trading stocks based on news articles and social media responses. Studies that predict the movements of stock prices have also applied classification algorithms with constructing term-document matrix in the same way as other text mining approaches. Because the document contains a lot of words, it is better to select words that contribute more for building a term-document matrix. Based on the frequency of words, words that show too little frequency or importance are removed. It also selects words according to their contribution by measuring the degree to which a word contributes to correctly classifying a document. The basic idea of constructing a term-document matrix was to collect all the documents to be analyzed and to select and use the words that have an influence on the classification. In this study, we analyze the documents for each individual item and select the words that are irrelevant for all categories as neutral words. We extract the words around the selected neutral word and use it to generate the term-document matrix. The neutral word itself starts with the idea that the stock movement is less related to the existence of the neutral words, and that the surrounding words of the neutral word are more likely to affect the stock price movements. And apply it to the algorithm that classifies the stock price fluctuations with the generated term-document matrix. In this study, we firstly removed stop words and selected neutral words for each stock. And we used a method to exclude words that are included in news articles for other stocks among the selected words. Through the online news portal, we collected four months of news articles on the top 10 market cap stocks. We split the news articles into 3 month news data as training data and apply the remaining one month news articles to the model to predict the stock price movements of the next day. We used SVM, Boosting and Random Forest for building models and predicting the movements of stock prices. The stock market opened for four months (2016/02/01 ~ 2016/05/31) for a total of 80 days, using the initial 60 days as a training set and the remaining 20 days as a test set. The proposed word - based algorithm in this study showed better classification performance than the word selection method based on sparsity. This study predicted stock price volatility by collecting and analyzing news articles of the top 10 stocks in market cap. We used the term - document matrix based classification model to estimate the stock price fluctuations and compared the performance of the existing sparse - based word extraction method and the suggested method of removing words from the term - document matrix. The suggested method differs from the word extraction method in that it uses not only the news articles for the corresponding stock but also other news items to determine the words to extract. In other words, it removed not only the words that appeared in all the increase and decrease but also the words that appeared common in the news for other stocks. When the prediction accuracy was compared, the suggested method showed higher accuracy. The limitation of this study is that the stock price prediction was set up to classify the rise and fall, and the experiment was conducted only for the top ten stocks. The 10 stocks used in the experiment do not represent the entire stock market. In addition, it is difficult to show the investment performance because stock price fluctuation and profit rate may be different. Therefore, it is necessary to study the research using more stocks and the yield prediction through trading simulation.