DOI QR코드

DOI QR Code

Stock-Index Invest Model Using News Big Data Opinion Mining

뉴스와 주가 : 빅데이터 감성분석을 통한 지능형 투자의사결정모형

  • 김유신 (국민대학교 BIT전문대학원) ;
  • 김남규 (국민대학교 경영정보학부) ;
  • 정승렬 (국민대학교 BIT전문대학원)
  • Received : 2012.05.16
  • Accepted : 2012.06.18
  • Published : 2012.06.30

Abstract

People easily believe that news and stock index are closely related. They think that securing news before anyone else can help them forecast the stock prices and enjoy great profit, or perhaps capture the investment opportunity. However, it is no easy feat to determine to what extent the two are related, come up with the investment decision based on news, or find out such investment information is valid. If the significance of news and its impact on the stock market are analyzed, it will be possible to extract the information that can assist the investment decisions. The reality however is that the world is inundated with a massive wave of news in real time. And news is not patterned text. This study suggests the stock-index invest model based on "News Big Data" opinion mining that systematically collects, categorizes and analyzes the news and creates investment information. To verify the validity of the model, the relationship between the result of news opinion mining and stock-index was empirically analyzed by using statistics. Steps in the mining that converts news into information for investment decision making, are as follows. First, it is indexing information of news after getting a supply of news from news provider that collects news on real-time basis. Not only contents of news but also various information such as media, time, and news type and so on are collected and classified, and then are reworked as variable from which investment decision making can be inferred. Next step is to derive word that can judge polarity by separating text of news contents into morpheme, and to tag positive/negative polarity of each word by comparing this with sentimental dictionary. Third, positive/negative polarity of news is judged by using indexed classification information and scoring rule, and then final investment decision making information is derived according to daily scoring criteria. For this study, KOSPI index and its fluctuation range has been collected for 63 days that stock market was open during 3 months from July 2011 to September in Korea Exchange, and news data was collected by parsing 766 articles of economic news media M company on web page among article carried on stock information>news>main news of portal site Naver.com. In change of the price index of stocks during 3 months, it rose on 33 days and fell on 30 days, and news contents included 197 news articles before opening of stock market, 385 news articles during the session, 184 news articles after closing of market. Results of mining of collected news contents and of comparison with stock price showed that positive/negative opinion of news contents had significant relation with stock price, and change of the price index of stocks could be better explained in case of applying news opinion by deriving in positive/negative ratio instead of judging between simplified positive and negative opinion. And in order to check whether news had an effect on fluctuation of stock price, or at least went ahead of fluctuation of stock price, in the results that change of stock price was compared only with news happening before opening of stock market, it was verified to be statistically significant as well. In addition, because news contained various type and information such as social, economic, and overseas news, and corporate earnings, the present condition of type of industry, market outlook, the present condition of market and so on, it was expected that influence on stock market or significance of the relation would be different according to the type of news, and therefore each type of news was compared with fluctuation of stock price, and the results showed that market condition, outlook, and overseas news was the most useful to explain fluctuation of news. On the contrary, news about individual company was not statistically significant, but opinion mining value showed tendency opposite to stock price, and the reason can be thought to be the appearance of promotional and planned news for preventing stock price from falling. Finally, multiple regression analysis and logistic regression analysis was carried out in order to derive function of investment decision making on the basis of relation between positive/negative opinion of news and stock price, and the results showed that regression equation using variable of market conditions, outlook, and overseas news before opening of stock market was statistically significant, and classification accuracy of logistic regression accuracy results was shown to be 70.0% in rise of stock price, 78.8% in fall of stock price, and 74.6% on average. This study first analyzed relation between news and stock price through analyzing and quantifying sensitivity of atypical news contents by using opinion mining among big data analysis techniques, and furthermore, proposed and verified smart investment decision making model that could systematically carry out opinion mining and derive and support investment information. This shows that news can be used as variable to predict the price index of stocks for investment, and it is expected the model can be used as real investment support system if it is implemented as system and verified in the future.

누구나 뉴스와 주가 사이에는 밀접한 관계를 있을 것이라 생각한다. 그래서 뉴스를 통해 투자기회를 찾고, 투자이익을 얻을 수 있을 것으로 기대한다. 그렇지만 너무나 많은 뉴스들이 실시간으로 생성 전파되며, 정작 어떤 뉴스가 중요한지, 뉴스가 주가에 미치는 영향은 얼마나 되는지를 알아내기는 쉽지 않다. 본 연구는 이러한 뉴스들을 수집 분석하여 주가와 어떠한 관련이 있는지 분석하였다. 뉴스는 그 속성상 특정한 양식을 갖지 않는 비정형 텍스트로 구성되어있다. 이러한 뉴스 컨텐츠를 분석하기 위해 오피니언 마이닝이라는 빅데이터 감성분석 기법을 적용하였고, 이를 통해 주가지수의 등락을 예측하는 지능형 투자의사결정 모형을 제시하였다. 그리고, 모형의 유효성을 검증하기 위하여 마이닝 결과와 주가지수 등락 간의 관계를 통계 분석하였다. 그 결과 뉴스 컨텐츠의 감성분석 결과값과 주가지수 등락과는 유의한 관계를 가지고 있었으며, 좀 더 세부적으로는 주식시장 개장 전 뉴스들과 주가지수의 등락과의 관계 또한 통계적으로 유의하여, 뉴스의 감성분석 결과를 이용해 주가지수의 변동성 예측이 가능할 것으로 판단되었다. 이렇게 도출된 투자의사결정 모형은 여러 유형의 뉴스 중에서 시황 전망 해외 뉴스가 주가지수 변동을 가장 잘 예측하는 것으로 나타났고 로지스틱 회귀분석결과 분류정확도는 주가하락 시 70.0%, 주가상승 시 78.8%이며 전체평균은 74.6%로 나타났다.

Keywords

References

  1. 김선웅, 안현철, "Support Vector Machines와 유전자 알고리즘을 이용한 지능형 트레이딩 시스템 개발", 지능정보연구, 16권 1호, 2010.
  2. 김진화, 홍광헌, 민진영, "지식 누적을 이용한 실시간 주식시장 예측", 지능정보연구, 17권 4호(2011).
  3. 박종엽, 한인구, "인공신경망을 이용한 한국종합주가지수의 방향성 예측", 한국전문가시스템학회지, 2호(1995).
  4. 송종석, 이수원, "상품평 극성 분류를 위한 특징별 서술어 긍정/부정 사전 자동 구축", 정보과학회논문지 : 소프트웨어 및 응용, 38권 3호(2011).
  5. 송치영, "뉴스가 금융시장에 미치는 영향에 관한 연구", 국제경제연구, 8권 3호(2005).
  6. 신종화, 이의철, "한국형 경제 뉴스 속보가 금융 시장에 미친 영향", 한국경영학회 통합학술대회,2010.
  7. 안성원, 조성배, "뉴스 텍스트 마이닝과 시계열 분석을 이용한 주가예측", 2010 한국컴퓨터 종합학술대회 논문집, 37권 1(C)호(2010).
  8. 안희중, 전승표, "남북관계 관련 뉴스가 주식시장에 미치는 영향", 한국금융연구원 한국경제의분석, 16권 2호(2010).
  9. 이근영, "북한 핵 관련 뉴스가 국내주식 및 외환시장에 미치는 영향", 동북아경제연구, 18권 1호(2006).
  10. 이완수, 김찬석, "TV 기업뉴스와 주식가치의 상관관계 대한 시계열 연구", 방송학회학술대회논문집, 2009.
  11. 윤홍준, 김한준, 장재영, "오피니언 마이닝 기술을 이용한 효율적 상품평 검색 기법", 정보과학회논문지 : 컴퓨팅의 실제 및 레터, 16권 2호( 2010).
  12. 조성우, "Big Data 시대의 기술", KT종합기술원, 2011.
  13. 채승병, "정보홍수 속에서 금맥 찾기 : 빅데이터 분석과 활용", SERI 경영노트, 91호(2011).
  14. James, Manyika, Michael Chui, Brad Brown, Jacques Bughin, Richard Dobbs, Charles Roxburgh, and Angela Hung Byers, "Big Data : The Next Frontier for Innovation, Competition, and Productivity", McKinsey and Company, The McKinsey Global Institute, 2011.
  15. Lee, Gillam, Khurshid Ahmad, Saif Ahmad, Matthew Casey, David Cheng, Tugba Taskaya, Paulo C F de Oliveira and Pensiri Manomaisupat. "Economic News and Stock Market Correlation : A Study of the UK Market", In Conferenceon Terminology and Knowledge Engineering, 2002.
  16. Manovich, L., "Trending : The Promises and the Challenges of Big Social Data", Debatesinthe Digital Humanities, 2011.
  17. Mark, L. Mitchell and J. Harold Mulherin, "The Impact of Public Information on the Stock Market", The Journal of Fiance, Vol.XL IX, No.3(1994).
  18. Merv Adrian, "이제는 빅데이터이다", Teradata Magazine, 2011.
  19. Pietro Veronesi, "Stock Market Overreaction to Bad News in Good Times : A Rational Expectations Equilibrium Model", The Review of Financial Studies, Vol.12, No.5(1999).
  20. Tak-shung Fu, Ka-ki Lee, Donahue Sze, Fu-lai Chung, and Chak-man Ng, "Discovering the Correlation between Stock Time Series and Financial News", Proc. of the 2008 IEEE/ WIC/ACM International Conference on Web Intelligenceand Intelligent Agent Technology, 2008.

Cited by

  1. 텍스트 마이닝을 활용한 신문사에 따른 내용 및 논조 차이점 분석 vol.18, pp.3, 2012, https://doi.org/10.13088/jiis.2012.18.3.053
  2. Autocorrelation Analysis of the Sentiment with Stock Information Appearing on Big-Data vol.12, pp.2, 2012, https://doi.org/10.35527/kfedoi.2013.12.2.004
  3. 금융시장의 빅데이터 트렌드를 이용한 주가지수 투자 전략 vol.32, pp.3, 2012, https://doi.org/10.7737/kmsr.2015.32.3.091
  4. 감성분석 결과와 사용자 만족도와의 관계 -기상청 사례를 중심으로- vol.16, pp.10, 2016, https://doi.org/10.5392/jkca.2016.16.10.393
  5. 텍스트 마이닝을 이용한 정보보호인식 분석 및 강화 방안 모색 vol.23, pp.4, 2012, https://doi.org/10.22693/niaip.2016.23.4.076
  6. 활용 주체별 빅데이터 수용 인식 차이에 관한 연구: 활용 목적, 조직 규모, 업종 특성을 중심으로 vol.24, pp.1, 2012, https://doi.org/10.22693/niaip.2017.24.1.079
  7. Analyzing the Change of Discourse about Adolescent Problems Based on the Foucault’s Governmentality vol.18, pp.2, 2012, https://doi.org/10.15703/kjc.18.2.201704.223
  8. 뉴스기사 분석을 통한 사회이슈와 가격에 관한 연구 - 조류인플루엔자와 달걀가격 중심으로 - vol.7, pp.1, 2018, https://doi.org/10.30693/smj.2018.7.1.45
  9. 텍스트마이닝을 활용한 북한 지도자의 신년사 및 연설문 트렌드 연구 vol.26, pp.3, 2012, https://doi.org/10.21219/jitam.2019.26.3.043
  10. 증권형 크라우드펀딩 투자설명서 형태소분석을 통한 투자자 보호방안에 관한 연구 vol.18, pp.5, 2012, https://doi.org/10.9716/kits.2019.18.5.165
  11. 증권형 크라우드펀딩 투자설명서 형태소분석을 통한 투자자 보호방안에 관한 연구 vol.18, pp.5, 2012, https://doi.org/10.9716/kits.2019.18.5.165
  12. 뉴스 데이터를 활용한 텍스트 감성분석에 따른 지역 산업생태계 위기 예측 - 광주 지역 자동차 산업을 중심으로 - vol.20, pp.8, 2012, https://doi.org/10.5392/jkca.2020.20.08.001
  13. A study on stock price prediction through analysis of sales growth performance and macro-indicators using artificial intelligence vol.11, pp.1, 2012, https://doi.org/10.22156/cs4smb.2021.11.01.028
  14. 디지털 뉴딜 정책에 대한 언론 보도량과 주식 시장의 동태적 관계 분석: 4차산업혁명 관련 기업을 중심으로 vol.26, pp.3, 2012, https://doi.org/10.7838/jsebs.2021.26.3.033