DOI QR코드

DOI QR Code

Sentiment Analysis of News Based on Generative AI and Real Estate Price Prediction: Application of LSTM and VAR Models

생성 AI기반 뉴스 감성 분석과 부동산 가격 예측: LSTM과 VAR모델의 적용

  • 김수아 (서강대학교 인공지능학과) ;
  • 권미주 (동덕여자대학교 정보통계학과) ;
  • 김현희 (동덕여자대학교 정보통계학과)
  • Received : 2023.12.27
  • Accepted : 2024.04.06
  • Published : 2024.05.31

Abstract

Real estate market prices are determined by various factors, including macroeconomic variables, as well as the influence of a variety of unstructured text data such as news articles and social media. News articles are a crucial factor in predicting real estate transaction prices as they reflect the economic sentiment of the public. This study utilizes sentiment analysis on news articles to generate a News Sentiment Index score, which is then seamlessly integrated into a real estate price prediction model. To calculate the sentiment index, the content of the articles is first summarized. Then, using AI, the summaries are categorized into positive, negative, and neutral sentiments, and a total score is calculated. This score is then applied to the real estate price prediction model. The models used for real estate price prediction include the Multi-head attention LSTM model and the Vector Auto Regression model. The LSTM prediction model, without applying the News Sentiment Index (NSI), showed Root Mean Square Error (RMSE) values of 0.60, 0.872, and 1.117 for the 1-month, 2-month, and 3-month forecasts, respectively. With the NSI applied, the RMSE values were reduced to 0.40, 0.724, and 1.03 for the same forecast periods. Similarly, the VAR prediction model without the NSI showed RMSE values of 1.6484, 0.6254, and 0.9220 for the 1-month, 2-month, and 3-month forecasts, respectively, while applying the NSI led to RMSE values of 1.1315, 0.3413, and 1.6227 for these periods. These results demonstrate the effectiveness of the proposed model in predicting apartment transaction price index and its ability to forecast real estate market price fluctuations that reflect socio-economic trends.

부동산 시장은 다양한 요인에 의해 가격이 결정되며 거시경제 변수뿐 만 아니라 뉴스 기사, SNS 등 다양한 텍스트 데이터의 영향을 받는다. 특히 뉴스 기사는 국민들이 느끼는 경제 심리를 반영하고 있으므로 부동산 매매 가격 예측에 있어 중요한 요인이다. 본 연구에서는 뉴스 기사를 감성 분석하여 그 결과를 뉴스 감성 지수로 점수화 한 후 부동산 가격 예측 모델에 적용하였다. 먼저 기사 본문을 요약 후 요약된 내용을 바탕으로 생성 AI를 활용하여 긍정, 부정, 중립으로 분류한 다음 총 점수를 산출하였고 이를 부동산 가격 예측 모델에 적용하였다. 부동산 가격 예측 모델로는 Multi-head attention LSTM 모델과 Vector Auto Regression 모델을 사용하였다. 제안하는 뉴스 감성 지수를 적용하지 않은 LSTM 예측 모델은 1개월, 2개월, 3개월 예측에서 각각 0.60, 0.872, 1.117의 Root Mean Square Error (RMSE)을 보였으며, 뉴스 감성 지수를 적용한 LSTM 예측 모델은 각각 0.40, 0.724, 1.03의 RMSE값을 나타낸다. 또한 뉴스 감성 지수를 적용하지 않은 Vector Auto Regression 예측 모델은 1개월, 2개월, 3개월 예측에서 각각 1.6484, 0.6254, 0.9220, 뉴스 감성 지수를 적용한 Vector Auto Regression 예측 모델은 각각 1.1315, 0.3413, 1.6227의 RMSE 값을 나타낸다. 앞선 아파트 매매가격지수 예측 모델을 통해 사회/경제적 동향을 반영한 부동산 시장 가격 변동을 예측할 수 있을 것으로 보인다.

Keywords

Acknowledgement

이 논문은 2022년도 동덕여자대학교 동덕여자대학교 학술 연구비 지원에 의하여 수행된 것임.

References

  1. J. Y. Park, "Consumer psychology and psychology," Korea Development Bank, 2004.
  2. J.-Y. Ham and J.-y. Son, "Applying the ba yesian vector autoregressive model in house price pre diction," Review of Real Estate and Urban Studies, Vol.8, No.2, pp.25-38, 2016.
  3. E. Shin, E. Kim, and T. Hong, "The prediction of real estate price based on deep learning using news sentiment and expert knowledge," The Journal of Internet Electronic Commerce Resarch, Vol.22, No.3, pp.61-73, 2022. https://doi.org/10.37272/JIECR.2022.06.22.3.61
  4. E. K. Shin, E. Kim, and T, Hong, "The prediction of real estate price based on deep learning using news sentiment and expert knowledge," The Journal of Internet Electronic Commerce Research, Vol.22, No.3, pp.61-73, 2022.
  5. H. J. Chun, "Prediction of housing price using time series analysis and machine learning methods," Residential Environmental: Journal of the Residential Environment Institute of Korea, Vol.18, No.1, pp.49-65, 2020. https://doi.org/10.22313/reik.2020.18.1.49
  6. M. Kwon and J. Kim, "Forecasting Seoul Apart ment Price Index based on a Deep Learning Model," in Proceedings of Journal of Korean Institute of Industrial Engineers, 2020.
  7. X. Zhang, X. Liang, A. Zhiyuli, S. Zhang, R. Xu, and B. Wu, "AT-LSTM: An attention-based LSTM model for financial time series prediction," In IOP Conference Series: Materials Science and Engineering, Vol.569, No.5, pp.052037, IOP Publishing, 2019. https://doi.org/10.1088/1757-899X/569/5/052037
  8. H. M. Lee, and H. J. Chun, "Dynamic character istics of housing price in Seoul using panel VAR model," Residential Environment: Journal of the Reside ntial Environment Institute of Korea, Vol.18, No.2, pp.27-42, 2020.
  9. G. P. Yang and H. J. Chun, "A study on the correlation between media and housing prices using a deep learn ing-based language model," Appraisal Studie, Vol.20, No.3, pp.109-134, 2021. https://doi.org/10.23843/as.20.3.4
  10. H. J. Park, Y. Kim, and H. Y. Kim, "Stock market foreca sting using a multi-task approach integrating long-shortterm memory and the random forest framework," Applied Soft Computing, Vol.114, pp.108106, 2022.
  11. H. Li, Y. Shen, and Y. Zhu, "Stock price prediction using attention-based multi-input LSTM," In Asian Conference on Machine Learning, pp.454-469, PMLR, 2018.
  12. J. S. Park and J.-S. Lee, "A study of dynamic relationship between apartment price and real estate online news using VAR model: An approach to the sen timent analysis using unstructured big data," Apprai Sal Studies, Vol.18, No.2, pp.83-113, 2019. https://doi.org/10.23843/as.18.2.4
  13. Y. S. Woo and E. J. Lee, "An analysis on the relationship between media coverage and time-series housin ng prices," Housing Study, Vol.19, No.4, pp.111-134, 2011.
  14. M.-H. Jang and H.-S. Kim, "A research on fluctuations of housing prices using text mining," Journal of the Korean Housing Association, Vol.30, No.2, pp.35-42, 2019.
  15. J. Park and J.-S. Lee, "Predictability of housing sales prices employing a real estate sentiment index: Using unstructured big data of online newspaper and TV broadcast news," Journal of Korea Planning Association, Vol.56, No.4, pp.99-111, 2021. https://doi.org/10.17208/jkpa.2021.08.56.4.99
  16. S. Baek, M. Cho, and M. Kang, "Comparative analysis of predictive power of apartment sales price index according to psychological variables: Focusing on the sentiment index using the survey and the online search engine," Journal of Korea Planning Association, Vol.58, No.2, pp.81-91,
  17. Y. Seo and K. Kim, "Development of an artificial intelligence model for predicting the policy and the environment affecting the public interest on the real estate," The Journal of Korean Institute of Information Technology, Vol.19, No.12, pp.135-141, 2021. https://doi.org/10.14801/jkiit.2021.19.12.135
  18. H.-J. Choi, H.-W. Park, and S.-K. Ko, "Optinal data generation strategy for training RNN-based time-series data imputation models." In Proceedings of Journal of Computing Science and Engineering, 2022.
  19. R. Mihalcea and P. Tarau, "TextRank: Bringing order into text," Proc. of the EMNLP, Vol.4, pp.404-411, 2004.
  20. J. W. Lee, H. Kim, H. Jung, and S. Park, "Natural language processing-based korean summary system considering linguistic feature," Journal of Knowledge Information Technolog y and Systems, Vol.18, No.2, pp.389-398, 2023.