DOI QR코드

DOI QR Code

Predicting Stock Liquidity by Using Ensemble Data Mining Methods

  • Bae, Eun Chan (Department of Global Business Administration, Sungkyunkwan University) ;
  • Lee, Kun Chang (SKK Business School/SAIHST (Samsung Advanced Institute of Health Sciences & Technology), Sungkyunkwan University)
  • Received : 2016.04.11
  • Accepted : 2016.06.07
  • Published : 2016.06.30

Abstract

In finance literature, stock liquidity showing how stocks can be cashed out in the market has received rich attentions from both academicians and practitioners. The reasons are plenty. First, it is known that stock liquidity affects significantly asset pricing. Second, macroeconomic announcements influence liquidity in the stock market. Therefore, stock liquidity itself affects investors' decision and managers' decision as well. Though there exist a great deal of literature about stock liquidity in finance literature, it is quite clear that there are no studies attempting to investigate the stock liquidity issue as one of decision making problems. In finance literature, most of stock liquidity studies had dealt with limited views such as how much it influences stock price, which variables are associated with describing the stock liquidity significantly, etc. However, this paper posits that stock liquidity issue may become a serious decision-making problem, and then be handled by using data mining techniques to estimate its future extent with statistical validity. In this sense, we collected financial data set from a number of manufacturing companies listed in KRX (Korea Exchange) during the period of 2010 to 2013. The reason why we selected dataset from 2010 was to avoid the after-shocks of financial crisis that occurred in 2008. We used Fn-GuidPro system to gather total 5,700 financial data set. Stock liquidity measure was computed by the procedures proposed by Amihud (2002) which is known to show best metrics for showing relationship with daily return. We applied five data mining techniques (or classifiers) such as Bayesian network, support vector machine (SVM), decision tree, neural network, and ensemble method. Bayesian networks include GBN (General Bayesian Network), NBN (Naive BN), TAN (Tree Augmented NBN). Decision tree uses CART and C4.5. Regression result was used as a benchmarking performance. Ensemble method uses two types-integration of two classifiers, and three classifiers. Ensemble method is based on voting for the sake of integrating classifiers. Among the single classifiers, CART showed best performance with 48.2%, compared with 37.18% by regression. Among the ensemble methods, the result from integrating TAN, CART, and SVM was best with 49.25%. Through the additional analysis in individual industries, those relatively stabilized industries like electronic appliances, wholesale & retailing, woods, leather-bags-shoes showed better performance over 50%.

Keywords

References

  1. H. C. Lee, "The Relation between Asset Liquidity and Stock Liquidity," Korean Journal of Business Administration, Vol. 27, No. 10, pp. 1691-1710, 2014.
  2. Korea Capital Market Institute, "Outlook for Korea's stock and bond markets," Seoul, S. W. Hwang and S. H. Kang, 2015.
  3. K. Mazouz, W. Daya and S. Yin, "Index revisions, systematic liquidity risk and the cost of equity capital," Journal of International Financial Markets, Institutions and Money, Vol. 33, pp. 283-298, 2014. https://doi.org/10.1016/j.intfin.2014.07.009
  4. M. L. Lipson and M. Sandra, "Liquidity and capital structure," Journal of Financial Markets, Vol. 12, No. 4, pp. 611-644, 2009. https://doi.org/10.1016/j.finmar.2009.04.002
  5. H. J. Ko, Y. S. Park and H. S. Lee, "The Empirical Analysis on the Relation between Volatility of Liquidity and Return," Korean Journal of Business Administration, Vol. 22, No. 5, pp. 2873-2893, 2009.
  6. Y. Amihud, and H. Mendelson, "Liquidity and stock returns," Financial Analysts Journal, Vol. 42, No. 3, pp. 43-48, 1986. https://doi.org/10.2469/faj.v42.n3.43
  7. A. S. Turnbull, R. W. White and B. F. Smith, "In search of liquidity: The block broker's choice of where to trade cross-listed stocks," Journal of Economics and Business, Vol. 62 No. 1, pp. 20-34, 2010. https://doi.org/10.1016/j.jeconbus.2009.07.004
  8. L. Kryzanowski and S. Lazrak, "Liquidity minimization and cross-listing choice: Evidence based on Canadian shares cross-listed on U.S. venues," Journal of International Financial Markets, Institutions and Money, Vol. 19, No. 3, pp. 550-564, 2009. https://doi.org/10.1016/j.intfin.2008.08.001
  9. R. Gopalan, O. Kadan and M. Pevzner, "Asset liquidity and stock liquidity," Journal of Financial and Quantitative Analysis, Vol. 47, No. 2, pp. 333-364, 2012. https://doi.org/10.1017/S0022109012000130
  10. K. S. Cho, H. C. Shin, "A Study on the Effects of Block Ownership on Trading Activity and Market Liquidity in Korean Stock Market," Korean Journal of Business Administration, Vol. 26, No. 1, pp. 131-148, 2013.
  11. J. Pearl, "Probabilistic Reasoning in Intelligent Systems: Networks of Plausible Inference," Morgan Kaufmann, 1988.
  12. B. Yet, K. Bastani, H. Raharjo, S. Lifvergren, W. Marsh and B. Bergman, "Decision support system for Warfarin therapy management using Bayesian networks," Decision Support Systems, Vol. 55, No. 2, pp. 488-498, 2013. https://doi.org/10.1016/j.dss.2012.10.007
  13. Y. Zuo and E. Kita, "Stock price forecast using Bayesian network," Expert Systems with Applications, Vol. 39, No. 8, pp. 6729-6737, 2012. https://doi.org/10.1016/j.eswa.2011.12.035
  14. F. Zheng, G. I. Webb, P. Suraweera and L. Zhu, "Subsumption resolution: an efficient and effective technique for semi-naive Bayesian learning," Machine Learning, Vol. 87 No. 1, pp. 93-125, 2012. https://doi.org/10.1007/s10994-011-5275-2
  15. G. I. Webb, J. R. Boughton, F. Zheng and K. M. Ting, "Learning by extrapolation from marginal to full-multivariate probability distributions: decreasingly naive Bayesian classification," Machine Learning, Vol. 86, No. 2, pp. 233-272, 2012. https://doi.org/10.1007/s10994-011-5263-6
  16. B. Park and J. K. Bae, "Using machine learning algorithms for housing price prediction: The case of Fairfax County, Virginia housing data," Expert Systems with Applications, Vol. 42, No. 6, pp. 2928-2934, 2015. https://doi.org/10.1016/j.eswa.2014.11.040
  17. L. Bouchaala, A. Masmoudi., F. Gargouri. and A. Rebai, "Improving algorithms for structure learning in Bayesian Networks using a new implicit score," Expert System Application, Vol. 37, No. 7, pp. 5470-5475, 2010. https://doi.org/10.1016/j.eswa.2010.02.065
  18. R. O. Duda, P. E. Hart. and D. G. Stork, "Pattern classification," Journal of Classification, Vol. 24, No. 2, pp. 305-307, 2007. https://doi.org/10.1007/s00357-007-0015-9
  19. J. Quinlan, "C4.5: Programs for Machine Learning," Morgan Kaufman, 1993.
  20. S. Lee, "Using data envelopment analysis and decision trees for efficiency analysis and recommendation of B2C controls," Decision Support Systems, Vol. 49, No. 4, pp. 486-497, 2013. https://doi.org/10.1016/j.dss.2010.06.002
  21. L. Rutkowski, M. Jaworski, L. Pietruczuk and P. Duda, "The CART decision tree for mining data streams," Information Sciences, Vol. 266, No. 10, pp. 1-15, 2014. https://doi.org/10.1016/j.ins.2013.12.060
  22. Y. Lin, H. Guo. and J. Hu, "An SVM-based Approach for Stock Market Trend Prediction," Proceedings of International Joint Conference on Neural Networks, pp. 1-7, 2013.
  23. J. A. Suykens and J. Vandewalle, "Least squares support vector machine classifiers," Neural processing letters, Vol. 9, No. 3, pp. 293-300, 1999. https://doi.org/10.1023/A:1018628609742
  24. L. Zhou, K. K. Lai and L. Yu, "Least squares support vector machines ensemble models for credit scoring," Expert Systems with Applications, Vol. 37, No. 1, pp. 127-133, 2010. https://doi.org/10.1016/j.eswa.2009.05.024
  25. M. T. Hagan, H. B. Demuth and M. H. Beale, "Neural network design", Boston: Pws Pub, 1996.
  26. H. C. W. Lau, G. T. S. Ho and Y. Zhao, "A demand forecast model using a combination of surrogate data analysis and optimal neural network approach," Decision Support Systems, Vol. 54, No. 3, pp. 1404-1416, 2013. https://doi.org/10.1016/j.dss.2012.12.008
  27. P. Hajek, "Municipal credit rating modelling by neural networks," Decision Support Systems, Vol. 51, No. 1, pp. 108-118, 2011. https://doi.org/10.1016/j.dss.2010.11.033
  28. T. G. Dietterich, "Ensemble learning," The handbook of brain theory and neural networks, Vol. 2, pp. 110-125, 2002.
  29. K. C. Lee and K. Choi, "A study on the classification properties of firms to be subject to accounting disclosure reviews and investigations: Comparison of Bayesian Network, C5.0, and ensemble prediction methods," Korean Management Review, Vol. 36, No. 3, pp. 705-737, 2007.
  30. L. I. Kuncheva and J. J. Rodriguez, "Classifier ensembles for fMRI data analysis: an experiment," Magnetic Resonance Imaging, Vol. 28, No. 4, pp. 583-593, 2010. https://doi.org/10.1016/j.mri.2009.12.021
  31. E. Fersini, E. Messina and F. A. Pozzi, "Sentiment analysis: Bayesian Ensemble Learning," Decision Support Systems, Vol. 68, 26-38, 2014. https://doi.org/10.1016/j.dss.2014.10.004
  32. J. K. Bae, "An integrated approach to predict corporate bankruptcy with voting algorithms and neural networks," Korean Business Review, Vol. 3, No. 2, pp. 79-101, 2010.
  33. C. W. Yang, "Comparisons of Liquidity Measures in the Korean Stock Market," Asian Review of Financial Research, Vol. 25, No. 1, pp. 37-88, 2012.
  34. P. M. Dechow, R. G. Sloan and A. P. Sweeney, "Detecting earnings management," the Accounting Review, Vol. 70, No. 2, pp. 193-225, 1995.
  35. J. Han, M. Kamber and J. Pei, "Data mining. concepts and techniques," Morgan Kaufmann, 2012.
  36. K. S. Cho, S. H. Lee and J. J. Kim, "Influence of Overseas Construction Business on Construction Companies' Financial Stability," Korean journal of construction engineering and management, Vol. 14, No. 1, pp. 43-51, 2013. https://doi.org/10.6106/KJCEM.2013.14.1.043
  37. K. J. Kim and H. S Kim, "A Study on the Characteristics of Asymmetric Volatility by Industry in Korean Stock Market ", Korean Journal of Business Administration, Vol. 21, No. 6, pp. 2947-2964, 2008.