DOI QR코드

DOI QR Code

A Big Data-Driven Business Data Analysis System: Applications of Artificial Intelligence Techniques in Problem Solving

  • Donggeun Kim (Department of Big Data Science, College of Public Policy, Korea University) ;
  • Sangjin Kim (Department of National Statistics, College of Public Policy, Korea University) ;
  • Juyong Ko (Department of Big Data Science, College of Public Policy, Korea University) ;
  • Jai Woo Lee (Department of Big Data Science, College of Public Policy, Korea University)
  • Received : 2023.05.22
  • Accepted : 2023.06.16
  • Published : 2023.06.30

Abstract

It is crucial to develop effective and efficient big data analytics methods for problem-solving in the field of business in order to improve the performance of data analytics and reduce costs and risks in the analysis of customer data. In this study, a big data-driven data analysis system using artificial intelligence techniques is designed to increase the accuracy of big data analytics along with the rapid growth of the field of data science. We present a key direction for big data analysis systems through missing value imputation, outlier detection, feature extraction, utilization of explainable artificial intelligence techniques, and exploratory data analysis. Our objective is not only to develop big data analysis techniques with complex structures of business data but also to bridge the gap between the theoretical ideas in artificial intelligence methods and the analysis of real-world data in the field of business.

Keywords

Acknowledgement

This article is financially supported by the College of Public Policy at Korea University.

References

  1. Jianqing Fan, Fang Han, Han Liu, Challenges of Big Data analysis, National Science Review, Volume 1, Issue 2, June 2014, Pages 293-314.  https://doi.org/10.1093/nsr/nwt032
  2. Patrick Mikalef, John Krogstie, Ilias O. Pappas, Paul Pavlou, Exploring the relationship between big data analytics capability and competitive performance: The mediating roles of dynamic and operational capabilities, Information & Management, Volume 57, Issue 2, 2020, 103169. 
  3. Fan C, Chen M, Wang X, Wang J and Huang B (2021) A Review on Data Preprocessing Techniques Toward Efficient and Reliable Knowledge Discovery From Building Operational Data. Front. Energy Res. 9:652801. 
  4. John Qi Dong, Chia-Han Yang, Business value of big data analytics: A systems-theoretic approach and empirical test, Information & Management, Volume 57, Issue 1, 2020, 103124. 
  5. Sunil Erevelles, Nobuyuki Fukawa, Linda Swayne, Big Data consumer analytics and the transformation of marketing, Journal of Business Research, Volume 69, Issue 2, 2016, Pages 897-904.  https://doi.org/10.1016/j.jbusres.2015.07.001
  6. Rangaswamy, E., Nawaz, N. & Changzhuang, Z. The impact of digital technology on changing consumer behaviours with special reference to the home furnishing sector in Singapore. Humanit Soc Sci Commun 9, 83 (2022). 
  7. White, K., Habib, R., & Hardisty, D. J. (2019). How to SHIFT Consumer Behaviors to be More Sustainable: A Literature Review and Guiding Framework. Journal of Marketing, 83(3), 22-49.  https://doi.org/10.1177/0022242919825649
  8. Emmanuel, T., Maupong, T., Mpoeleng, D. et al. A survey on missing data in machine learning. J Big Data 8, 140 (2021). 
  9. Uthayasankar Sivarajah, Muhammad Mustafa Kamal, Zahir Irani, Vishanth Weerakkody, Critical analysis of Big Data challenges and analytical methods, Journal of Business Research, Volume 70, 2017, Pages 263-286.  https://doi.org/10.1016/j.jbusres.2016.08.001
  10. Desamparados Blazquez, Josep Domenech, Big Data sources and methods for social and economic analyses, Technological Forecasting and Social Change, Volume 130, 2018, Pages 99-113.  https://doi.org/10.1016/j.techfore.2017.07.027
  11. Goldstein M, Uchida S (2016) A Comparative Evaluation of Unsupervised Anomaly Detection Algorithms for Multivariate Data. PLOS ONE 11(4): e0152173. 
  12. Panjei, E., Gruenwald, L., Leal, E. et al. A survey on outlier explanations. The VLDB Journal 31, 977-1008 (2022).  https://doi.org/10.1007/s00778-021-00721-1
  13. Kean Ming Tan, Daniela Witten, Ali Shojaie, The cluster graphical lasso for improved estimation of Gaussian graphical models, Computational Statistics & Data Analysis, Volume 85, 2015, Pages 23-36.  https://doi.org/10.1016/j.csda.2014.11.015
  14. Jain R, Xu W (2021) HDSI: High dimensional selection with interactions algorithm on feature selection and testing. PLOS ONE 16(2): e0246159. 
  15. Friedman, Jerome, Trevor Hastie, and Robert Tibshirani. 2010. "Regularization Paths for Generalized Linear Models via Coordinate Descent." Journal of Statistical Software, Articles 33 (1): 1-22.  https://doi.org/10.18637/jss.v033.i01
  16. Morris, T.P., White, I.R. & Royston, P. Tuning multiple imputation by predictive mean matching and local residual draws. BMC Med Res Methodol 14, 75 (2014). 
  17. ur Rehman, A., Belhaouari, S.B. Unsupervised outlier detection in multidimensional data. J Big Data 8, 80 (2021). 
  18. Yusuke Hara, Junpei Suzuki, Masao Kuwahara, Network-wide traffic state estimation using a mixture Gaussian graphical model and graphical lasso, Transportation Research Part C: Emerging Technologies, Volume 86, 2018, Pages 622-638.  https://doi.org/10.1016/j.trc.2017.12.007
  19. Andres Martinez, Claudia Schmuck, Sergiy Pereverzyev, Clemens Pirker, Markus Haltmeier, A machine learning framework for customer purchase prediction in the non-contractual setting, European Journal of Operational Research, Volume 281, Issue 3, 2020, Pages 588-596.  https://doi.org/10.1016/j.ejor.2018.04.034
  20. Nicholas P. Danks, Pratyush N. Sharma, Marko Sarstedt, Model selection uncertainty and multimodel inference in partial least squares structural equation modeling (PLS-SEM), Journal of Business Research, Volume 113, 2020, Pages 13-24.  https://doi.org/10.1016/j.jbusres.2020.03.019
  21. Mohammad Zoynul Abedin, Petr Hajek, Taimur Sharif, Md. Shahriare Satu, Md. Imran Khan, Modelling bank customer behaviour using feature engineering and classification techniques, Research in International Business and Finance, Volume 65, 2023, 101913. 
  22. C.L. Philip Chen, Chun-Yang Zhang, Data-intensive applications, challenges, techniques and technologies: A survey on Big Data, Information Sciences, Volume 275, 2014, Pages 314-347.  https://doi.org/10.1016/j.ins.2014.01.015
  23. Sarker, I.H. Data Science and Analytics: An Overview from Data-Driven Smart Computing, Decision-Making and Applications Perspective. SN COMPUT. SCI. 2, 377 (2021). 
  24. Bickley, S.J., Chan, H.F. & Torgler, B. Artificial intelligence in the field of economics. Scientometrics 127, 2055-2084 (2022).  https://doi.org/10.1007/s11192-022-04294-w
  25. Federico Battiston, Giulia Cencetti, Iacopo Iacopini, Vito Latora, Maxime Lucas, Alice Patania, Jean-Gabriel Young, Giovanni Petri, Networks beyond pairwise interactions: Structure and dynamics, Physics Reports, Volume 874, 2020, Pages 1-92.  https://doi.org/10.1016/j.physrep.2020.05.004
  26. Douglas A. Luke and Jenine K. Harris, Network Analysis in Public Health: History, Methods, and Applications, Annual Review of Public Health 2007 28:1, 69-93.  https://doi.org/10.1146/annurev.publhealth.28.021406.144132
  27. Marko Sarstedt, Christian M. Ringle, Denis Iuklanov, Antecedents and consequences of corporate reputation: A dataset, Data in Brief, Volume 48, 2023, 109079. 
  28. Batko K, Slezak A. The use of Big Data Analytics in healthcare. J Big Data. 2022;9(1):3