Big Data 분석을 위한 Machine Learning

  • Published : 2014.10.31

Abstract

본고는 빅데이터 시대에 새로운 가치를 창출할 수 있는 정보 분석을 위한 기계학습을 설명하고자 한다. 기계학습의 일반적 정의와 특성, 그리고 빅데이터 특성에 의한 기계학습의 변화를 확인하고 특별히 다양한 변화 중에서 분산 및 병렬화를 통한 스케일러블 기계학습을 중점으로 주어진 빅데이터를 효율적으로 분석할 수 있는 다양한 플랫폼들과 프레임워크들을 설명한다. 더불어 실제 다양한 응용 활용을 제공하고 있는 Google API 같은 빅데이터 분석 기계학습 프로젝트들을 통해서 기계학습을 통한 빅데이터 분석에 대한 폭넓은 이해를 전달하고자 한다.

Keywords

References

  1. Seoul Metropolitan Government, News in Seoul: Utilizing Big Data (2014), Retrieved Oct., 1, 2014 from http://www.seoul.go.kr.
  2. Mitchell T. Machine learning, McGraw Hill, 1997
  3. R. Bekkerman et al., Scaling up machine learning: Parallel and distributed approaches, Cambridge University Press, 2011.
  4. S. Pan and Q. Yang, "A survey on transfer learning,"IEEE Transactions on Knowledge and Data Engineering, vol. 22, pp. 1345-1359, 2010. https://doi.org/10.1109/TKDE.2009.191
  5. E. Alpaydin, Introduction to machine learning, MIT press, 2004.
  6. Y. Abu-Mostafa et al., Learning from data, AMLBook, 2012.
  7. Phil. Simon, Too Big to Ignore: The Business Case for Big Data, John Wiley & Sons, 2013.
  8. Samuel, "Some Studies in Machine Learning Using the Game of Checkers," IBM Journal of Research and Development, vol. 3(3), pp. 210-219, July 1959. https://doi.org/10.1147/rd.33.0210
  9. F. Rosenblatt, "The perceptron: a probabilistic model for information storage and organization in the brain," Psychological review, vol. 65, pp. 386, 1958. https://doi.org/10.1037/h0042519
  10. M. Minsky and S. Papert, Perceptrons- Expanded Edition: An Introduction to Computational Geometry, MIT press, 1987.
  11. Winston, Patrick, "Learning structural descriptions from examples," Cambridge Project Mac, No. MACTR-76, 1970.
  12. J. Holland, "Genetic algorithms and the optimal allocation of trials," SIAM Journal on Computing, vol. 2, pp. 88-105, 1973. https://doi.org/10.1137/0202009
  13. J. Anderson and T. Mitchell et al., Machine learning: An artificial intelligence approach, vol. 2, Morgan Kaufmann, 1986.
  14. A. Barto, Reinforcement learning: An introduction, MIT press, 1998.
  15. Manyika et al. "Big data: The next frontier for innovation, competition, and productivity," Report in McKinsey Global Institute, May, 2011.
  16. A. Kulesza and F. Pereira, "Structured learning with approximate inference," Advances in neural information processing systems, pp. 785-792, 2007.
  17. R. Agrawal et al., "Mining association rules between sets of items in large databases," ACM SIGMOD Record, pp. 207-216, 1993.
  18. X. Wu et al., "Top 10 algorithms in data mining," Knowledge and Information Systems, vol. 14, pp. 1-37, 2008. https://doi.org/10.1007/s10115-007-0114-2
  19. O. Chapelle et al., Semi-supervised learning, vol. 2, MIT press Cambridge, 2006.
  20. C. Watkins and P. Dayan, "Q-learning," Machine learning, vol. 8, pp. 279-292, 1992.
  21. G. Tesauro, "Temporal difference learning and TDGammon," Communications of the ACM, vol. 38, pp. 58-68, 1995.
  22. Kurniawati et al., "SARSOP: Efficient Point-Based POMDP Planning by Approximating Optimally Reachable Belief Spaces," Robotics: Science and Systems, vol. 2008, 2008.
  23. J. Cheng and R. Greiner, "Learning bayesian belief network classifiers: Algorithms and system," in Advances in Artificial Intelligence, pp. 141-151, 2001.
  24. Rajaraman et al., Mining of massive datasets, Cambridge University Press, 2011.
  25. Bishop, M. Christopher, Pattern recognition and machine learning, springer, 2006.
  26. Scholkopf et al., Learning with kernels: support vector machines, regularization, optimization, and beyond, MIT press, 2001.
  27. J. Dean et al., "Large scale distributed deep networks," Advances in Neural Information Processing Systems, pp. 1223-1231, 2012.
  28. Dietterich, G. Thomas, "Ensemble methods in machine learning," Multiple classifier systems, pp. 1-15, 2000.
  29. J. Dean and S. Ghemawat, "MapReduce: simplified data processing on large clusters," Communications of the ACM, vol. 51, pp. 107-113, 2008.
  30. Y. Yu et al., "DryadLINQ: A System for General-Purpose Distributed Data-Parallel Computing Using a High-Level Language," OSDI, pp. 1-14, 2008.
  31. Che et al., "A performance study of general-purpose applications on graphics processors using CUDA," Journal of parallel and distributed computing, no. 10, pp. 1370-1380, 2008.
  32. Wu et al., "GPU-accelerated large scale analytics," IACM UCHPC, 2009.
  33. Zaki et al., Large-scale parallel data mining, Springer, 2000.
  34. K.H. Lee et al., "Parallel data processing with MapReduce: a survey," ACM SIGMOD Record, vol. 40, pp. 11-20, 2012. https://doi.org/10.1145/2094114.2094118
  35. Kosala et al., "Web mining research: A survey," ACM Sigkdd Explorations Newsletter 2.1, pp. 1-15, 2000.
  36. G. Malewicz et al., "Pregel: a system for large-scale graph processing," Proceedings of the 2010 ACM SIGMOD International Conference on Management of data, pp. 135-146, 2010.
  37. The Apache Foundation, Apache Pig (2014), Retrieved Oct., 1, 2014 from http://pig.apache.org.
  38. The Apache Foundation, Hadoop (2014), Retrieved Oct., 1, 2014 from http://hadoop.apache.org.
  39. The Apache Foundation, Spark (2014), Retrieved Oct., 1, 2014 from https://spark.apache.org.
  40. IBM, Parallel Machine Learning Toolbox (2014), Retrieved Oct., 1, 2014 from https://www.research.ibm.com/haifa/projects/verification/ml_toolbox/index.html.
  41. W. Gropp et al., Using MPI: portable parallel programming with the message-passing interface, vol. 1, MIT press, 1999.
  42. Chang, Edward Y. "PSVM: Parallelizing support vector machines on distributed computers," Foundations of Large-Scale Multimedia Information Management and Retrieval, pp. 213-230, 2011.
  43. Burges, JC. Christopher, "From ranknet to lambdarank to lambdamart: An overview," Learning 11, pp. 23-581, 2010.
  44. Pednault, PD. Edwin, "Transform Regression and the Kolmogorov Superposition Theorem," SDM, pp. 35-46. 2006.
  45. G. Hinton, "Learning multiple layersof representation," Trends in cognitive sciences, vol. 11, pp. 428-434, 2007. https://doi.org/10.1016/j.tics.2007.09.004
  46. Chen et al., "Parallel spectral clustering in distributed systems," IEEE Transactions on Pattern Analysis and Machine Intelligence, no. 3, pp. 568-586, 2011.
  47. Bohm et al., "Robust information-theoretic clustering," ACM SIGKDD international conference on Knowledge discovery and data mining, pp. 65-75, ACM, 2006.
  48. Dekel et al., "Optimal distributed online prediction using mini-batches," The Journal of Machine Learning Research 13, no. 1, pp. 165-202, 2012.
  49. Koren et al., "Matrix factorization techniques for recommender systems," Computer 42, no. 8, pp. 30-37, 2009.
  50. The Apache Foundation, Mahout (2014), Retrieved Oct., 1, 2014 from http://mahout.apache.org.
  51. Bu et al., "The HaLoop approach to large-scale iterative data analysis," The VLDB Journal-The International Journal on Very Large Data Bases 21, no. 2, pp. 169-190, 2012. https://doi.org/10.1007/s00778-012-0269-7
  52. Y. Low et al., "Graphlab: A new framework for parallel machine learning," arXiv preprint arXiv:1006.4990, 2010.
  53. A. Ghoting et al., "SystemML: Declarative machine learning on MapReduce," Data Engineering (ICDE), pp. 231-242, 2011.
  54. Google, Google Prediction API (2014), Retrieved Oct., 1, 2014 from https://developers.google.com/prediction.