DOI QR코드

DOI QR Code

Performance Comparison of Machine Learning Algorithms for TAB Digit Recognition

타브 숫자 인식을 위한 기계 학습 알고리즘의 성능 비교

  • 허재혁 (단국대학교 컴퓨터공학과) ;
  • 이현종 (단국대학교 소프트웨어학과) ;
  • 황두성 (단국대학교 소프트웨어학과)
  • Received : 2018.06.05
  • Accepted : 2018.07.25
  • Published : 2019.01.31

Abstract

In this paper, the classification performance of learning algorithms is compared for TAB digit recognition. The TAB digits that are segmented from TAB musical notes contain TAB lines and musical symbols. The labeling method and non-linear filter are designed and applied to extract fret digits only. The shift operation of the 4 directions is applied to generate more data. The selected models are Bayesian classifier, support vector machine, prototype based learning, multi-layer perceptron, and convolutional neural network. The result shows that the mean accuracy of the Bayesian classifier is about 85.0% while that of the others reaches more than 99.0%. In addition, the convolutional neural network outperforms the others in terms of generalization and the step of the data preprocessing.

본 논문에서는 기타 타브 악보에서 추출한 프렛 번호를 대상으로 학습 알고리즘의 분류 성능을 비교한다. 타브 악보로부터 세그먼트를 통해 추출된 타브 숫자 데이터는 타브 선과 악보 기호가 포함하기 때문에 레이블링 기법과 비선형 필터를 이용하여 프렛 숫자를 추출한다. 추가적인 데이터 확보를 위해 전처리가 수행된 데이터에 대해 4 방향으로 이동 연산을 수행한다. 선택된 학습 모델은 베이지안 분류기, 지지벡터기기, 프로토타입 기반 학습, 다층 신경망 그리고 합성곱 신경망 모델 등이다. 실험 결과 베이지안 분류기는 85.0% 평균 정확도를 보였고 나머지 분류기는 99.0% 이상의 평균 정확도를 보였다. 일반화 성능과 전처리 단계를 고려 시 합성곱 신경망이 다른 학습 모델들보다 우수하다.

Keywords

JBCRJM_2019_v8n1_19_f0001.png 이미지

Fig. 1. The Process for Extracting Fret Digits

JBCRJM_2019_v8n1_19_f0002.png 이미지

Fig. 2. Examples of TAB Digit Segmentation

JBCRJM_2019_v8n1_19_f0003.png 이미지

Fig. 3. Removing TAB Lines

JBCRJM_2019_v8n1_19_f0004.png 이미지

Fig. 4. The Limitation of Labeling Method

JBCRJM_2019_v8n1_19_f0005.png 이미지

Fig. 5. The Example of Non-linear Filtering

JBCRJM_2019_v8n1_19_f0006.png 이미지

Fig. 6. Examples of Segmented Data andPreprocessed Data

JBCRJM_2019_v8n1_19_f0007.png 이미지

Fig. 7. A Visualization of TAB Digits Using the t-SNE Method

JBCRJM_2019_v8n1_19_f0008.png 이미지

Fig. 8. Feature Map Examples

Table 1. The Number of Data Per Segment Size

JBCRJM_2019_v8n1_19_t0001.png 이미지

Table 2. The Number of Fret Digits Per Class

JBCRJM_2019_v8n1_19_t0002.png 이미지

Table 3. The Performance Comparison of TAB Digit Recognition

JBCRJM_2019_v8n1_19_t0003.png 이미지

Table 4. The Structure of a MLP Network

JBCRJM_2019_v8n1_19_t0004.png 이미지

Table 5. The Structure of a CNN-PRE Network

JBCRJM_2019_v8n1_19_t0005.png 이미지

Table 6. The Structure of a CNN-ORG Network

JBCRJM_2019_v8n1_19_t0006.png 이미지

References

  1. LeCun, Yann, Cortes Corinna, and Christopher JC Burges, "The MNIST database of handwritten digits." 2009.
  2. LeCun, Yann, et al., "Handwritten digit recognition with a back-propagation network," Advances in Neural Information Processing Systems. 1990.
  3. Patrice Y. Simard, Dave Steinkraus, and John C. Platt, "Best Practice for Convolutional Neural Networks Applied to Visual Documnet Analysis," Proceedings of the Seventh International Conference on Document Analysis and Recognition, IEEE Computer Society, 2003.
  4. Dan, Zhu, and Chen Xu, "The recognition of handwritten digits based on bp neural network and the implementation on android," Intelligent System Design and Engineering Applications (ISDEA), 2013 Third International Conference on. IEEE, 2013.
  5. Zang, Di, et al. "Vehicle license plate recognition using visual attention model and deep learning," Journal of Electronic Imaging, Vol. 24, No. 3, pp. 033001-033001. 2015. https://doi.org/10.1117/1.JEI.24.3.033001
  6. Goodfellow, Ian J., et al., "Multi-digit number recognition from street view imagery using deep convolutional neural networks," arXiv preprint arXiv:1312.6082, 2013.
  7. Nvidia, C. U. D. A. "Programming guide." 2010.
  8. Stone, John E., David Gohara, and Guochun Shi, “OpenCL: A parallel programming standard for heterogeneous computing systems,” Computing in Science & Engineering, Vol. 12, No. 3, pp. 66-73, 2010.
  9. Wienke, Sandra, et al., "OpenACC-first experiences with real-world applications," European Conference on Parallel Processing. Springer, Berlin, Heidelberg, 2012.
  10. Chen, FeiGuo, Wei Ge, and JingHai Li. "Molecular dynamics simulation of complex multiphase flow on a computer cluster with GPUs," Science in China Series B: Chemistry, Vol. 52, No. 3, pp. 372-380, 2009. https://doi.org/10.1007/s11426-009-0069-0
  11. Shao, Fei, Zinan Chang, and Yi Zhang, "AES encryption algorithm based on the high performance computing of GPU," Communication Software and Networks, 2010. ICCSN'10. Second International Conference on. IEEE, 2010.
  12. Michalakes, John, and Manish Vachharajani. "GPU acceleration of numerical weather prediction," Parallel Processing Letters, Vol. 18, No. 4, pp. 531-548, 2008. https://doi.org/10.1142/S0129626408003557
  13. Baek, Byung-Hyun, Hyun-Jong Lee, and Doosung Hwang. "Guitar Tab Digit Recognition and Play using Prototype based Classification," The Korean Society of Computer and Information, Vol. 21, No. 9, pp. 19-25, 2016.
  14. R. C Gonzalez and R. E Woods, "Digital Image Processing," Pearson Education, New Jersey, 2010.
  15. Richard O. Duda, Peter E. hart, and David G. Stork, Pattern Classification, 2nd, Willey-Interscience, 2000.
  16. Peemen, Maurice, Bart Mesman, and Henk Corporaal. "Efficiency Optimization of Trainable Feature Extractors for a Consumer Platform," ACIVS, Vol. 6915. 2011.
  17. B.E. Boser, I.M. Guyon, V.N. Vapnik, "A Training Algorithm for Optimal Margin Classifiers," Proc. Fifth Ann. Workshop Computational Learning Theory, pp. 144-152, 1992.
  18. Ward, Jonatan, et al. "Efficient mapping of the training of Convolutional Neural Networks to a CUDA-based cluster," Eindhoven University of Technology, The Netherlands 12, 2011.
  19. S. Y. Shim, D. H. Hwang, "Prototype based Classification by Generating Multidimensional Spheres per Class Area," Journal of The Korea Society of Computer and Information, Vol. 20, No. 2, Feb. 2015.
  20. L.J.P. van der Maaten and G.E. Hinton. "Visualizing High-Dimensional Data Using t-SNE," Journal of Machine Learning Research, Vol. 9, pp. 2579-2605, Nov. 2008.
  21. Pedregosa, Fabian, et al. "Scikit-learn: Machine learning in Python," Journal of Machine Learning Research, Vol. 12. Oct. pp. 2825-2830, 2011.
  22. Chollet, Francois. "Keras: Deep learning library for theano and tensorflow," https://keras.io 7.8, 2015.
  23. Kingma, Diederik, and Jimmy Ba. "Adam: A method for stochastic optimization," arXiv preprint arXiv:1412.6980, 2014.