DOI QR코드

DOI QR Code

A Study on Improving Performance of the Deep Neural Network Model for Relational Reasoning

관계 추론 심층 신경망 모델의 성능개선 연구

  • 이현옥 (고려대학교 빅데이터 융합학과) ;
  • 임희석 (고려대학교 컴퓨터학과)
  • Received : 2018.07.06
  • Accepted : 2018.08.08
  • Published : 2018.12.31

Abstract

So far, the deep learning, a field of artificial intelligence, has achieved remarkable results in solving problems from unstructured data. However, it is difficult to comprehensively judge situations like humans, and did not reach the level of intelligence that deduced their relations and predicted the next situation. Recently, deep neural networks show that artificial intelligence can possess powerful relational reasoning that is core intellectual ability of human being. In this paper, to analyze and observe the performance of Relation Networks (RN) among the neural networks for relational reasoning, two types of RN-based deep neural network models were constructed and compared with the baseline model. One is a visual question answering RN model using Sort-of-CLEVR and the other is a text-based question answering RN model using bAbI task. In order to maximize the performance of the RN-based model, various performance improvement experiments such as hyper parameters tuning have been proposed and performed. The effectiveness of the proposed performance improvement methods has been verified by applying to the visual QA RN model and the text-based QA RN model, and the new domain model using the dialogue-based LL dataset. As a result of the various experiments, it is found that the initial learning rate is a key factor in determining the performance of the model in both types of RN models. We have observed that the optimal initial learning rate setting found by the proposed random search method can improve the performance of the model up to 99.8%.

지금까지 인공지능의 한 분야인 딥러닝 방법은 구조화되지 않은 데이터로부터 문제를 해결하는 놀라울만한 성과를 이루어왔지만, 인간처럼 여러 상황들을 종합적으로 판단, 그것들의 연관성을 추론하고, 그 다음 상황을 예측하는 수준의 지능을 갖는데 도달하지 못하였다. 최근 발표된 복잡한 관계 추론을 수행하는 심층 신경망은 인공지능이 인간의 핵심 지적 능력인 관계 추론을 보유할 수 있다는 것을 증명하였다. 본 논문에서는 관계 추론 심층 신경망 중에서 Relation Networks (RN)의 성능을 분석 및 관찰해 보고자 Sort-of-CLEVR 데이터 셋을 사용한 시각적 질의응답과 bAbI task를 사용한 텍스트 기반 질의응답 두 유형의 RN 기반 심층 신경망 모델을 구축하여 baseline 모델과의 비교를 통한 성능검증을 하였다. 또한 모델의 성능을 극대화하기 위하여 하이퍼 파라미터 튜닝 등 다양각도의 성능개선 실험으로 관계 추론을 위한 RN 기반 심층 신경망 모델의 성능개선 방법을 제안하였다. 제안한 성능개선 방법은 시각적 질의응답 모델과 텍스트 기반 질의응답 모델에 적용하여 그 효과를 검증하였고, 기존의 RN 모델에서 사용해보지 않았던 Dialog-based LL 데이터 셋을 사용하여 새로운 도메인에서의 제안한 성능개선 방법의 효과를 다시 한 번 검증하였다. 실험 결과 두 유형의 RN 모델 모두에서 초기 학습률이 모델의 성능을 결정하는 핵심 요인임을 알 수 있었고, 제안한 random search 방법에 의해 찾은 최적의 초기 학습률 설정이 모델의 성능을 최고 99.8%까지 향상 시킬 수 있다는 것을 확인하였다.

Keywords

JBCRJM_2018_v7n12_485_f0001.png 이미지

Fig. 1. Visual QA - Standard(A) vs Relational Reasoning(B)

JBCRJM_2018_v7n12_485_f0002.png 이미지

Fig. 2. Visual QA Requiring Relational Reasoning

JBCRJM_2018_v7n12_485_f0003.png 이미지

Fig. 3. Visual QA Architecture with RN (Adam Santoro et al., 2017, Figure 2)

JBCRJM_2018_v7n12_485_f0004.png 이미지

Fig. 4. Text-based QA Architecture with RN

JBCRJM_2018_v7n12_485_f0005.png 이미지

Fig. 5. Our RN-based Visual QA Model Architecture

JBCRJM_2018_v7n12_485_f0006.png 이미지

Fig. 6. Neural Network with Batch Normalization

JBCRJM_2018_v7n12_485_f0007.png 이미지

Fig. 7. Relational Question (A) and Non-relational Question (B) Generated on Our Model

JBCRJM_2018_v7n12_485_f0008.png 이미지

Fig. 8. Our RN-based Model on Visual QA Task

JBCRJM_2018_v7n12_485_f0009.png 이미지

Fig. 9. Accuracy of Each Model with Different Hyper Parameters

JBCRJM_2018_v7n12_485_f0010.png 이미지

Fig. 10. Loss of Each Model with Different Hyper Parameters

JBCRJM_2018_v7n12_485_f0011.png 이미지

Fig. 11. Accuracy of Each Model with Different Learning Rate Set by Random Search Method

JBCRJM_2018_v7n12_485_f0012.png 이미지

Fig. 12. Loss of Each Model with Different Learning Rate Set by Random Search Method

Table 1. Performance Comparison Between RN and Baseline on our Visual QA Task

JBCRJM_2018_v7n12_485_t0001.png 이미지

Table 2. Improved Performance by Hyper Parameters Tuning

JBCRJM_2018_v7n12_485_t0002.png 이미지

Table 3. Comparison of Results on bAbI QA Task Using Different Learning Rate Set by Random Search Method

JBCRJM_2018_v7n12_485_t0003.png 이미지

Table 4. Comparison of Results on Dialog-based LL QA Task Using Different Learning Rate Set by Random Search Method

JBCRJM_2018_v7n12_485_t0004.png 이미지

References

  1. Adam Santoro, David Raposo, David G.T. Barrett, Mateusz Malinowski, Razvan Pascanu, Peter Battaglia, and Timothy Lillicrap, "A simple neural network module for relational reasoning," arXiv: 1706.01427v1, 2017.
  2. David Raposo, Adam Santoro, David Barrett, Razvan Pascanu, Timothy Lillicrap, and Peter Battaglia, "Discovering objects and their relations from entangled scene representations," arXiv:1702.05068, 2017.
  3. Nicholas Watters, Andrea Tacchetti, Theophane Weber, Razvan Pascanu, Peter Battaglia, and Daniel Zoran, "Visual Interaction Networks," arXiv:1706.01433v1, 2017.
  4. Stanislaw Antol, Aishwarya Agrawal, Jiasen Lu, Margaret Mitchell, Dhruv Batra, C Lawrence Zitnick, and Devi Parikh, "Vqa: Visual question answering," arXiv:1505.00468v7, 2015.
  5. Antoine Bordes, Jason Weston, Sumit Chopra, and Tomas Mikolov, "Towards ai-complete question answering: A set of prerequisite toy tasks," arXiv:1502.05698, 2015.
  6. Sainbayar Sukhbaatar, Arthur Szlam, Jason Weston, and Rob Fergus, "End-To-End Memory Networks," arXiv:1503.08895v5, 2015.
  7. Sergey Ioffe and Christan Szegedy, "Batch Normalization : Accelerating Deep Network Training by Reducing Internal Covariate Shift," arXiv:1502.03167, 2015.
  8. N. Srivastava, G. Hinton. A. Krizhevsky. I. Sutskever, and R. Salakhutdinov, "Dropout : A simple way to prevent neural networks from overfitting," The Journal of Machine Learning Research, 15, pp.1929-1958, 2014.
  9. Diederik Kingma and Jimmy Ba, "Adam : A Method for Stochastic Optimization," arXiv: 1412.6980, 2014.
  10. James Bergstra and Yoshua Bengio, "Random Search for Hyper Parameter Optimization," Journal of Machine Learning Research, Vol.13, pp.281-305, 2012.
  11. Jason Weston, "Dialog-based Language Learning," arXiv: 1604.06045, 2016.
  12. Sebastian Ruder, "An overview of gradient descent optimization algorithms," arXiv:1609.04747v2, 2016.
  13. Ilya Sutskever, James Martens, George Dahl, and Geoffrey Hinton, "On the importance of initialization and momentum in deep learning," Proceedings of the 30th International Conference on Machine Learning, pp.1139-1147, 2013.
  14. John Duchi, Elad Hazan, and Yoram Singer, "Adaptive Subgradient Methods for Online Learning and Stochastic Optimization," Journal of Machine Learning Research, Vol.12, pp.2121-2159, 2011.
  15. Jasper Snoek, Hugo Larochelle, and Ryan P. Adams. "Practical Bayesian optimization of machine learning algorithms," arXiv:1206.2944v2, 2012.
  16. Matthias Feurer, Benjamin Letham, and Eytan Bakshy, "Scalable Meta-Learning for Bayesian Optimization," arXiv: 1802.02219, 2018.
  17. Justin Johnson, Bharath Hariharan, Laurens van der Maaten, Li Fei-Fei, C Lawrence Zitnick, and Ross Girshick., "Clevr: A diagnostic dataset for compositional language and elementary visual reasoning," arXiv:1612.06890v1, 2017.