DOI QR코드

DOI QR Code

Adaptive Weight Control for Improvement of Catastropic Forgetting in LwF

LwF에서 망각현상 개선을 위한 적응적 가중치 제어 방법

  • Park, Seong-Hyeon (Department of Embedded Systems Engineering, Incheon National University) ;
  • Kang, Seok-Hoon (Department of Embedded Systems Engineering, Incheon National University)
  • Received : 2021.09.06
  • Accepted : 2021.09.30
  • Published : 2022.01.31

Abstract

Among the learning methods for Continuous Learning environments, "Learning without Forgetting" has fixed regularization strengths, which can lead to poor performance in environments where various data are received. We suggest a way to set weights variable by identifying the features of the data we want to learn. We applied weights adaptively using correlation and complexity. Scenarios with various data are used for evaluation and experiments showed accuracy increases by up to 5% in the new task and up to 11% in the previous task. In addition, it was found that the adaptive weight value obtained by the algorithm proposed in this paper, approached the optimal weight value calculated manually by repeated experiments for each experimental scenario. The correlation coefficient value is 0.739, and overall average task accuracy increased. It can be seen that the method of this paper sets an appropriate lambda value every time a new task is learned, and derives the optimal result value in various scenarios.

지속적 학습 환경을 위한 학습 방법 중 LwF(Learning without Forgetting)는 정규화 강도가 고정되어 있어 다양한 데이터가 들어오는 환경에서 성능이 하락 할 수 있다. 본 논문에서는 학습하려는 데이터의 특징을 파악하여 가중치를 가변적으로 설정할 수 있는 방법을 제안하고, 실험으로 성능을 검증한다. 상관 관계와 복잡도를 이용하여 적응적으로 가중치를 적용하도록 하였다. 평가를 위해 다양한 데이터를 가진 태스크가 들어오는 시나리오를 구성하여 실험을 진행하였고, 실험 결과 새로운 태스크의 정확도가 최대 5%, 이전 태스크의 정확도가 최대 11% 상승하였다. 또한, 본 논문에서 제안한 알고리즘으로 구한 적응적 가중치 값은, 각 실험 시나리오마다 반복적 실험에 의해, 수동으로 계산한 최적 가중치 값에 접근한 것을 알 수 있었다. 상관 계수 값은 0.739 이었고, 전체적으로 평균 태스크 정확도가 상승하였다. 본 논문의 방법은, 새로운 태스크를 학습할 때마다 적절한 람다 값을 적응적으로 설정하였으며, 본 논문에서 제시한 여러 가지 시나리오에서 최적의 결과값을 도출하고 있다는 것을 알 수 있다.

Keywords

References

  1. I. J. Goodfellow, M. Miraza, D. Xiao, A. Courville, and Y. Bengio, "An empirical investigation of catastrophic forgetting in gradient-based neural networks," arXiv preprint arXiv:1312.6211, 2013.
  2. R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences, vol. 3, no. 4, pp. 128-135, Apr. 1999. https://doi.org/10.1016/S1364-6613(99)01294-2
  3. G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Networks, vol. 113, pp. 54-71, 2019. https://doi.org/10.1016/j.neunet.2019.01.012
  4. Z. Li and D. Hoiem, "Learning without forgetting," IEEE transactions on pattern analysis and machine intelligence, vol. 40, no. 12, pp. 2935-2947, Nov. 2017. https://doi.org/10.1109/tpami.2017.2773081
  5. Y. Hsu, Y. Liu, A. Ramasamy, and Z. Kira, "Re-evaluating continual learning scenarios: A categorization and case for strong baselines," arXiv preprint, arXiv:1810.12488, 2018.
  6. J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, "Overcoming catastrophic forgetting in neural networks," Proceedings of the national academy of sciences, vol. 114, no. 13, pp. 3521-3526, 2017. https://doi.org/10.1073/pnas.1611835114
  7. F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," in Proceedings of the 34th International Conference on Machine Learning, vol. 70, pp. 3987-3995, 2017.
  8. J. Yoon, E. Yang, J. Lee, and S. Hwang, "Lifelong learning with dynamically expandable networks," arXiv preprint arXiv:1708.01547, 2017.
  9. H. Shin, J. K. Lee, J. Kim, and J. Kim, "Continual learning with deep generative replay," arXiv preprint arXiv: 1705.08690, 2017.
  10. S. H. park and S. H. Kang, "Continual Learning using Data Similarity," Institute of Korean Electrical and Electronics Engineers, vol. 24, no. 2, pp. 514-522, 2020.