Regularization Strength Control for Continuous Learning based on Attention Transfer

Kang, Seok-Hoon;Park, Seong-Hyeon;

doi:10.7471/ikeee.2022.26.1.19

Journal of IKEEE (전기전자학회논문지)

Volume 26 Issue 1
/
Pages.19-26
/
2022
/
1226-7244(pISSN)
/
2288-243X(eISSN)

Institute of Korean Electrical and Electronics Engineers (한국전기전자학회)

DOI QR Code

Regularization Strength Control for Continuous Learning based on Attention Transfer

어텐션 기반의 지속학습에서 정규화값 제어 방법

Kang, Seok-Hoon (Dept. of Embedded Systems Engineering, Incheon National University) ;
Park, Seong-Hyeon (Dept. of Embedded Systems Engineering, Incheon National University)

강석훈 ;
박성현

Received : 2021.12.28
Accepted : 2022.03.19
Published : 2022.03.31

https://doi.org/10.7471/ikeee.2022.26.1.19 Citation PDF KSCI

Download PDF

⟨ Previous Next ⟩

Abstract

In this paper, we propose an algorithm that applies a different variable lambda to each loss value to solve the performance degradation caused by domain differences in LwF, and show that the retention of past knowledge is improved. The lambda value could be variably adjusted so that the current task to be learned could be well learned, by the variable lambda method of this paper. As a result of learning by this paper, the data accuracy improved by an average of 5% regardless of the scenario. And in particular, the performance of maintaining past knowledge, the goal of this paper, was improved by up to 70%, and the accuracy of past learning data increased by an average of 22% compared to the existing LwF.

본 논문에서는 LwF에서 도메인 차이에 따른 성능 하락 현상을 해결하기 위해, 각 손실값에 각각 다른 가변람다를 적용하는 알고리즘을 제안하여, 향상된 과거 지식유지가 이루어 지게 한다. 이 지식 전달 기반의 방법을 LwF와 접목하여, 과거 학습 태스크의 지식 유지 성능을 강화하였다. 가변 람다 방법을 추가적으로 적용하여, 현재 학습할 태스크를 잘 학습할 수 있도록 람다 값을 가변적으로 조절할 수 있었다. 본 논문의 제안 방법을 적용하여 학습한 결과 시나리오에 상관없이 평균 5% 정도 데이터의 정확도가 향상하였고, 특히 본 논문의 목표인 과거 지식을 유지하는 성능이 최대 70% 가까이 개선되었고, 과거 학습 데이터의 정확도가 기존 LwF 대비 평균 22% 상승하였다.

Keywords

References

R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences, vol.3, no.4, pp.128-135, 1999. DOI: 10.1016/s1364-6613(99)01294-2
G. I. Parisi, R. Kemker, J. L. Part, C. Kanan, and S. Wermter, "Continual lifelong learning with neural networks: A review," Neural Networks, vol.113, pp.54-71, 2019. DOI: 10.1016/j.neunet.2019.01.012
F. Zenke, B. Poole, and S. Ganguli, "Continual learning through synaptic intelligence," Proceedings of the 34th International Conference on Machine Learning, vol 70, pp.3987-3995, 2017. DOI: 10.5555/3305890.3306093
Y. Hsu, Y. Liu, A. Ramasamy, and Z. Kira, "Re-evaluating continual learning scenarios: A categorization and case for strong baselines," arXiv:1810.12488, 2019.
J. Yoon, E. Yang, J. Lee, and S. J. Hwang, "Lifelong learning with dynamically expandable networks," arXiv:1708.01547, 2017.
H. Shin, J. K. Lee, J. Kim, and J. Kim, "Continual learning with deep generative replay," arXiv: 1705.08690, 2017.
G. Hinton, O. Vinyals, and J. Dean, "Distilling the knowledge in a neural network," NIPS Workshop, arXiv:1503.02531, 2014.
K. McRae, and PA. Hetherington, "Catastrophic interference is eliminated in pretrained networks," Proceedings of the 15h Annual Conference of the Cognitive Science Society, pp.723-728, 1993. DOI: 10.1.1.30.4449 https://doi.org/10.1.1.30.4449
R. M. French, "Catastrophic forgetting in connectionist networks," Trends in cognitive sciences 3.4, pp.128-135, 1999. DOI: 10.1016/S1364-6613(99)01294-2
J. Kirkpatrick, R. Pascanu, N. Rabinowitz, J. Veness, G. Desjardins, A. A. Rusu, K. Milan, J. Quan, T. Ramalho, A. Grabska-Barwinska, D. Hassabis, C. Clopath, D. Kumaran, and R. Hadsell, "Overcoming catastrophic forgetting in neural networks," Proceedings of the national academy of sciences, vol.114, no.13, pp.3521-3526, 2017. https://doi.org/10.1073/pnas.1611835114
Z. Li and D. Hoiem, "Learning without forgetting", IEEE transactions on pattern analysis and machine intelligence, vol.40, no.12, pp.2935- 2947, 2017. DOI: 10.48550/arXiv.1612.00796
S. Zagoruyko, and N. Komodakis, "Paying more attention to attention: Improving the performance of convolutional neural networks via attention transfer," arXiv:1612.03928, 2016.
B. Heo, M. Lee, S Yun, and JY. Choi, "Knowledge transfer via distillation of activation boundaries formed by hidden neurons," Proceedings of the AAAI Conference on Artificial Intelligence, Vol.33, No.1, pp.3779-3787, 2019. DOI: 10.48550/arXiv.1811.03233

Journal of IKEEE (전기전자학회논문지)

Regularization Strength Control for Continuous Learning based on Attention Transfer

어텐션 기반의 지속학습에서 정규화값 제어 방법

Abstract

Keywords

References

이메일무단수집거부

이용약관

제 1 장 총칙

제 2 장 이용계약의 체결

제 3 장 계약 당사자의 의무

제 4 장 서비스의 이용

제 5 장 계약 해지 및 이용 제한

제 6 장 손해배상 및 기타사항

Detail Search

Image Search (β)