DOI QR코드

DOI QR Code

Weighted Fast Adaptation Prior on Meta-Learning

  • Received : 2019.10.02
  • Accepted : 2019.10.11
  • Published : 2019.12.31

Abstract

Along with the deeper architecture in the deep learning approaches, the need for the data becomes very big. In the real problem, to get huge data in some disciplines is very costly. Therefore, learning on limited data in the recent years turns to be a very appealing area. Meta-learning offers a new perspective to learn a model with this limitation. A state-of-the-art model that is made using a meta-learning framework, Meta-SGD, is proposed with a key idea of learning a hyperparameter or a learning rate of the fast adaptation stage in the outer update. However, this learning rate usually is set to be very small. In consequence, the objective function of SGD will give a little improvement to our weight parameters. In other words, the prior is being a key value of getting a good adaptation. As a goal of meta-learning approaches, learning using a single gradient step in the inner update may lead to a bad performance. Especially if the prior that we use is far from the expected one, or it works in the opposite way that it is very effective to adapt the model. By this reason, we propose to add a weight term to decrease, or increase in some conditions, the effect of this prior. The experiment on few-shot learning shows that emphasizing or weakening the prior can give better performance than using its original value.

Keywords

References

  1. F.F. Li, R. Fergus, and P. Perona. "One-Shot Learning of Object Categories," IEEE Transactions on Pattern Analysis and Machine Intelligence, vol. 28, no. 4, pp. 594-611, 2006. DOI: http://dx.doi.org/10.1109/TPAMI.2006.79.
  2. Y. Cheng, M. Yu, X. Guo and B. Zhou. "Few-Shot Learning with Meta Metric Learners," Proc. 31st Conference on Neural Information Processing Systems (NIPS), 2017.
  3. A. Nichol, J. Achian and J. Schulman, "On First-Order Meta-Learning Algorithms," arXiv preprint arXiv:1803.02999, 2018.
  4. S. Ravi and H. Larochelle. "Optimization as a Model for Few-Shot Learning," in Proc. International Conference on Learning Representations, 2017.
  5. C. Finn, P. Abbeel and S. Levine, "Model-Agnostic Meta-Learning for Fast Adaptation of Deep Networks," in Proc. 34th International Conference on Machine Learning, Vol. 70, pp. 1126-1135, 2017.
  6. Z. Li, F. Zhou, F. Chen, and H. Li, "Meta-SGD: Learning to Learn Quickly for Few-Shot Learning," arXiv preprint arXiv:1707.09835, 2017.
  7. G. Koch, R. Zemel and R. Salakhutdinov, "Siamese Neural Networks for One-Shot Image Recognition," in Proc. International Conference on Learning Representation Deep Learning Workshop, Vol. 2, 2015.
  8. J. Snell, K. Swersky and R. Zemel, "Prototypical Networks for Few-Shot Learning," In Proc. Advances in Neural Information Processing Systems, pp. 4077-4087, 2017.
  9. F. Sung, Y. Yang, L. Zhang, T. Xiang, P. H. S. Torr, and T. M. Hospedales, "Learning to Compare: Relation Network for Few-Shot Learning," in Proc. 2018 IEEE/CVF Conference Vision and Pattern Recognition, pp. 1199-1208, 2018. DOI: http://dx.doi.org/10.1109/CVPR.2018.00131.
  10. C. Finn, K. Xu, and S. Levine, "Probabilistic Model-Agnostic Meta-Learning," in Proc. Advances in Neural Information Processing Systems, pp. 9516-9527, 2018.
  11. J. S. Yoon, T. S. Kim, O. Dia, S. W. Kim, Y. Bengio and S. J. Ahn, "Bayesian Model-Agnostic Meta-Learning," in Proc. Advances in Neural Information Processing Systems, pp. 7332-7342, 2018.
  12. A. Antoniou, H. Edwards, and A. Storkey, "How to Train Your MAML," arXiv preprint arXiv:1810.09502, 2019.
  13. R. Vuorio, S. -H. Sun, H. Hu and J. J. Lim, "Toward Multimodal Model-Agnostic Meta-Learning," arXiv preprint arXiv:1812.07172, 2018.
  14. K. Li and D.-K. Kang, "FAST-ADAM in Semi-Supervised Generative Adversarial Networks," International Journal of Internet, Broadcasting and Communication (IJIBC), Vol. 11, No. 4, pp. 31-36, Nov. 2019. DOI: http://dx.doi.org/10.7236/IJIBC.2019.11.4.31.
  15. Z.-Y. Wang and D.-K. Kang, "Experimental Analysis of Equilibrization in Binary Classification for Non-Image Imbalanced Data Using Wasserstein GAN," International Journal of Internet, Broadcasting and Communication (IJIBC), Vol. 11, No. 4, pp. 37-42, Nov. 2019. DOI: http://dx.doi.org/10.7236/IJIBC.2019.11.4.37.