References
- Victor Sanh et al., "DistilBERT, a distilled version of BERT: smaller, faster, cheaper and lighter.", arXiv:1910.01108, 2019
- Geoffrey Hinton et al., "Distilling the Knowledge in a Neural Network.", arXiv:1503.02531, 2015
- https://alexnim.com/coding-projects-knowledge-distillation.html