Knowledge Distillation with Relative Representations for Image Representation Learning
Document Type
Conference Proceeding
Publication Date
1-1-2023
Abstract
Relative representations allow the alignment of latent spaces which embed data in extrinsically different manners but with similar relative distances between data points. This ability to compare different latent spaces for the same input lends itself to knowledge distillation techniques. We explore the applicability of relative representations to knowledge distillation by training a student model such that the relative representations of its outputs match the relative representations of the outputs of a teacher model. We test our Relative Representation Knowledge Distillation (RRKD) scheme on supervised and self-supervised image representation learning with MNIST and show that an encoder can be compressed to 47.71% of its original size while maintaining 91.92% of its full performance. We demonstrate that RRKD is competitive with or outperforms other relation-based distillation schemes in traditional distillation setups (CIFAR-10, CIFAR-100, SVHN) and in a transfer learning setting (Stanford Cars, Oxford-IIIT Pets, Oxford Flowers-102). Our results indicate that relative representations are an effective signal for knowledge distillation. Code is made available at https://github.com/Ramos-Ramos/rrkd.
Recommended Citation
Ramos, P., Alampay, R., Abu, P. (2023). Knowledge Distillation with Relative Representations for Image Representation Learning. In: Burduk, R., Choraś, M., Kozik, R., Ksieniewicz, P., Marciniak, T., Trajdos, P. (eds) Progress on Pattern Classification, Image Processing and Communications. CORES IP&C 2023 2023. Lecture Notes in Networks and Systems, vol 766. Springer, Cham. https://doi.org/10.1007/978-3-031-41630-9_14