#73
summarized by : kaikai zhao
Masked Generative Distillation

どんな論文か?

proposed a general feature-based distillation method effective on various tasks (classification, detection, segmentation), masked feature generation is better than directly mimicking the features;
placeholder

新規性

mask random pixels of the student's feature and force it to generate the teacher's full feature by two 3x3 conv layers

結果

improvements observed consistently on variance tasks

その他(なぜ通ったか?等)

for cnn models; MSE loss used for feature distillation