- …
- …
#73
summarized by : kaikai zhao
どんな論文か?
proposed a general feature-based distillation method effective on various tasks (classification, detection, segmentation), masked feature generation is better than directly mimicking the features;
新規性
mask random pixels of the student's feature and force it to generate the teacher's full feature by two 3x3 conv layers
結果
improvements observed consistently on variance tasks
その他(なぜ通ったか?等)
for cnn models; MSE loss used for feature distillation
- …
- …