TinyViT: Fast Pretraining Distillation for Small Vision Transformers

#70

summarized by : kaikai zhao

Kan Wu; Jinnian Zhang; Houwen Peng; Mengchen Liu; Bin Xiao; Jianlong Fu; Lu Yuan

どんな論文か？

proposed TinyViT, a new family of tiny and efficient small vision transformers pretrained on large-scale datasets with proposed fast distillation framework

新規性

1) save sparse soft labels (top-K logits) and data augmentation encoding of teacher on disk to save computation; 2) automatically scaled down a large model with computation and parameter constraint;

結果

a top-1 accuracy of 84.8% on ImageNet-1k with only 21M parameters; being comparable to SwinB pretrained on ImageNet-21k while using 4.2 times fewer parameters

その他（なぜ通ったか？等）

このページで利用されている画像は論文から引用しています．