#70
summarized by : kaikai zhao
TinyViT: Fast Pretraining Distillation for Small Vision Transformers

どんな論文か?

proposed TinyViT, a new family of tiny and efficient small vision transformers pretrained on large-scale datasets with proposed fast distillation framework
placeholder

新規性

1) save sparse soft labels (top-K logits) and data augmentation encoding of teacher on disk to save computation; 2) automatically scaled down a large model with computation and parameter constraint;

結果

a top-1 accuracy of 84.8% on ImageNet-1k with only 21M parameters; being comparable to SwinB pretrained on ImageNet-21k while using 4.2 times fewer parameters

その他(なぜ通ったか?等)