- …
- …
#138
summarized by : Anonymous
どんな論文か?
improving inference speed for recognition task by reducing the number of patches, self-attention heads and Transformer blocks to use based on the input image;
新規性
use a small decision network (binary classification task) inside the Transformer block to decide how to reduce the number of patches, self-attention heads and Transformer blocks;
結果
obtained more than 2× improvement in efficiency compared to state-of-the-art vision transformers with only 0.8% drop of accuracy on ImageNet
その他(なぜ通ったか?等)
- …
- …