#209
summarized by : Anonymous
MiniViT: Compressing Vision Transformers With Weight Multiplexing

どんな論文か?

a new compression framework (MiniViT) for vision transformer with weight (MSA and MLP) sharing
placeholder

新規性

naive weight sharing: training instability, performance degradation; weight transformation to increase diversity; distillation (prediction-logits, self-attention, hidden states) to improve performance

結果

compressed models (Mini-DeiT, Mini-Swin) achieved competitive or better performance than original models DeiT, Swin; good generalization ability for downstream tasks (classification, detection)

その他(なぜ通ったか?等)