- …
- …
#160
summarized by : Anonymous
どんな論文か?
What contributes to the good performance of Transformers? the general architecture of the transformers, instead of the specific token mixer module
新規性
proposed a general architecture 'MetaFormer' abstracted from Transformer with specifying the token mixer; showed that with a naive token-mixer (PoolFormer), promising results can be obtained.
結果
replacing attention module in Transformer with spatial pooling operation as a token mixer (named PoolFormer), competitive performance on classification, detection, segmentation tasks obtained
その他(なぜ通ったか?等)
- …
- …