A ConvNet for the 2020s

#143

summarized by : Anonymous

Zhuang Liu; Hanzi Mao; Chao-Yuan Wu; Christoph Feichtenhofer; Trevor Darrell; Saining Xie

どんな論文か？

identifying the confounding variables for conv and ViT, how does ViT design affect conv; how to modernize conv to close the gap between pre-ViT conv and post-Vit conv.

新規性

improving conv by enhanced recipe, patchify stem, stage ratio, depthwise conv, inverting dimension; increase depth; move up depthwise conv; fewer activation & norms, ReLU -> GeLU, large kernel size

結果

The proposed ConvNeXt models compete favorably with SOTA hierarchical vision transformer across multiple computer vision benchmarks.

その他（なぜ通ったか？等）

このページで利用されている画像は論文から引用しています．