#191
summarized by : Anonymous
MPViT: Multi-Path Vision Transformer for Dense Prediction

どんな論文か?

improving multi-scale feature representation for dense prediction tasks by multi-scale patch embedding and multi-path feature aggregation
placeholder

新規性

multi-scale embedding with a multipath structure for simultaneously representing fine and coarse features for dense prediction tasks; global to local feature interaction module

結果

SOTA results on ImageNet-1K classification, COCO detection, ADE20K segmentation compared with other backbones.

その他(なぜ通ったか?等)