#362
summarized by : Anonymous
Beyond Fixation: Dynamic Window Visual Transformer

どんな論文か?

using multi-scale windows to improve the performance of single-scale fixed-window backbone, information from different scales is dynamically fused by assigning different weights
placeholder

新規性

the first method to exploit multi-scale window; propose a novel plug-and-play module with a dynamic multi-scale window for multi-head self-attention in transformer;

結果

outperformed other backbones on classification, object detection, segmentation tasks.

その他(なぜ通ったか?等)