#33
summarized by : Anonymous
Recurrent Glimpse-Based Decoder for Detection With Transformer

どんな論文か?

reduce the training difficulty of DETR by adding recurrent glimpse-based decoders (REGO) after DETR decoders, REGO computes attention only in the enlarged RoI regions from the detected boxes of DETR

新規性

RoI based refinement first time applied for attention model to achieve effective sparse attention; Queries learned only from features inside enlarged RoI, key and values learned from decoder features

結果

generally improves the training speed of DETR variants (e.g., DETR and Deformable DETR) and accuracy

その他(なぜ通ったか?等)

effective locality modeling is important for reducing the training difficulty of attention in DETR