ICCV2021論文サマリ

tag: vision-and-language

MDETR - Modulated Detection for End-to-End Multi-Modal Understanding

by: 飯田啄巳

Object detection Vision and language

Detector-Free Weakly Supervised Grounding by Separation

by: 飯田啄巳

Vision and language

UniT: Multimodal Multitask Learning With a Unified Transformer

by: Akihiro FUJII

Multi modal Object detection Segmentation Vision and language

Adversarial VQA: A New Benchmark for Evaluating the Robustness of VQA Models

by: SY

Dataset Robustness Vision and language VQA

Contrast and Classify: Training Robust VQA Models

by: SY

Vision and language VQA