ECCV2022論文サマリ
ECCV2022論文サマリ
MultiMAE: Multi-modal Multi-task Masked Autoencoders
by: Yui Iioka @Keio-Univ.
Attention
Depth estimation
Multi modal
Segmentation
Semantic segmentation
Learning Ego 3D Representation As Ray Tracing
by: shoji sonoyama
3D object detection
Segmentation
BEV
Towards Grand Unification of Object Tracking
by: Masanori YANO
Video
Object tracking
SOT
MOT
VOS
MOTS
Hunting Group Clues with Transformers for Social Group Activity Recognition
by: Chihiro Nakatani(中谷千洋)
Action recognition
Graph Neural Network for Cell Tracking in Microscopy Videos
by: Ryuichi Nakahara
Object detection
Cell tracking
Relationformer: A Unified Framework for Image-to-Graph Generation
by: Ryuichi Nakahara
Object detection
medical
PointScatter: Point Set Representation for Tubular Structure Extraction
by: Ryuichi Nakahara
Point cloud
medical
Differentiable Zooming for Multiple Instance Learning on Whole-Slide Images
by: Ryuichi Nakahara
medical
Self-Supervised Sparse Representation for Video Anomaly Detection
by: shota nishiyama
Video
anomaly detection
Detecting Twenty-Thousand Classes Using Image-Level Supervision
by: Hirokatsu Kataoka
Dataset
Object detection
Recognition
Abstracting Sketches through Simple Primitives
by: Hirokatsu Kataoka
Dataset
Self supervised learning
Interpretability
DeiT III: Revenge of the ViT
by: Hirokatsu Kataoka
Recognition
Representation learning
Semantic segmentation
MVP: Multimodality-Guided Visual Pre-training
by: Hirokatsu Kataoka
Representation learning
Self supervised learning
MovieCuts: A New Dataset and Benchmark for Cut Type Recognition
by: Hirokatsu Kataoka
Action recognition
Dataset
Video
Explicit Occlusion Reasoning for Multi-Person 3D Human Pose Estimation
by: Ryuichi Nakahara
Pose estimation
DeciWatch: A Simple Baseline for 10× Efficient 2D and 3D Pose Estimation
by: Ryuichi Nakahara
Pose estimation
Knowledge Condensation Distillation
by: Hirokatsu Kataoka
Knowledge distillation
Representation learning
Self-Supervision Can Be a Good Few-Shot Learner
by: Hirokatsu Kataoka
Self supervised learning
Few-shot Learning
Are Vision Transformers Robust to Patch Perturbations?
by: Shuya Takahashi(髙橋 秀弥)
Adversarial examples
Object detection
Robustness
OccamNets: Mitigating Dataset Bias by Favoring Simpler Hypotheses
by: Shinnosuke Matsufusa
Robustness
Cross-Modal Knowledge Transfer without Task-Relevant Source Data
by: Shinnosuke Matsufusa
Depth estimation
Multi modal
Improving Robustness by Enhancing Weak Subnets
by: Shinnosuke Matsufusa
Adversarial examples
Neural architecture search(NAS)
Robustness
Real-Time Online Video Detection with Temporal Smoothing Transformers
by: Shinnosuke Matsufusa
Video
Toward Understanding WordArt: Corner-Guided Transformer for Scene Text Recognition
by: Shinnosuke Matsufusa
Recognition
Robustness
SeqFormer: Sequential Transformer for Video Instance Segmentation
by: Shinnosuke Matsufusa
Segmentation
Video
Understanding the Dynamics of DNNs Using Graph Modularity
by: Shuya Takahashi(髙橋 秀弥)
Recognition
Representation learning
Explainable AI for CV
Dense Siamese Network for Dense Unsupervised Learning
by: Ryo Nakamura
Contributions of Shape, Texture, and Color in Visual Recognition
by: Shuya Takahashi(髙橋 秀弥)
Recognition
Explainable AI
TISE: Bag of Metrics for Text-to-Image Synthesis Evaluation
by: Seitaro Shinagawa
Vision and language