CVPR2022論文サマリ
CVPR2022論文サマリ
Detector-Free Weakly Supervised Group Activity Recognition
by: Chihiro Nakatani(中谷 千洋)
Action recognition
Attetion
Recognition
Video
360MonoDepth: High-Resolution 360° Monocular Depth Estimation
by: Ryunosuke Isikawa
3D
3D object detection
StyleMesh: Style Transfer for Indoor 3D Scene Reconstructions
by: Keiichi Sawada
3D
3D reconstruction
Mip-NeRF 360: Unbounded Anti-Aliased Neural Radiance Fields
by: Hirokatsu Kataoka
Neural radiance fields (NeRF)
Everything at Once – Multi-Modal Fusion Transformer for Video Retrieval
by: Chihiro Nakatani(中谷 千洋)
Multi modal
Video
Dual-AI: Dual-Path Actor Interaction Learning for Group Activity Recognition
by: Chihiro Nakatani(中谷 千洋)
Action recognition
Recognition
Video
ObjectFormer for Image Manipulation Detection and Localization
by: 岡本大和(LINE Computer Vision Lab)
Attetion
Manipulation Detection
Proactive Image Manipulation Detection
by: 岡本大和(LINE Computer Vision Lab)
Adversarial examples
Attetion
Manipulation Detection
Self-Supervised Learning of Adversarial Example: Towards Good Generalizations for Deepfake Detection
by: 岡本大和(LINE Computer Vision Lab)
Adversarial examples
Fake
TableFormer: Table Structure Understanding With Transformers
by: 岡本大和(LINE Computer Vision Lab)
Object detection
document analysis
Block-NeRF: Scalable Large Scene Neural View Synthesis
by: Hirokatsu Kataoka
Neural radiance fields (NeRF)
Style-ERD: Responsive and Coherent Online Motion Style Transfer
by: Kosuke Fukazawa
3D
Motion Synthesis
FENeRF: Face Editing in Neural Radiance Fields
by: Hirokatsu Kataoka
3D reconstruction
GAN
Neural radiance fields (NeRF)
Deblur-NeRF: Neural Radiance Fields From Blurry Images
by: Hirokatsu Kataoka
Neural radiance fields (NeRF)
NeRF-Editing: Geometry Editing of Neural Radiance Fields
by: Hirokatsu Kataoka
Neural radiance fields (NeRF)
ZeroCap: Zero-Shot Image-to-Text Generation for Visual-Semantic Arithmetic
by: Ryuichi nakahara
BEHAVE: Dataset and Method for Tracking Human Object Interactions
by: Hirokatsu Kataoka
Pose estimation
Vision-Language Pre-Training for Boosting Scene Text Detectors
by: Hirokatsu Kataoka
Object detection
GenDR: A Generalized Differentiable Renderer
by: Norikatsu Sumi
3D reconstruction
Differentiable renderer
I M Avatar: Implicit Morphable Head Avatars From Videos
by: 山田亮佑 (Ryosuke Yamada)
3D
3D reconstruction
Unsupervised Pre-Training for Temporal Action Localization Tasks
by: Ryota Hashiguchi
Representation learning
Video
OpenTAL: Towards Open Set Temporal Action Localization
by: Ryota Hashiguchi
Video
Temporal Action Localization
AutoSDF: Shape Priors for 3D Completion, Reconstruction and Generation
by: Hirokatsu Kataoka
3D
3D reconstruction
SGTR: End-to-End Scene Graph Generation With Transformer
by: Yoshiki Nagasaki
Attention
Scene Graph Generation
Backdoor Attacks on Self-Supervised Learning
by: 山田亮佑 (Ryosuke Yamada)
Representation learning
Self supervised learning
Contrastive Test-Time Adaptation
by: Hirokatsu Kataoka
Domain adaptation
Representation learning
Self supervised learning
Point-Level Region Contrast for Object Detection Pre-Training
by: 山田亮佑 (Ryosuke Yamada)
Object detection
Self supervised learning
Episodic Memory Question Answering
by: Yusuke Mori
3D
Dataset
Multi modal
Robustness
Vision and language
GazeOnce: Real-Time Multi-Person Gaze Estimation
by: Anonymous
Multimodal Material Segmentation
by: Masanori YANO
Dataset
Multi modal
Segmentation
Semantic segmentation
Noise Is Also Useful: Negative Correlation-Steered Latent Contrastive Learning
by: Yui Iioka (Keio University)
Dataset
Object detection
Robustness
Self supervised learning
Class-Incremental Learning With Strong Pre-Trained Models
by: Hirokatsu Kataoka
Class-Incremental Learning
Patch Slimming for Efficient Vision Transformers
by: Sora Takashima (高島 空良)
Attention
Recognition
Pruning
Semantic-Aware Domain Generalized Segmentation
by: Takehiro Matsuda
Domain adaptation
Semantic segmentation
Structured Sparse R-CNN for Direct Scene Graph Generation
by: Yoshiki Nagasaki
Scene Graph Generation
Language As Queries for Referring Video Object Segmentation
by: Ryuichi Nakahara
Segmentation
Video
Vision and language
Capturing and Inferring Dense Full-Body Human-Scene Contact
by: Kosuke Fuazawa
3D
Dataset
Human Scene Contact
Which Images To Label for Few-Shot Medical Landmark Detection?
by: Ryuichi Nakahara
Object detection
Incremental Cross-View Mutual Distillation for Self-Supervised Medical CT Synthesis
by: Ryuichi Nakahara
3D
IFOR: Iterative Flow Minimization for Robotic Object Rearrangement
by: Takahiro Suzuki
3D
Optical flow
Towards Low-Cost and Efficient Malaria Detection
by: Shunsuke Yoshizawa
Dataset
Domain adaptation
Object detection
Revisiting Skeleton-Based Action Recognition
by: Masanori YANO
Action recognition
Pose estimation
Video
Globetrotter: Connecting Languages by Connecting Images
by: Ryosuke Oshima
Dataset
Vision and language
LiT: Zero-Shot Transfer With Locked-Image Text Tuning
by: Sora Takashima (高島 空良)
N-shot learning
Recognition
Representation learning
Vision and language