CVPR2021論文サマリ

3D 3D reconstruction Representation learning

NeRF in the Wild: Neural Radiance Fields for Unconstrained Photo Collections

by: Naoya Chiba

pixelNeRF: Neural Radiance Fields From One or Few Images

by: Naoya Chiba

3D 3D reconstruction Representation learning

Energy-Based Learning for Scene Graph Generation

by: Shuhei M Yoshida

Recognition Scene graph generation Energy-based model

Fully Convolutional Scene Graph Generation

by: Shuhei M Yoshida

Recognition Scene graph generation

Probabilistic Modeling of Semantic Ambiguity for Scene Graph Generation

by: Shuhei M Yoshida

Recognition Scene graph generation Uncertainty

Differentiable SLAM-Net: Learning Particle SLAM for Visual Navigation

by: shoji sonoyama

Pose estimation SLAM Navigation

What if We Only Use Real Datasets for Scene Text Recognition? Toward Scene Text Recognition With Fewer Labels

by: So Uchida

Dataset Self supervised learning Vision and language Text Recognition

Bipartite Graph Network With Adaptive Message Passing for Unbiased Scene Graph Generation

by: Shuhei M Yoshida

Recognition Scene graph generation Long-tail Imbalance

Read Like Humans: Autonomous, Bidirectional and Iterative Language Modeling for Scene Text Recognition

by: So Uchida

Attetion Multi modal Recognition Vision and language Text Recognition

LOHO: Latent Optimization of Hairstyles via Orthogonalization

by: Shion Honda

Dataset Multi modal Recognition Vision and language Text Recognition

Dictionary-Guided Scene Text Recognition

by: So Uchida

DARCNN: Domain Adaptive Region-Based Convolutional Neural Network for Unsupervised Instance Segmentation in Biomedical Images

by: Anonymous

Domain adaptation Segmentation

Sequence-to-Sequence Contrastive Learning for Text Recognition

by: So Uchida

Self supervised learning Text Recognition

Learning Better Visual Dialog Agents With Pretrained Visual-Linguistic Representation

by: Seitaro Shinagawa

GAN Multi modal Vision and language

Cross-Modal Contrastive Learning for Text-to-Image Generation

by: Seitaro Shinagawa

Iterative Shrinking for Referring Expression Grounding Using Deep Reinforcement Learning

by: Seitaro Shinagawa

Multi modal Object detection Vision and language Referring expression

LayoutTransformer: Scene Layout Generation With Conceptual and Spatial Diversity

by: yasud

Vision and language Scene Graph

NeuralFusion: Online Depth Fusion in Latent Space

by: Itsuki Ueda

3D 3D reconstruction Representation learning

Learning by Planning: Language-Guided Global Image Editing

by: Seitaro Shinagawa

Dataset Vision and language

Neighbor2Neighbor: Self-Supervised Denoising From Single Noisy Images

by: Hiroki Nakamura

Self supervised learning Denoising

Scan2Cap: Context-Aware Dense Captioning in RGB-D Scans

by: Katsuyuki Nakamura

3D object detection Multi modal Vision and language

Deep Implicit Templates for 3D Shape Representation

by: Naoya Chiba

3D 3D reconstruction Representation learning Self supervised learning

D-NeRF: Neural Radiance Fields for Dynamic Scenes

by: Naoya Chiba

3D 3D reconstruction Representation learning Self supervised learning Video

Explicit Knowledge Incorporation for Visual Reasoning

by: Tasuku KINJO

Vision and language Visual Reasoning

Can Audio-Visual Integration Strengthen Robustness Under Multimodal Attacks?

by: Keita Goto

Action recognition Multi modal Robustness Video

FAIEr: Fidelity and Adequacy Ensured Image Caption Evaluation

by: Seitaro Shinagawa

Counterfactual VQA: A Cause-Effect Look at Language Bias

by: Seitaro Shinagawa

Recognition Vision and language Scene graph generation

Pushing It Out of the Way: Interactive Visual Navigation

by: Seitaro Shinagawa

3D Visual Navigation

Linguistic Structures As Weak Supervision for Visual Scene Graph Generation

by: Shuhei M Yoshida

Transitional Adaptation of Pretrained Models for Visual Storytelling

by: yasud

Multi modal Vision and language Metric Learning

Revamping Cross-Modal Recipe Retrieval With Hierarchical Transformers and Self-Supervised Learning

by: yasud

Re-Labeling ImageNet: From Single to Multi-Labels, From Global to Localized Labels

by: Akihiro FUJII

Dataset Object detection Recognition

Simple Copy-Paste Is a Strong Data Augmentation Method for Instance Segmentation

by: Akihiro FUJII

Instance segmentation Object detection data augmentation

Training Generative Adversarial Networks in One Stage

by: Akihiro FUJII

Transformation Driven Visual Reasoning

by: Tasuku KINJO

Self supervised learning Anomaly detection

Self-Supervised Motion Learning From Static Images

by: Hiroki Nakamura

Self supervised learning

CutPaste: Self-Supervised Learning for Anomaly Detection and Localization

by: Akihiro FUJII

Involution: Inverting the Inherence of Convolution for Visual Recognition

by: Akihiro FUJII

Attetion Object detection Recognition Segmentation

RepVGG: Making VGG-Style ConvNets Great Again

by: Akihiro FUJII

3D Representation learning Self supervised learning Super resolution

Variational Transformer Networks for Layout Generation

by: yasud

Layout Generation

Neural Geometric Level of Detail: Real-Time Rendering With Implicit 3D Shapes

by: Naoya Chiba

Deep Optimized Priors for 3D Shape Modeling and Reconstruction

by: Naoya Chiba

3D Point cloud Representation learning Self supervised learning

GDR-Net: Geometry-Guided Direct Regression Network for Monocular 6D Object Pose Estimation

by: shoji sonoyama

3D object detection Pose estimation

Spatially Consistent Representation Learning

by: Shuhei M Yoshida

Instance segmentation Object detection Representation learning Self supervised learning Contrastive learning

MetricOpt: Learning To Optimize Black-Box Evaluation Metrics

by: Masanori YANO

Object detection Recognition Learning method

UP-DETR: Unsupervised Pre-Training for Object Detection With Transformers

by: 福原吉博 (Yoshihiro Fukuhara)

Object detection Unsupervised Learning

Pedestrian and Ego-Vehicle Trajectory Prediction From Monocular Camera

by: Fukuchi Nobuaki

Depth estimation Disentanglement Object detection Self supervised learning

Learning Probabilistic Ordinal Embeddings for Uncertainty-Aware Regression

by: Koki Obinata

Video Vision and language Retrieval

Rethinking and Improving the Robustness of Image Style Transfer

by: 福原吉博 (Yoshihiro Fukuhara)

Style Transfer

Thinking Fast and Slow: Efficient Text-to-Visual Retrieval With Transformers

by: Shintaro Yamamoto

Incremental Few-Shot Instance Segmentation

by: Hiroaki Aizawa

Instance segmentation N-shot learning

Seeing Out of the Box: End-to-End Pre-Training for Vision-Language Representation Learning

by: Yutaka Kawashima

3D 3D reconstruction Disentanglement Point cloud Representation learning

MoViNets: Mobile Video Networks for Efficient Video Recognition

by: Akihiro FUJII

Action recognition

Joint Learning of 3D Shape Retrieval and Deformation

by: Naoya Chiba

Point Cloud Instance Segmentation Using Probabilistic Embeddings

by: Naoya Chiba

3D Instance segmentation Representation learning Segmentation

HistoGAN: Controlling Colors of GAN-Generated and Real Images via Color Histograms

by: Shoma Iwai

GAN Style Transfer Colorization

Fashion IQ: A New Dataset Towards Retrieving Images by Natural Language Feedback

by: 岡本大和

Dataset Segmentation Semantic segmentation Self supervised learning

NeX: Real-Time View Synthesis With Neural Basis Expansion

by: yasud

3D Novel View Synthesis

Every Annotation Counts: Multi-Label Deep Supervision for Medical Image Segmentation

by: Shunsuke Yoshizawa

PANDA: Adapting Pretrained Features for Anomaly Detection and Segmentation

by: hiroyuki masuda

Anomaly Detection

ATSO: Asynchronous Teacher-Student Optimization for Semi-Supervised Image Segmentation

by: Shunsuke Yoshizawa

Segmentation Semantic segmentation Self supervised learning

Deep Two-View Structure-From-Motion Revisited

by: Yutaro Oguri

3D 3D reconstruction Depth estimation Optical flow Pose estimation

Non-Salient Region Object Mining for Weakly Supervised Semantic Segmentation

by: Shunsuke Yoshizawa

Segmentation Semantic segmentation CAM

Semi-Supervised Semantic Segmentation With Cross Pseudo Supervision

by: Shunsuke Yoshizawa

Segmentation Semantic segmentation Self supervised learning

Where and What? Examining Interpretable Disentangled Representations

by: 福原吉博 (Yoshihiro Fukuhara)

Disentanglement GAN

Parser-Free Virtual Try-On via Distilling Appearance Flows

by: 綱島秀樹

GAN Virtual Try-on

Soft-IntroVAE: Analyzing and Improving the Introspective Variational Autoencoder

by: 綱島秀樹

Disentanglement VAE

Exemplar-Based Open-Set Panoptic Segmentation Network

by: Hiroaki Aizawa

Bottleneck Transformers for Visual Recognition

by: Akihiro FUJII

Instance segmentation Recognition Segmentation

Divergence Optimization for Noisy Universal Domain Adaptation

by: Anonymous

Domain adaptation Robustness

The Lottery Ticket Hypothesis for Object Recognition

by: Akihiro FUJII

Instance segmentation Object detection Segmentation Lottery ticket hypothesis

Human POSEitioning System (HPS): 3D Human Pose Estimation and Self-Localization in Large Scenes From Body-Mounted Sensors

by: Katsuyuki Nakamura

3D Dataset Multi modal Pose estimation

SSAN: Separable Self-Attention Network for Video Representation Learning

by: Kensho Hara

Object detection Representation learning Robustness

GIRAFFE: Representing Scenes As Compositional Generative Neural Feature Fields

by: yasud

3D GAN

Learning Continuous Image Representation With Local Implicit Image Function

by: Hiroaki Aizawa

Explaining Classifiers Using Adversarial Perturbations on the Perceptual Ball

by: Tasuku KINJO

Invertible Image Signal Processing

by: Teppei Kurita

ISP Invertible

Boosting Monocular Depth Estimation Models to High-Resolution via Content-Adaptive Multi-Resolution Merging

by: Teppei Kurita

Depth estimation Monocular Depth Estimation High-Resolution

Weakly-Supervised Physically Unconstrained Gaze Estimation

by: Teppei Kurita

Gaze Estimation Weakly-Supervised

Separating Skills and Concepts for Novel Visual Question Answering

by: Shintaro Yamamoto

Generalized Domain Adaptation

by: Anonymous

Roses Are Red, Violets Are Blue… but Should VQA Expect Them To?

by: Jumpei Suzuki

TediGAN: Text-Guided Diverse Face Image Generation and Manipulation

by: Tadashi Ise

Dataset GAN Multi modal

Scaled-YOLOv4: Scaling Cross Stage Partial Network

by: Masanori YANO

Privacy-Preserving Image Features via Adversarial Affine Subspace Embeddings

by: 福原吉博 (Yoshihiro Fukuhara)

Privacy

Fostering Generalization in Single-View 3D Reconstruction by Learning a Hierarchy of Local and Global Shape Priors

by: Naoya Chiba

Action recognition Video Video Retrieval

Reconstructing 3D Human Pose by Watching Humans in the Mirror

by: Soma Nonaka

Pose estimation

Removing the Background by Adding the Background: Towards Background Robust Self-Supervised Video Representation Learning

by: 綱島秀樹

PAConv: Position Adaptive Convolution With Dynamic Kernel Assembling on Point Clouds

by: Naoya Chiba

3D Point cloud Segmentation Semantic segmentation

PLOP: Learning Without Forgetting for Continual Semantic Segmentation

by: Shunsuke Yoshizawa

Domain adaptation Knowledge distillation Segmentation Semantic segmentation Continuous learning

Instance Localization for Self-Supervised Detection Pretraining

by: 古澤嘉久

Object detection Representation learning Self supervised learning

Revisiting Knowledge Distillation: An Inheritance and Exploration Framework

by: Shunsuke Yoshizawa

Birds of a Feather: Capturing Avian Shape Models From Images

by: Teppei Kurita

Dataset Portrait Retouching

PPR10K: A Large-Scale Portrait Photo Retouching Dataset With Human-Region Mask and Group-Level Consistency

by: Teppei Kurita

Dynamic Head: Unifying Object Detection Heads With Attentions

by: y.inoue

Attetion Object detection

Stochastic Image-to-Video Synthesis Using cINNs

by: Teppei Kurita

Video Image-to-Video cINN

The Temporal Opportunist: Self-Supervised Multi-Frame Monocular Depth

by: Teppei Kurita

3D Depth estimation Self supervised learning

Cuboids Revisited: Learning Robust 3D Shape Fitting to Single RGB Images

by: Teppei Kurita

3D Cuboids Primitive

NeRD: Neural 3D Reflection Symmetry Detector

by: Teppei Kurita

3D Reflection Symmetry

Multi-Label Learning From Single Positive Labels

by: Akihiro FUJII

Recognition multi label

Correlated Input-Dependent Label Noise in Large-Scale Image Classification

by: Akihiro FUJII

Recognition Label noise

Towards Compact CNNs via Collaborative Compression

by: Shunsuke Yoshizawa

Model compression

Learning Deep Classifiers Consistent With Fine-Grained Novelty Detection

by: Akihiro FUJII

Recognition Novelty detection

End-to-End Learning for Joint Image Demosaicing, Denoising and Super-Resolution

by: Teppei Kurita

Super resolution Demosaicing Denoising

Enhancing the Transferability of Adversarial Attacks Through Variance Tuning

by: 伊藤諒悟

Adversarial examples Adversarial Attack

Invertible Denoising Network: A Light Solution for Real Noise Removal

by: Shoma Iwai

Denoising Image Restoration Invertible Model

Adversarial Generation of Continuous Images

by: Hiroaki Aizawa

Pose Transfer Virtual Try-on Editing

Disentangled Cycle Consistency for Highly-Realistic Virtual Try-On

by: 綱島秀樹

GAN Virtual Try-on

PISE: Person Image Synthesis and Editing With Decoupled GAN

by: 綱島秀樹

Taming Transformers for High-Resolution Image Synthesis

by: 綱島秀樹

GAN VAE Transformer Inpainting Image Editing Super Resolution

Adaptive Cross-Modal Prototypes for Cross-Domain Visual-Language Retrieval

by: Shintaro Yamamoto

Domain adaptation Vision and language

Learned Initializations for Optimizing Coordinate-Based Neural Representations

by: Hiroaki Aizawa

Probabilistic Embeddings for Cross-Modal Retrieval

by: 金城忍

Multi modal

A Multiplexed Network for End-to-End, Multilingual OCR

by: 岡本大和

Object detection Recognition Vision and language

Understanding the Behaviour of Contrastive Loss

by: 金城忍

Exploring Simple Siamese Representation Learning

by: 福原吉博 (Yoshihiro Fukuhara)

Dataset Multi modal Recognition Vision and language

TextOCR: Towards Large-Scale End-to-End Reasoning for Arbitrary-Shaped Scene Text

by: 岡本大和

Distilling Audio-Visual Knowledge by Compositional Contrastive Learning

by: Keita Goto

Action recognition Knowledge distillation Video

DeepMetaHandles: Learning Deformation Meta-Handles of 3D Meshes With Biharmonic Coordinates

by: Naoya Chiba

3D Disentanglement Point cloud Representation learning Self supervised learning

DyCo3D: Robust Instance Segmentation of 3D Point Clouds Through Dynamic Convolution

by: Naoya Chiba

3D Instance segmentation Object detection Point cloud Robustness Segmentation

BoxInst: High-Performance Instance Segmentation With Box Annotations

by: Masanori YANO

Instance segmentation

Rethinking Semantic Segmentation From a Sequence-to-Sequence Perspective With Transformers

by: Shunsuke Yoshizawa

Attetion Segmentation Transformer

Equalization Loss v2: A New Gradient Balance Approach for Long-Tailed Object Detection

by: Akihiro FUJII

Object detection long tailed

InverseForm: A Loss Function for Structured Boundary-Aware Segmentation

by: Shunsuke Yoshizawa

Segmentation Semantic segmentation Loss function

Rethinking BiSeNet for Real-Time Semantic Segmentation

by: Shion Honda

Semantic segmentation Video

Truly Shift-Invariant Convolutional Neural Networks

by: 金城忍

CNN Shift equivariant Shift invariance 2D reconstruction

Towards Semantic Segmentation of Urban-Scale 3D Point Clouds: A Dataset, Benchmarks and Challenges

by: Anonymous

3D Dataset Semantic segmentation

Can We Characterize Tasks Without Labels or Features?

by: Hiroaki Aizawa

SelfDoc: Self-Supervised Document Representation Learning

by: Shintaro Yamamoto

document recognition

Multiple Instance Active Learning for Object Detection

by: Shun.ishizaka

Object detection Active learning

SelfAugment: Automatic Augmentation Policies for Self-Supervised Learning

by: Akihiro FUJII

Representation learning Self supervised learning Data augmentation

Multimodal Contrastive Training for Visual Representation Learning

by: 金城忍

Multi modal Representation learning Self supervised learning

Image Change Captioning by Learning From an Auxiliary Task

by: Shintaro Yamamoto

Interpretability Explainable

Transformer Interpretability Beyond Attention Visualization

by: 平澤寅庄

Asymmetric Gained Deep Image Compression With Continuous Rate Adaptation

by: Shoma Iwai

Image Compression

Orthogonal Over-Parameterized Training

by: 金城忍

Orthogonal transformation Hyperspherical Learning

MaX-DeepLab: End-to-End Panoptic Segmentation With Mask Transformers

by: Akihiro FUJII

Attetion Segmentation

Synthesizing Long-Term 3D Human Motion and Interaction in 3D Scenes

by: Shun.ishizaka

3D Video VAE

Towards Open World Object Detection

by: Akihiro FUJII

Dataset Object detection

Domain-Robust VQA With Diverse Datasets and Methods but No Target Labels

by: Shintaro Yamamoto

Domain adaptation Vision and language

Dense Contrastive Learning for Self-Supervised Visual Pre-Training

by: 金城忍

Object detection Segmentation Self supervised learning Contrastive learning

Generalizing to the Open World: Deep Visual Odometry With Online Adaptation

by: shoji sonoyama

Depth estimation Pose estimation Self supervised learning SLAM visual odometry

Visual Room Rearrangement

by: 綱島秀樹

Embodied Learning Visual Exploration Navigation Planning Rearrangement

VirTex: Learning Visual Representations From Textual Annotations

by: Shintaro Yamamoto

Recognition Representation learning Vision and language

Natural Adversarial Examples

by: 伊藤諒悟

Adversarial examples Dataset Recognition Robustness

Multiresolution Knowledge Distillation for Anomaly Detection

by: Shunsuke Nakatsuka

Knowledge distillation Anomaly detection

Neural Parts: Learning Expressive 3D Shape Abstractions With Invertible Neural Networks

by: Naoya Chiba

3D Disentanglement Point cloud Representation learning Self supervised learning

SpinNet: Learning a General Surface Descriptor for 3D Point Cloud Registration

by: Naoya Chiba

3D Point cloud Representation learning

Deep Stable Learning for Out-of-Distribution Generalization

by: 福原吉博 (Yoshihiro Fukuhara)

Generalization

Augmentation Strategies for Learning With Noisy Labels

by: Masanori YANO

Dataset Segmentation Semantic segmentation Active learning

Three Ways To Improve Semantic Segmentation With Self-Supervised Depth Estimation

by: Shunsuke Yoshizawa

Adversarial Robustness Under Long-Tailed Distribution

by: Akihiro FUJII

Adversarial examples Robustness long tailed

Boundary IoU: Improving Object-Centric Image Segmentation Evaluation

by: Shunsuke Yoshizawa

Instance segmentation Segmentation Semantic segmentation Evaluation

Capsule Network Is Not More Robust Than Convolutional Network

by: Akihiro FUJII

Meta learning Person re-identification Unsupervised domain adaptation Batch normalization Instance normalization

Meta Batch-Instance Normalization for Generalizable Person Re-Identification

by: 金城忍

Polarimetric Normal Stereo

by: Takahiro Kushida

3D 3D reconstruction Depth estimation

Deep Denoising of Flash and No-Flash Pairs for Photography in Low-Light Environments

by: Teppei Kurita

Denoising

Multi-Attentional Deepfake Detection

by: Akihiro FUJII

Attetion deep fake

Anycost GANs for Interactive Image Synthesis and Editing

by: Shoma Iwai

Domain adaptation Self supervised learning Unsupervised domain adaptation Supervised learning Adversarial learning Clustering

Cross-Domain Gradient Discrepancy Minimization for Unsupervised Domain Adaptation

by: 金城忍

Robust Consistent Video Depth Estimation

by: Soma Nonaka

Depth estimation

Architectural Adversarial Robustness: The Case for Deep Pursuit

by: Takayuki Semitsu

Adversarial examples Robustness

KeypointDeformer: Unsupervised 3D Keypoint Discovery for Shape Control

by: Naoya Chiba

3D Disentanglement Representation learning Self supervised learning

Few-Shot 3D Point Cloud Semantic Segmentation

by: Naoya Chiba

3D N-shot learning Point cloud Segmentation Semantic segmentation

Regularizing Generative Adversarial Networks Under Limited Data

by: Takayuki Semitsu

How Well Do Self-Supervised Models Transfer?

by: 福原吉博 (Yoshihiro Fukuhara)

Structured Scene Memory for Vision-Language Navigation

by: hisaka koji

GAN Robustness Positional encoding Inductive bias CNN Translation invariant

Positional Encoding As Spatial Inductive Bias in GANs

by: 金城忍

Continual Semantic Segmentation via Repulsion-Attraction of Sparse and Disentangled Latent Representations

by: Shigemichi Matsuzaki

Knowledge distillation Representation learning Semantic segmentation

Uncertainty-Guided Model Generalization to Unseen Domains

by: 金城忍

Meta learning Robustness Domain generalization Curriculum learning

Graph-Based High-Order Relation Modeling for Long-Term Action Recognition

by: Kengo Ino

3D Point cloud Representation learning Self supervised learning

Cycle4Completion: Unpaired Point Cloud Completion Using Cycle Transformation With Missing Region Coding

by: Naoya Chiba

Deformed Implicit Field: Modeling 3D Shapes With Learned Dense Correspondence

by: Naoya Chiba

3D Disentanglement Representation learning Self supervised learning

FCPose: Fully Convolutional Multi-Person Pose Estimation With Dynamic Instance-Aware Convolutions

by: Masanori YANO

Object detection Pose estimation

Vx2Text: End-to-End Learning of Video-Based Text Generation From Multimodal Inputs

by: 金城忍

Multi modal Video Vision and language

LAFEAT: Piercing Through Adversarial Defenses With Latent Features

by: 福原吉博 (Yoshihiro Fukuhara)

Adversarial examples Robustness

Causal Attention for Vision-Language Tasks

by: hisaka koji

Domain adaptation GAN Unsupervised learning

DRANet: Disentangling Representation and Adaptation Networks for Unsupervised Cross-Domain Adaptation

by: 金城忍

Rethinking Class Relations: Absolute-Relative Supervised and Unsupervised Few-Shot Learning

by: Shuhei M Yoshida

Meta learning N-shot learning Self supervised learning

ORDisCo: Effective and Efficient Usage of Incremental Unlabeled Data for Semi-Supervised Continual Learning

by: Shunsuke Nakatsuka

Continual learning

Propagate Yourself: Exploring Pixel-Level Consistency for Unsupervised Visual Representation Learning

by: Hiroki Nakamura

Self supervised learning

Euro-PVI: Pedestrian Vehicle Interactions in Dense Urban Centers

by: fukuchi nobuaki

Dataset Disentanglement Multi modal

Learning Affinity-Aware Upsampling for Deep Image Matting

by: Masanori YANO

Image matting Upsampling

A Large-Scale Study on Unsupervised Spatiotemporal Representation Learning

by: Shuhei M Yoshida

Action recognition Representation learning Self supervised learning Video

DoDNet: Learning To Segment Multi-Organ and Tumors From Multiple Partially Labeled Datasets

by: Shunsuke Kogure

3D Dataset Segmentation

Learning Cross-Modal Retrieval With Noisy Labels

by: yusuke.okimoto

Multi modal image retrieval

DeepI2P: Image-to-Point Cloud Registration via Deep Classification

by: Naoya Chiba

3D Point cloud Pose estimation Self supervised learning

CorrNet3D: Unsupervised End-to-End Learning of Dense Correspondence for 3D Point Clouds

by: Naoya Chiba

3D Point cloud Self supervised learning

How Transferable Are Reasoning Patterns in VQA?

by: Seitaro Shinagawa

Multi-Scale Aligned Distillation for Low-Resolution Detection

by: Yutaro Oguri

Multi modal Contrastive learning

VisualVoice: Audio-Visual Speech Separation With Cross-Modal Consistency

by: 金城忍

Offboard 3D Object Detection From Point Cloud Sequences

by: Naoya Chiba

3D 3D object detection Disentanglement Point cloud Pose estimation Self supervised learning

House-GAN++: Generative Adversarial Layout Refinement Network towards Intelligent Computational Agent for Professional Architects

by: 金城忍

No Frame Left Behind: Full Video Action Recognition

by: Kengo Ino

3D Point cloud Representation learning Self supervised learning

Point2Skeleton: Learning Skeletal Representations from Point Clouds

by: Naoya Chiba

Mask-ToF: Learning Microlens Masks for Flying Pixel Correction in Time-of-Flight Imaging

by: Takahiro Kushida

3D 3D reconstruction Depth estimation Point cloud

Practical Single-Image Super-Resolution Using Look-Up Table

by: Shoma Iwai

Super resolution

Coarse-To-Fine Domain Adaptive Semantic Segmentation With Photometric Alignment and Category-Center Regularization

by: Shigemichi Matsuzaki

3D Video Vision and language

Bayesian Nested Neural Networks for Uncertainty Calibration and Adaptive Compression

by: Anonymous

Bayesian Neural Network

Less Is More: ClipBERT for Video-and-Language Learning via Sparse Sampling

by: yasud

Transformer Meets Tracker: Exploiting Temporal Context for Robust Visual Tracking

by: 福原吉博 (Yoshihiro Fukuhara)

Attetion Object detection Transformer

Generic Perceptual Loss for Modeling Structured Output Dependencies

by: Masanori YANO

Depth estimation Instance segmentation Segmentation Semantic segmentation Super resolution Style transfer

Effective Snapshot Compressive-Spectral Imaging via Deep Denoising and Total Variation Priors

by: Shigekazu Takizawa

Computational imaging Compressive ssensing

HLA-Face: Joint High-Low Adaptation for Low Light Face Detection

by: Takayuki Semitsu

Domain adaptation Robustness Self supervised learning

Sketch, Ground, and Refine: Top-Down Dense Video Captioning

by: QIU YUE

Self supervised learning Video Vision and language

Towards Bridging Event Captioner and Sentence Localizer for Weakly Supervised Dense Event Captioning

by: QIU YUE

Style-Aware Normalized Loss for Improving Arbitrary Style Transfer

by: Akihiro FUJII

GAN style transfer

Prototypical Cross-Domain Self-Supervised Learning for Few-Shot Unsupervised Domain Adaptation

by: 金城忍

Domain adaptation N-shot learning Self supervised learning Contrastive learning

Open-Book Video Captioning With Retrieve-Copy-Generate Network

by: QIU YUE

GAN Recognition Segmentation Semantic segmentation quantization

Zero-Shot Adversarial Quantization

by: Akihiro FUJII

Improving OCR-Based Image Captioning by Incorporating Geometrical Relationship

by: QIU YUE

Action recognition Attetion Video fine grained

Temporal Query Networks for Fine-Grained Video Understanding

by: Akihiro FUJII

Zero-Shot Instance Segmentation

by: Akihiro FUJII

N-shot learning Segmentation zero shot

The Lottery Tickets Hypothesis for Supervised and Self-Supervised Pre-Training in Computer Vision Models

by: Akihiro FUJII

Representation learning Self supervised learning lottery tickets hypothesis transfer learning

Distilling Knowledge via Knowledge Review

by: Akihiro FUJII

CNN Pruning Model compression

Convolutional Neural Network Pruning With Structural Redundancy Reduction

by: 金城忍

One Shot Face Swapping on Megapixels

by: Akihiro FUJII

End-to-End Object Detection With Fully Convolutional Network

by: Akihiro FUJII

3D 3D reconstruction Depth estimation

Shape From Sky: Polarimetric Normal Recovery Under the Sky

by: Takahiro Kushida

ReMix: Towards Image-to-Image Translation With Limited Data

by: Akihiro FUJII

GAN style transfer

Uncertainty Reduction for Model Adaptation in Semantic Segmentation

by: Anonymous

Action recognition Knowledge distillation Self supervised learning Video

Towards Long-Form Video Understanding

by: Kensho Hara

Dataset Video

Anomaly Detection in Video via Self-Supervised and Multi-Task Learning

by: Ryo Nakamura

Pixel-Wise Anomaly Detection in Complex Driving Scenes

by: Shunsuke Nakatsuka

Dataset Segmentation Image matting

Semantic Image Matting

by: Masanori YANO

Predator: Registration of 3D Point Clouds With Low Overlap

by: Naoya Chiba

3D Attetion Point cloud Self supervised learning

StyleMeUp: Towards Style-Agnostic Sketch-Based Image Retrieval

by: yusuke.okimoto

Disentanglement Meta learning Multi modal Representation learning image retrieval

Equivariant Point Network for 3D Point Cloud Analysis

by: Naoya Chiba

3D Attetion Point cloud Pose estimation

A Sliced Wasserstein Loss for Neural Texture Synthesis

by: Takayuki Semitsu

Disentanglement Domain adaptation

Fair Feature Distillation for Visual Recognition

by: 福原吉博 (Yoshihiro Fukuhara)

Knowledge distillation Representation learning Fairness

Encoding in Style: A StyleGAN Encoder for Image-to-Image Translation

by: Tadashi Ise

GAN Multi modal

FrameExit: Conditional Early Exiting for Efficient Video Recognition

by: Chihiro Nakatani

Action recognition Recognition Video

How Does Topology Influence Gradient Propagation and Model Performance of Deep Networks With DenseNet-Type Skip Connections?

by: 金城忍

Semi-Supervised Semantic Segmentation With Directional Context-Aware Consistency

by: Shigemichi Matsuzaki

Representation learning Semantic segmentation Semi-supervised learning

Scene Essence

by: Teppei Kurita

Recognition Scene Essence

Time Adaptive Recurrent Neural Network

by: 金城忍

RNN Ordinary differential equations

Image Super-Resolution With Non-Local Sparse Attention

by: Shoma Iwai

Attetion Super resolution

Towards Accurate Text-Based Image Captioning With Content Diversity Exploration

by: QIU YUE

Enriching ImageNet With Human Similarity Judgments and Psychological Embeddings

by: Ryota Suzuki

3D Domain adaptation Point cloud Segmentation Semantic segmentation

Complete & Label: A Domain Adaptation Approach to Semantic Segmentation of LiDAR Point Clouds

by: Takehiro Matsuda

Self-Point-Flow: Self-Supervised Scene Flow Estimation From Point Clouds With Optimal Transport and Random Walk

by: Naoya Chiba

3D Optical flow Point cloud Self supervised learning

A 3D GAN for Improved Large-Pose Facial Recognition

by: Katsuhiro Muto

3D Dataset GAN Robustness

PD-GAN: Probabilistic Diverse GAN for Image Inpainting

by: Teppei Kurita

GAN Inpainting

Function4D: Real-Time Human Volumetric Capture From Very Sparse Consumer RGBD Sensors

by: Teppei Kurita

3D 3D reconstruction Real-time Volumetric Capture

Camouflaged Object Segmentation With Distraction Mining

by: Teppei Kurita

Segmentation Camouflaged Object

Depth Completion Using Plane-Residual Representation

by: Teppei Kurita

Depth estimation Depth Completion

Contrastive Learning for Compact Single Image Dehazing

by: Teppei Kurita

Contrastive Learning Dehazing

Depth Completion With Twin Surface Extrapolation at Occlusion Boundaries

by: Teppei Kurita

Depth estimation Depth Completion

A Decomposition Model for Stereo Matching

by: Shoji Sonoyama

Depth estimation

SRWarp: Generalized Image Super-Resolution under Arbitrary Transformation

by: Teppei Kurita

Super resolution

Learning Optical Flow From Still Images

by: Teppei Kurita

Optical flow

Progressive Semantic Segmentation

by: Masanori YANO

3D GAN Point cloud Self supervised learning

Single-View 3D Object Reconstruction From Shape Priors in Memory

by: 牧原昂志

3D reconstruction

Style-Based Point Generator With Adversarial Rendering for Point Cloud Completion

by: Naoya Chiba

Im2Vec: Synthesizing Vector Graphics Without Vector Supervision

by: Naoya Chiba

Disentanglement Representation learning Self supervised learning

StyleMix: Separating Content and Style for Enhanced Data Augmentation

by: 福原吉博 (Yoshihiro Fukuhara)

Data Augmentation

Generative Classifiers as a Basis for Trustworthy Image Classification

by: Hiroaki Aizawa

Learning To Count Everything

by: 西村　和也（九州大学）

Dataset N-shot learning visual counting

Towards Good Practices for Efficiently Annotating Large-Scale Image Classification Datasets

by: Anonymous

Dataset Self supervised learning

Adaptive Methods for Real-World Domain Generalization

by: 金城忍

Domain adaptation Self supervised learning Domain invariant

Are Labels Always Necessary for Classifier Accuracy Evaluation?

by: 金城忍

Classifier accuracy prediction

AutoInt: Automatic Integration for Fast Neural Volume Rendering

by: Naoya Chiba

3D Representation learning

Synthesize-It-Classifier: Learning a Generative Classifier Through Recurrent Self-Analysis

by: Hiroaki Aizawa

DriveGAN: Towards a Controllable High-Quality Neural Simulation

by: Akihiro FUJII

GAN neural simulation

User-Guided Line Art Flat Filling With Split Filling Mechanism

by: 伊藤諒悟

colorization

LightTrack: Finding Lightweight Neural Networks for Object Tracking via One-Shot Architecture Search

by: 小林範久

Object detection Tracking

SMD-Nets: Stereo Mixture Density Networks

by: Teppei Kurita

Depth estimation Stereo Depth

Pixel Codec Avatars

by: Teppei Kurita

Attetion Dataset GAN Image translation

Saliency-Guided Image Translation

by: Masanori YANO

MagFace: A Universal Representation for Face Recognition and Quality Assessment

by: Kazuki Maeno

SuperMix: Supervising the Mixing Data Augmentation

by: 福原吉博 (Yoshihiro Fukuhara)

Data augmentation

VLN BERT: A Recurrent Vision-and-Language BERT for Navigation

by: hisaka koji

Explainability Human in the loop

Understanding Failures of Deep Networks via Robust Feature Extraction

by: 金城忍

Invisible Perturbations: Physical Adversarial Examples Exploiting the Rolling Shutter Effect

by: Teppei Kurita

Adversarial examples Perturbations

Learning Goals From Failure

by: Kensho Hara

Action recognition Dataset Video

Prototype Augmentation and Self-Supervision for Incremental Learning

by: Hiroaki Aizawa

Self supervised learning

Fair Attribute Classification Through Latent Space De-Biasing

by: Ryo Takahashi

Disentanglement GAN fairness

Adversarial Invariant Learning

by: 金城忍

Out of distribution Generalization Adversarial attack

L2M-GAN: Learning To Manipulate Latent Space Semantics for Facial Attribute Editing

by: Tadashi Ise

Quantifying Explainers of Graph Neural Networks in Computational Pathology

by: Tasuku KINJO

GNN

TDN: Temporal Difference Networks for Efficient Action Recognition

by: Kensho Hara

RSTNet: Captioning With Adaptive Attention on Visual and Non-Visual Words

by: QIU YUE

On Semantic Similarity in Video Retrieval

by: Kensho Hara

Information-Theoretic Segmentation by Inpainting Error Maximization

by: Anonymous

Segmentation

Towards Diverse Paragraph Captioning for Untrimmed Videos

by: QIU YUE

Multiple Instance Captioning: Learning Representations From Histopathology Textbooks and Articles

by: QIU YUE

3D Point cloud Semantic segmentation

ArtCoder: An End-to-End Method for Generating Scanning-Robust Stylized QR Codes

by: Ryota Suzuki

Style transfer

Semantic Segmentation for Real Point Cloud Scenes via Bilateral Augmentation and Adaptive Fusion

by: Naoya Chiba

Tracking Pedestrian Heads in Dense Crowd

by: 小林範久

Object detection Tracking

Deep Implicit Moving Least-Squares Functions for 3D Reconstruction

by: Naoya Chiba

3D 3D reconstruction Point cloud Representation learning Self supervised learning

Decoupled Dynamic Filter Networks

by: Masanori YANO

Depth estimation Object detection Recognition Upsampling

Stereo Radiance Fields (SRF): Learning View Synthesis for Sparse Views of Novel Scenes

by: Takayuki Semitsu

3D reconstruction Point cloud Robustness Self supervised learning

On Robustness and Transferability of Convolutional Neural Networks

by: 金城忍

Dataset Domain adaptation Robustness Out-of-Distribution Transfer learning preprocessing

LiDAR R-CNN: An Efficient and Universal 3D Object Detector

by: Fukuchi Nobuaki

3D 3D object detection Point cloud Pose estimation

Meta Pseudo Labels

by: Hiroki Ohashi

Meta learning Recognition

Denoise and Contrast for Category Agnostic Shape Completion

by: Naoya Chiba

3D 3D reconstruction Point cloud Self supervised learning

Fully Understanding Generic Objects: Modeling, Segmentation, and Reconstruction

by: Naoya Chiba

3D 3D reconstruction Depth estimation Disentanglement Point cloud Self supervised learning

NExT-QA: Next Phase of Question-Answering to Explaining Temporal Actions

by: QIU YUE

Dataset Video Vision and language

SUTD-TrafficQA: A Question Answering Benchmark and an Efficient Network for Video Reasoning Over Traffic Events

by: QIU YUE

Predicting Human Scanpaths in Visual Question Answering

by: QIU YUE

Attetion Knowledge distillation XAI

A Peek Into the Reasoning of Neural Networks: Interpreting With Structural Visual Concepts

by: Tasuku KINJO

PQA: Perceptual Question Answering

by: QIU YUE

GAN Unsupervised learning Image translation

DG-Font: Deformable Generative Networks for Unsupervised Font Generation

by: Masanori YANO

Representation Learning via Global Temporal Alignment and Cycle-Consistency

by: 福原吉博 (Yoshihiro Fukuhara)

Representation learning Video

Learning the Best Pooling Strategy for Visual Semantic Embedding

by: 金城忍

Multi modal Average pool Max pool K-MAX pool

Found a Reason for me? Weakly-supervised Grounded Visual Question Answering using Capsules

by: QIU YUE

Recognition Vision and language

Bridge To Answer: Structure-Aware Graph Interaction Network for Video Question Answering

by: QIU YUE

Adversarial examples Video Object tracking

IoU Attack: Towards Temporally Coherent Black-Box Adversarial Attack for Visual Object Tracking

by: 伊藤諒悟

DexYCB: A Benchmark for Capturing Hand Grasping of Objects

by: 牧原昂志

3D Dataset

SOE-Net: A Self-Attention and Orientation Encoding Network for Point Cloud Based Place Recognition

by: Naoya Chiba

3D Adversarial examples Point cloud

PointGuard: Provably Robust 3D Point Cloud Classification

by: Naoya Chiba

Differentiable Patch Selection for Image Recognition

by: Shunsuke Nakatsuka

Selection

UPFlow: Upsampling Pyramid for Unsupervised Optical Flow Learning

by: Masanori YANO

Optical flow Unsupervised learning Upsampling

Faster Meta Update Strategy for Noise-Robust Deep Learning

by: Akihiro FUJII

Meta learning Robustness

MaxUp: Lightweight Adversarial Training With Data Augmentation Improves Neural Network Training

by: Akihiro FUJII

Recognition Robustness long tailed

Improving Calibration for Long-Tailed Recognition

by: Akihiro FUJII

Long-Tailed Multi-Label Visual Recognition by Collaborative Training on Uniform and Re-Balanced Samplings

by: Akihiro FUJII

Robustness long tailed

Learning Invariant Representations and Risks for Semi-Supervised Domain Adaptation

by: 金城忍

Domain adaptation GAN

Semantic Segmentation With Generative Models: Semi-Supervised Learning and Strong Out-of-Domain Generalization

by: Anonymous

Semantic segmentation Domain generalization

Scene Text Retrieval via Joint Text Detection and Similarity Learning

by: Shintaro Yamamoto

Multi modal Vision and language Retrieval

Open-Vocabulary Object Detection Using Captions

by: Shintaro Yamamoto

Attetion Multi modal Video Affective computing

Topological Planning With Transformers for Vision-and-Language Navigation

by: QIU YUE

3D Vision and language

Affect2MM: Affective Analysis of Multimedia Content Using Emotion Causality

by: Kensho Hara

clDice – A Novel Topology-Preserving Loss Function for Tubular Structure Segmentation

by: Shunsuke Yoshizawa

Segmentation Topology

Binary TTC: A Temporal Geofence for Autonomous Navigation

by: QIU YUE

Optical flow Semantic segmentation Video

Video Object Segmentation Using Global and Instance Embedding Learning

by: Ryo Nakamura

Instance segmentation Video

Guided Interactive Video Object Segmentation Using Reliability-Based Attention Maps

by: Anonymous

Attetion Semantic segmentation Video

SOON: Scenario Oriented Object Navigation With Graph-Based Exploration

by: QIUYUE

TAP: Text-Aware Pre-Training for Text-VQA and Text-Caption

by: Norihito Ishida

3D 3D object detection Instance segmentation Point cloud Segmentation Semantic segmentation

Panoptic-PolarNet: Proposal-Free LiDAR Point Cloud Panoptic Segmentation

by: Naoya Chiba

ArtEmis: Affective Language for Visual Art

by: Norihito Ishida

Dataset Vision and language Emotional estimation Explainable AI

Regularization Strategy for Point Cloud via Rigidly Mixed Sample

by: Naoya Chiba

3D 3D object detection 3D reconstruction

OCONet: Image Extrapolation by Object Completion

by: Masanori YANO

GAN Self supervised learning Image extrapolation Image completion

Distribution Alignment: A Unified Framework for Long-Tail Visual Recognition

by: Takayuki Semitsu

Instance segmentation Object detection Representation learning Robustness Semantic segmentation

Fast and Accurate Model Scaling

by: Akihiro FUJII

Neural architecture search(NAS) Recognition

Contrastive Learning Based Hybrid Networks for Long-Tailed Image Classification

by: Akihiro FUJII

Recognition Representation learning long tailed

Rethinking Channel Dimensions for Efficient Model Design

by: 金城忍

Neural architecture search(NAS)

Disentangling Label Distribution for Long-Tailed Visual Recognition

by: Akihiro FUJII

Recognition long tailed

Joint Generative and Contrastive Learning for Unsupervised Person Re-Identification

by: 金城忍

3D Disentanglement GAN Person re-identification Domain transfer Contrastive learning

Exploiting & Refining Depth Distributions With Triangulation Light Curtains

by: Teppei Kurita

Depth estimation Light Curtains

Adaptive Image Transformer for One-Shot Object Detection

by: Shintaro Yamamoto

N-shot learning Object detection

MP3: A Unified Model To Map, Perceive, Predict and Plan

by: QIU YUE

Point cloud Video

OpenRooms: An Open Framework for Photorealistic Indoor Scene Datasets

by: Shunsuke Kogure

Dataset Object detection Segmentation

On Feature Normalization and Data Augmentation

by: 金城忍

Normalization Decision boundary smoothing Augmentation

StyleSpace Analysis: Disentangled Controls for StyleGAN Image Generation

by: ianna

Disentanglement GAN image manipulation image generation

Objectron: A Large Scale Dataset of Object-Centric Videos in the Wild With Pose Annotations

by: Shunsuke Kogure

3D 3D object detection Dataset Object detection

MIST: Multiple Instance Self-Training Framework for Video Anomaly Detection

by: Ryo Nakamura

Recognition Representation learning

Primitive Representation Learning for Scene Text Recognition

by: Hirokatsu Kataoka

Discrimination-Aware Mechanism for Fine-Grained Representation Learning

by: Hirokatsu Kataoka

Recognition Representation learning

General Multi-Label Image Classification With Transformers

by: Shintaro Yamamoto

3D 3D object detection 3D reconstruction Point cloud Recognition

SceneGraphFusion: Incremental 3D Scene Graph Prediction From RGB-D Sequences

by: Hirokatsu Kataoka

Semantic Audio-Visual Navigation

by: QIU YUE

3D Dataset Vision and language

Blur, Noise, and Compression Robust Generative Adversarial Networks

by: 金城忍

GAN Robustness Cycle consistency loss

Room-and-Object Aware Knowledge Reasoning for Remote Embodied Referring Expression

by: 綱島秀樹

Embodied AI Navigation VLN

FESTA: Flow Estimation via Spatial-Temporal Attention for Scene Point Clouds

by: Naoya Chiba

3D Attetion Point cloud

UnsupervisedR&R: Unsupervised Point Cloud Registration via Differentiable Rendering

by: Naoya Chiba

3D Point cloud Pose estimation Self supervised learning Video

PoseAug: A Differentiable Pose Augmentation Framework for 3D Human Pose Estimation

by: Takayuki Semitsu

3D Pose estimation Robustness

Learning To Warp for Style Transfer

by: Masanori YANO

Style transfer

Uncertainty-Aware Joint Salient Object and Camouflaged Object Detection

by: Takehiro Matsuda

Object detection Segmentation

MetaSAug: Meta Semantic Augmentation for Long-Tailed Visual Recognition

by: Akihiro FUJII

Recognition long tailed

Progressive Domain Expansion Network for Single Domain Generalization

by: 金城忍

Domain adaptation Representation learning VAE Contrastive learning

KeepAugment: A Simple Information-Preserving Data Augmentation Approach

by: Akihiro FUJII

Object detection Recognition Robustness

Improving the Efficiency and Robustness of Deepfakes Detection Through Precise Geometric Features

by: Akihiro FUJII

Recognition Robustness deep fake

Towards Evaluating and Training Verifiably Robust Neural Networks

by: Akihiro FUJII

N-shot learning Contrastive learning Supervised learning

Contrastive Embedding for Generalized Zero-Shot Learning

by: 金城忍

HyperSeg: Patch-Wise Hypernetwork for Real-Time Semantic Segmentation

by: Shigemichi Matsuzaki

Semantic segmentation Hypernetwork

Pre-Trained Image Processing Transformer

by: Shintaro Yamamoto

Super resolution Low-level vision

Exponential Moving Average Normalization for Self-Supervised and Semi-Supervised Learning

by: 金城忍

Self supervised learning Teacher-student network Semi-supervised learning Batch normalization

Bottom-Up Human Pose Estimation via Disentangled Keypoint Regression

by: 西村　和也 (九州大学)

Disentanglement Pose estimation

Differentiable Multi-Granularity Human Representation Learning for Instance-Aware Human Semantic Parsing

by: Soma Nonaka

Pose estimation Segmentation Semantic segmentation

Dynamic Domain Adaptation for Efficient Inference

by: 金城忍

Domain adaptation Robustness Batch normalization Instance normalization Layer normalization Encoder-decoder

Rethinking Text Segmentation: A Novel Dataset and a Text-Specific Refinement Approach

by: Ryoh Hayamizu

Dataset Segmentation

Post-Hoc Uncertainty Calibration for Domain Drift Scenarios

by: 平澤寅庄

Calibration

Adversarially Adaptive Normalization for Single Domain Generalization

by: 金城忍

Learning a Facial Expression Embedding Disentangled From Identity

by: Ryo Miyoshi

Disentanglement Facial Expression Recognition Feature Embedding

Recognizing Actions in Videos From Unseen Viewpoints

by: Ryo Nakamura

3D 3D reconstruction Action recognition Video

Unveiling the Potential of Structure Preserving for Weakly Supervised Object Localization

by: Ryo Nakamura

PV-RAFT: Point-Voxel Correlation Fields for Scene Flow Estimation of Point Clouds

by: Naoya Chiba

3D 3D object detection Object detection Pose estimation

GrooMeD-NMS: Grouped Mathematically Differentiable NMS for Monocular 3D Object Detection

by: Naoya Chiba

Depth-Aware Mirror Segmentation

by: Masanori YANO

Dataset Segmentation Semantic segmentation

Improving Sign Language Translation With Monolingual Data by Sign Back-Translation

by: hisaka koji

How2Sign: A Large-Scale Multimodal Dataset for Continuous American Sign Language

by: hisaka koji

UC2: Universal Cross-Lingual Cross-Modal Vision-and-Language Pre-Training

by: hisaka koji

Representation learning Neural Scene Representation

Neural Scene Flow Fields for Space-Time View Synthesis of Dynamic Scenes

by: 福原吉博 (Yoshihiro Fukuhara)

Neighborhood Contrastive Learning for Novel Class Discovery

by: 金城忍

Self supervised learning Contrastive learning Supervised learning

Quasi-Dense Similarity Learning for Multiple Object Tracking

by: Akihiro FUJII

multi object tracking

Actor-Context-Actor Relation Network for Spatio-Temporal Action Localization

by: Akihiro FUJII

Action recognition

Cross-View Cross-Scene Multi-View Crowd Counting

by: Akihiro FUJII

Dataset crowd counting

Transferable Semantic Augmentation for Domain Adaptation

by: 金城忍

Domain adaptation Distribution augmentation

MIST: Multiple Instance Spatial Transformer

by: Shintaro Yamamoto

Recognition Reconstruction

Learning Salient Boundary Feature for Anchor-free Temporal Action Localization

by: Shunsuke Kogure

Label smoothing Supervised learning

End-to-End Human Object Interaction Detection With HOI Transformer

by: Shintaro Yamamoto

Human Object Interaction

Improving Unsupervised Image Clustering With Robust Learning

by: 金城忍

Shallow Feature Matters for Weakly Supervised Object Localization

by: Ryo Nakamura

Recognition Segmentation

Region-Aware Adaptive Instance Normalization for Image Harmonization

by: Shoma Iwai

Image harmonization Image composition

Mitigating Face Recognition Bias via Group Adaptive Classifier

by: Shunsuke Kogure

Representation learning Fairness

Dual Contradistinctive Generative Autoencoder

by: 金城忍

GAN VAE Contrastive learning

Conceptual 12M: Pushing Web-Scale Image-Text Pre-Training To Recognize Long-Tail Visual Concepts

by: Shintaro Yamamoto

Dataset Vision and language

VoxelContext-Net: An Octree Based Framework for Point Cloud Compression

by: Naoya Chiba

3D Point cloud Representation learning Self supervised learning Video

Robust Point Cloud Registration Framework Based on Deep Graph Matching

by: Naoya Chiba

3D Point cloud Pose estimation Self supervised learning

SwiftNet: Real-Time Video Object Segmentation

by: Anonymous

Segmentation Video

DiNTS: Differentiable Neural Network Topology Search for 3D Medical Image Segmentation

by: Yuta Nakamura

3D Neural architecture search(NAS) Segmentation

Mirror3D: Depth Refinement for Mirror Surfaces

by: Masanori YANO

3D reconstruction Dataset Depth estimation Segmentation

Neural Surface Maps

by: Takayuki Semitsu

3D 3D reconstruction Point cloud

Consistent Instance False Positive Improves Fairness in Face Recognition

by: Kazuki Maeno

Fingerspelling Detection in American Sign Language

by: hisaka koji

Masksembles for Uncertainty Estimation

by: Akihiro FUJII

uncertainty

Kaleido-BERT: Vision-Language Pre-Training on Fashion Domain

by: hisaka koji

Collaborative Spatial-Temporal Modeling for Language-Queried Video Actor Segmentation

by: hisaka koji

Domain adaptation Knowledge distillation Transfer learning Semi supervised learning Fine tuning

Adaptive Consistency Regularization for Semi-Supervised Transfer Learning

by: 金城忍

Siamese Natural Language Tracker: Tracking by Natural Language Descriptions With Siamese Trackers

by: hisaka koji

Recognition Robustness noisy label

Partially View-Aligned Representation Learning With Noise-Robust Contrastive Loss

by: Akihiro FUJII

Multi modal Robustness

Jo-SRC: A Contrastive Approach for Combating Noisy Labels

by: Akihiro FUJII

S3: Learnable Sparse Signal Superdensity for Guided Depth Estimation

by: Teppei Kurita

Depth estimation Depth Completion

Diverse Part Discovery: Occluded Person Re-Identification With Part-Aware Transformer

by: Shintaro Yamamoto

Person re-identification

Facial Action Unit Detection With Transformers

by: Shintaro Yamamoto

Recognition facial expression

Joint Negative and Positive Learning for Noisy Labels

by: 金城忍

Noisy data Positive learning Negative learning

Rethinking the Heatmap Regression for Bottom-Up Human Pose Estimation

by: 西村　和也（九州大学）

Pose estimation

AdvSim: Generating Safety-Critical Scenarios for Self-Driving Vehicles

by: Takeru Oba

Adversarial examples Dataset Robustness

Regressive Domain Adaptation for Unsupervised Keypoint Detection

by: 西村　和也（九州大学）

Domain adaptation Pose estimation

Detection, Tracking, and Counting Meets Drones in Crowds: A Benchmark

by: Tasuku KINJO

Dataset Video crowd analysis

AGORA: Avatars in Geography Optimized for Regression Analysis

by: Ryoh Hayamizu

Look Before You Speak: Visually Contextualized Utterances

by: Shintaro Yamamoto

Dataset Super resolution Reference-SR

Robust Reference-Based Super-Resolution via C2-Matching

by: Shoma Iwai

Out-of-Distribution Detection Using Union of 1-Dimensional Subspaces

by: 金城忍

Robustness Out-of-domain detection Spectrum analysis Althogonal

Exploring and Distilling Posterior and Prior Knowledge for Radiology Report Generation

by: Kazuma Nakata

Knowledge distillation Vision and language

From Points to Multi-Object 3D Reconstruction

by: Tomohiro Hayase

3D reconstruction

Adversarial Robustness Across Representation Spaces

by: 福原吉博 (Yoshihiro Fukuhara)

Adversarial examples Robustness

PMP-Net: Point Cloud Completion by Learning Multi-Step Point Moving Paths

by: Naoya Chiba

3D Point cloud

Image De-Raining via Continual Learning

by: Masanori YANO

Deraining Learning method Continual learning

View-Guided Point Cloud Completion

by: Naoya Chiba

3D 3D reconstruction Depth estimation Point cloud

Deep Compositional Metric Learning

by: 金城忍

Metric learning Ensemble learning

Stochastic Whitening Batch Normalization

by: 金城忍

Batch normalization Iterative normalization Whitening

Back-Tracing Representative Points for Voting-Based 3D Object Detection in Point Clouds

by: Naoya Chiba

3D 3D object detection Object detection Point cloud Pose estimation

Omni-Supervised Point Cloud Segmentation via Gradual Receptive Field Component Reasoning

by: Naoya Chiba

3D Point cloud Segmentation Semantic segmentation

Exploring Complementary Strengths of Invariant and Equivariant Representations for Few-Shot Learning

by: 金城忍

N-shot learning Inductive biases Transformation equivariance / invariance Contrastive learning

VideoMoCo: Contrastive Video Representation Learning With Temporally Adversarial Examples

by: 金城忍

GAN Representation learning Contrastive learning

Understanding and Simplifying Perceptual Distances

by: 金城忍

Convolutional neural network Maximum mean discrepancy

Cross-Iteration Batch Normalization

by: 金城忍

Batch normalization

Single Pair Cross-Modality Super Resolution

by: Masanori YANO

Multi modal Super resolution

Sketch2Model: View-Aware 3D Modeling From Single Free-Hand Sketches

by: yasud

3D 3D reconstruction Domain adaptation Sketch

Flow-Based Kernel Prior With Application to Blind Super-Resolution

by: 近藤佑樹 (Yuki Kondo)

Self supervised learning Super resolution Deep generative model

Human-Like Controllable Image Captioning With Verb-Specific Semantic Roles

by: Seitaro Shinagawa

Action recognition Dataset Representation learning Video

Dynamic Weighted Learning for Unsupervised Domain Adaptation

by: 金城忍

Domain adaptation GAN

Home Action Genome: Cooperative Compositional Action Understanding

by: Hirokatsu Kataoka

SCALE: Modeling Clothed Humans with a Surface Codec of Articulated Local Elements

by: Hirokatsu Kataoka

Densely Connected Multi-Dilated Convolutional Networks for Dense Prediction Tasks

by: Hirokatsu Kataoka

Recognition Architecture

SceneGen: Learning To Generate Realistic Traffic Scenes

by: Hirokatsu Kataoka

Self-driving Cars

Knowledge Evolution in Neural Networks

by: Hirokatsu Kataoka

Computational Photography

Unsupervised Pre-Training for Person Re-Identification

by: Hirokatsu Kataoka

Dataset Recognition

Intelligent Carpet: Inferring 3D Human Pose From Tactile Signals

by: Hirokatsu Kataoka

3D Recognition

Composing Photos Like a Photographer

by: masato tonouchi

Skip-Convolutions for Efficient Video Processing

by: 金城忍

Convolutional neural network Skip connection

StickyPillars: Robust and Efficient Feature Matching on Point Clouds Using Graph Neural Networks

by: Naoya Chiba

PWCLO-Net: Deep LiDAR Odometry in 3D Point Clouds Using Hierarchical Embedding Mask Optimization

by: Naoya Chiba

Action recognition Multi modal Video

Repetitive Activity Counting by Sight and Sound

by: Katsuyuki Nakamura

Surrogate Gradient Field for Latent Space Manipulation

by: 金城忍

Object detection Person re-identification Person search

Scene-Intuitive Agent for Remote Embodied Visual Grounding

by: 綱島秀樹

Embodied AI REVERIE VLN

Anchor-Free Person Search

by: Masanori YANO

Coming Down to Earth: Satellite-to-Street View Synthesis for Geo-Localization

by: Katsuhiro Muto

Multi-Stage Aggregated Transformer Network for Temporal Language Localization in Videos

by: hisaka koji

Structured Multi-Level Interaction Network for Video Moment Localization via Language Query

by: hisaka koji

Towards More Flexible and Accurate Object Tracking With Natural Language: Algorithms and Benchmark

by: hisaka koji

Knowledge distillation Recognition

Learning Student Networks in the Wild

by: Hirokatsu Kataoka

OBoW: Online Bag-of-Visual-Words Generation for Self-Supervised Learning

by: Hirokatsu Kataoka

Knowledge distillation Self supervised learning

Refer-It-in-RGBD: A Bottom-Up Approach for 3D Visual Grounding in RGBD Images

by: Seitaro Shinagawa

3D 3D object detection Vision and language

KOALAnet: Blind Super-Resolution Using Kernel-Oriented Adaptive Local Adjustment

by: 近藤佑樹 (Yuki Kondo)

Self supervised learning Super resolution

Relation-aware Instance Refinement for Weakly Supervised Visual Grounding

by: Seitaro Shinagawa

Instance segmentation Semantic segmentation Panoptic Segmentation Associative embeddings

Hierarchical Lovász Embeddings for Proposal-Free Panoptic Segmentation

by: 金城忍

Wasserstein Contrastive Representation Distillation

by: 金城忍

FVC: A New Framework Towards Deep Video Compression in Feature Space

by: Shoma Iwai

Video Video compression

Shape and Material Capture at Home

by: Teppei Kurita

Pose estimation Semantic segmentation Backbone

CodedStereo: Learned Phase Masks for Large Depth-of-Field Stereo

by: Teppei Kurita

Depth estimation

Lite-HRNet: A Lightweight High-Resolution Network

by: Teppei Kurita

Neural Camera Simulators

by: Teppei Kurita

Simulator

Pareidolia Face Reenactment

by: Teppei Kurita

Pareidolia Face Reenactment

Passive Inter-Photon Imaging

by: Teppei Kurita

SPAD

Explore Image Deblurring via Encoded Blur Kernel Space

by: Teppei Kurita

Deblur

Refine Myself by Teaching Myself: Feature Refinement via Self-Knowledge Distillation

by: Teppei Kurita

Knowledge distillation Self supervised learning

Learning Multi-Scale Photo Exposure Correction

by: Kensho Hara

Dataset Photo Exposure Correction

More Photos Are All You Need: Semi-Supervised Learning for Fine-Grained Sketch Based Image Retrieval

by: yasud

Information Retrieval Sketch

Semi-Supervised Video Deraining With Dynamical Rain Generator

by: Teppei Kurita

Deraining

SSLayout360: Semi-Supervised Indoor Layout Estimation From 360° Panorama

by: Teppei Kurita

Layout Estimation Panorama

Skeleton Merger: An Unsupervised Aligned Keypoint Detector

by: Teppei Kurita

3D Unsupervised Learning Keypoint Detector

Deep Gaussian Scale Mixture Prior for Spectral Compressive Imaging

by: Shigekazu Takizawa

Computational imaging Compressive imaging

Learning the Superpixel in a Non-Iterative and Lifelong Manner

by: Teppei Kurita

Superpixel

Student-Teacher Learning From Clean Inputs to Noisy Inputs

by: Teppei Kurita

Frequency-Aware Discriminative Feature Learning Supervised by Single-Center Loss for Face Forgery Detection

by: Teppei Kurita

Face Forgery Detection

Sewer-ML: A Multi-Label Sewer Defect Classification Dataset and Benchmark

by: Teppei Kurita

Depth estimation Multi modal Echo

Beyond Image to Depth: Improving Depth Prediction Using Echoes

by: Teppei Kurita

DatasetGAN: Efficient Labeled Data Factory With Minimal Human Effort

by: Soma Nonaka

Dataset GAN Semantic segmentation

A Fourier-Based Framework for Domain Generalization

by: Ryo Takahashi

Self supervised learning Video Vision and language

Audio-Visual Instance Discrimination with Cross-Modal Agreement

by: QIU YUE

All Labels Are Not Created Equal: Enhancing Semi-Supervision via Label Grouping and Co-Training

by: Hirokatsu Kataoka

Recognition Representation learning Pseudo Labels

Visual Navigation With Spatial Attention

by: QIU YUE

3D Vision and language

Point Cloud Upsampling via Disentangled Refinement

by: Naoya Chiba

3D Action recognition Disentanglement Point cloud

i3DMM: Deep Implicit 3D Morphable Model of Human Heads

by: Naoya Chiba

3D 3D reconstruction Disentanglement Representation learning

GeoSim: Realistic Video Simulation via Geometry-Aware Composition for Self-Driving

by: QIU YUE

3D Dataset Video

Privacy-Preserving Collaborative Learning With Automatic Transformation Search

by: Kai Watabe

Data augmentation Collaborative learning

Bidirectional Projection Network for Cross Dimension Scene Understanding

by: Kai Watabe

3D Segmentation Semantic segmentation

Learning To Segment Actions From Visual and Language Instructions via Differentiable Weak Sequence Alignment

by: hisaka koji

Read and Attend: Temporal Localisation in Sign Language Videos

by: hisaka koji

FSDR: Frequency Space Domain Randomization for Domain Generalization

by: Kai Watabe

Knowledge distillation Continual learning

Magic Layouts: Structural Prior for Component Detection in User Interface Designs

by: yasud

Object detection Layout

On Learning the Geodesic Path for Incremental Learning

by: Shunsuke Nakatsuka

Prototype-Guided Saliency Feature Learning for Person Search

by: Masanori YANO

Attetion Object detection Person re-identification Person search

VITON-HD: High-Resolution Virtual Try-On via Misalignment-Aware Normalization

by: 綱島秀樹

Virtual Try-on

Generalized Focal Loss V2: Learning Reliable Localization Quality Estimation for Dense Object Detection

by: 古澤嘉久

Neural architecture search(NAS) Robustness

DSRNA: Differentiable Search of Robust Neural Architectures

by: Akihiro FUJII

Bilinear Parameterization for Non-Separable Singular Value Penalties

by: 金城忍

Low rank approximation

Dive Into Ambiguity: Latent Distribution Mining and Pairwise Uncertainty Estimation for Facial Expression Recognition

by: Akihiro FUJII

Recognition Robustness uncertainty

Multi-Objective Interpolation Training for Robustness To Label Noise

by: Akihiro FUJII

DAT: Training Deep Networks Robust To Label-Noise by Matching the Feature Distributions

by: Akihiro FUJII

3D reconstruction Depth estimation Point cloud

Learning To Recover 3D Scene Shape From a Single Image

by: Yusuke Saito

Variational Relational Point Completion Network

by: Shunsuke Yoshizawa

Dataset Point cloud

AGQA: A Benchmark for Compositional Spatio-Temporal Reasoning

by: QIU YUE

Dataset Video Vision and language

Few-Shot Classification With Feature Map Reconstruction Networks

by: 金城忍

N-shot learning

You See What I Want You To See: Exploring Targeted Black-Box Transferability Attack for Hash-Based Image Retrieval Systems

by: 伊藤諒悟

Adversarial examples Image Retrieval

Seesaw Loss for Long-Tailed Instance Segmentation

by: Shunsuke Yoshizawa

Instance segmentation Segmentation Loss function

Temporal Action Segmentation From Timestamp Supervision

by: Shunsuke Kogure

Action recognition Segmentation Video

Deep Animation Video Interpolation in the Wild

by: Anonymous

GAN Video

High-Fidelity Neural Human Motion Transfer From Monocular Video

by: Anonymous

3D Video

Intentonomy: A Dataset and Study Towards Human Intent Understanding

by: Shintaro Yamamoto

Representation learning Self supervised learning Video

Task Programming: Learning Data Efficient Behavior Representations

by: QIU YUE

Person30K: A Dual-Meta Generalization Network for Person Re-Identification

by: Ryoh Hayamizu

Multi-Target Domain Adaptation With Collaborative Consistency Learning

by: 金城忍

Nutrition5k: Towards Automatic Nutritional Understanding of Generic Food

by: Shintaro Yamamoto

Neural Lumigraph Rendering

by: QIU YUE

3D reconstruction

NewtonianVAE: Proportional Control and Goal Identification From Pixels via Physical Latent Spaces

by: QIU YUE

Dataset Object detection Recognition Robustness

Taskology: Utilizing Task Relations at Scale

by: Shintaro Yamamoto

Generalizable Pedestrian Detection: The Elephant in the Room

by: yoshiki miyazawa

Temporally-Weighted Hierarchical Clustering for Unsupervised Action Segmentation

by: Shunsuke Kogure

Action recognition Segmentation Video

Exploring Adversarial Fake Images on Face Manifold

by: 伊藤諒悟

Adversarial examples GAN Adversarial attack

Learning Camera Localization via Dense Scene Matching

by: Takayuki Semitsu

Object detection Segmentation Semantic segmentation

Simultaneously Localize, Segment and Rank the Camouflaged Objects

by: Takehiro Matsuda

Dynamic Region-Aware Convolution

by: Masanori YANO

Instance segmentation Object detection Recognition

Neural Body: Implicit Neural Representations With Structured Latent Codes for Novel View Synthesis of Dynamic Humans

by: Naoya Chiba

3D 3D reconstruction Representation learning Video

HVPR: Hybrid Voxel-Point Representation for Single-Stage 3D Object Detection

by: Naoya Chiba

3D 3D object detection Object detection Point cloud Pose estimation

Efficient Feature Transformations for Discriminative and Generative Continual Learning

by: 金城忍

Continual learning

LiBRe: A Practical Bayesian Approach to Adversarial Detection

by: 伊藤諒悟

Adversarial examples Adversarial detection

MagDR: Mask-Guided Detection and Reconstruction for Defending Deepfakes

by: Akihiro FUJII

Recognition Robustness deep fake

Representative Forgery Mining for Fake Face Detection

by: Akihiro FUJII

Recognition deep fake

A Second-Order Approach to Learning With Instance-Dependent Label Noise

by: Akihiro FUJII

Learning Semantic-Aware Dynamics for Video Prediction

by: Shoma Iwai

Adversarial examples Object detection Robustness

Domain-Independent Dominance of Adaptive Methods

by: 金城忍

Optimizer

Class-Aware Robust Adversarial Training for Object Detection

by: Akihiro FUJII

AdCo: Adversarial Contrast for Efficient Learning of Unsupervised Representations From Self-Trained Negative Adversaries

by: 福原吉博 (Yoshihiro Fukuhara)

Attetion Data compresstion

Attention-Guided Image Compression by Deep Reconstruction of Compressive Sensed Saliency Skeleton

by: Shigekazu Takizawa

TrafficSim: Learning To Simulate Realistic Multi-Agent Behaviors

by: QIU YUE

Dataset Video

Jigsaw Clustering for Unsupervised Visual Representation Learning

by: Hiroki Abe

LaPred: Lane-Aware Prediction of Multi-Modal Future Trajectories of Dynamic Agents

by: QIU YUE

N-shot learning Object detection Vision and language

Semantic Relation Reasoning for Shot-Stable Few-Shot Object Detection

by: QIU YUE

Abstract Spatial-Temporal Reasoning via Probabilistic Abduction and Execution

by: QIU YUE

Scale-Localized Abstract Reasoning

by: QIU YUE

3D 3D object detection Attetion Point cloud Pose estimation

AutoDO: Robust AutoAugment for Biased Data With Label Noise via Scalable Probabilistic Implicit Differentiation

by: Akihiro FUJII

Recognition augmentation

3D Object Detection With Pointformer

by: Naoya Chiba

VarifocalNet: An IoU-Aware Dense Object Detector

by: Masanori YANO

3D Disentanglement Representation learning

NeRV: Neural Reflectance and Visibility Fields for Relighting and View Synthesis

by: Naoya Chiba

Closed-Form Factorization of Latent Semantics in GANs

by: Takehiro Matsuda

Domain Adaptation With Auxiliary Target Domain-Oriented Classifier

by: 金城忍

Illumination Estimation White Balance

DivCo: Diverse Conditional Image Synthesis via Contrastive Generative Adversarial Network

by: 金城忍

GAN Contrastive learning

Leveraging the Availability of Two Cameras for Illuminant Estimation

by: Teppei Kurita

Hardness Sampling for Self-Training Based Transductive Zero-Shot Learning

by: 金城忍

N-shot learning

SPSG: Self-Supervised Photometric Scene Generation From RGB-D Scans

by: Yusuke Saito

3D 3D reconstruction GAN

Learning by Watching

by: Takeru Oba

Imitation Learning

M3P: Learning Universal Representations via Multitask Multilingual Multimodal Pre-Training

by: Shintaro Yamamoto

3D object detection data augmentation LiDAR

LiDAR-Aug: A General Rendering-Based Augmentation Framework for 3D Object Detection

by: Akihiro FUJII

Repopulating Street Scenes

by: Shintaro Yamamoto

Image synthesis

S3: Neural Shape, Skeleton, and Skinning Fields for 3D Human Modeling

by: Naoya Chiba

3D Disentanglement Multi modal Point cloud Pose estimation Semantic segmentation

Hierarchical Layout-Aware Graph Convolutional Network for Unified Aesthetics Assessment

by: yasud

GCN Image Aesthetics assessment

Robust and Accurate Object Detection via Adversarial Learning

by: yoshiki miyazawa

Adversarial examples Object detection Recognition Robustness

Gradient Forward-Propagation for Large-Scale Temporal Video Modelling

by: Kensho Hara

Dataset Video

Deep Learning in Latent Space for Video Prediction and Compression

by: Shoma Iwai

Video Video Compression

Generating Diverse Structure for Image Inpainting With Hierarchical VQ-VAE

by: Shoma Iwai

Image inpainting

Affordance Transfer Learning for Human-Object Interaction Detection

by: QIU YUE

Image-to-image translation

StEP: Style-Based Encoder Pre-Training for Multi-Modal Image Synthesis

by: Shoma Iwai

Few-Shot Image Generation via Cross-Domain Correspondence

by: 金城忍

Domain adaptation GAN N-shot learning

GLEAN: Generative Latent Bank for Large-Factor Image Super-Resolution

by: Shoma Iwai

GAN Super resolution

CT-Net: Complementary Transfering Network for Garment Transfer With Arbitrary Geometric Changes

by: 綱島秀樹

Virtual Try-on Garment Transfer

FS-Net: Fast Shape-Based Network for Category-Level 6D Object Pose Estimation With Decoupled Rotation Mechanism

by: Takahiro SUZUKI

3D 3D object detection Object detection Point cloud Pose estimation

PointAugmenting: Cross-Modal Augmentation for 3D Object Detection

by: Akihiro FUJII

3D 3D object detection Point cloud augmentation

Hijack-GAN: Unintended-Use of Pretrained, Black-Box GANs

by: Masanori YANO

3D 3D object detection 3D reconstruction Object detection Point cloud Segmentation

RfD-Net: Point Scene Understanding by Semantic Instance Reconstruction

by: Naoya Chiba

Learning to Track Instances without Video Annotations

by: Atsuki Osanai

Instance segmentation Self supervised learning Video

A2-FPN: Attention Aggregation Based Feature Pyramid Network for Instance Segmentation

by: Atsuki Osanai

Instance segmentation Object detection

Fully Convolutional Networks for Panoptic Segmentation

by: Atsuki Osanai

Panoptic segmentation

Monte Carlo Scene Search for 3D Scene Understanding

by: Yusuke Saito

3D Point cloud 3D Scene Understanding

Center-Based 3D Object Detection and Tracking

by: Yutaro Oguri

3D 3D object detection Point cloud

Relative Order Analysis and Optimization for Unsupervised Deep Metric Learning

by: 金城忍

Self supervised learning Metric learning Contrastive learning

Self-Supervised Augmentation Consistency for Adapting Semantic Segmentation

by: Akihiro FUJII

Domain adaptation Semantic segmentation Self supervised learning

Boosting Ensemble Accuracy by Revisiting Ensemble Diversity Metrics

by: Akihiro FUJII

Recognition ensemble

High-Fidelity and Arbitrary Face Editing

by: Akihiro FUJII

Adversarial Imaging Pipelines

by: Akihiro FUJII

Segmentation Semantic segmentation Post processing

Look Closer To Segment Better: Boundary Patch Refinement for Instance Segmentation

by: Shunsuke Yoshizawa

Neural Scene Graphs for Dynamic Scenes

by: QIU YUE

3D reconstruction Video

Towards Robust Classification Model by Counterfactual and Invariant Data Generation

by: Akihiro FUJII

Attetion Object detection Recognition

Manifold Regularized Dynamic Network Pruning

by: 金城忍

Channel pruning

Exploiting Edge-Oriented Reasoning for 3D Point-Based Scene Graph Analysis

by: QIU YUE

3D Vision and language

HOTR: End-to-End Human-Object Interaction Detection With Transformers

by: Hiroki Ohashi

Exploring Data-Efficient 3D Scene Understanding With Contrastive Scene Contexts

by: Yusuke Saito

3D 3D reconstruction Instance segmentation Point cloud Segmentation 3D Scene Understanding

Regularizing Neural Networks via Adversarial Model Perturbation

by: 伊藤諒悟

Regularization

GATSBI: Generative Agent-Centric Spatio-Temporal Object Interaction

by: QIU YUE

Representation learning Video

Spherical Confidence Learning for Face Recognition

by: Akihiro FUJII

Adversarial examples GAN Robustness ensemble

Ensembling With Deep Generative Views

by: Akihiro FUJII

Tree-Like Decision Distillation

by: Akihiro FUJII

QAIR: Practical Query-Efficient Black-Box Attacks for Image Retrieval

by: Anonymous

GAN Video RNN Image-video synthesis

Understanding Object Dynamics for Interactive Image-to-Video Synthesis

by: 金城忍

Improving the Transferability of Adversarial Samples With Adversarial Transformations

by: 伊藤諒悟

Dataset Image restoration

Modular Interactive Video Object Segmentation: Interaction-to-Mask, Propagation and Difference-Aware Fusion

by: QIU YUE

Segmentation Video

Image Restoration for Under-Display Camera

by: Yamagata Eisuke

QPIC: Query-Based Pairwise Human-Object Interaction Detection With Image-Wide Contextual Information

by: QIU YUE

Your “Flamingo” is My “Bird”: Fine-Grained, or Not

by: Shintaro Yamamoto

Action recognition Knowledge distillation Representation learning Video

Ego-Exo: Transferring Visual Representations From Third-Person to First-Person Videos

by: Katsuyuki Nakamura

Adversarial Laser Beam: Effective Physical-World Attack to DNNs in a Blink

by: Takuto2N

Meta learning Representation learning

Stylized Neural Painting

by: Takehiro Matsuda

Physically-Aware Generative Network for 3D Shape Modeling

by: Eisuke Yamagata

3D 3D reconstruction GAN

Hierarchical Video Prediction Using Relational Layouts for Human-Object Interactions

by: QIU YUE

GAN Video

Glance and Gaze: Inferring Action-Aware Points for One-Stage Human-Object Interaction Detection

by: QIU YUE

Object detection Tracking

Track To Detect and Segment: An Online Multi-Object Tracker

by: 小林範久

RGB-D Local Implicit Function for Depth Completion of Transparent Objects

by: Naoya Chiba

3D 3D reconstruction Depth estimation Point cloud

Holistic 3D Scene Understanding From a Single Image With Implicit Representation

by: Naoya Chiba

3D 3D object detection 3D reconstruction Depth estimation Object detection Point cloud Pose estimation

Exploiting Aliasing for Manga Restoration

by: Masanori YANO

Attetion Super resolution Manga restoration

PixMatch: Unsupervised Domain Adaptation via Pixelwise Consistency Training

by: Akihiro FUJII

Domain adaptation Segmentation Semantic segmentation

Improving Panoptic Segmentation at All Scales

by: 金城忍

Segmentation Panoptic segmentation

Multimodal Motion Prediction With Stacked Transformers

by: 金城忍

Autonomous driving Trajectory prediction Motion planning

D2IM-Net: Learning Detail Disentangled Implicit Fields From Single Images

by: Naoya Chiba

3D 3D reconstruction Depth estimation Disentanglement Representation learning

Multi-Person Implicit Reconstruction From a Single Image

by: Naoya Chiba

3D 3D object detection 3D reconstruction Dataset Depth estimation Instance segmentation Object detection Pose estimation Segmentation

In the Light of Feature Distributions: Moment Matching for Neural Style Transfer

by: Kai Watabe

Domain adaptation Neural style transfer

Toward Accurate and Realistic Outfits Visualization With Attention to Details

by: 綱島秀樹

Virtual Try-on Multi-garment Virtual Try-on Outfits Visualization

Inception Convolution With Efficient Dilation Search

by: Masanori YANO

Instance segmentation Neural architecture search(NAS) Object detection Pose estimation Recognition

Adaptive Convolutions for Structure-Aware Style Transfer

by: 金城忍

Style transfer

Activate or Not: Learning Customized Activation

by: 金城忍

Activation function

Transformer Tracking

by: 金城忍

Object tracking Transformer

Dynamic Slimmable Network

by: 金城忍

Model compression

Representative Batch Normalization With Feature Calibration

by: 金城忍

Batch normalization

Reducing Domain Gap by Reducing Style Bias

by: 金城忍

Convolutional Neural Network Domain shit

To the Point: Efficient 3D Object Detection in the Range Image With Graph Convolution Kernels

by: Naoya Chiba

3D 3D object detection Object detection Point cloud

Learning Fine-Grained Segmentation of 3D Shapes Without Part Labels

by: Naoya Chiba

3D Point cloud Representation learning Segmentation

Learning To Predict Visual Attributes in the Wild

by: Masanori YANO

Attetion Dataset Recognition Segmentation Contrastive learning

Network Pruning via Performance Maximization

by: 金城忍

Channel pruning

Cross-View Regularization for Domain Adaptive Panoptic Segmentation

by: 金城忍

Domain adaptation Unsupervised learning Panoptic segmentation

Weakly-Supervised Instance Segmentation via Class-Agnostic Learning With Salient Images

by: 金城忍

Instance segmentation

Modeling Multi-Label Action Dependencies for Temporal Action Localization

by: Ryo Nakamura

LPSNet: A Lightweight Solution for Fast Panoptic Segmentation

by: 金城忍

Panoptic segmentation

PiCIE: Unsupervised Semantic Segmentation Using Invariance and Equivariance in Clustering

by: 金城忍

FixBi: Bridging Domain Spaces for Unsupervised Domain Adaptation

by: Kai Watabe

Domain adaptation Pose estimation Animal pose estimation

From Synthetic to Real: Unsupervised Domain Adaptation for Animal Pose Estimation

by: Kai Watabe

RPSRNet: End-to-End Trainable Rigid Point Set Registration Network Using Barnes-Hut 2D-Tree Representation

by: Naoya Chiba

Image harmonization Image composition

Intrinsic Image Harmonization

by: Masanori YANO

A Dual Iterative Refinement Method for Non-Rigid Shape Matching

by: Naoya Chiba

3D reconstruction Point cloud Online Reconstruction

DI-Fusion: Online Implicit 3D Reconstruction With Deep Priors

by: Yusuke Saito

Localizing Visual Sounds the Hard Way

by: 金城忍

Multi modal Visual sounds localization Contrastive learning

Turning Frequency to Resolution: Video Super-Resolution via Event Cameras

by: Yutaro Oguri

Super resolution Video

Transformation Invariant Few-Shot Object Detection

by: 金城忍

N-shot learning Object detection

SAIL-VOS 3D: A Synthetic Dataset and Baselines for Object Detection and 3D Mesh Reconstruction From Video Data

by: Kensho Hara

3D 3D object detection 3D reconstruction Video

Beyond Static Features for Temporally Consistent 3D Human Pose and Shape From a Video

by: Kensho Hara

3D Pose estimation Video

Unbiased Mean Teacher for Cross-Domain Object Detection

by: Kai Watabe

Domain adaptation GAN Object detection

Convolutional Dynamic Alignment Networks for Interpretable Classifications

by: 金城忍

CNN

Unsupervised Multi-Source Domain Adaptation Without Access to Source Data

by: 金城忍

Domain adaptation Unsupervised domain adaptation

Quality-Agnostic Image Recognition via Invertible Decoder

by: 金城忍

Robustness Data corruption

I3Net: Implicit Instance-Invariant Network for Adapting One-Stage Object Detectors

by: Ryo Takahashi

Domain adaptation Object detection

MOOD: Multi-Level Out-of-Distribution Detection

by: Masanori YANO

Recognition Out-of-distribution

Locally Aware Piecewise Transformation Fields for 3D Human Mesh Registration

by: Naoya Chiba

3D Point cloud Pose estimation Semantic segmentation

Unsupervised 3D Shape Completion Through GAN Inversion

by: Naoya Chiba

3D 3D reconstruction Point cloud Self supervised learning

Learning To Reconstruct High Speed and High Dynamic Range Videos From Events

by: Yutaro Oguri

3D Dataset Pose estimation Visual Localization

Large-Scale Localization Datasets in Crowded Indoor Spaces

by: Yusuke Saito

Smoothing the Disentangled Latent Style Space for Unsupervised Image-to-Image Translation

by: 金城忍

Disentanglement Image-to-Image translation

Over-the-Air Adversarial Flickering Attacks Against Video Recognition Networks

by: Kensho Hara

Action recognition Adversarial examples Video

BABEL: Bodies, Action and Behavior With English Labels

by: Kensho Hara

3D Action recognition Dataset Video

WOAD: Weakly Supervised Online Action Detection in Untrimmed Videos

by: Kensho Hara

Normalization Cofounders Biases

Single Image Reflection Removal With Absorption Effect

by: Teppei Kurita

Reflection Removal

Metadata Normalization

by: 金城忍

SCF-Net: Learning Spatial Contextual Features for Large-Scale Point Cloud Segmentation

by: 山田亮佑 (Ryosuke Yamada)

3D 3D object detection

AdderSR: Towards Energy Efficient Image Super-Resolution

by: Teppei Kurita

Super resolution Energy Efficient

Dynamic Transfer for Multi-Source Domain Adaptation

by: 金城忍

Domain adaptation Meta learning

Continual Adaptation of Visual Representations via Domain Randomization and Meta-Learning

by: Kai Watabe

Intra-Inter Camera Similarity for Unsupervised Person Re-Identification

by: 金城忍

Person re-identification

Populating 3D Scenes by Learning Human-Scene Interaction

by: QIU YUE

3D Dataset

Learning Asynchronous and Sparse Human-Object Interaction in Videos

by: QIU YUE

Back to the Feature: Learning Robust Camera Localization From Pixels To Pose

by: Yusuke Saito

Optical flow Self supervised learning

Back to Event Basics: Self-Supervised Learning of Image Reconstruction for Event Cameras via Photometric Constancy

by: Yutaro Oguri

Compatibility-Aware Heterogeneous Visual Search

by: masato tonouchi

Neural architecture search(NAS) Representation learning

Learning Spatial-Semantic Relationship for Facial Attribute Recognition With Limited Labeled Data

by: Masanori YANO

Recognition Representation learning Segmentation

Space-Time Neural Irradiance Fields for Free-Viewpoint Video

by: Naoya Chiba

3D Point cloud Representation learning

Shelf-Supervised Mesh Prediction in the Wild

by: Naoya Chiba

3D 3D reconstruction Depth estimation Pose estimation Representation learning Self supervised learning

Rotation Equivariant Siamese Networks for Tracking

by: 金城忍

Object tracking Convolutional Neural Network Rotation equivariant

Scaling Local Self-Attention for Parameter Efficient Visual Backbones

by: 金城忍

Attetion

Unsupervised Multi-Source Domain Adaptation for Person Re-Identification

by: 金城忍

Domain adaptation Person re-identification

Learning the Non-Differentiable Optimization for Blind Super-Resolution

by: So Uchida

Super resolution Reinforcement Learning

Visualizing Adapted Knowledge in Domain Transfer

by: 金城忍

Visualization

Generating Manga From Illustrations via Mimicking Manga Creation Workflow

by: Masanori YANO

Dataset Segmentation Image stylization Image translation Style transfer

VDSM: Unsupervised Video Disentanglement With State-Space Modeling and Deep Mixtures of Experts

by: Yoshikazu Hayashi

Multi modal Audio sound separation

Looking Into Your Speech: Learning Cross-Modal Affinity for Audio-Visual Speech Separation

by: 金城忍

Reconsidering Representation Alignment for Multi-View Clustering

by: 金城忍

Contrastive learning

ReAgent: Point Cloud Registration Using Imitation and Reinforcement Learning

by: Naoya Chiba

3D Optical flow Point cloud

HCRF-Flow: Scene Flow From Point Clouds With Continuous High-Order CRFs and Position-Aware Flow Embedding

by: Naoya Chiba

Deep RGB-D Saliency Detection With Depth-Sensitive Attention and Automatic Multi-Modal Fusion

by: Yusuke Saito

3D Multi modal Neural architecture search(NAS) Saliency

Spatiotemporal Registration for Event-Based Visual Odometry

by: Yutaro Oguri

Dataset Pose estimation

Semantic-Aware Video Text Detection

by: 金城忍

Semantic segmentation Video Text detection Text tracking

Mask Guided Matting via Progressive Refinement Network

by: Masanori YANO

Dataset Image matting Upsampling

Cross-Domain Adaptive Clustering for Semi-Supervised Domain Adaptation

by: 金城忍

Domain adaptation N-shot learning Semi supervised learning

Interventional Video Grounding With Dual Contrastive Learning

by: 金城忍

Multi modal Causal inference

Patchwise Generative ConvNet: Training Energy-Based Models From a Single Natural Image for Internal Learning

by: 金城忍

Self supervised learning Image synthesis CNN

Unsupervised Disentanglement of Linear-Encoded Facial Semantics

by: 金城忍

3D reconstruction GAN

SimPLE: Similar Pseudo Label Exploitation for Semi-Supervised Classification

by: Hiroki Yamaoka

Semi-Supervised Learning

Generative Hierarchical Features From Synthesizing Images

by: 金城忍

3D Pose estimation Robot Pose

Single-View Robot Pose and Joint Angle Estimation via Render & Compare

by: Yusuke Saito

3D CNNs With Adaptive Temporal Feature Resolutions

by: 金城忍

Video 3D CNN Computing complexity

End-to-End Human Pose and Mesh Reconstruction with Transformers

by: Naoya Chiba

3D 3D reconstruction Attetion Pose estimation Robustness

Camera-Space Hand Mesh Recovery via Semantic Aggregation and Adaptive 2D-1D Registration

by: Naoya Chiba

3D 3D reconstruction Pose estimation

Glancing at the Patch: Anomaly Localization With Global and Local Feature Comparison

by: Hiroyuki Masuda

Anomaly Segmentation

Coarse-Fine Networks for Temporal Activity Detection in Videos

by: Katsuyuki Nakamura

Variational inference Batch normalization Uncertainty reduction

Fast Bayesian Uncertainty Estimation and Reduction of Batch Normalized Single Image Super-Resolution Network

by: 金城忍

Scene Text Telescope: Text-Focused Scene Image Super-Resolution

by: So Uchida

Attetion Super resolution Text recognition

CoSMo: Content-Style Modulation for Image Retrieval With Text Feedback

by: Shunsuke Tokunaga

3D 3D reconstruction Video Monocular 3D reconstruction

Learning Monocular 3D Reconstruction of Articulated Categories From Motion

by: Yusuke Saito

Improving Multiple Object Tracking With Single Object Tracking

by: yoshiki miyazawa

Object detection Person re-identification Recognition Video

Learning Scene Structure Guidance via Cross-Task Knowledge Transfer for Single Depth Super-Resolution

by: Yusuke Saito

Depth estimation Knowledge distillation Multi modal Depth Super-Resolution

Representing Videos As Discriminative Sub-Graphs for Action Recognition

by: Narifumi Otobe

Action recognition

Image Inpainting With External-Internal Learning and Monochromic Bottleneck

by: Masanori YANO

Image inpainting Colorization

Physics-Based Iterative Projection Complex Neural Network for Phase Retrieval in Lensless Microscopy Imaging

by: Shigekazu Takizawa

Computational imaging Phase retrieval

Learning Calibrated Medical Image Segmentation via Multi-Rater Agreement Modeling

by: Anonymous

Segmentation Video Instance segmentation

End-to-End Video Instance Segmentation With Transformers

by: 金城忍

Linear Semantics in Generative Adversarial Networks

by: 金城忍

GAN Semantic segmentation

Adaptive Weighted Discriminator for Training Generative Adversarial Networks

by: 金城忍

N-shot learning Object detection Contrastive learning

FSCE: Few-Shot Object Detection via Contrastive Proposal Encoding

by: 金城忍

Automatic Correction of Internal Units in Generative Neural Networks

by: 金城忍

3D 3D reconstruction Pose estimation Video

LASR: Learning Articulated Shape Reconstruction From a Monocular Video

by: Naoya Chiba

RefineMask: Towards High-Quality Instance Segmentation With Fine-Grained Features

by: Masanori YANO

Instance segmentation Upsampling

NeuroMorph: Unsupervised Shape Interpolation and Correspondence in One Go

by: Naoya Chiba

3D Pose estimation Representation learning Self supervised learning

DECOR-GAN: 3D Shape Detailization by Conditional Refinement

by: Naoya Chiba

3D 3D reconstruction GAN

FFB6D: A Full Flow Bidirectional Fusion Network for 6D Pose Estimation

by: Naoya Chiba

3D 3D object detection Object detection Point cloud Pose estimation Semantic segmentation

Image-to-Image Translation via Hierarchical Style Disentanglement

by: 金城忍

GAN Image-to-Image translation multi-label multi-style multi-tag

Fine-Grained Angular Contrastive Learning With Coarse Labels

by: 金城忍

N-shot learning Transfer learning Contrastive learning

Exploit Visual Dependency Relations for Semantic Segmentation

by: 金城忍

Normalized Avatar Synthesis Using StyleGAN and Perceptual Refinement

by: 金城忍

3D reconstruction GAN

Progressive Semantic-Aware Style Transformation for Blind Face Restoration

by: 金城忍

3D Domain adaptation Person re-identification

UnrealPerson: An Adaptive Pipeline Towards Costless Person Re-Identification

by: Masanori YANO

Robust Neural Routing Through Space Partitions for Camera Relocalization in Dynamic Indoor Environments

by: Yusuke Saito

3D Pose estimation Neural Decision Tree

Discovering Interpretable Latent Space Directions of GANs Beyond Binary Attributes

by: 金城忍

Multi-Modal Relational Graph for Cross-Modal Video Moment Retrieval

by: QIU YUE

Representation learning Robustness Single image deraining

Robust Representation Learning With Feedback for Single Image Deraining

by: Hiroki Kobayashi

Progressive Unsupervised Learning for Visual Object Tracking

by: QIU YUE

Self supervised learning Video

Self-Supervised Wasserstein Pseudo-Labeling for Semi-Supervised Image Classification

by: 金城忍

Semi supervised learning

RPN Prototype Alignment for Domain Adaptive Object Detector

by: 金城忍

Domain adaptation Object detection

Learning To Filter: Siamese Relation Network for Robust Tracking

by: QIU YUE

Object detection Video

ManipulaTHOR: A Framework for Visual Object Manipulation

by: QIU YUE

Attetion Segmentation Video

Delving Deep Into Many-to-Many Attention for Few-Shot Video Object Segmentation

by: 金城忍

Omnimatte: Associating Objects and Their Effects in Video

by: QIU YUE

Segmentation Video

Bilevel Online Adaptation for Out-of-Domain Human Mesh Reconstruction

by: Naoya Chiba

3D 3D reconstruction Domain adaptation Pose estimation Video

Multi-Stage Progressive Image Restoration

by: Masanori YANO

Attetion Image restoration Deraining Deblurring Denoising

HybrIK: A Hybrid Analytical-Neural Inverse Kinematics Solution for 3D Human Pose and Shape Estimation

by: Naoya Chiba

Object detection Contrastive learning

Improving Weakly Supervised Visual Grounding by Contrastive Knowledge Distillation

by: 金城忍

SSTVOS: Sparse Spatiotemporal Transformers for Video Object Segmentation

by: QIU YUE

Object detection Self supervised learning Video

CoMoGAN: Continuous Model-Guided Image-to-Image Translation

by: 金城忍

Attetion Convolutional Neural Network

Gaussian Context Transformer

by: 金城忍

Cross-Domain Similarity Learning for Face Recognition in Unseen Domains

by: 金城忍

Domain adaptation Face recognition Meta-learning

Learning Delaunay Surface Elements for Mesh Reconstruction

by: hisaka koji

3D

Fusing the Old with the New: Learning Relative Camera Pose with Geometry-Guided Uncertainty

by: hisaka koji

Pose estimation

Uncertainty Guided Collaborative Training for Weakly Supervised Temporal Action Detection

by: hisaka koji

Tuning IR-Cut Filter for Illumination-Aware Spectral Reconstruction From RGB

by: hisaka koji

Probabilistic 3D Human Shape and Pose Estimation From Multiple Unconstrained Images in the Wild

by: Naoya Chiba

Human Trajectory Prediction

Global2Local: Efficient Structure Search for Video Action Segmentation

by: hisaka koji

Segmentation

Introvert: Human Trajectory Prediction via Conditional 3D Attention

by: hisaka koji

Memory-Guided Unsupervised Image-to-Image Translation

by: Masanori YANO

GAN Image translation Memory network

Holistic 3D Human and Scene Mesh Estimation From Single View Images

by: Naoya Chiba

3D 3D reconstruction Instance segmentation Object detection Point cloud Pose estimation Segmentation

Co-Attention for Conditioned Image Matching

by: 金城忍

Attetion

The Spatially-Correlative Loss for Various Image Translation Tasks

by: 金城忍

GAN Contrastive learning

Partial Feature Selection and Alignment for Multi-Source Domain Adaptation

by: 金城忍

3D 3D reconstruction Disentanglement Pose estimation Self supervised learning

DualAST: Dual Style-Learning Networks for Artistic Style Transfer

by: 金城忍

GAN Style transfer

Neural Deformation Graphs for Globally-Consistent Non-Rigid Reconstruction

by: Naoya Chiba

StablePose: Learning 6D Object Poses From Geometrically Stable Patches

by: Naoya Chiba

3D 3D object detection Point cloud Pose estimation Robustness

CoCosNet v2: Full-Resolution Correspondence Learning for Image Translation

by: Masanori YANO

GAN Image translation

ST3D: Self-Training for Unsupervised Domain Adaptation on 3D Object Detection

by: Yusuke Saito

3D 3D object detection Domain adaptation

Exploring Heterogeneous Clues for Weakly-Supervised Audio-Visual Video Parsing

by: 金城忍

Multi modal Video Audio-visual video parsing Contrastive learning

Cyclic Co-Learning of Sounding Object Visual Grounding and Sound Separation

by: 金城忍

Video Sound separation

Generative PointNet: Deep Energy-Based Learning on Unordered Point Sets for 3D Generation, Reconstruction and Classification

by: Naoya Chiba

3D Point cloud Self supervised learning

ReDet: A Rotation-Equivariant Detector for Aerial Object Detection

by: 金城忍

Object detection Rotation equivariant invariant

PointNetLK Revisited

by: Naoya Chiba

Playable Video Generation

by: QIU YUE

ACRE: Abstract Causal REasoning Beyond Covariation

by: QIU YUE

KRISP: Integrating Implicit and Symbolic Knowledge for Open-Domain Knowledge-Based VQA

by: QIU YUE

Roof-GAN: Learning To Generate Roof Geometry and Relations for Residential Houses

by: 金城忍

Perception Matters: Detecting Perception Failures of VQA Models Using Metamorphic Testing

by: QIU YUE

Convolutional Neural Network Spatial matching

Convolutional Hough Matching Networks

by: 金城忍

MOS: Towards Scaling Out-of-Distribution Detection for Large Semantic Space

by: Masanori YANO

Recognition Out-of-distribution

Semi-Supervised 3D Hand-Object Poses Estimation With Interactions in Time

by: Yusuke Saito

3D Pose estimation Video Hand-Object Pose

Cross-Modal Center Loss for 3D Cross-Modal Retrieval

by: 金城忍

Multi modal Cross modal retrieval

ZeroScatter: Domain Transfer for Long Distance Imaging and Vision Through Scattering Media

by: 金城忍

Navigating the GAN Parameter Space for Semantic Image Editing

by: 金城忍

Multiple Object Tracking With Correlation Learning

by: 金城忍

Multiple object tracking

Causal Hidden Markov Model for Time Series Disease Forecasting

by: worldblue

Robustness causal

Partition-Guided GANs

by: 金城忍

3D 3D object detection Adversarial examples Point cloud Robustness

Verifiability and Predictability: Interpreting Utilities of Network Architectures for Point Cloud Processing

by: Naoya Chiba

MonoRUn: Monocular 3D Object Detection by Reconstruction and Uncertainty Propagation

by: Naoya Chiba

3D 3D object detection 3D reconstruction Object detection Point cloud Pose estimation Self supervised learning

RankDetNet: Delving Into Ranking Constraints for Object Detection

by: 古澤嘉久

3D object detection Object detection

Learning To Segment Rigid Motions From Two Frames

by: Anonymous

Segmentation

Self-Supervised Learning for Semi-Supervised Temporal Action Proposal

by: 古澤嘉久

Action recognition Self supervised learning Video

Deep Burst Super-Resolution

by: Masanori YANO

Attetion Dataset Super resolution

S2-BNN: Bridging the Gap Between Self-Supervised Real and 1-Bit Neural Networks via Guided Distribution Calibration

by: 古澤嘉久

Self supervised learning Binary neural networks

Scale-Aware Automatic Augmentation for Object Detection

by: 古澤嘉久