CVPR2020サーベイまとめ（一覧）

EfficientDet: Scalable and Efficient Object Detection

by: 中嶋航大

ObjectDetection

PSGAN: Pose and Expression Robust Spatial-Aware GAN for Customizable Makeup Transfer

by: 綱島秀樹

GAN Makeup

StereoGAN: Bridging Synthetic-to-Real Domain Gap by Joint Optimization of Domain Translation and Stereo Matching

by: Teppei Kurita

GAN Domain Adaptation Stereo Depth

Wavelet Integrated CNNs for Noise-Robust Image Classification

by: Teppei Kurita

Wavelet Object Recognition

Where Am I Looking At? Joint Location and Orientation Estimation by Cross-View Matching

by: Teppei Kurita

Cross-View Geo-Localization Aerial Localization

SurfelGAN: Synthesizing Realistic Sensor Data for Autonomous Driving

by: Teppei Kurita

Autonomous Vehicles GAN Simulator Data Driven CG

On the Uncertainty of Self-Supervised Monocular Depth Estimation

by: Teppei Kurita

Self Supervised Uncertainty Monocular Depth

FroDO: From Detections to 3D Objects

by: Teppei Kurita

Multi-View 3D Object Reconstruction

Binarizing MobileNet via Evolution-Based Searching

by: Teppei Kurita

MobileNet Binary Neural Networks Group Convolution

Large-Scale Object Detection in the Wild From Imbalanced Multi-Labels

by: Teppei Kurita

Object Detection Multi-Label Softmax

Background Matting: The World Is Your Green Screen

by: Teppei Kurita

Matting Self-Supervised GAN

Disentangled and Controllable Face Image Generation via 3D Imitative-Contrastive Learning

by: 綱島秀樹

GAN 3D Prior Disentanglement Image Manipulation

Context Prior for Scene Segmentation

by: Teppei Kurita

Semantic Segmentation Prior

Robust 3D Self-Portraits in Seconds

by: Teppei Kurita

3D Reconstruction 3D Self-Portrait Volumetric Bundle Adjustment

SEAN: Image Synthesis With Semantic Region-Adaptive Normalization

by: 綱島秀樹

GAN Normalization Style Injection Conditional Generation

Self-Supervised Viewpoint Learning From Image Collections

by: 榎本和馬

Self-Supervised Viewpoint

Transferable, Controllable, and Inconspicuous Adversarial Attacks on Person Re-identification With Deep Mis-Ranking

by: Rei Tamaru

adversarial person re-identification

Adversarial Vertex Mixup: Toward Better Adversarially Robust Generalization

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial example robustness data augmentation adversarial training

Blurry Video Frame Interpolation

by: 中嶋航大

Deblurring frame interpolation

REVERIE: Remote Embodied Visual Referring Expression in Real Indoor Environments

by: Yue Qiu

Vision and Language Embodied AI Dataset

Domain Adaptive Image-to-Image Translation

by: 綱島秀樹

GAN I2I DAI2I Domain Adaptation Attribute Transfer

Attentive Normalization for Conditional Image Generation

by: 中嶋航大

GAN Normalization

SketchyCOCO: Image Generation From Freehand Scene Sketches

by: 綱島秀樹

GAN Sketch2Img Domain Transfer Edge2Img Dataset

X3D: Expanding Architectures for Efficient Video Recognition

by: Kensho Hara

Action Recognition 3D CNN

Towards Learning a Generic Agent for Vision-and-Language Navigation via Pre-Training

by: Seitaro Shinagawa

vision-and-language pre-training masked language model navigation

Learning Better Lossless Compression Using Lossy Compression

by: Shoma Iwai

Image Compression Lossless Compression

Boosting Semantic Human Matting With Coarse Annotations

by: Masaki Taniguchi

human matting coarse annotated data

Dual Super-Resolution Learning for Semantic Segmentation

by: Takuji Tahara

Semantic Segmentation Single Image Super-Resolution

Image Search With Text Feedback by Visiolinguistic Attention Learning

by: Seitaro Shinagawa

vision-and-language image search fashion

Evolving Losses for Unsupervised Video Representation Learning

by: Kensho Hara

Self-supervised Learning Action Recognition

Weakly-Supervised Action Localization by Generative Attention Modeling

by: Komiki Maruyama

Weakly Supervised Learning Temporal Action Localization

SAPIEN: A SimulAted Part-Based Interactive ENvironment

by: Yue Qiu

Embodied AI Dataset

Guided Variational Autoencoder for Disentanglement Learning

by: hiroki tsujimoto

disentanglement VAE adversarial

Cascaded Deep Video Deblurring Using Temporal Sharpness Prior

by: Teppei Kurita

Deblur Prior Video

Inverse Rendering for Complex Indoor Scenes: Shape, Spatially-Varying Lighting and SVBRDF From a Single Image

by: Teppei Kurita

Inverse Rendering Depth Surface Normal Intrinsic Decomposition Light Estimation SVBRDF Dataset

Gold Seeker: Information Gain From Policy Distributions for Goal-Oriented Vision-and-Langauge Reasoning

by: Seitaro Shinagawa

vision-and-language goal-oriented reinforcement learning reasoning bayesian

Instance-Aware Image Colorization

by: Teppei Kurita

Colorization Object Detection

HyperSTAR: Task-Aware Hyperparameters for Deep Networks

by: Teppei Kurita

Hyperparameter Meta Learning

Vision-Language Navigation With Self-Supervised Auxiliary Reasoning Tasks

by: Yue Qiu

Vision and Language Vision-Language Navigation

Fashion Editing With Adversarial Parsing Learning

by: Seitaro Shinagawa

fashion editing GAN in-paining

Combating Noisy Labels by Agreement: A Joint Training Method with Co-Regularization

by: 飯田啄巳

Mislabel Noisy label robust learning

Actor-Transformers for Group Activity Recognition

by: Shunsuke Kogure

Action Recognition

SQuINTing at VQA Models: Introspecting VQA Models With Sub-Questions

by: Yue Qiu

Vision and Language Visual Question Answering Dataset

LiDARsim: Realistic LiDAR Simulation by Leveraging the Real World

by: Ryota Suzuki

LiDAR simulation self-driving

Intuitive, Interactive Beard and Hair Synthesis With Generative Models

by: Shoma Iwai

Image manipulation

Why Having 10,000 Parameters in Your Camera Model Is Better Than Twelve

by: Shoji Sonoyama

camera calibration bundle adjustment

Lighthouse: Predicting Lighting Volumes for Spatially-Coherent Illumination

by: Teppei Kurita

Inverse Rendering Stereo Multi-Plane Image Lighting Estimation

3D Packing for Self-Supervised Monocular Depth Estimation

by: Shoji Sonoyama

self-supervised learning monocular depth estimation SLAM

Novel View Synthesis of Dynamic Scenes With Globally Coherent Depths From a Monocular Camera

by: 綱島秀樹

NVS Novel View Synthesis Depth Prediction Video

Image Based Virtual Try-On Network From Unpaired Data

by: fnakamura

Virtual Try-on 仮想試着着せ替え

MMTM: Multimodal Transfer Module for CNN Fusion

by: Yusuke Machii

Malti-modal gesture recognition audiovisual speech enhancement human action recognition

Prior Guided GAN Based Semantic Inpainting

by: Ho Ching Chiu

Image Inpainting GAN

Cascade Cost Volume for High-Resolution Multi-View Stereo and Stereo Matching

by: Shoji Sonoyama

multi view stereo stereo matching cost volume depth estimation

Neural Network Pruning With Residual-Connections and Limited-Data

by: Tomoro Tokusumi

pruning 枝刈り小規模データセット

Benchmarking Adversarial Robustness on Image Classification

by: Tomoki Tanimura

Robustness Adversarial Robustness Adversarial Training Image Classification Adversarial Attack Adversarial Defense

Total3DUnderstanding: Joint Layout, Object Pose and Mesh Reconstruction for Indoor Scenes From a Single Image

by: yusuke saito

3D Object Detection Object Pose Prediction Object Reconstruction

Ego-Topo: Environment Affordances From Egocentric Video

by: Katsuyuki Nakamura

Egocentric vision Graph convolution

Improving Confidence Estimates for Unfamiliar Examples

by: Anonymous

familiar unfamiliar

Transformation GAN for Unsupervised Image Synthesis and Representation Learning

by: Hiroaki Aizawa

GAN self-supervised learning unsupervised representation learning

Generative Hybrid Representations for Activity Forecasting With No-Regret Learning

by: Shun.ishizaka

activity forecasting trajectory forecasting egocentric

What Makes Training Multi-Modal Classification Networks Hard?

by: pshiko

multi-modal classification video

A Neural Rendering Framework for Free-Viewpoint Relighting

by: Teppei Kurita

Inverse Rendering Relighting Multi-View

Focus on Defocus: Bridging the Synthetic to Real Domain Gap for Depth Estimation

by: Teppei Kurita

Focal Stack Depth Estimation Defocus

Filter Response Normalization Layer: Eliminating Batch Dependence in the Training of Deep Neural Networks

by: Obi

normalization activation function

Oops! Predicting Unintentional Action in Video

by: Shun.ishizaka

unintentional dataset self-supervised learning

BDD100K: A Diverse Driving Dataset for Heterogeneous Multitask Learning

by: Hiroki Yamamoto

Dataset Driving

Don't Hit Me! Glass Detection in Real-World Scenes

by: Hiroki Yamamoto

Segmentation Glass Dataset

Hardware-in-the-Loop End-to-End Optimization of Camera Image Processing Pipelines

by: MatsuokaHikaru

Camera Image Processing Pipelines Object detection

Learning to Autofocus

by: Teppei Kurita

Focal Stack Depth Estimation Auto Focus

Quasi-Newton Solver for Robust Non-Rigid Registration

by: MatsuokaHikaru

Robust Non-Rigid Registration Quasi-Newton

Uninformed Students: Student-Teacher Anomaly Detection With Discriminative Latent Embeddings

by: Shunsuke Nakatsuka

Anomaly Detection Teacher-Student Learning

Bringing Old Photos Back to Life

by: Teppei Kurita

Image Restoration VAE GAN

Towards Verifying Robustness of Neural Networks Against A Family of Semantic Perturbations

by: Tomoki Tanimura

Adversarial Examples Certified Robustness Adversarial Robustness Semantic Adversarial Attack

Camouflaged Object Detection

by: Teppei Kurita

Object Detection Camouflaged Object Detection Dataset

Peek-a-Boo: Occlusion Reasoning in Indoor Scenes With Plane Representations

by: Hiroaki Aizawa

3D scene understanding

Front2Back: Single View 3D Shape Reconstruction via Front to Back Prediction

by: Hiroaki Aizawa

3D Reconstruction

A Multigrid Method for Efficiently Training Video Models

by: Kensho Hara

Action Recognition Video Recognition Efficient

What Machines See Is Not What They Get: Fooling Scene Text Recognition Models With Adversarial Text Images

by: Shintaro Yamamoto

Scene Text Recognition Adversarial Attack

What It Thinks Is Important Is Important: Robustness Transfers Through Input Gradients

by: Rei Tamaru

Adversarial Example Transfer Learning

12-in-1: Multi-Task Vision and Language Representation Learning

by: Shintaro Yamamoto

Vision-and-Language Multi-Task Learning

Learning Temporal Co-Attention Models for Unsupervised Video Action Localization

by: Komiki Maruyama

Unsupervised Learning Temporal Action Localization

BSP-Net: Generating Compact Meshes via Binary Space Partitioning

by: Yue Qiu

3D Vision;Mesh

DeepCap: Monocular Human Performance Capture Using Weak Supervision

by: Yue Qiu

3D Surface Reconstruction Pose Estimation 3D Vision

CIAGAN: Conditional Identity Anonymization Generative Adversarial Networks

by: 綱島秀樹

GAN Anonymization

Syn2Real Transfer Learning for Image Deraining Using Gaussian Processes

by: Hao

syn2real image deraining

Cross-Modality Person Re-Identification With Shared-Specific Feature Transfer

by: Keita Goto

person re-identification cross modal

TITAN: Future Forecast Using Action Priors

by: Ryo Fujii

Trajectory prediction egocentric View Action recognition

Explorable Super Resolution

by: Hao

SR、超解像度、編集

UC-Net: Uncertainty Inspired RGB-D Saliency Detection via Conditional Variational Autoencoders

by: Yue Qiu

Saliency Map Estimation;RGB-D;CVAE;

Transferring Cross-Domain Knowledge for Video Sign Language Recognition

by: Yue Qiu

Word-level Sign Language Recognition;Sign News;Video;

Cross-Batch Memory for Embedding Learning

by: Yue Qiu

Deep Metric Learning;Hard Negative Instances

Visual Chirality

by: Hirokatsu Kataoka

Visual Chirality Data Augmentation

Squeeze-and-Attention Networks for Semantic Segmentation

by: Anonymous

Attention semantic segmentation

A Spatial RNN Codec for End-to-End Image Compression

by: Shoma Iwai

Image Compression

METAL: Minimum Effort Temporal Activity Localization in Untrimmed Videos

by: Komiki Maruyama

Few-shot Learning Weakly Supervised Learning Temporal Action Localization

PIFuHD: Multi-Level Pixel-Aligned Implicit Function for High-Resolution 3D Human Digitization

by: Pavel Savkin

high resolution 3d reconstruction human shape implicit function monocular image

Efficient Neural Vision Systems Based on Convolutional Image Acquisition

by: Ryota Suzuki

optical system point spread function cnn

PQ-NET: A Generative Part Seq2Seq Network for 3D Shapes

by: Hiroaki Aizawa

3D Shape Generation Seq2Seq

Learning to Shadow Hand-Drawn Sketches

by: yasud

GAN hand-drawn shadow lighting

Geometry-Aware Satellite-to-Ground Image Synthesis for Urban Areas

by: Hiroaki Aizawa

cross-view synthesis

Total Deep Variation for Linear Inverse Problems

by: MatsuokaHikaru

image restoration Total Deep Variation Diverse inverse problems maximum a posterior estimator

Learning Unsupervised Hierarchical Part Decomposition of 3D Objects From a Single RGB Image

by: Hiroaki Aizawa

Part Decomposition 3D Reconstruction

DaST: Data-Free Substitute Training for Adversarial Attacks

by: 飯田啄巳

gan adversarial learning data-free online system substitute model black box

Hierarchical Human Parsing With Typed Part-Relation Reasoning

by: Seitaro Shinagawa

human parsing segmentation graph neural networks message-passing reasoning

Neural Data Server: A Large-Scale Search Engine for Transfer Learning Data

by: 岡本大和

Transfer Learning

Footprints and Free Space From a Single Color Image

by: Shoji Sonoyama

segmentation depth estimation

Learning Memory-Guided Normality for Anomaly Detection

by: Shunsuke Nakatsuka

Anomaly Detection

Through the Looking Glass: Neural 3D Reconstruction of Transparent Shapes

by: Teppei Kurita

3D Reconstruction Transparent Surface Normal

SpeedNet: Learning the Speediness in Videos

by: Teppei Kurita

Action Recognition Speediness Video

Self-Supervised Learning of Pretext-Invariant Representations

by: hiroki tsujimoto

representation learning Self-Supervised

RetinaTrack: Online Single Stage Joint Detection and Tracking

by: Michiya Abe

mot retinanet waymo

Revisiting Knowledge Distillation via Label Smoothing Regularization

by: Hiroki Yamamoto

Knowledge Distillation Transfer Learning

Action Genome: Actions As Compositions of Spatio-Temporal Scene Graphs

by: Shun.ishizaka

scene graph dataset action recognition relationships compositional

3D Photography Using Context-Aware Layered Depth Inpainting

by: Teppei Kurita

3D Reconstruction Depth Occlusion

TextureFusion: High-Quality Texture Acquisition for Real-Time RGB-D Scanning

by: yusuke saito

RGB-D Scanning Texture Acquisition

Light Field Spatial Super-Resolution via Deep Combinatorial Geometry Embedding and Structural Consistency Regularization

by: Teppei Kurita

Light Field Computational Photography 4D Super Resolution

Learning Event-Based Motion Deblurring

by: Teppei Kurita

Dynamic Vision Sensor Event Based Camera Deblur

Panoptic-Based Image Synthesis

by: Teppei Kurita

Panoptic Map Image Synthesis

Adaptive Subspaces for Few-Shot Learning

by: Shuhei M Yoshida

few-shot learning subspace method

Rethinking Data Augmentation for Image Super-resolution: A Comprehensive Analysis and a New Strategy

by: So Uchida

super-resolution data augmentation

A Self-supervised Approach for Adversarial Robustness

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial examples robustness self-supervised learning

End-to-End Illuminant Estimation Based on Deep Metric Learning

by: Teppei Kurita

Illuminant Estimation Triplet Loss Color Histogram Color Consistency ICDF

15 Keypoints Is All You Need

by: Shuhei M Yoshida

pose tracking PoseTrack Challenge tracking

Towards Fairness in Visual Recognition: Effective Strategies for Bias Mitigation

by: Ryo Takahashi

Probabilistic Structural Latent Representation for Unsupervised Embedding

by: 綱島秀樹

Unsupervised Embedding Retrieval Fine-grained Classification

Controllable Person Image Synthesis With Attribute-Decomposed GAN

by: 綱島秀樹

GAN PIS Disentanglement Pose Transfer Virtual Try-on

From Image Collections to Point Clouds With Self-Supervised Shape and Pose Networks

by: HIroaki Aizawa

3D Reconstruction Point Cloud Differentiable Renderer

The Secret Revealer: Generative Model-Inversion Attacks Against Deep Neural Networks

by: Rei Tamaru

Adversarial Example Model-Inversion GAN

Unpaired Image Super-Resolution Using Pseudo-Supervision

by: Ho Ching Chiu

Super Resolution Cycle Consistency GAN

SPARE3D: A Dataset for SPAtial REasoning on Three-View Line Drawings

by: Hiroaki Aizawa

Spatial Reasoning

BiFuse: Monocular 360 Depth Estimation via Bi-Projection Fusion

by: Shoji Sonoyama

depth estimation 360 camera single image depth estimation

Unsupervised Learning of Probably Symmetric Deformable 3D Objects From Images in the Wild

by: Hiroaki Aizawa

3D Reconstruction

Webly Supervised Knowledge Embedding Model for Visual Reasoning

by: Shintaro Yamamoto

Visual Reasoning Knowledge Base

Towards Achieving Adversarial Robustness by Enforcing Feature Consistency Across Bit Planes

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial examples robustness

The GAN That Warped: Semantic Attribute Editing With Unpaired Data

by: 中嶋航大

Towards Backward-Compatible Representation Learning

by: Tomoki Tanimura

Representation Learning Life Long Learning Continuous Learning Image Retrieval Feature Extraction

Conditional Channel Gated Networks for Task-Aware Continual Learning

by: 岡本大和

Continual Learning

Iterative Answer Prediction With Pointer-Augmented Multimodal Transformers for TextVQA

by: Shintaro Yamamoto

VQA TextVQA Transformer

End-to-End Adversarial-Attention Network for Multi-Modal Clustering

by: Yusuke Machii

Multi-Modal Adversarial clustering attention

Articulation-Aware Canonical Surface Mapping

by: Hiroaki Aizawa

Canonical Surface Mapping Articulation

WCP: Worst-Case Perturbations for Semi-Supervised Deep Learning

by: 岡本大和

Semi-supervised Learning worst-case perturbations model-based robustness sample-based robustness additive perturbations dropconnect perturbations

S3VAE: Self-Supervised Sequential VAE for Representation Disentanglement and Data Generation

by: Shuhei M Yoshida

self-supervised learning sequential data representation disentanglement

Iterative Context-Aware Graph Inference for Visual Dialog

by: Yue Qiu

Visual Dialog Graph CNN

Learned Image Compression With Discretized Gaussian Mixture Likelihoods and Attention Modules

by: Shoma Iwai

Image Compression

TA-Student VQA: Multi-Agents Training by Self-Questioning

by: Shintaro Yamamoto

VQA Reinforcement Learning

Skeleton-Based Action Recognition With Shift Graph Convolutional Network

by: Wataru Kudo

Action Recognition Graph Convolutional Network

Use the Force, Luke! Learning to Predict Physical Forces by Simulating Effects

by: Kensho Hara

Physics Video Force Prediction

HybridPose: 6D Object Pose Estimation Under Hybrid Representations

by: Pavel Savkin

6D pose estimation edge symmetry keypoint SVD

Two Causal Principles for Improving Visual Dialog

by: Yue Qiu

Visual Dialog

UCTGAN: Diverse Image Inpainting Based on Unsupervised Cross-Space Translation

by: Ho Ching Chiu

Image Inpainting Unsupervised Learning Multiple-solution inpainting

Single-Shot Monocular RGB-D Imaging Using Uneven Double Refraction

by: Shoji Sonoyama

single shot depth estimation depth estimation double refraction

Intelligent Home 3D: Automatic 3D-House Design From Linguistic Descriptions Only

by: Ryota Suzuki

GAN GCN GAN GCN layout generation texture generation text-to-3d

Watch Your Up-Convolution: CNN Based Generative Deep Neural Networks Are Failing to Reproduce Spectral Distributions

by: Shoma Iwai

GAN Deep Fake

D2Det: Towards High Quality Object Detection and Instance Segmentation

by: Munetaka Minoguchi

object detection instance segmentation 2 stage detector

High-Resolution Daytime Translation Without Domain Labels

by: kiyo

Image-to-Image Translation

Few-Shot Open-Set Recognition Using Meta-Learning

by: 岡本大和

open-set few-shot meta-learning

Efficient Adversarial Training With Transferable Adversarial Examples

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial training adversarial example robustness

BachGAN: High-Resolution Image Synthesis From Salient Object Layout

by: Shoma Iwai

GAN Layout

Telling Left From Right: Learning Spatial Correspondence of Sight and Sound

by: Masuyama Yoshiki

self-supervised learning audio-visual multi modal object detection sound source separation

Adversarial Examples Improve Image Recognition

by: Rei Tamaru

Adversarial Example Image Recognition Batch Normalization

One Man's Trash Is Another Man's Treasure: Resisting Adversarial Examples by Adversarial Examples

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial examples robustness

Erasing Integrated Learning: A Simple Yet Effective Approach for Weakly Supervised Object Localization

by: 堀田大地

WSOL

Orderless Recurrent Models for Multi-Label Classification

by: 堀田大地

multi-label clasiffication

Semantically Multi-Modal Image Synthesis

by: 綱島秀樹

GAN Conditional Image Synthesis SMIS

Counterfactual Samples Synthesizing for Robust Visual Question Answering

by: Shintaro Yamamoto

VQA

Auxiliary Training: Towards Accurate and Robust Models

by: 福原吉博 (Yoshihiro Fukuhara)

robustness distillation trade-off

Disentangled Image Generation Through Structured Noise Injection

by: Ho Ching Chiu

GAN Image Generation Disentanglement

Generating and Exploiting Probabilistic Monocular Depth Estimates

by: Shoji Sonoyama

depth estimation single depth estimation C-VAE MAP estimation

Listen to Look: Action Recognition by Previewing Audio

by: Shun.ishizaka

action recognition audio-visual learning multi-modal video understanding

BBN: Bilateral-Branch Network With Cumulative Learning for Long-Tailed Visual Recognition

by: Hiroki Yamamoto

Class Imbalance Long tail Classification

SynSin: End-to-End View Synthesis From a Single Image

by: Anonymous

Image and video synthesis Neural Generative Models Novel view synthesis

PhraseCut: Language-Based Image Segmentation in the Wild

by: Shun.ishizaka

VQA dataset vision and language segmentation

Learning Multiview 3D Point Cloud Registration

by: yusuke saito

multiview 3D point cloud registration RGB-D

STAViS: Spatio-Temporal AudioVisual Saliency Network

by: Masuyama Yoshiki

audio-visual multi modal saliency map

Don't Judge an Object by Its Context: Learning to Overcome Contextual Bias

by: pshiko

Few-Shot Class-Incremental Learning

by: Shuhei M Yoshida

few-shot learning class-incremental learning

Progressive Mirror Detection

by: Teppei Kurita

Mirror Mirror Detection Object Detection Edge Detection

Counterfactual Vision and Language Learning

by: Yue Qiu

Vision and Language Counterfactual Causal Reasoning Visual Question Ansering

Learn2Perturb: An End-to-End Feature Perturbation Learning to Improve Adversarial Robustness

by: Tomoki Tanimura

Adversarial Robustness Adversarial Defense Adversarial Training Noise Injection

Visual Commonsense R-CNN

by: Yue Qiu

Recognition Object Detection Vision and Language

Cascade EF-GAN: Progressive Facial Expression Editing With Local Focuses

by: Ho Ching Chiu

Multi-Task Collaborative Network for Joint Referring Expression Comprehension and Segmentation

by: Shintaro Yamamoto

Referring Expression Multi-Task Learning

Nested Scale-Editing for Conditional Image Synthesis

by: 綱島秀樹

GAN Conditional Image Synthesis I2I Disentanglement

Differentiable Adaptive Computation Time for Visual Reasoning

by: Seitaro Shinagawa

differentiable visual reasoning adaptive computation time VQA

Active Speakers in Context

by: Yue Qiu

Active Speakers Detection Video Recognition

Predicting Goal-Directed Human Attention Using Inverse Reinforcement Learning

by: Kensho Hara

Human Attention Saliency Inverse Reinforcement Learning

UnrealText: Synthesizing Realistic Scene Text Images From the Unreal World

by: Yue Qiu

3D Dataset Scene Text Generation

Robustness Guarantees for Deep Neural Networks on Videos

by: Rei Tamaru

Adversarial Learning Optical Flow Video RNN

VQA With No Questions-Answers Training

by: Shintaro Yamamoto

VQA

MaskGAN: Towards Diverse and Interactive Facial Image Manipulation

by: 綱島秀樹

GAN Image Manipulation Mask2Img Face Style Injection

Show, Edit and Tell: A Framework for Editing Image Captions

by: Yue Qiu

Image Captioning Vision and Language

Disentangling and Unifying Graph Convolutions for Skeleton-Based Action Recognition

by: Mariko Nakano

Action Recognition Graph Convolutional Network

Deep Distance Transform for Tubular Structure Segmentation in CT Scans

by: 中村優太

Radiology Segmentation CT scan Tubular structure Multi-task learning Geometry-aware

Neural Topological SLAM for Visual Navigation

by: Yue Qiu

Visual Navigation SLAM Embodied AI

Embodied Language Grounding With 3D Visual Feature Representations

by: Yue Qiu

3D Visual Feature Representations Vision and Language

Neural Networks Are More Productive Teachers Than Human Raters: Active Mixup for Data-Efficient Knowledge Distillation From a Blackbox Model

by: Hiroki Ohashi

distillation active learning mix-up

Physically Realizable Adversarial Examples for LiDAR Object Detection

by: Rei Tamaru

Adversarial Example 3D Point Cloud

Vision-Dialog Navigation by Exploring Cross-Modal Memory

by: Yue Qiu

Visual Dialog Vision-Language Navigation

A Context-Aware Loss Function for Action Spotting in Soccer Videos

by: Komiki Maruyama

Weakly Supervised Learning Temporal Action Localization Video Summarization

RoboTHOR: An Open Simulation-to-Real Embodied AI Platform

by: Ryota Suzuki

sim2real embodied platform robot

Cross-View Correspondence Reasoning Based on Bipartite Graph Convolutional Network for Mammogram Mass Detection

by: 藤中彩乃

# Crossmodal Learning # Graph Convolutional Network # Cross-view Correspondence Reasoning

An Internal Covariate Shift Bounding Algorithm for Deep Neural Networks by Unitizing Layers' Outputs

by: MatsuokaHikaru

Internal Covariate Shif Batch Normalization Earth Mover distance

D3VO: Deep Depth, Deep Pose and Deep Uncertainty for Monocular Visual Odometry

by: Pavel Savkin

SLAM visual odometry VO monocular unsupervised depth DSO

Graph Structured Network for Image-Text Matching

by: Shintaro Yamamoto

Image-Text Matching Graph Matching

SCT: Set Constrained Temporal Transformer for Set Supervised Action Segmentation

by: Komiki Maruyama

Weakly Supervised Learning Temporal Action Segmentation

Relative Interior Rule in Block-Coordinate Descent

by: MatsuokaHikaru

Max-Sum Potts model maximum a posteriori message-passing

End-to-End Pseudo-LiDAR for Image-Based 3D Object Detection

by: Ryota Suzuki

3D detection pseudo-LiDAR end-to-end

Dataless Model Selection With the Deep Frame Potential

by: 中村真裕

neural network deep neural network

Unsupervised Person Re-Identification via Multi-Label Classification

by: Masanori YANO

Unsupervised Learning Person Re-Identification Transfer Learning

FocalMix: Semi-Supervised Learning for 3D Medical Image Detection

by: 中村優太

3D Object detection Focal Loss Feature Pyramid Networks CT scan Lang Nodule Semi-supervised learning

SDC-Depth: Semantic Divide-and-Conquer Network for Monocular Depth Estimation

by: Shoji Sonoyama

single depth estimation semantic segmentation instance segmentation

MPM: Joint Representation of Motion and Position Map for Cell Tracking

by: Issei Tomotsuka

cell microscopy object detection

Cross-Domain Correspondence Learning for Exemplar-Based Image Translation

by: 綱島秀樹

GAN Domain Transfer Exemplar-guided

Music Gesture for Visual Sound Separation

by: Masuyama Yoshiki

audio-visual multi modal sound source separation

Learning a Neural Solver for Multiple Object Tracking

by: 遠藤大河

motion and tracking MOT MPN

Rotation Equivariant Graph Convolutional Network for Spherical Image Classification

by: 古川遼

spherical image equivariance graph convolutional network

Training Quantized Neural Networks With a Full-Precision Auxiliary Module

by: Tomoro Tokusumi

Quantization Object Detection

Semantic Image Manipulation Using Scene Graphs

by: Seitaro Shinagawa

image manipulation image editing scene graph image generation GAN

Multi-Dimensional Pruning: A Unified Framework for Model Compression

by: Hiroki Ohashi

pruning model compression

Speech2Action: Cross-Modal Supervision for Action Recognition

by: Shun.ishizaka

action recognition cross-modal weakly supervised learning BERT

AdversarialNAS: Adversarial Neural Architecture Search for GANs

by: 綱島秀樹

GAN NAS

Unsupervised Learning for Intrinsic Image Decomposition From a Single Image

by: Teppei Kurita

Intrinsic Decomposition Physics Constraint Albedo Shading

Strip Pooling: Rethinking Spatial Pooling for Scene Parsing

by: Anonymous

semantic segmentation pooling

Momentum Contrast for Unsupervised Visual Representation Learning

by: pshiko

contrastive-learning self-supervised unsupervised representation-leraning

Towards Unsupervised Learning of Generative Models for 3D Controllable Image Synthesis

by: 堀田大地

GAN 3D

AdderNet: Do We Really Need Multiplications in Deep Learning?

by: Hiroki Ohashi

efficient training efficient inference

CPR-GCN: Conditional Partial-Residual Graph Convolutional Network in Automated Anatomical Labeling of Coronary Arteries

by: 藤中彩乃

CPR-GCN Conditional Partial-Residual Graph Convolutional Network 3D Hybrid Model Coronary Artery

Few-Shot Video Classification via Temporal Alignment

by: Hiroki Ohashi

few-shot learning video classification DTW

Learning Dynamic Routing for Semantic Segmentation

by: hiroki tsujimoto

semantic segmentation

Instance Segmentation of Biological Images Using Harmonic Embeddings

by: 藤中彩乃

Instance Segmentation Harmonic Embedding U-Net proposal-free approach

Computing the Testing Error Without a Testing Set

by: Anonymous

DNN without a Testing Set

Rethinking Classification and Localization for Object Detection

by: pshiko

object-detection R-CNN

Adversarial Latent Autoencoders

by: Shunsuke Nakatsuka

GANs Generative Models Autoencoder Representation Learning

Novel Object Viewpoint Estimation Through Reconstruction Alignment

by: Hiroaki Aizawa

Viewpoint Estimation

Unsupervised Learning From Video With Deep Neural Embeddings

by: hiroki tsujimoto

unsupervised learning action recognition

Unsupervised Person Re-Identification via Softened Similarity Learning

by: Masanori YANO

Unsupervised Learning Person Re-Identification Similarity Dissimilarity

Neural Voxel Renderer: Learning an Accurate and Controllable Rendering Tool

by: Hiroaki Aizawa

Differentiable Renderer Voxel

When NAS Meets Robustness: In Search of Robust Architectures Against Adversarial Attacks

by: Tomoki Tanimura

Adversarial Robustness Neural Architecture Search Adversarial Defense Adversarial Examples

Dynamic Fluid Surface Reconstruction Using Deep Neural Network

by: 榎本

3D reconstruction

Regularizing Discriminative Capability of CGANs for Semi-Supervised Generative Learning

by: 綱島秀樹

GAN Conditional Image Syntheiss Semi-supervised Learning

MineGAN: Effective Knowledge Transfer From GANs to Target Domains With Few Images

by: Hiroaki Aizawa

GAN Knowledge Transfer

Recognizing Objects From Any View With Object and Viewer-Centered Representations

by: Hiroaki Aizawa

Object Recognition

Satellite Image Time Series Classification With Pixel-Set Encoders and Temporal Self-Attention

by: Hirokatsu Kataoka

satellite image pixel-set encoders self-attention

Wish You Were Here: Context-Aware Human Generation

by: Rei Tamaru

Pose Transfer Pose Generation GAN Semantic Maps

DMCP: Differentiable Markov Channel Pruning for Neural Networks

by: Tomoro Tokusumi

Channel Pruning Markov Process

Self-Supervised Scene De-Occlusion

by: Hiroaki Aizawa

Scene De-Occlusion Amodal Perception

View-GCN: View-Based Graph Convolutional Network for 3D Shape Analysis

by: yusuke saito

view-based graph-based 3D Shape Analysis RGB-D

Single-View View Synthesis With Multiplane Images

by: Hiroaki Aizawa

View Synthesis Multiplane Image

Dynamic Traffic Modeling From Overhead Imagery

by: Katsuyuki Nakamura

Application

UNAS: Differentiable Architecture Search Meets Reinforcement Learning

by: Obi

nas neural architecture search objective function

Single Image Reflection Removal Through Cascaded Refinement

by: Teppei Kurita

Reflection Removal Cascaded Refinement LSTM

Referring Image Segmentation via Cross-Modal Progressive Comprehension

by: Seitaro Shinagawa

semantic segmentation vision-and-language

What You See is What You Get: Exploiting Visibility for 3D Object Detection

by: Higaki Yoshinari

3D object detection LiDAR BEV

Adversarial Texture Optimization From RGB-D Scans

by: 榎本

reconstruction

Plug-and-Play Algorithms for Large-Scale Snapshot Compressive Imaging

by: Higaki Yoshinari

Compressive Imaging ADMM

Adversarial Robustness: From Self-Supervised Pre-Training to Fine-Tuning

by: Tomoki Tanimura

Adversarial Robustness Self-supervised Ensemble Adversarial Examples

Deep Generative Model for Robust Imbalance Classification

by: Shunsuke Nakatsuka

Imbalanced Data Classification Generative Models

Non-Line-of-Sight Surface Reconstruction Using the Directional Light-Cone Transform

by: Higaki Yoshinari

NLOS albedo normal

ViewAL: Active Learning With Viewpoint Entropy for Semantic Segmentation

by: 榎本

ActiveLearning SemanticSegmentation

Adversarial Feature Hallucination Networks for Few-Shot Learning

by: Shuhei M Yoshida

few-shot learning data augmentation GANs

RevealNet: Seeing Behind Objects in RGB-D Scans

by: Hiroaki Aizawa

3D instance segmentation 3D scan completion Semantic instance completion

Multi-scale Domain-adversarial Multiple-instance CNN for Cancer Subtype Classification with Unannotated Histopathological Images

by: 藤中彩乃

Multiple instance learning Domain adversarial normalization Multi-scale learning 弱教師あり学習

Learning to Manipulate Individual Objects in an Image

by: 綱島秀樹

Image Manipulation Object-aware Representation Learning Disentanglement VAE

Compositional Convolutional Neural Networks: A Deep Architecture With Innate Robustness to Partial Occlusion

by: Hiroaki Aizawa

VIBE: Video Inference for Human Body Pose and Shape Estimation

by: 榎本

3D GAN estimation

PatchVAE: Learning Local Latent Codes for Recognition

by: Rei Tamaru

Unsupervised Learning Disentangled Learning Image Recognition VAE Representation Learning

DAVD-Net: Deep Audio-Aided Video Decompression of Talking Heads

by: Yue Qiu

Video Decompression Videos of Talking Heads Audio and Vision

SOS: Selective Objective Switch for Rapid Immunofluorescence Whole Slide Image Classification

by: 藤中彩乃

Selective objective switch SOS 冗長性 Paradoxical loss

Contextual Residual Aggregation for Ultra High-Resolution Image Inpainting

by: kiyo

Image Inpainting

A Disentangling Invertible Interpretation Network for Explaining Latent Representations

by: Hiroaki Aizawa

Interpretability Disentanglement

Correction Filter for Single Image Super-Resolution: Robustifying Off-the-Shelf Deep Super-Resolvers

by: So Uchida

Super-Resolution Filter Correction

Local Context Normalization: Revisiting Local Normalization

by: HayataEbisawa

Normalization

Boosting Few-Shot Learning With Adaptive Margin Loss

by: Shuhei M Yoshida

few-shot learning metric learning meta learning additive margin loss

Structure-Guided Ranking Loss for Single Image Depth Prediction

by: Shoji Sonoyama

single depth estimation ranking loss instance segmentation

MotionNet: Joint Perception and Motion Prediction for Autonomous Driving Based on Bird's Eye View Maps

by: HayataEbisawa

3D object detection motion prediction

Learning Rank-1 Diffractive Optics for Single-Shot High Dynamic Range Imaging

by: Higaki Yoshinari

HDR DOE PSF Glare

Organ at Risk Segmentation for Head and Neck Cancer Using Stratified Learning and Neural Architecture Search

by: 榎本

medical segmentation

Orthogonal Convolutional Neural Networks

by: Anonymous

Deep Residual Flow for Out of Distribution Detection

by: Shunsuke Nakatsuka

Out of Distribution Detection Flow Based Model Generative Models

CNN-Generated Images Are Surprisingly Easy to Spot… for Now

by: Ho Ching Chiu

Image Forensic Fake Detection GAN

Classifying, Segmenting, and Tracking Object Instances in Video with Mask Propagation

by: pshiko

video instance segmentation segmentation instance segmentation

Interpreting the Latent Space of GANs for Semantic Face Editing

by: Hiroaki Aizawa

Image Manipulation

RoutedFusion: Learning Real-Time Depth Map Fusion

by: yusuke saito

Depth Fusion RGB-D Multi-view3D reconstruction

GLU-Net: Global-Local Universal Network for Dense Flow and Correspondences

by: 遠藤大河

Motion and Tracking Optical Flow Geometric Matching

Knowledge As Priors: Cross-Modal Knowledge Generalization for Datasets Without Superior Knowledge

by: Masuyama Yoshiki

cross-modal multi-modal knowledge distillation meta learning

Achieving Robustness in the Wild via Adversarial Mixing With Disentangled Representations

by: 山縣英介

disentangled adversarial

A Characteristic Function Approach to Deep Implicit Generative Modeling

by: 古川遼

implicit generative models GANs characteristic function

Semantics-Guided Neural Networks for Efficient Skeleton-Based Human Action Recognition

by: 山縣英介

semantic human action

KFNet: Learning Temporal Camera Relocalization Using Kalman Filtering

by: yusuke saito

camera relocalization temporal camera relocalization kalman filter

Training Noise-Robust Deep Neural Networks via Meta-Learning

by: 山縣英介

label noise meta learning loss correction

PointRend: Image Segmentation As Rendering

by: pshiko

semantic segmentation instnace segmentation

Action Segmentation With Joint Self-Supervised Temporal Domain Adaptation

by: Hiroki Ohashi

action segmentation domain adaptation self-supervised learning transductive learning

Generating Accurate Pseudo-Labels in Semi-Supervised Learning and Avoiding Overconfident Predictions via Hermite Polynomial Activations

by: Obi

semi-supervised learning pseudo-label activation function

Agriculture-Vision: A Large Aerial Image Database for Agricultural Pattern Analysis

by: Teppei Kurita

Dataset Agriculture Semantic Segmentation NIR

Dynamic Refinement Network for Oriented and Densely Packed Object Detection

by: Hirokatsu Kataoka

Object Detection Densely Annotated Objects

Unsupervised Model Personalization While Preserving Privacy and Scalability: An Open Problem

by: Hirokatsu Kataoka

Privacy Preserving

Incremental Few-Shot Object Detection

by: Hiroki Ohashi

object detection few-shot learning incremental learning

Seeing without Looking: Contextual Rescoring of Object Detections for AP Maximization

by: Hirokatsu Kataoka

Object Detection Contextual Reasoning Faster R-CNN Cascade R-CNN

Multimodal Categorization of Crisis Events in Social Media

by: Hirokatsu Kataoka

Multimodal Categorization Crisis Event Detection SNS

Learning From Synthetic Animals

by: Hao

synthetic animals keypoints

Where, What, Whether: Multi-Modal Learning Meets Pedestrian Detection

by: Hirokatsu Kataoka

Pedestrian Detection Multimodal

Your Local GAN: Designing Two Dimensional Local Attention Mechanisms for Generative Models

by: Rei Tamaru

Attention GAN Sparcity

Reference-Based Sketch Image Colorization Using Augmented-Self Reference and Dense Semantic Correspondence

by: yasud

Style Transfer 画風変換 GAN Colorization

Learning Physics-Guided Face Relighting Under Directional Light

by: Rei Tamaru

Image Transfer Relighting GAN

Few-Shot Learning via Embedding Adaptation With Set-to-Set Functions

by: Shuhei M Yoshida

few-shot learning metric learning transformer set-to-set function

Leveraging 2D Data to Learn Textured 3D Mesh Generation

by: Hiroaki Aizawa

3D Shape Generation Mesh

NETNet: Neighbor Erasing and Transferring Network for Better Single Shot Object Detection

by: Munetaka Minoguchi

single shot detection feature pyramid

Deep Grouping Model for Unified Perceptual Parsing

by: Anonymous

semantic segmentation grouping process graph convolution

Image Super-Resolution With Cross-Scale Non-Local Attention and Exhaustive Self-Exemplars Mining

by: 中嶋航大

Towards Causal VQA: Revealing and Reducing Spurious Correlations by Invariant and Covariant Semantic Editing

by: Yue Qiu

Visual Question Answering Causal Reasoning Dataset

FDA: Fourier Domain Adaptation for Semantic Segmentation

by: 中嶋航大

C2FNAS: Coarse-to-Fine Neural Architecture Search for 3D Medical Image Segmentation

by: 中村優太

Neural Architecture Search evolutionary algorithm medical image segmentation

Which Is Plagiarism: Fashion Image Retrieval Based on Regional Representation for Design Protection

by: Hirokatsu Kataoka

Plagiarized Fashion Landmark Detection Image Retrieval

Hierarchical Conditional Relation Networks for Video Question Answering

by: Yue Qiu

Video Question Answering Conditional Relation Networks

Breaking the Cycle - Colleagues Are All You Need

by: fnakamura

Image-to-Image Translation Council-GAN without cycle-consistency

FineGym: A Hierarchical Video Dataset for Fine-Grained Action Understanding

by: Hirokatsu Kataoka

Fine-grained Action Recognition FineGym

The Devil Is in the Details: Delving Into Unbiased Data Processing for Human Pose Estimation

by: 綱島秀樹

Pose Estimation Data Bias

Violin: A Large-Scale Dataset for Video-and-Language Inference

by: Shintaro Yamamoto

Vision-and-Language Dataset

MixNMatch: Multifactor Disentanglement and Encoding for Conditional Image Generation

by: Hiroaki Aizawa

Image Synthesis Disentanglement

Unsupervised Reinforcement Learning of Transferable Meta-Skills for Embodied Navigation

by: Yue Qiu

Visual Navigation Meta Learning Unsupervised Reinforcement Learning

Cops-Ref: A New Dataset and Task on Compositional Referring Expression Comprehension

by: Yue Qiu

Refering Expression Vision and Language Dataset

AugFPN: Improving Multi-Scale Feature Learning for Object Detection

by: Munetaka Minoguchi

object detection feature pyramid network

3D-ZeF: A 3D Zebrafish Tracking Benchmark Dataset

by: yamada ryosuke

dataset 3d

DeepEMD: Few-Shot Image Classification With Differentiable Earth Mover's Distance and Structured Classifiers

by: Shuhei M Yoshida

few-shot learning earth mover's distance

Cascaded Refinement Network for Point Cloud Completion

by: Naoya Chiba

Point Cloud Completion Point Cloud Upsampling

Vec2Face: Unveil Human Faces From Their Blackbox Features in Face Recognition

by: 野中琢登

GAN

NeuralScale: Efficient Scaling of Neurons for Resource-Constrained Deep Neural Networks

by: Tomoro Tokusumi

Pruning NAS

On Isometry Robustness of Deep 3D Point Cloud Models Under Adversarial Attacks

by: Naoya Chiba

Adversarial Attacks Point Cloud Isometry Robustness

Few-Shot Pill Recognition

by: Masanori YANO

Few-Shot Learning Recognition Image Classification Segmentation Dataset

Mapillary Street-Level Sequences: A Dataset for Lifelong Place Recognition

by: Anonymous

MSLS Place Recognition

Image Processing Using Multi-Code GAN Prior

by: Hao

GAN inversion prior image processing

Deep Optics for Single-Shot High-Dynamic-Range Imaging

by: Higaki Yoshinari

HDR DOE PSF

Rotate-and-Render: Unsupervised Photorealistic Face Rotation From Single-View Images

by: 綱島秀樹

GAN Neural Renderer 3D Generation Unsupervised Self-Supervised Face Verification Face Identification Data Augmentation

Universal Litmus Patterns: Revealing Backdoor Attacks in CNNs

by: 飯田啄巳

backdoor attack poisoning universal attack defense

ActiveMoCap: Optimized Viewpoint Selection for Active Human Motion Capture

by: Shoji Sonoyama

human pose estimation active vision drone

Temporal Pyramid Network for Action Recognition

by: Anonymous

Action Recognition

Visual-Textual Capsule Routing for Text-Based Video Segmentation

by: 日坂　幸次

Vision and Language

SuperGlue: Learning Feature Matching With Graph Neural Networks

by: Shoji Sonoyama

feature matching local feature graph neural network SfM SLAM

Meta-Transfer Learning for Zero-Shot Super-Resolution

by: So Uchida

Super-Resolution Internal Learning Meta Learning

Deep Image Spatial Transformation for Person Image Generation

by: Hiroaki Aizawa

Pose-guided person image generation

DUNIT: Detection-Based Unsupervised Image-to-Image Translation

by: kiyo

Image-to-Image Translation

Tracking by Instance Detection: A Meta-Learning Approach

by: 遠藤大河

物体追跡物体検出

Bridging the Gap Between Anchor-Based and Anchor-Free Detection via Adaptive Training Sample Selection

by: pshiko

object detection anchor-based anchor-free

CookGAN: Causality Based Text-to-Image Synthesis

by: Hirokatsu Kataoka

GAN Image Synthesis Cooking

You2Me: Inferring Body Pose in Egocentric Video via First and Second Person Interactions

by: Katsuyuki Nakamura

egocentric vision pose estimation

ClusterFit: Improving Generalization of Visual Representations

by: Hirokatsu Kataoka

Self-supervised Learning ClusterFit

CycleISP: Real Image Restoration via Improved Data Synthesis

by: Teppei Kurita

Noise Reduction RAW sRGB ISP Attention

Modality Shifting Attention Network for Multi-Modal Video Question Answering

by: Shintaro Yamamoto

Video Question Answering

Few-Shot Object Detection With Attention-RPN and Multi-Relation Detector

by: Hiroki Ohashi

object detection few-shot learning new dataset

Task Agnostic Robust Learning on Corrupt Outputs by Correlation-Guided Mixture Density Networks

by: 岡本大和

Robust Learning; Bayesian Deep Learning; Semi-supervised Learning

Graph-Structured Referring Expression Reasoning in the Wild

by: 日坂　幸次

Vision and Language

Unbiased Scene Graph Generation From Biased Training

by: 綱島秀樹

Causal Inference Counterfactual SGG Scene Graph Generation

Blindly Assess Image Quality in the Wild Guided by a Self-Adaptive Hyper Network

by: Shoma Iwai

IQA Image quality assessment

Learning 3D Semantic Scene Graphs From 3D Indoor Reconstructions

by: Yue Qiu

3D Vision 3D Scene Graph PointNet Graph NN

PhysGAN: Generating Physical-World-Resilient Adversarial Examples for Autonomous Driving

by: HayataEbisawa

ALFRED: A Benchmark for Interpreting Grounded Instructions for Everyday Tasks

by: Yue Qiu

Vision and Language Egocentric Vision Dataset

Deep Relational Reasoning Graph Network for Arbitrary Shape Text Detection

by: Yue Qiu

Scene Text Detection Graph NN Relational Reasoning

RiFeGAN: Rich Feature Generation for Text-to-Image Synthesis From Prior Knowledge

by: 中村優太

Text-to-image generation Text-to-Image Synthesis multi-captions generative adversarial networks

In Defense of Grid Features for Visual Question Answering

by: Shintaro Yamamoto

Vision-and-Language Attention

IntrA: 3D Intracranial Aneurysm Dataset for Deep Learning

by: Anonymous

3D Medicine disease

Transferring Dense Pose to Proximal Animal Classes

by: Hiroaki Aizawa

Transfer Learning

Visual Grounding in Video for Unsupervised Word Translation

by: Masuyama Yoshiki

multi-modal machine translation

Multi-Modality Cross Attention Network for Image and Sentence Matching

by: Shintaro Yamamoto

Image-Text Matching Transformer

Fantastic Answers and Where to Find Them: Immersive Question-Directed Visual Attention

by: Yue Qiu

Visual Attention Visual Question Answering Vision and Language

End-to-End Learning of Visual Representations From Uncurated Instructional Videos

by: Seito Kasai

Action Recognition Self-supervised Learning Multimodal

Deep Stereo Using Adaptive Thin Volume Representation With Uncertainty Awareness

by: Shoji Sonoyama

multi view stereo adaptive thin volume depth estimation

Multi-Path Region Mining for Weakly Supervised 3D Semantic Segmentation on Point Clouds

by: Naoya Chiba

Point Cloud Segmentation Weakly Supervised 3D Semantic Segmentation

DualSDF: Semantic Shape Manipulation Using a Two-Level Representation

by: 綱島秀樹

3D Generation 3D Manipulation

Fine-Grained Generalized Zero-Shot Learning via Dense Attribute-Based Attention

by: Shuhei M Yoshida

zero-shot learning attention mechanism

Point-GNN: Graph Neural Network for 3D Object Detection in a Point Cloud

by: Naoya Chiba

Point Cloud Object Detection LiDAR

Overcoming Classifier Imbalance for Long-Tail Object Detection With Balanced Group Softmax

by: Masanori YANO

Object Detection Instance Segmentation Imbalanced Data

Dynamic Graph Message Passing Networks

by: Takuji Tahara

Graph Semantic Segmentation Instance Segmentation Object Detection

MaskFlownet: Asymmetric Feature Matching With Learnable Occlusion Mask

by: 遠藤大河

機械学習 Optical Flow AsymOFMM

On the General Value of Evidence, and Bilingual Scene-Text Visual Question Answering

by: Seitaro Shinagawa

vision-and-language VQA scene text detection

Multi-View Neural Human Rendering

by: yusuke saito

dynamic 3D reconstruction multi-view

AOWS: Adaptive and Optimal Network Width Search With Latency Constraints

by: Obi

NAS neural architecture search slimmable network width search

Recurrent Feature Reasoning for Image Inpainting

by: Seitaro Shinagawa

in-painting image generation partial convolution reasoning contextual attention

Learning Texture Transformer Network for Image Super-Resolution

by: So Uchida

Super-Resolution Reference-based Super-Resolution Transformer Attention

How Useful Is Self-Supervised Pretraining for Visual Tasks?

by: Hiroki Ohashi

self-supervised learning

Deblurring by Realistic Blurring

by: Teppei Kurita

Blur Deblur GAN SinGAN

Video Modeling With Correlation Networks

by: Kensho Hara

Action Recognition Motion Representation

ColorFool: Semantic Adversarial Colorization

by: Tomoki Tanimura

Adversarial Attack Semantic Segmentation Black Box Attack Adversarial Examples Color Space

Towards Photo-Realistic Virtual Try-On by Adaptively Generating-Preserving Image Content

by: 綱島秀樹

Virtual Try-on

Video Playback Rate Perception for Self-Supervised Spatio-Temporal Representation Learning

by: Hiroki Ohashi

self-supervised learning video data action recognition video retrieval

How Does Noise Help Robustness? Explanation and Exploration under the Neural SDE Framework

by: 榎本

Self-Supervised Learning of Interpretable Keypoints From Unlabelled Videos

by: Hiroaki Aizawa

Self-supervised Learning

Fine-Grained Image-to-Image Transformation Towards Visual Recognition

by: Shoma Iwai

Image Transformation

3D Part Guided Image Editing for Fine-Grained Object Understanding

by: Yue Qiu

3D Editing Semantic segmentation

Hierarchical Graph Attention Network for Visual Relationship Detection

by: Yue Qiu

Visual Relationship Detection Graph NN

Low-Rank Compression of Neural Nets: Learning the Rank of Each Layer

by: Kiro Otsu

Neural network compression low-rank optimization

Weakly Supervised Visual Semantic Parsing

by: Yue Qiu

Weakly Supervised Learning Visual Semantic Parsing

Attention-Guided Hierarchical Structure Aggregation for Image Matting

by: Masaki Taniguchi

alpha matting attention

Online Knowledge Distillation via Collaborative Learning

by: Yue Qiu

Knowledge Distillation

End-to-End Camera Calibration for Broadcast Videos

by: Hirokatsu Kataoka

Camera Calibration Sports Scene Basketball

ActionBytes: Learning From Trimmed Videos to Localize Actions

by: Hirokatsu Kataoka

Action Localization Action Recognition

Distilling Cross-Task Knowledge via Relationship Matching

by: 岡本大和

Knowledge Distillation Model Reuse Knowledge Transfer Cross-Task Learning Embedding Learning

Category-Level Articulated Object Pose Estimation

by: Hirokatsu Kataoka

Point Cloud Depth Image 3D Object Recognition

Cross-Domain Detection via Graph-Induced Prototype Alignment

by: 岡本大和

Cross-domain Detection Relation Graph Prototype-based Domain Adaptation Balanced Training

Mask Encoding for Single Shot Instance Segmentation

by: pshiko

instance segmentation single stage COCO

DIST: Rendering Deep Implicit Signed Distance Function With Differentiable Sphere Tracing

by: Naoya Chiba

Implicit Function DeepSDF Rendering

Deep Parametric Shape Predictions Using Distance Fields

by: Naoya Chiba

Distance Field Chamfer Distance Font

Rethinking Zero-Shot Video Classification: End-to-End Training for Realistic Applications

by: Shuhei M Yoshida

SDFDiff: Differentiable Rendering of Signed Distance Fields for 3D Shape Optimization

by: Naoya Chiba

Differentiable Rendering Signed Distance Fields Single-view 3D Reconstruction

TubeTK: Adopting Tubes to Track Multi-Object in a One-Step Training Model

by: pacifinapacific

MOT Tracking 3DCNN

Robust Learning Through Cross-Task Consistency

by: Shun.ishizaka

consistency multi-task 3D robust surface normals depth

Generalizing Hand Segmentation in Egocentric Videos With Uncertainty-Guided Model Adaptation

by: Katsuyuki Nakamura

egocentric vision hand segmentation domain adaptation self-supervision

Single-Stage 6D Object Pose Estimation

by: Shoji Sonoyama

pose estimation single image

Improving Convolutional Networks With Self-Calibrated Convolutions

by: Yusuke Kyokawa

recognition architecture cnn

ManiGAN: Text-Guided Image Manipulation

by: 綱島秀樹

GAN Image Manipulation Text-guided Image Manipulation

Learning Texture Invariant Representation for Domain Adaptation of Semantic Segmentation

by: 鏡川悠介

domain adaptation style transfer semantic segmentation

Joint 3D Instance Segmentation and Object Detection for Autonomous Driving

by: Higaki Yoshinari

LiDAR 3D instance segmentation bounding box

StyleRig: Rigging StyleGAN for 3D Control Over Portrait Images

by: Ho Ching Chiu

GAN Image Synthesis Cycle Consistency Rig-like control

Local Implicit Grid Representations for 3D Scenes

by: Naoya Chiba

Local Implicit Grid 3D Reconstruction Implicit Surface

Rethinking Class-Balanced Methods for Long-Tailed Visual Recognition From a Domain Adaptation Perspective

by: pshiko

class imbalanced data log tailed data recognition domain adaptation

TEA: Temporal Excitation and Aggregation for Action Recognition

by: Anonymous

Action Recognition

Self-Supervised Learning of Video-Induced Visual Invariances

by: Hiroki Ohashi

self-supervised learning video transfer learning

Straight to the Point: Fast-Forwarding Videos via Reinforcement Learning Using Textual Data

by: Shintaro Yamamoto

Fast-forwarding Reinforcement Learning

Real-Time Panoptic Segmentation From Dense Detections

by: Masaki Taniguchi

panoptic segmentation single-shot real-time

Exploiting Joint Robustness to Adversarial Perturbations

by: Tomoki Tanimura

Adversarial Robustness Ensemble Learning Adversarial Examples Adversarial Defense

Multi-Modal Graph Neural Network for Joint Reasoning on Vision and Scene Text

by: Shintaro Yamamoto

VQA GNN

A Physics-Based Noise Formation Model for Extreme Low-Light Raw Denoising

by: Teppei Kurita

CMOS Image Sensor Noise Model Noise Reduction NR

Cross-Domain Document Object Detection: Benchmark Suite and Method

by: Shintaro Yamamoto

Document Analysis Domain Adaptation Object Detection

Better Captioning With Sequence-Level Exploration

by: 中村優太

Image captioning Video captioning Reinforcement learning

Deep Snake for Real-Time Instance Segmentation

by: Masaki Taniguchi

instance segmentation snake algorithm graph convolutional network

Global Texture Enhancement for Fake Face Detection in the Wild

by: Shoma Iwai

Deepfake detection fake face detection

Reciprocal Learning Networks for Human Trajectory Prediction

by: Ryo Fujii

Trajectory prediction GAN adversarial attack

Revisiting Pose-Normalization for Fine-Grained Few-Shot Recognition

by: Hiroaki Aizawa

Few-shot Classification Fine-grained Classification

Deep Fair Clustering for Visual Learning

by: Ryo Takahashi

fairness 公平性クラスタリング

DeepFaceFlow: In-the-Wild Dense 3D Facial Motion Estimation

by: 遠藤大河

Motion and Tracking Facial Recognition Optical Flow

PULSE: Self-Supervised Photo Upsampling via Latent Space Exploration of Generative Models

by: Ho Ching Chiu

Super Resolution GAN Self-supervised

SSRNet: Scalable 3D Surface Reconstruction Network

by: Naoya Chiba

3D Surface Reconstruction Implicit Surface 3D Reconstruction

Image2StyleGAN++: How to Edit the Embedded Images?

by: Anonymous

GAN StyleGAN Inpainting Style transfer Reconstruction

Deep Global Registration

by: Shoji Sonoyama

registration point cloud pose estimation

State-Relabeling Adversarial Active Learning

by: 岡本大和

active learning adversarial learning uncertainty relabel

Context Aware Graph Convolution for Skeleton-Based Action Recognition

by: Mariko Nakano

SpineNet: Learning Scale-Permuted Backbone for Recognition and Localization

by: pshiko

network architecture search recognition localization coco

Mixture Dense Regression for Object Detection and Human Pose Estimation

by: Anonymous

LatentFusion: End-to-End Differentiable Reconstruction and Rendering for Unseen Object Pose Estimation

by: Naoya Chiba

Object Pose Estimation 3D shape learning View Synthesis

ImVoteNet: Boosting 3D Object Detection in Point Clouds With Image Votes

by: Naoya Chiba

3D Object Detection RGB-D Deep Hough Voting

End-to-End Learning Local Multi-View Descriptors for 3D Point Clouds

by: Naoya Chiba

Point Cloud Registration Multi-View CNN View Pooling

Weakly-Supervised Domain Adaptation via GAN and Mesh Model for Estimating 3D Hand Poses Interacting Objects

by: Shun.ishizaka

hand pose estimation hand-object interaction GAN 3D mesh model

Learning to Super Resolve Intensity Images From Events

by: Teppei Kurita

Event Camera DVS Dynamic Vision Sensor Super Resolution RNN

SwapText: Image Based Texts Transfer in Scenes

by: Yue Qiu

Text Transfer GAN

Set-Constrained Viterbi for Set-Supervised Action Segmentation

by: Kensho Hara

Action Segmentation HMM Viterbi Weakly-supervised

More Grounded Image Captioning by Distilling Image-Text Matching Model

by: Yue Qiu

Image Captioning Knowledge Distillation

Upgrading Optical Flow to 3D Scene Flow Through Optical Expansion

by: Ryota Suzuki

optical flow 3D optical flow local affine transformation

OctSqueeze: Octree-Structured Entropy Model for LiDAR Compression

by: Ryota Suzuki

octree pointcloud compression

Local Class-Specific and Global Image-Level Generative Adversarial Networks for Semantic-Guided Scene Generation

by: Shoma Iwai

GAN Semantic Image Synthesis Semantic Cross-View Image Translation

Normalized and Geometry-Aware Self-Attention Network for Image Captioning

by: Yue Qiu

Image Captioning

Reinforced Feature Points: Optimizing Feature Detection and Description for a High-Level Task

by: Ryota Suzuki

reinforcement learning keypoint matching

Meshed-Memory Transformer for Image Captioning

by: Yue Qiu

Image Captioning Transformer

Single-Image HDR Reconstruction by Learning to Reverse the Camera Pipeline

by: Ryota Suzuki

HDR camera pipeline

Transform and Tell: Entity-Aware News Image Captioning

by: Yue Qiu

News Image Captioning Image Captioning Transformer

Detecting Attended Visual Targets in Video

by: Hiroki Ohashi

gaze attention target visual target video gaze estimation

Context-Aware and Scale-Insensitive Temporal Repetition Counting

by: Komiki Maruyama

Action Repetition Action Counting

Distilling Knowledge From Graph Convolutional Networks

by: Komiki Maruyama

Graph Convolutional Network Knowledge Distillation

End-to-End Optimization of Scene Layout

by: 綱島秀樹

Scene Graph Layout Generation Conditional Layout Generation

Uncertainty-Aware Score Distribution Learning for Action Quality Assessment

by: Komiki Maruyama

Action Quality Assessment

G-TAD: Sub-Graph Localization for Temporal Action Detection

by: Komiki Maruyama

Temporal Action Localization Graph Convolutional network

nuScenes: A Multimodal Dataset for Autonomous Driving

by: Michiya Abe

dataset autonomous driving lidar rader 3d bounding box nuScenes

Counting Out Time: Class Agnostic Video Repetition Counting in the Wild

by: Komiki Maruyama

Periodicity Detection Repetition Counting

Maintaining Discrimination and Fairness in Class Incremental Learning

by: yamada ryosuke

incremental learning fairness class imbalance

Optimizing Rank-Based Metrics With Blackbox Differentiation

by: MatsuokaHikaru

Learning to Rank LR mAP Blackbox Differentiation

Spatiotemporal Fusion in 3D CNNs: A Probabilistic View

by: 若宮天雅

Video Recognition

PointASNL: Robust Point Clouds Processing Using Nonlocal Neural Networks With Adaptive Sampling

by: Naoya Chiba

Point Cloud Processing Adaptive Sapling Object Detection Semantic Segmentation

Say As You Wish: Fine-Grained Control of Image Caption Generation With Abstract Scene Graphs

by: 日坂　幸次

Vision and Language

Enhanced Blind Face Restoration With Multi-Exemplar Images and Adaptive Spatial Feature Fusion

by: Teppei Kurita

Face Restoration Landmark MSL AdaIN

Learning to Structure an Image With Few Colors

by: Masanori YANO

Recognition Image Classification Color Quantization Regularization

Sign Language Transformers: Joint End-to-End Sign Language Recognition and Translation

by: 日坂　幸次

Vision and Language

CvxNet: Learnable Convex Decomposition

by: 寺田英雄

convex autoencoder cvxnet end-to-end 3d-reconstruction

What Can Be Transferred: Unsupervised Domain Adaptation for Endoscopic Lesions Segmentation

by: 藤中彩乃

半教師あり学習転移学習 Unsupervised domain adaptation Semantic segmentation Crossmodal learning

ActBERT: Learning Global-Local Video-Text Representations

by: Shintaro Yamamoto

Vision-and-Language Pre-training

Exploit Clues From Views: Self-Supervised and Regularized Learning for Multiview Object Recognition

by: yamada ryosuke

multi-view self-supervised learning

A U-Net Based Discriminator for Generative Adversarial Networks

by: kiyo

GAN

Texture and Shape Biased Two-Stream Networks for Clothing Classification and Attribute Recognition

by: yamada ryosuke

fashion two-stream

Unsupervised Representation Learning for Gaze Estimation

by: hiroki tsujimoto

gaze estimation unsupervised learning representation learning few-shot learning disentanglement

Information-Driven Direct RGB-D Odometry

by: yusuke saito

visual odometry visual slam information-based rgb-d

DuDoRNet: Learning a Dual-Domain Recurrent Network for Fast MRI Reconstruction With Deep T1 Prior

by: 中村優太

MRI reconstruction

Advancing High Fidelity Identity Swapping for Forgery Detection

by: Ho Ching Chiu

Face Swap GAN Attention Denormalization Self-supervision

Mitigating Bias in Face Recognition Using Skewness-Aware Reinforcement Learning

by: yamada ryosuke

fairness face recognition reinforcement learning

GraphTER: Unsupervised Learning of Graph Transformation Equivariant Representations via Auto-Encoding Node-Wise Transformations

by: Naoya Chiba

Graph Convolutional Neural Networks Transformation Equivariant Representations Unsupervised Learning

BlendedMVS: A Large-Scale Dataset for Generalized Multi-View Stereo Networks

by: yamada ryosuke

multi-view dataset

HVNet: Hybrid Voxel Network for LiDAR Based 3D Object Detection

by: Higaki Yoshinari

Object detection LiDAR BEV Multi-scale

MLCVNet: Multi-Level Context VoteNet for 3D Object Detection

by: Higaki Yoshinari

3D point cloud object detection context

Fashion Outfit Complementary Item Retrieval

by: yamada ryosuke

fashion

PV-RCNN: Point-Voxel Feature Set Abstraction for 3D Object Detection

by: Naoya Chiba

3D Object Detection Point Cloud PointVoxel-RCNN

Fast Soft Color Segmentation

by: Masanori YANO

Segmentation Self-Supervised Learning

Inter-Task Association Critic for Cross-Resolution Person Re-Identification

by: Anonymous

Person Re-Identiﬁcation resolution

Something-Else: Compositional Action Recognition With Spatial-Temporal Interaction Networks

by: Katsuyuki Nakamura

action recognition compositionality dataset

Learning Selective Self-Mutual Attention for RGB-D Saliency Detection

by: yusuke saito

silency detection RGB-D Self-Mutual Attention Non-local network

Polishing Decision-Based Adversarial Noise With a Customized Sampling

by: 山縣英介

AE boundary attack

Flow Contrastive Estimation of Energy-Based Models

by: 古川遼

enegy-based model flow-based model contrastive estimation joint training

Detecting Adversarial Samples Using Influence Functions and Nearest Neighbors

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial examples robustness adversarial detection

Dreaming to Distill: Data-Free Knowledge Transfer via DeepInversion

by: 岡本大和

model inversiond ata-free distillation transfer pruning compression incremental learning continual learning efficient image synthesis explainable ai

Unsupervised Adaptation Learning for Hyperspectral Imagery Super-Resolution

by: Teppei Kurita

HIS MIS Hyper Spectral Multi Spectral Super Resolution

Deformable Siamese Attention Networks for Visual Object Tracking

by: Anonymous

Object Tracking Siamese Deformable Siamese Attention

Predicting Lymph Node Metastasis Using Histopathological Images Based on Multiple Instance Learning With Deep Graph Convolution

by: 中村優太

Histopathological Images Multi-instance learning Lymph Node Metastasis Cancer Graph Convolutional Network VAE-GAN weakly-supervised learning

ADINet: Attribute Driven Incremental Network for Retinal Image Classification

by: 藤中彩乃

ADINet Attribute driven incremental network 画像分類 Dynamical training Incremental learning framework

Unified Dynamic Convolutional Network for Super-Resolution With Variational Degradations

by: Ho Ching Chiu

Super Resolution Dynamic Convolution Multiple Degradation

Structure Boundary Preserving Segmentation for Medical Image With Ambiguous Boundary

by: 中村優太

Edge Detection Semantic Segmentation Skin Lesion Ultrasound

Unsupervised Instance Segmentation in Microscopy Images via Panoptic Domain Adaptation and Task Re-Weighting

by: 藤中彩乃

教師なし学習 Unsupervised domain adaptation (UDA) Instance segmentation Cyc-consistency panoptic domain adaptive mask R-CNN (CyC-PDAM) task re-weighting

GrappaNet: Combining Parallel Imaging With Deep Learning for Multi-Coil MRI Reconstruction

by: 中村優太

Parallel Imaging MRI U-Net GRAPPA

GroupFace: Learning Latent Groups and Constructing Group-Based Representations for Face Recognition

by: yamada ryosuke

face recognition

Scale-Equalizing Pyramid Convolution for Object Detection

by: Higaki Yoshinari

Object Detection feature pyramid 3D convolution

Generating 3D People in Scenes Without People

by: Katsuyuki Nakamura

Optical Flow in Dense Foggy Scenes Using Semi-Supervised Learning

by: Higaki Yoshinari

Optical Flow fog

Where Does It Exist: Spatio-Temporal Video Grounding for Multi-Form Sentences

by: Seitaro Shinagawa

video grounding vision-and-language localization

Structure-Preserving Super Resolution With Gradient Guidance

by: Ho Ching Chiu

Super Resolution Gradient Guidance

Enhancing Generic Segmentation With Learned Region Representations

by: hiroki tsujimoto

representation learning semantic segmentation generic segmentation edge detection

ABCNet: Real-Time Scene Text Spotting With Adaptive Bezier-Curve Network

by: Masanori YANO

Object Detection Dataset

Joint Texture and Geometry Optimization for RGB-D Reconstruction

by: yusuke saito

3D reconstruction rgb-d joint texture optimization geometry optimization high-boost normal

C-Flow: Conditional Generative Flow Models for Images and 3D Point Clouds

by: Naoya Chiba

Generative Flow Model Point Cloud Generation Image Manipulation

3DRegNet: A Deep Neural Network for 3D Point Registration

by: Naoya Chiba

3D Registration Point Cloud Registration

Point Cloud Completion by Skip-Attention Network With Hierarchical Folding

by: Naoya Chiba

Point Cloud Completion Hierarchical Folding Point Cloud Auto Encoder

Enhancing Cross-Task Black-Box Transferability of Adversarial Examples With Dispersion Reduction

by: 山縣英介

AEs transferability feature map

Siam R-CNN: Visual Tracking by Re-Detection

by: 遠藤大河

Visual Object Tracking Object Tracking Faster R-CNN

Rethinking Performance Estimation in Neural Architecture Search

by: Obi

NAS neural architecture search performance estimation

Learning to Generate 3D Training Data Through Hybrid Gradient

by: 山縣英介

synthetic image blackbox optimization

Improved Few-Shot Visual Classification

by: Shuhei M Yoshida

few-shot learning deep metric learning Mahalanobis distance

Seeing the World in a Bag of Chips

by: Teppei Kurita

Surface Light Fields RGBD Illumination Estimation

Closed-Loop Matters: Dual Regression Networks for Single Image Super-Resolution

by: Ho Ching Chiu

Super Resolution Self-supervision

Fine-Grained Video-Text Retrieval With Hierarchical Graph Reasoning

by: Seito Kasai

Cross-modal Retrieval Vision & Language

Learning Oracle Attention for High-Fidelity Face Completion

by: Shoma Iwai

face image synthesis iamge inpainting

PointGroup: Dual-Set Point Grouping for 3D Instance Segmentation

by: Naoya Chiba

Point Cloud 3D Instance Segmentation

FPConv: Learning Local Flattening for Point Convolution

by: Naoya Chiba

Point Cloud Convolution 2D Projection

Understanding Adversarial Examples From the Mutual Influence of Images and Perturbations

by: 福原吉博 (Yoshihiro Fukuhara)

universal adversarial examples adversarial examples

A Spatiotemporal Volumetric Interpolation Network for 4D Dynamic Medical Image

by: 藤中彩乃

4D spatiotemporal volumetric interpolation network (SVIN) 時空間モデル動画補間教師なし学習

Unity Style Transfer for Person Re-Identification

by: Yukitaka Tsuchiya

re-ID GAN

DOA-GAN: Dual-Order Attentive Generative Adversarial Network for Image Copy-Move Forgery Detection and Localization

by: 綱島秀樹

Copy-move Forgery Detection

PointAugment: An Auto-Augmentation Framework for Point Cloud Classification

by: Naoya Chiba

Point Cloud Classification Data Augmentation Adversarial Training

Fast Symmetric Diffeomorphic Image Registration with Convolutional Neural Networks

by: 藤中彩乃

位置合わせ Unsupervised symmetric image registration CNN 逆変換

Extreme Relative Pose Network Under Hybrid Representations

by: Shoji Sonoyama

RGB-D pose estimation

Joint Semantic Segmentation and Boundary Detection Using Iterative Pyramid Contexts

by: Anonymous

semantic segmentation semantic boundary detection

RPM-Net: Robust Point Matching Using Learned Features

by: Naoya Chiba

Point Cloud Registration Point Cloud Matching

Semantic Pyramid for Image Generation

by: fnakamura

image generation GAN generative adversarial networks

Grid-GCN for Fast and Scalable Point Cloud Learning

by: Naoya Chiba

Graph Convolutional Network Point Cloud Classification Point Cloud Segmentation

Unpaired Portrait Drawing Generation via Asymmetric Cycle Mapping

by: fnakamura

image-to-image translation image generation GAN generative adversarial networks

P2B: Point-to-Box Network for 3D Object Tracking in Point Clouds

by: Naoya Chiba

Point Cloud Tracking 3D Object Tracking Point-to-Box

A Shared Multi-Attention Framework for Multi-Label Zero-Shot Learning

by: 岡本大和

multi-label learning zero-shot learning few-shot learning attention

Fast MSER

by: Masanori YANO

Object Detection Feature

Learning Interactions and Relationships Between Movie Characters

by: 若宮天雅

Video Recognition

Proxy Anchor Loss for Deep Metric Learning

by: hiroki tsujimoto

metric learning

Gate-Shift Networks for Video Action Recognition

by: Anonymous

Understanding Human Hands in Contact at Internet Scale

by: Katsuyuki Nakamura

Human-object interaction Hand detection Dataset

Cost Volume Pyramid Based Depth Inference for Multi-View Stereo

by: yusuke saito

3D reconstruction multi view Cost Volume Pyramid Depth Inference

Toward a Universal Model for Shape From Texture

by: 山縣英介

shape texture 3pgame

GAN Compression: Efficient Architectures for Interactive Conditional GANs

by: iida

Compression cGAN GAN

Deep White-Balance Editing

by: Teppei Kurita

White Balance AWB

From Paris to Berlin: Discovering Fashion Style Influences Around the World

by: Hirokatsu Kataoka

Fashion Trend SNS

Towards Global Explanations of Convolutional Neural Networks With Concept Attribution

by: Hirokatsu Kataoka

Explainability Interpretability

Learning Unseen Concepts via Hierarchical Decomposition and Composition

by: Yue Qiu

Zero-shot Learning

Self-Supervised Deep Visual Odometry With Online Adaptation

by: Shoji Sonoyama

visual odometry self supervised learning meta learning online adaptation

Discriminative Multi-Modality Speech Recognition

by: Yue Qiu

Audio Speech Recognition Multi-modality Speech Recognition

Learning From Web Data With Self-Organizing Memory Module

by: Yue Qiu

Web Data Learning from Web Data

Noise Robust Generative Adversarial Networks

by: fnakamura

image generation generative adverarial networks GANs

SAM: The Sensitivity of Attribution Methods to Hyperparameters

by: Ryo Takahashi

Learning Visual Emotion Representations From Web Data

by: Yue Qiu

Web data Emotion Recognition Zero-shot Learning

DeepFLASH: An Efficient Network for Learning-Based Medical Image Registration

by: 藤中彩乃

DeepFLASH 位置合わせ Learning-based image registration 帯域制限高速化

Part-Aware Context Network for Human Parsing

by: Hirokatsu Kataoka

Human Pose Estimation

Visual-Semantic Matching by Exploring High-Order Attention and Distraction

by: Yue Qiu

Visual-Semantic Matching Image Retrieval Text Retrieval Scene Graph

RL-CycleGAN: Reinforcement Learning Aware Simulation-to-Real

by: fnakamura

reinforcement leanring sim2real image-to-image translation CycleGAN

RankMI: A Mutual Information Maximizing Ranking Loss

by: Shuhei M Yoshida

image retrieval mutual information deep metric learning

Self-Trained Deep Ordinal Regression for End-to-End Video Anomaly Detection

by: 岡本大和

anomaly detection deep ordinal regression human-in-the-loop machine learning anomaly explanation self-training unsupervised representation learning abnormal activity detection video learning

A Programmatic and Semantic Approach to Explaining and Debugging Neural Network Based Object Detectors

by: 岡本大和

population-level explanation testing perception neural network blackbox scenario object detection machine learning autonomous driving

Future Video Synthesis With Object Motion Prediction

by: Yukitaka Tsuchiya

Video Synthesis Spatial Transformer GAN Inpainting

On Positive-Unlabeled Classification in GAN

by: 綱島秀樹

GAN Unconditional Generation

Regularization on Spatio-Temporally Smoothed Feature for Action Recognition

by: 若宮天雅

Action Recognition

Steering Self-Supervised Feature Learning Beyond Local Pixel Statistics

by: 岡本大和

self-supervised representation learning inpainting unsupervised feature learning self-supervision transformations image statistics

SampleNet: Differentiable Point Cloud Sampling

by: Naoya Chiba

Point Cloud Sampling Point Cloud Classification 3D Registration Point Cloud Reconstruction

Self-Supervised 3D Human Pose Estimation via Part Guided Novel Image Synthesis

by: a2kiti

Pose Estimation single image

Towards Discriminability and Diversity: Batch Nuclear-Norm Maximization Under Label Insufficient Situations

by: Asato Matsumoto

semi-supervised learning domain adaptation open domain recognition

Neural Point Cloud Rendering via Multi-Plane Projection

by: Naoya Chiba

Point Cloud Rendering Multi Plane Rendering

Weakly-Supervised 3D Human Pose Learning via Multi-View Images in the Wild

by: a2kiti

pose estimation

Boosting the Transferability of Adversarial Samples via Attention

by: 山縣英介

AEs transfer blackbox

Learning to Have an Ear for Face Super-Resolution

by: Teppei Kurita

Super Resolution Multi Modal Audio Style GAN

Explaining Knowledge Distillation by Quantifying the Knowledge

by: Shuhei M Yoshida

knowledge distillation visual concepts

Adversarial Camouflage: Hiding Physical-World Attacks With Natural Styles

by: Tomoki Tanimura

Adversarial Examples Adversarial Attack Camouflage Real World Style Transfer

Distortion Agnostic Deep Watermarking

by: Tomoki Tanimura

Watermarking Robustness Adversarial Training Noise Encoding

Deep 3D Portrait From a Single Image

by: Yukitaka Tsuchiya

3D face manipulation GAN

Universal Weighting Metric Learning for Cross-Modal Matching

by: Keita Goto

cross modal triplet loss

FOAL: Fast Online Adaptive Learning for Cardiac Motion Estimation

by: 中村優太

meta learning motion estimation MRI cardiac MRI optical flow

Weakly Supervised Discriminative Feature Learning With State Information for Person Identification

by: Masanori YANO

Weakly Supervised Learning Unsupervised Learning Person Re-Identification Recognition

Self-Learning Video Rain Streak Removal: When Cyclic Consistency Meets Temporal Correspondence

by: yamada ryosuke

MCEN: Bridging Cross-Modal Gap between Cooking Recipes and Dish Images with Latent Variable Model

by: Yue Qiu

Cross-Modal Retrieval Vision and Language

Context-Aware Attention Network for Image-Text Retrieval

by: Yue Qiu

Image-Text Retrieval

4D Visualization of Dynamic Events From Unconstrained Multi-View Videos

by: Yukitaka Tsuchiya

4D visualization video manipulation

IMRAM: Iterative Matching With Recurrent Attention Memory for Cross-Modal Image-Text Retrieval

by: Yue Qiu

Image-Text Retrieval Vision and Language

Gated Channel Transformation for Visual Recognition

by: Anonymous

visual recognition

Implicit Functions in Feature Space for 3D Shape Reconstruction and Completion

by: Hiroaki Aizawa

Implicit Function

TransMoMo: Invariance-Driven Unsupervised Video Motion Retargeting

by: Yukitaka Tsuchiya

Motion Retargeting Disentanglement Pose GAN

Active Vision for Early Recognition of Human Actions

by: Mariko Nakano

Action Recognition

Reusing Discriminators for Encoding: Towards Unsupervised Image-to-Image Translation

by: Shoma Iwai

image-to-image GAN style transfer

Spatial Pyramid Based Graph Reasoning for Semantic Segmentation

by: Anonymous

semantic segmentation graph convolution graph reasoning

Dynamic Neural Relational Inference

by: Takehiko Ohkawa

Neural Contours: Learning to Draw Lines From 3D Shapes

by: yasud

draw line 3D

Can We Learn Heuristics for Graphical Model Inference Using Reinforcement Learning?

by: Takehiko Ohkawa

Semantic Segmentation Reinforcement Learning

SaccadeNet: A Fast and Accurate Object Detector

by: Ryosuke Araki

ObjectDetection

Depth Sensing Beyond LiDAR Range

by: Shoji Sonoyama

depth estimation

Cross-Modal Cross-Domain Moment Alignment Network for Person Search

by: 若宮天雅

person search

iTAML: An Incremental Task-Agnostic Meta-learning Approach

by: Shuhei M Yoshida

incremental learning class-incremental learning meta learning

Search to Distill: Pearls Are Everywhere but Not the Eyes

by: Takehiko Ohkawa

Knowledge Distillation Neural Architecture Search

CentripetalNet: Pursuing High-Quality Keypoint Pairs for Object Detection

by: Ryosuke Araki

ObjectDetection

FReeNet: Multi-Identity Face Reenactment

by: yamada ryosuke

face reenactment

Look-Into-Object: Self-Supervised Structure Modeling for Object Recognition

by: Hiroaki Aizawa

A Semi-Supervised Assessor of Neural Architectures

by: 山縣英介

NAS semi-supervised auto-encoder

Domain Balancing: Face Recognition on Long-Tailed Domains

by: yamada ryosuke

face recognition fairness

Retina-Like Visual Image Reconstruction via Spiking Neural Model

by: Teppei Kurita

Spike Camera Retina

ScopeFlow: Dynamic Scene Scoping for Optical Flow

by: Hiroki Ohashi

optical flow learning process augmentation

Deep Representation Learning on Long-Tailed Data: A Learnable Embedding Augmentation Perspective

by: Tomoki Tanimura

Representation Learning Imbalance Dataset Augmentation Person ReIdentification Face Recognition

A Hierarchical Graph Network for 3D Object Detection on Point Clouds

by: 寺田英雄

point cloud 3d object detection hgnet graph network graph convolution pyramidal model

Local-Global Video-Text Interactions for Temporal Grounding

by: Shintaro Yamamoto

Temporal Grounding Vision and Language

Exploring Unlabeled Faces for Novel Attribute Discovery

by: Shoma Iwai

Towards Visually Explaining Variational Autoencoders

by: 古澤嘉久

ExplainableAI 生成モデル異常検知

A Real-Time Cross-Modality Correlation Filtering Method for Referring Expression Comprehension

by: Yue Qiu

Referring Expression Vision and Language

On Vocabulary Reliance in Scene Text Recognition

by: Yue Qiu

Scene Text Recognition

ContourNet: Taking a Further Step Toward Accurate Arbitrary-Shaped Scene Text Detection

by: Yue Qiu

Scene Text Recognition

STEFANN: Scene Text Editor Using Font Adaptive Neural Network

by: Yue Qiu

Scene Text Editing

Semantic Drift Compensation for Class-Incremental Learning

by: 綱島秀樹

Incremental Learning Class-IL Semantic Drift

Optical Flow in the Dark

by: Komiki Maruyama

Optical Flow Optical Flow Estimation

Video Panoptic Segmentation

by: Komiki Maruyama

Video Panoptic Segmentation Panoptic Segmentation

On Translation Invariance in CNNs: Convolutional Layers Can Exploit Absolute Spatial Location

by: 榎本

CNN

FeatureFlow: Robust Video Interpolation via Structure-to-Texture Generation

by: 榎本

optical-flow interpolation

Training a Steerable CNN for Guidewire Detection

by: 榎本

medical detection

Semi-Supervised Semantic Image Segmentation With Self-Correcting Networks

by: 榎本

segmentation semi-supervised

Learning to Simulate Dynamic Environments With GameGAN

by: Shunsuke Nakatsuka

GANs Generative Models Reinforcement Learning

Benchmarking the Robustness of Semantic Segmentation Models

by: Hayata Ebisawa

segmantation robustness autonomous

Making Better Mistakes: Leveraging Class Hierarchies With Deep Networks

by: Shuhei M Yoshida

multi-class classification class hierarchy

RetinaFace: Single-Shot Multi-Level Face Localisation in the Wild

by: Masanori YANO

Localization Object Detection

BSP-Net: Generating Compact Meshes via Binary Space Partitioning

by: Shoji Sonoyama

3d reconstruction segmentation single view 3d reconstruction

Height and Uprightness Invariance for 3D Prediction From a Single View

by: Hiroaki Aizawa

Local Deep Implicit Functions for 3D Shape

by: yusuke saito

Deep Implicit Functions 3D shape representation

GhostNet: More Features From Cheap Operations

by: Teppei Kurita

Convolution Ghost Net Redundancy Feature Map

Hierarchically Robust Representation Learning

by: Shuhei M Yoshida

robust optimization representation learning

Learning in the Frequency Domain

by: Tomoki Tanimura

Frequency DCT Image Classification Instance Segmentation Feature Selection

Learning User Representations for Open Vocabulary Image Hashtag Prediction

by: Shintaro Yamamoto

Image Recognition

SEED: Semantics Enhanced Encoder-Decoder Framework for Scene Text Recognition

by: Yue Qiu

Scene Text Recognition Encoder-Decoder Framework

Learn to Augment: Joint Data Augmentation and Network Optimization for Text Recognition

by: Yue Qiu

Scene Text Generation Data Augmentation Scene Text Recognition

SCATTER: Selective Context Attentional Scene Text Recognizer

by: Shintaro Yamamoto

Scene Text Recognition

Fast(er) Reconstruction of Shredded Text Documents via Self-Supervised Deep Asymmetric Metric Learning

by: Yue Qiu

Shredded Text Reconstruction Metric Learning

Modeling the Background for Incremental Learning in Semantic Segmentation

by: 綱島秀樹

Semantic Segmentation Incremental Learning

Interactive Image Segmentation With First Click Attention

by: Shintaro Yamamoto

Segmentation

OrigamiNet: Weakly-Supervised, Segmentation-Free, One-Step, Full Page Text Recognition by learning to unfold

by: Yue Qiu

Text Recognition Full Page Text Recognition

Towards Transferable Targeted Attack

by: Keita Goto

adversarial attack triplet loss

Multi-Modal Domain Adaptation for Fine-Grained Action Recognition

by: Yasuhiko Tajiri

action recognition

PolarNet: An Improved Grid Representation for Online LiDAR Point Clouds Semantic Segmentation

by: Naoya Chiba

LiDAR Point Cloud Point Cloud Segmentation

RandLA-Net: Efficient Semantic Segmentation of Large-Scale Point Clouds

by: Naoya Chiba

Point Cloud Semantic Segmentation Point Cloud Sampling

CARP: Compression Through Adaptive Recursive Partitioning for Multi-Dimensional Images

by: Ryota Suzuki

image compression video compression pixel partitioning Bayesian model

Self-Supervised Equivariant Attention Mechanism for Weakly Supervised Semantic Segmentation

by: Shunsuke Nakatsuka

Weakly Supervised CAM Semantic Segmentation

The Edge of Depth: Explicit Constraints Between Segmentation and Depth

by: Ryota Suzuki

depth estimation self-supervised semantic segmentation

Deformation-Aware Unpaired Image Translation for Pose Estimation on Laboratory Animals

by: hiroki tsujimoto

pose estimation GAN style transfer

Learning Individual Speaking Styles for Accurate Lip to Speech Synthesis

by: Ryota Suzuki

lip reading personal dataset

Old Is Gold: Redefining the Adversarially Learned One-Class Classifier Training Paradigm

by: Shunsuke Nakatsuka

Anomaly Detection Generative Models

Feature-Metric Registration: A Fast Semi-Supervised Approach for Robust Point Cloud Registration Without Correspondences

by: Naoya Chiba

Point Cloud Registration PointNetLK Feature Learning

Screencast Tutorial Video Understanding

by: 若宮天雅

video recognition video caption

PointGMM: A Neural GMM Network for Point Clouds

by: Naoya Chiba

hierarchical Gaussian Mixture Model (hGMM) Point Cloud Registration Point Cloud Generation

Symmetry and Group in Attribute-Object Compositions

by: Anonymous

Combining Detection and Tracking for Human Pose Estimation in Videos

by: Masanori YANO

3D CNN Pose Estimation Tracking Object Detection

Dynamic Hierarchical Mimicking Towards Consistent Optimization Objectives

by: Anonymous

Optimization

DEPARA: Deep Attribution Graph for Deep Knowledge Transferability

by: Asato Matsumoto

knowledge transferability representation task similarity

Learning Video Stabilization Using Optical Flow

by: 遠藤大河

Optical Flow Motion and Tracking

Frequency Domain Compact 3D Convolutional Neural Networks

by: Teppei Kurita

DCT Frequency Domain 3D Convolution

Learning to Cartoonize Using White-Box Cartoon Representations

by: Yukitaka Tsuchiya

GAN Cartoon white-box

Embedding Expansion: Augmentation in Embedding Space for Deep Metric Learning

by: Tomoki Tanimura

Metric Learning Embedding Space Data Augmentation Embedding Expansion

Single Image Reflection Removal With Physically-Based Training Images

by: Anonymous

Reflection removal Physically-Based training dataset

Exploring Self-Attention for Image Recognition

by: 日坂　幸次

Spatio-Temporal Graph for Video Captioning With Knowledge Distillation

by: Katsuyuki Nakamura

video captioning object interaction knowledge distillation

A Local-to-Global Approach to Multi-Modal Movie Scene Segmentation

by: 日坂　幸次

ARShadowGAN: Shadow Generative Adversarial Network for Augmented Reality in Single Light Scenes

by: Yukitaka Tsuchiya

AR Shadow GAN

Detailed 2D-3D Joint Representation for Human-Object Interaction

by: 日坂　幸次

Human-Object Interaction

Weakly-Supervised Mesh-Convolutional Hand Reconstruction in the Wild

by: fnakamura

3D hand pose estimation 3D mesh reconstruction

Seeing Through Fog Without Seeing Fog: Deep Multimodal Sensor Fusion in Unseen Adverse Weather

by: 福沢栄治

自動運転、悪天候、データセット、マルチセンサーの融合

Semi-Supervised Learning for Few-Shot Image-to-Image Translation

by: kiyo

Image-to-Image Translation semi-supervised learning

Detection in Crowded Scenes: One Proposal, Multiple Predictions

by: 福沢栄治

人物検出、混雑したシーン

LG-GAN: Label Guided Adversarial Network for Flexible Targeted Attack of Point Cloud Based Deep Networks

by: 日坂　幸次

LG-GAN

Solving Mixed-Modal Jigsaw Puzzle for Fine-Grained Sketch-Based Image Retrieval

by: 日坂　幸次

FG-SBIR

Rethinking Computer-Aided Tuberculosis Diagnosis

by: Anonymous

CTD TB TBX11K Faster R-CNN

Disp R-CNN: Stereo 3D Object Detection via Shape Prior Guided Instance Disparity Estimation

by: 日坂　幸次

DispR-CNN

Gum-Net: Unsupervised Geometric Matching for Fast and Accurate 3D Subtomogram Image Alignment and Averaging

by: Issei Tomotsuka

Cyro-EM tomography alignment

Learning Geocentric Object Pose in Oblique Monocular Images

by: Katsuyuki Nakamura

geocentric pose depth estimation rectification

G3AN: Disentangling Appearance and Motion for Video Generation

by: Rei Tamaru

Generative Adversarial Learning Disentangled Representations

Hyperbolic Image Embeddings

by: 古川遼

hyperbolic space image embedding few-shot classification person re-identification

HRank: Filter Pruning Using High-Rank Feature Map

by: Asato Matsumoto

pruning feature map rank

Learning Generative Models of Shape Handles

by: Hiroaki Aizawa

Hyperbolic Visual Embedding Learning for Zero-Shot Recognition

by: Shuhei M Yoshida

zero-shot learning Poincare embedding

Fast Sparse ConvNets

by: Masanori YANO

Sparsity Kernel

Learning to Segment 3D Point Clouds in 2D Image Space

by: Naoya Chiba

Point Cloud Semantic Segmentation Kamada-Kawai Algorithm

Can Deep Learning Recognize Subtle Human Activities?

by: Katsuyuki Nakamura

deep learning activity recognition human performance

X-Linear Attention Networks for Image Captioning

by: Seitaro Shinagawa

image-captioning bilinear pooling attention

Filter Grafting for Deep Neural Networks

by: 遠藤大河

Deep Neural Networks DNN Filter Grafting

UniPose: Unified Human Pose Estimation in Single Images and Videos

by: a2kiti

Pose Estimation Atrous Convolution

Putting Visual Object Recognition in Context

by: Seitaro Shinagawa

image-recognition

Bodies at Rest: 3D Human Pose and Shape Estimation From a Pressure Image Using Synthetic Data

by: a2kiti

pose estimation pressure image

EmotiCon: Context-Aware Multimodal Emotion Recognition Using Frege's Principle

by: Seitaro Shinagawa

emotion recognition multi-modal

Context-Aware Group Captioning via Self-Attention and Contrastive Features

by: Seitaro Shinagawa

group-captioning image-captioning multi-source

High-Frequency Component Helps Explain the Generalization of Convolutional Neural Networks

by: 古澤嘉久

adversarial-training explainableAI

MemNAS: Memory-Efficient Neural Architecture Search With Grow-Trim Learning

by: 山縣英介

NAS memory

Equalization Loss for Long-Tailed Object Recognition

by: Hiroki Ohashi

loss function instance segmentation object detection image classification

Detail-recovery Image Deraining via Context Aggregation Networks

by: Seitaro Shinagawa

deraining image-inpainting

Defending Against Universal Attacks Through Selective Feature Regeneration

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial examples robustness universal perturbation

Universal Physical Camouflage Attacks on Object Detectors

by: 山縣英介

AEs physical

Instance Shadow Detection

by: Teppei Kurita

Shadow Detection Cast Shadow Object Detection Instance Segmentation Light Estimation

FastDVDnet: Towards Real-Time Deep Video Denoising Without Flow Estimation

by: Higaki Yoshinari

video denoising multiscale

Augmenting Colonoscopy Using Extended and Directional CycleGAN for Lossy Image Translation

by: 藤中彩乃

非可逆画像変換 CycleGAN 仮想大腸内視鏡検査深度推定

MAGSAC++, a Fast, Reliable and Accurate Robust Estimator

by: Higaki Yoshinari

robust estimator RANSAC fundamental matrix homography

Joint Filtering of Intensity Images and Neuromorphic Events for High-Resolution Noise-Robust Imaging

by: Higaki Yoshinari

event camera super resolution denoising

Dense Regression Network for Video Grounding

by: Yue Qiu

Video Grounding

MAST: A Memory-Augmented Self-Supervised Tracker

by: pacifinapacific

Tracking self supervised

Label Distribution Learning on Auxiliary Label Space Graphs for Facial Expression Recognition

by: Ryota Suzuki

latent space auxiliary task facial expression recognition

Hi-CMD: Hierarchical Cross-Modality Disentanglement for Visible-Infrared Person Re-Identification

by: Keita Goto

person re-identification cross modal infrared image

Syntax-Aware Action Targeting for Video Captioning

by: Yue Qiu

Video Captioning

Siamese Box Adaptive Network for Visual Tracking

by: pacifinapacific

VOT Tracking

Learning When and Where to Zoom With Deep Reinforcement Learning

by: Narifumi Otoeb

PatchDrop

Object Relational Graph With Teacher-Recommended Learning for Video Captioning

by: Yue Qiu

Video Captioning External Language Model

Large Scale Video Representation Learning via Relational Graph Clustering

by: Seito Kasai

representation learning retrieval search recommendation clustering

SG-NN: Sparse Generative Neural Networks for Self-Supervised Scene Completion of RGB-D Scans

by: Yue Qiu

RGB-D Scene Completion Self-Supervised 3D Vision

Learning Depth-Guided Convolutions for Monocular 3D Object Detection

by: 福沢　栄治

単眼カメラ、三次元物体検出、KITTI

Pose-Guided Visible Part Matching for Occluded Person ReID

by: 福沢　栄治

隠れた人、同一人物判定、姿勢

There and Back Again: Revisiting Backpropagation Saliency Methods

by: 古澤嘉久

ExplainableAI Grad-CAM Saliency-Method

Robust Superpixel-Guided Attentional Adversarial Attack

by: Keita Goto

adversarial attack class activation mapping

When to Use Convolutional Neural Networks for Inverse Problems

by: 山縣英介

CSC CNN

Adaptive Hierarchical Down-Sampling for Point Cloud Classification

by: Naoya Chiba

Point Cloud Downsampling Point Cloud Classification

SCOUT: Self-Aware Discriminant Counterfactual Explanations

by: Seitaro Shinagawa

explanation counterfactual visual explanation

AutoTrack: Towards High-Performance Visual Tracking for UAV With Automatic Spatio-Temporal Regularization

by: 福沢　栄治

UAV、自動追跡、視覚追跡

Towards Universal Representation Learning for Deep Face Recognition

by: 山縣英介

face recognition universal model

From Depth What Can You See? Depth Completion via Auxiliary Image Reconstruction

by: 福沢栄治

深度補完、補助画像再構成、KITTI

Bidirectional Graph Reasoning Network for Panoptic Segmentation

by: Anonymous

semantic segmentation instance segmentation panoptic segmentation

Face X-Ray for More General Face Forgery Detection

by: 榎本

deepfake

Just Go With the Flow: Self-Supervised Scene Flow Estimation

by: 福沢　栄治

シーンフロー推定、教師無し学習、自己監視損失

Boundary-Aware 3D Building Reconstruction From a Single Overhead Image

by: 寺田英雄

multi-task multi-feature 3d building modeling fpn mask r-cnn bpsh

Noisier2Noise: Learning to Denoise From Unpaired Noisy Data

by: 福沢栄治

画像ノイズ、除去、ニューラルネットワーク

Multimodal Future Localization and Emergence Prediction for Objects in Egocentric View With a Reachability Prior

by: Katsuyuki Nakamura

Reachability prior

What's Hidden in a Randomly Weighted Neural Network?

by: Shuhei M Yoshida

random neural networks lottery ticket hypothesis

Self-Training With Noisy Student Improves ImageNet Classification

by: Anonymous

self-training semi-supervised learning ImageNet

Exploring Data Aggregation in Policy Learning for Vision-Based Urban Autonomous Driving

by: 福沢栄治

データ集約、自動運転、ポリシー学習

Predicting Semantic Map Representations From Images Using Pyramid Occupancy Networks

by: 福沢栄治

自律運転、セマンティックマップ、単眼カメラ

Noise-Aware Fully Webly Supervised Object Detection

by: 福沢栄治

物体検出、Web画像、ノイズ画像

Sub-Frame Appearance and 6D Pose Estimation of Fast Moving Objects

by: 遠藤大河

Motion and Tracking

Rethinking Depthwise Separable Convolutions: How Intra-Kernel Correlations Lead to Improved MobileNets

by: Masanori YANO

Recognition Image Classification Kernel

Deep Unfolding Network for Image Super-Resolution

by: So Uchida

Supre-Resolution

Deep Degradation Prior for Low-Quality Image Classification

by: Anonymous

Image Classification

Associate-3Ddet: Perceptual-to-Conceptual Association for 3D Point Cloud Object Detection

by: Naoya Chiba

Object Detection LiDAR Point Cloud

TailorNet: Predicting Clothing in 3D as a Function of Human Pose, Shape and Garment Style

by: hirotaka hiraki

clothing 3d deformation pose shape garment style

Cross-Domain Semantic Segmentation via Domain-Invariant Interactive Relation Transfer

by: Anonymous

Domain Adaptation Semantic Segmentation

4D Association Graph for Realtime Multi-Person Motion Capture Using Multiple Video Cameras

by: yusuke saito

Multi-view Multi-person Realtime Motion Capture 4D Association Graph

AnimalWeb: A Large-Scale Hierarchical Dataset of Annotated Animal Faces

by: Anonymous

face analysis face detection fine-grained animal dataset

Cascaded Human-Object Interaction Recognition

by: Seitaro Shinagawa

image recognition human-object interaction recognition

Attention-Based Context Aware Reasoning for Situation Recognition

by: Seitaro Shinagawa

situation recognition

Unsupervised Multi-Modal Image Registration via Geometry Preserving Image-to-Image Translation

by: kiyo

Multi-modal Image Registration Generative Adversarial Models Unsupervised Learning

OASIS: A Large-Scale Dataset for Single Image 3D in the Wild

by: Anonymous

single-view 3D dataset annotation crowdsourcing depth estimation

A Morphable Face Albedo Model

by: hirotaka hiraki

photometric albedo 3d morphable model

Interpretable and Accurate Fine-grained Recognition via Region Grouping

by: Hirokatsu Kataoka

Fine-grained Recognition Interpretability Explainability

Reflection Scene Separation From a Single Image

by: Teppei Kurita

Reflection Enhancement Reflection Removal Reflection Separation

Beyond Short-Term Snippet: Video Relation Detection With Spatio-Temporal Global Context

by: Yue Qiu

Video visual relation detection Sliding Window Graph CN

PFRL: Pose-Free Reinforcement Learning for 6D Pose Estimation

by: Anonymous

Pose Estimation 6D

Weakly Supervised Semantic Point Cloud Segmentation: Towards 10x Fewer Labels

by: Naoya Chiba

Point Cloud Segmentation Weakly Supervised Learning

Learning Representations by Predicting Bags of Visual Words

by: Tomoki Tanimura

Self-Supservised Learning Bag of Visual Words Classification Detection

Scalability in Perception for Autonomous Driving: Waymo Open Dataset

by: Anonymous

autonomous driving dataset LiDAR multi-modal

NMS by Representative Region: Towards Crowded Pedestrian Detection by Proposal Pairing

by: Masanori YANO

Object Detection

Bi-Directional Relationship Inferring Network for Referring Image Segmentation

by: Anonymous

referring segmentation language attention

Domain Adaptation for Image Dehazing

by: Hao

image dehazing domain adaptation

DeeperForensics-1.0: A Large-Scale Dataset for Real-World Face Forgery Detection

by: Hao

face forgery detection dataset

Going Deeper With Lean Point Networks

by: Hirokatsu Kataoka

Point Cloud Segmentation

Cars Can't Fly Up in the Sky: Improving Urban-Scene Segmentation via Height-Driven Attention Networks

by: Hirokatsu Kataoka

Semantic Segmentation

Celeb-DF: A Large-Scale Challenging Dataset for DeepFake Forensics

by: Hao

DeepFake dataset

Anisotropic Convolutional Networks for 3D Semantic Scene Completion

by: Yue Qiu

Semantic Scene Completion 3D CNNs 3D Vision

Time Flies: Animating a Still Image With Time-Lapse Video As Reference

by: Hao

time lapse style transfer

High-Dimensional Convolutional Networks for Geometric Pattern Recognition

by: 中村真裕

convnets

StarGAN v2: Diverse Image Synthesis for Multiple Domains

by: Hao

image-image transfer

Designing Network Design Spaces

by: Hirokatsu Kataoka

EfficientNet AutoML NAS

ViBE: Dressing for Diverse Body Shapes

by: Hao

body dressing fashion

Select, Supplement and Focus for RGB-D Saliency Detection

by: Yue Qiu

RGB-D Saliency Detection

MSeg: A Composite Dataset for Multi-Domain Semantic Segmentation

by: Anonymous

dataset MSeg multi-domain semantic segmentation

Diverse Image Generation via Self-Conditioned GANs

by: Hao

GAN mode collapse Self-Conditioned

Discovering Human Interactions With Novel Objects via Zero-Shot Learning

by: 福沢栄治

ゼロショット、人間との相互作用、物体認識

Train in Germany, Test in the USA: Making 3D Object Detectors Generalize

by: 福沢栄治

三次元物体認識、自動運転、データセット

PPDM: Parallel Point Detection and Matching for Real-Time Human-Object Interaction Detection

by: Wataru Kudo

Action recognition

DoveNet: Deep Image Harmonization via Domain Verification

by: Yukitaka Tsuchiya

image harmonization U-Net dataset

Learning Multi-View Camera Relocalization With Graph Neural Networks

by: 福沢栄治

グラフニューラルネットワーク、カメラの絶対ポーズの推定、自動運転

Learning Situational Driving

by: 裏優斗

Autonomous Driving imitation learning

STINet: Spatio-Temporal-Interactive Network for Pedestrian Detection and Trajectory Prediction

by: 福沢栄治

歩行者追跡、時空間交互ネットワーク、３次元点群

Memory Enhanced Global-Local Aggregation for Video Object Detection

by: Hiroki Ohashi

video object detection object detection video

Generalized Product Quantization Network for Semi-Supervised Image Retrieval

by: Anonymous

GPQ semi supervised image retrieval

Multi-Granularity Reference-Aided Attentive Feature Aggregation for Video-Based Person Re-Identification

by: Hirokatsu Kataoka

Person Re-ID

Adaptive Loss-Aware Quantization for Multi-Bit Networks

by: Kiro Otsu

Network Compression Quantization multi-bit networks

Deep Polarization Cues for Transparent Object Segmentation

by: Masaki Taniguchi

transparent instance segmentation polarization

How Much Time Do You Have? Modeling Multi-Duration Saliency

by: 榎本

Scene Analysis

Dynamic Multiscale Graph Neural Networks for 3D Skeleton Based Human Motion Prediction

by: Shun.ishizaka

pose 3D skeleton human motion prediction multi-scale graph

Fusion-Aware Point Convolution for Online Semantic 3D Scene Segmentation

by: Naoya Chiba

Point Cloud Segmentation Point Cloud Convolution

Searching for Actions on the Hyperbole

by: Shun.ishizaka

video retrieval hierarchical action

Scene Recomposition by Learning-Based ICP

by: Shoji Sonoyama

reinforced learning scene recomposition pose estimation semantic segmentation

Don't Even Look Once: Synthesizing Features for Zero-Shot Detection

by: 福沢栄治

ゼロショット検出、物体検出、目に見えないオブジェクト

Open Compound Domain Adaptation

by: 岡本大和

domain adaptation compound domains open world curriculum learning visual memory

Deep Iterative Surface Normal Estimation

by: Ryosuke Araki

Surface Normal Estimation

Real-World Person Re-Identification via Degradation Invariance Learning

by: Masanori YANO

Person Re-Identification Self-Supervised Learning Disentangled Representation GAN

Bi3D: Stereo Depth Estimation via Binary Classifications

by: Shoji Sonoyama

depth estimation segmentation stereo depth estimation

MonoPair: Monocular 3D Object Detection Using Pairwise Spatial Relationships

by: Anonymous

3D Object Detection

PnPNet: End-to-End Perception and Prediction With Tracking in the Loop

by: Anonymous

Autonomous driving object detection tracking motion prediction lidar nuScenes

Self-Supervised Monocular Trained Depth Estimation Using Self-Attention and Discrete Disparity Volume

by: Shoji Sonoyama

monocular depth estimation depth estimation self attention self supervised learning

On the Detection of Digital Face Manipulation

by: Yukitaka Tsuchiya

deep fake detection dataset attention map

Weakly Supervised Fine-Grained Image Classification via Guassian Mixture Model Oriented Discriminative Learning

by: Ryosuke Araki

Classification Gaussian Mixture Model Weakly Supervised Fine-grained Image Recognition

Modeling Biological Immunity to Adversarial Examples

by: kiyo

Adversarial Examples Sparse Coding

Prime Sample Attention in Object Detection

by: Anonymous

object detection prime sampling

Towards Accurate Scene Text Recognition With Semantic Reasoning Networks

by: Seitaro Shinagawa

scene text recognition reasoning

Composed Query Image Retrieval Using Locally Bounded Features

by: Anonymous

composed query image retrieval

Deep Active Learning for Biased Datasets via Fisher Kernel Self-Supervision

by: Asato Matsumoto

active learning pseudo label fisher kernel Kullback–Leibler

Solving Jigsaw Puzzles With Eroded Boundaries

by: Teppei Kurita

Inpainiting GAN Jigsaw Pazzles

Interactive Object Segmentation With Inside-Outside Guidance

by: Anonymous

Interactive Object Segmentation

Efficient Dynamic Scene Deblurring Using Spatially Variant Deconvolution Network With Optical Flow Guided Training

by: Hiroki Ohashi

deblur optical flow deformable convolution dilated convolution

FALCON: A Fourier Transform Based Approach for Fast and Secure Convolutional Neural Network Predictions

by: Tomoki Tanimura

Privacy Privacy Preserving Fourier Transform

MetaFuse: A Pre-trained Fusion Model for Human Pose Estimation

by: Ryota Suzuki

pose estimation multiple camera fusion meta learning

KeypointNet: A Large-Scale 3D Keypoint Dataset Aggregated From Numerous Human Annotations

by: Ryota Suzuki

3D keypoint dataset keypoints aggregation

Continual Learning With Extended Kronecker-Factored Approximate Curvature

by: Asato Matsumoto

K-FAC Continual Learning natural gradient learning

Label Decoupling Framework for Salient Object Detection

by: Anonymous

salient object detection distance transformation

Towards Robust Image Classification Using Sequential Attention Models

by: Tomoki Tanimura

Adversarial Examples Adversarial Robustness Adversarial Defense Sequential Process Attention Robustness LSTM

Sketch Less for More: On-the-Fly Fine-Grained Sketch-Based Image Retrieval

by: yasud

sketch RL 強化学習 Siamese network

PaStaNet: Toward Human Activity Knowledge Engine

by: Kensho Hara

image-based activity recognition human-object interaction

Deep Semantic Clustering by Partition Confidence Maximisation

by: Asato Matsumoto

deep clustering unsupervised

Autolabeling 3D Objects With Differentiable Rendering of SDF Shape Priors

by: Pavel Savkin

SDF NOCS Differentiable rendering LiDAR annotation curriculum learning

Learning the Redundancy-Free Features for Generalized Zero-Shot Object Recognition

by: Anonymous

Zero-Shot Object Recognition

BlendMask: Top-Down Meets Bottom-Up for Instance Segmentation

by: Masaki Taniguchi

instance segmentation real-time attention

Data Uncertainty Learning in Face Recognition

by: yamada ryosuke

face recognition

Video Object Grounding Using Semantic Roles in Language Description

by: Shintaro Yamamoto

Video Object Grounding Vision-and-Language

ZSTAD: Zero-Shot Temporal Activity Detection

by: Hiroki Ohashi

zero-shot learning activity detection label embedding word2vec

Style Normalization and Restitution for Generalizable Person Re-Identification

by: Masa

Person Re-Identification Domain Adaption Domain Generalization

Learning Instance Occlusion for Panoptic Segmentation

by: Hirokatsu Kataoka

Panoptic Segmentation Instance Segmentation Semantic Segmentation

RDCFace: Radial Distortion Correction for Face Recognition

by: yamada ryosuke

face recognition

Google Landmarks Dataset v2 - A Large-Scale Benchmark for Instance-Level Recognition and Retrieval

by: Shunsuke Kogure

Dataset Evaluation

Uncertainty-Aware CNNs for Depth Completion: Uncertainty from Beginning to End

by: Ryota Suzuki

Action Modifiers: Learning From Adverbs in Instructional Videos

by: Wataru Kudo

Weakly Supervised Embedding

BEDSR-Net: A Deep Shadow Removal Network From a Single Document Image

by: Shintaro Yamamoto

Document Recognition

ReDA:Reinforced Differentiable Attribute for 3D Face Reconstruction

by: Daiki Kimura

3D Face Reconstruction

MISC: Multi-Condition Injection and Spatially-Adaptive Compositing for Conditional Person Image Synthesis

by: Yukitaka Tsuchiya

image synthesis gan

Learning From Noisy Anchors for One-Stage Object Detection

by: Hirokatsu Kataoka

Object Detection MSCOCO Noisy Anchor

M2m: Imbalanced Classification via Major-to-Minor Translation

by: Shintaro Yamamoto

Classification Class-Imbalance

Attention Scaling for Crowd Counting

by: Hirokatsu Kataoka

Crowd Counting Density Attention

Intra- and Inter-Action Understanding via Temporal Action Parsing

by: Kensho Hara

Action Understanding Video Dataset Action Parsing

Two-Shot Spatially-Varying BRDF and Shape Estimation

by: yamada ryosuke

BRDF shape estimation

Learning to Dress 3D People in Generative Clothing

by: 中村真裕

3d people

Hypergraph Attention Networks for Multimodal Learning

by: Shintaro Yamamoto

VQA Vision-and-Language

Self-Supervised Monocular Scene Flow Estimation

by: Hirokatsu Kataoka

Scene Flow Depth Image

Parsing-Based View-Aware Embedding Network for Vehicle Re-Identification

by: Hirokatsu Kataoka

Vehicle Re-identification

Correspondence-Free Material Reconstruction using Sparse Surface Constraints

by: Ryota Suzuki

physical material parameter deformable object sparse observation

Analyzing and Improving the Image Quality of StyleGAN

by: 綱島秀樹

GAN Regularization Latent Space Mapping

Learning to Discriminate Information for Online Action Detection

by: Kensho Hara

Online Action Detection Temporal Action Detection RNN

WaveletStereo: Learning Wavelet Coefficients of Disparity Map in Stereo Matching

by: 若宮天雅

stereo matching

Three-Dimensional Reconstruction of Human Interactions

by: Hirokatsu Kataoka

Monocular 3D Reconstruction Human Interaction

Moving in the Right Direction: A Regularization for Deep Metric Learning

by: Shintaro Yamamoto

Metric Learning

Copy and Paste GAN: Face Hallucination From Shaded Thumbnails

by: Hirokatsu Kataoka

GAN Face Generation

Data-Efficient Semi-Supervised Learning by Reliable Edge Mining

by: Asato Matsumoto

Semi-Supervised Learning graph

3DV: 3D Dynamic Voxel for Action Recognition in Depth Video

by: Hiroaki Aizawa

CenterMask: Real-Time Anchor-Free Instance Segmentation

by: Masanori YANO

Instance Segmentation

Deep Homography Estimation for Dynamic Scenes

by: Yukitaka Tsuchiya

homography

CenterMask: Single Shot Instance Segmentation With Point Representation

by: Masanori YANO

Instance Segmentation

RGBD-Dog: Predicting Canine Pose from RGBD Sensors

by: yamada ryosuke

3DSSD: Point-Based 3D Single Stage Object Detector

by: Hirokatsu Kataoka

Point Cloud Object Detection

Variational Context-Deformable ConvNets for Indoor Scene Parsing

by: Seitaro Shinagawa

scene parsing image segmentation depth convolution

VSGNet: Spatial Attention Network for Detecting Human Object Interactions Using Graph Convolutions

by: Shintaro Yamamoto

Human Object Interaction Graph Convolution

Learning a Unified Sample Weighting Network for Object Detection

by: Ryota Suzuki

object detection sample weighting

Image Demoireing with Learnable Bandpass Filters

by: Teppei Kurita

Demoire Moire Bandpass Filter

Can Weight Sharing Outperform Random Architecture Search? An Investigation With TuNAS

by: Hirokatsu Kataoka

Neural Architecture Search Weight Sharing

Synthetic Learning: Learn From Distributed Asynchronized Discriminator GAN Without Sharing Medical Image Data

by: Shintaro Yamamoto

Medical Image GAN Privacy

Differentiable Volumetric Rendering: Learning Implicit 3D Representations Without 3D Supervision

by: Hiroaki Aizawa

Interactive Multi-Label CNN Learning With Partial Labels

by: Asato Matsumoto

multi-label partial label smoothing

SiamCAR: Siamese Fully Convolutional Classification and Regression for Visual Tracking

by: Shunsuke Kogure

object detection tracking

Circle Loss: A Unified Perspective of Pair Similarity Optimization

by: Masa

metric learning face recognition fine-grained image retrieval person re-id

An Adaptive Neural Network for Unsupervised Mosaic Consistency Analysis in Image Forensics

by: Ryota Suzuki

mosaic consistency analysis image forensics

Deep 3D Capture: Geometry and Reflectance From Sparse Multi-View Images

by: yamada ryosuke

multi-view

Progressive Relation Learning for Group Activity Recognition

by: Mariko Nakano

Group Activity Recognition Action Recognition

Meta-Learning of Neural Architectures for Few-Shot Learning

by: Asato Matsumoto

NSA meta-learning few-shot

Assessing Eye Aesthetics for Automatic Multi-Reference Eye In-Painting

by: Ryota Suzuki

inpainting aesthetics

Single-Stage Semantic Segmentation From Image Labels

by: Yuchi Ishikawa

weakly-supervised semantic segmentation

Collaborative Distillation for Ultra-Resolution Universal Style Transfer

by: 野中琢登

Style Transfer Distillation

Disparity-Aware Domain Adaptation in Stereo Image Restoration

by: Anonymous

Unsupervised Intra-Domain Adaptation for Semantic Segmentation Through Self-Supervision

by: Anonymous

Unsupervised Domain Adaptation Semantic Segmentation Self-Supervision

Sideways: Depth-Parallel Training of Video Models

by: Kensho Hara

Video Efficient

DualConvMesh-Net: Joint Geodesic and Euclidean Convolutions on 3D Meshes

by: Anonymous

3D Mesh Semantic Segmentation

Clean-Label Backdoor Attacks on Video Recognition Models

by: Kensho Hara

Backdoor Attack Video Recognition Action Recognition

Active 3D Motion Visualization Based on Spatiotemporal Light-Ray Integration

by: Ryota Suzuki

projector light field

Monocular Real-Time Hand Shape and Motion Capture Using Multi-Modal Data

by: fnakamura

3D hand pose estimation 3D mesh reconstruction

Flow2Stereo: Effective Self-Supervised Learning of Optical Flow and Stereo Matching

by: Yuchi Ishikawa

optical flow stereo matching self-supervised learning

Multiple Anchor Learning for Visual Object Detection

by: Ryosuke Araki

Object Detection

High-Order Information Matters: Learning Relation and Topology for Occluded Person Re-Identification

by: Masa

Occluded Person Re-ID

Video Super-Resolution With Temporal Group Attention

by: Yuchi Ishikawa

video super-resolution

PuppeteerGAN: Arbitrary Portrait Animation With Semantic-Aware Appearance Transformation

by: Anonymous

GAN Semantic Segmentation Image to Image

Towards Efficient Model Compression via Learned Global Ranking

by: Shuhei M Yoshida

pruning

Sketch-BERT: Learning Sketch Bidirectional Encoder Representation From Transformers by Self-Supervised Learning of Sketch Gestalt

by: yasud

sketch retrival generation recognition

Towards Unified INT8 Training for Convolutional Neural Network

by: Shuhei M Yoshida

quantization quantized training

Leveraging Photometric Consistency Over Time for Sparsely Supervised Hand-Object Reconstruction

by: Wataru Kudo

SQE: a Self Quality Evaluation Metric for Parameters Optimization in Multi-Object Tracking

by: MatsuokaHikaru

Multi-Object Tracking ground truth Gaussian mixture model

PointPainting: Sequential Fusion for 3D Object Detection

by: Naoya Chiba

LiDAR Point Cloud 3D Semantic Segmentation

Deep Implicit Volume Compression

by: yusuke saito

3D reconstruction 3D voxel grids truncated signed distance fields compressing

Learning to Forget for Meta-Learning

by: Shuhei M Yoshida

meta learning MAML

Learning a Dynamic Map of Visual Appearance

by: Katsuyuki Nakamura

visual attribute dataset

Domain-Aware Visual Bias Eliminating for Generalized Zero-Shot Learning

by: Asato Matsumoto

zero-shot learning semantic-aligned visual representation

Spherical Space Domain Adaptation With Robust Pseudo-Label Loss

by: Ryo Takahashi

ドメイン適応

Learning to Detect Important People in Unlabelled Images for Semi-Supervised Important People Detection

by: Rei Tamaru

Important People Detection pseudo-label estimation Object Detection

Understanding Road Layout From Videos as a Whole

by: ReiTamaru

3D Reconstruction

Generalized Zero-Shot Learning via Over-Complete Distribution

by: Asato Matsumoto

zero-shot learning CVAE

Social-STGCNN: A Social Spatio-Temporal Graph Convolutional Neural Network for Human Trajectory Prediction

by: 日坂　幸次

Human Trajectory Prediction

One-Shot Adversarial Attacks on Visual Tracking With Dual Attention

by: 日坂　幸次

dversarial attack

Correspondence Networks With Adaptive Neighbourhood Consensus

by: 日坂　幸次

visual correspondence

Neural Architecture Search for Lightweight Non-Local Networks

by: 日坂　幸次

Meshlet Priors for 3D Mesh Reconstruction

by: Naoya Chiba

Meshlet Priors Mesh Reconstruction

Attention Convolutional Binary Neural Tree for Fine-Grained Visual Categorization

by: 日坂　幸次

FGVC

PVN3D: A Deep Point-Wise 3D Keypoints Voting Network for 6DoF Pose Estimation

by: Ryosuke Araki

Object Pose Estimation

Learning Saliency Propagation for Semi-Supervised Instance Segmentation

by: 日坂　幸次

Instance segmentatio

Memory Aggregation Networks for Efficient Interactive Video Object Segmentation

by: 日坂　幸次

iVOS Segmentation

Cloth in the Wind: A Case Study of Physical Measurement Through Simulation

by: 日坂　幸次

physical Measurement

Single-Step Adversarial Training With Dropout Scheduling

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial training adversarial robustness adversarial examples

From Two Rolling Shutters to One Global Shutter

by: Teppei Kurita

Rolling Shutter Global Shutter CMOS RANSAC

CascadePSP: Toward Class-Agnostic and Very High-Resolution Segmentation via Global and Local Refinement

by: Masaki Taniguchi

semantic segmentation cascade network high resolution image

COCAS: A Large-Scale Clothes Changing Person Dataset for Re-Identification

by: Masa

Person Re-Identification Dataset

Disentangling Physical Dynamics From Unknown Factors for Unsupervised Video Prediction

by: Rei Tamaru

PDE Video Prediction Video Forcasting

Heterogeneous Knowledge Distillation Using Information Flow Modeling

by: Shuhei M Yoshida

knowledge distillation

AvatarMe: Realistically Renderable 3D Facial Reconstruction "In-the-Wild"

by: Naoya Chiba

3D Facial Reconstruction Face Rendering BRDF Estimation

SESS: Self-Ensembling Semi-Supervised 3D Object Detection

by: Ryosuke Araki

3D Object Detection Semi-supervised Learning

Central Similarity Quantization for Efficient Image and Video Retrieval

by: Anonymous

hash CSQ image retrieval

Towards Inheritable Models for Open-Set Domain Adaptation

by: kiyo

Unsupervised Domain Adaptation Domain Adaptation

Spatially Attentive Output Layer for Image Classification

by: Masanori YANO

Recognition Image Classification Attention Distillaion

Polarized Reflection Removal With Perfect Alignment in the Wild

by: Teppei Kurita

Reflection Removal Polarization Sensor Polarization

Warp to the Future: Joint Forecasting of Features and Feature Motion

by: 日坂　幸次

Semantic Segmentation Warp to the Future

Deepstrip: High-Resolution Boundary Refinement

by: 日坂　幸次

h Resolution Boundary

Smoothing Adversarial Domain Attack and P-Memory Reconsolidation for Cross-Domain Person Re-Identification

by: 日坂　幸次

Person Re-Identification

Instance-Aware, Context-Focused, and Memory-Efficient Weakly Supervised Object Detection

by: 日坂　幸次

Weakly Supervised Object Detection

Density-Based Clustering for 3D Object Detection in Point Clouds

by: 日坂　幸次

3D Object Detection

Densely Connected Search Space for More Flexible Neural Architecture Search

by: 日坂　幸次

Neural architecture search

Network Adjustment: Channel Search Guided by FLOPs Utilization Ratio

by: 日坂　幸次

Network Adjustmen

Learning Longterm Representations for Person Re-Identification Using Radio Signals

by: 日坂　幸次

Person Re-Identificatio

What Deep CNNs Benefit From Global Covariance Pooling: An Optimization Perspective

by: 日坂　幸次

Global Covariance Pooling

Learning to Measure the Static Friction Coefficient in Cloth Contact

by: 日坂　幸次

Static Friction Coefficient

Fast Template Matching and Update for Video Object Tracking and Segmentation

by: 日坂　幸次

Video Object Tracking and Segmentation

Probabilistic Video Prediction From Noisy Data With a Posterior Confidence

by: 日坂　幸次

Video Prediction

Few-Shot Learning of Part-Specific Probability Space for 3D Shape Segmentation

by: Naoya Chiba

Point Cloud Semantic Segmentation Few Shot Learning

Neuromorphic Camera Guided High Dynamic Range Imaging

by: Teppei Kurita

Neuromorphic Camera DVS FSM HDR High Dynamic Range Sensor Fusion

TetraTSDF: 3D Human Reconstruction From a Single Image With a Tetrahedral Outer Shell

by: Naoya Chiba

3D Human Reconstruction Tetrahedral Volumetric Representation

Graph-Guided Architecture Search for Real-Time Semantic Segmentation

by: 福沢栄治

セマンティックセグメンテーション、軽量化、Cityscapes

All in One Bad Weather Removal Using Architectural Search

by: Anonymous

weather Architectural Search

Norm-Aware Embedding for Efficient Person Search

by: Masanori YANO

Person Search Person Re-Identification Object Detection

An Investigation Into the Stochasticity of Batch Whitening

by: kiyo

Normalization Principal Component Analysis Cholesky Decomposition Zero-phase Component Analysis

Joint Demosaicing and Denoising With Self Guidance

by: Teppei Kurita

JDD Demosaic NR Noise Reduction Bayer

StegaStamp: Invisible Hyperlinks in Physical Photographs

by: Tomoki Tanimura

Watermarking Steganography AR Augmented Reality Robustness

3D Human Mesh Regression With Dense Correspondence

by: Naoya Chiba

Human Mesh Regression Human Pose Estimation

DSGN: Deep Stereo Geometry Network for 3D Object Detection

by: Shoji Sonoyama

object detection depth estimation

TCTS: A Task-Consistent Two-Stage Framework for Person Search

by: Masanori YANO

Person Search Person Re-Identification Object Detection

Polarized Non-Line-of-Sight Imaging

by: Teppei Kurita

NLOS Polarization BRDF Transport Matrix

Learning a Reinforced Agent for Flexible Exposure Bracketing Selection

by: Teppei Kurita

Reinforcement Learning HDR Bracketing

Superpixel Segmentation With Fully Convolutional Networks

by: Anonymous

Superpixel Stereo matching

Robust Design of Deep Neural Networks Against Adversarial Attacks Based on Lyapunov Theory

by: 福原吉博 (Yoshihiro Fukuhara)

adversarial examples adversarial robustness

Learning to Observe: Approximating Human Perceptual Thresholds for Detection of Suprathreshold Image Transformations

by: Hiroaki Aizawa

KeyPose: Multi-View 3D Labeling and Keypoint Estimation for Transparent Objects

by: Naoya Chiba

Transparent Object Detection 3D Pose Estimation Passive Stereo

GraspNet-1Billion: A Large-Scale Benchmark for General Object Grasping

by: 福原吉博 (Yoshihiro Fukuhara)

dataset grasping grasp pose prediction

Improving the Robustness of Capsule Networks to Image Affine Transformations

by: 綱島秀樹

CapsNet Affine Transformation Routing

PREDICT & CLUSTER: Unsupervised Skeleton Based Action Recognition

by: 福原吉博 (Yoshihiro Fukuhara)

unsupervised learning action recognition

Adaptive Graph Convolutional Network With Attention Graph Clustering for Co-Saliency Detection

by: Ryo Takahashi

GP-NAS: Gaussian Process Based Neural Architecture Search

by: Obi

NAS neural architecture search Gaussian Process

Bi-Directional Interaction Network for Person Search

by: Masanori YANO

Person Search Person Re-Identification Object Detection Siamese Network

Cascaded Deep Monocular 3D Human Pose Estimation With Evolutionary Training Data

by: a2kiti

pose estimation data augmentation

BANet: Bidirectional Aggregation Network With Occlusion Handling for Panoptic Segmentation

by: 古川遼

panoptic segmentation occlusion handling

Perceptual Quality Assessment of Smartphone Photography

by: Teppei Kurita

Perceptual Quality Dataset

Learning Invariant Representation for Unsupervised Image Restoration

by: fnakamura

cross domain transfer image restoration disentangling representation

Exploring Spatial-Temporal Multi-Frequency Analysis for High-Fidelity and Temporal-Consistency Video Prediction

by: Rei Tamaru

Video Prediction Human Vision System Discrete Wavelet Transform

Two-Stage Peer-Regularized Feature Recombination for Arbitrary Image Style Transfer

by: Ryota Suzuki

style transfer metric learning graph convolution

Neural Cages for Detail-Preserving 3D Deformations

by: Naoya Chiba

Deformation Transfer Cage-based Deformation

Robust Partial Matching for Person Search in the Wild

by: Masanori YANO

Person Search Person Re-Identification Object Detection Dataset

CRNet: Cross-Reference Networks for Few-Shot Segmentation

by: Hiroki.Yamamoto

FewShot Segmentation

Multi-Scale Progressive Fusion Network for Single Image Deraining

by: Hiroki.Yamamoto

Deraining Segmentation

A Lighting-Invariant Point Processor for Shading

by: Hirokatsu Kataoka

Shading Geometry Depth

Projection & Probability-Driven Black-Box Attack

by: Hirokatsu Kataoka

Adversarial Examples

Adaptive Interaction Modeling via Graph Operations Search

by: Hirokatsu Kataoka

Graph Convolution Graph Operation Neural Architecture Search

Ensemble Generative Cleaning With Feedback Loops for Defending Adversarial Attacks

by: Hirokatsu Kataoka

Defense Adversarial Attack

In Perfect Shape: Certifiably Optimal 3D Shape Reconstruction From 2D Landmarks

by: Hirokatsu Kataoka

2D Landmark 3D Shape Reconstruction

Self2Self With Dropout: Learning Self-Supervised Denoising From Single Image

by: Teppei Kurita

Self-Supervised Learning NR Denoising Dropout

Recursive Social Behavior Graph for Trajectory Prediction

by: Hirokatsu Kataoka

Social Interaction Graph Convolutional Network

Lightweight Photometric Stereo for Facial Details Recovery

by: Hirokatsu Kataoka

CNN Photometric Stereo

Bi-Directional Relationship Inferring Network for Referring Image Segmentation

by: Anonymous

referring segmentation language attention

StructEdit: Learning Structural Shape Variations

by: Hirokatsu Kataoka

3D Model 3D Shape

Harmonizing Transferability and Discriminability for Adapting Object Detectors

by: Hirokatsu Kataoka

コードはこちらで公開: https://github.com/chaoqichen/HTCN

Fast Video Object Segmentation With Temporal Aggregation Network and Dynamic Template Matching

by: Hirokatsu Kataoka

Video Segmentation DAVIS dataset

Correlating Edge, Pose With Parsing

by: Hirokatsu Kataoka

Human Parsing Human Pose Estimation

VecRoad: Point-Based Iterative Graph Exploration for Road Graphs Extraction

by: Hirokatsu Kataoka

Road Graph Extraction Road Segmentation

3D-MPA: Multi-Proposal Aggregation for 3D Semantic Instance Segmentation

by: Hirokatsu Kataoka

3D Point Cloud 3D Instance Segmentation

EcoNAS: Finding Proxies for Economical Neural Architecture Search

by: Hirokatsu Kataoka

NAS

Joint Spatial-Temporal Optimization for Stereo 3D Object Tracking

by: Naoya Chiba

3D Object Tracking Stereo Trajectory Tracking

Hierarchical Clustering With Hard-Batch Triplet Loss for Person Re-Identification

by: Masanori YANO

Unsupervised Learning Person Re-Identification

MSG-GAN: Multi-Scale Gradients for Generative Adversarial Networks

by: Hiroki.Yamamoto

GAN

Normal Assisted Stereo Depth Estimation

by: Teppei Kurita

Surface Normal Depth Multi-View Cost Volume 3D CNN

Deep Non-Line-of-Sight Reconstruction

by: Naoya Chiba

Non-Line-of-Sight 3D Repnstrgion NLoS Depth Measurement

Context-Aware Human Motion Prediction

by: hirotaka hiraki

Motion Prediction Context Awareness Context RNN Graph Attention Network Edge Convolution

Gait Recognition via Semi-supervised Disentangled Representation Learning to Identity and Covariate Features

by: kiyo

Semi-Supervised Laerning Disentangled Representation Learning gait recognition

Visually Imbalanced Stereo Matching

by: Teppei Kurita

Stereo Depth Imbalanced

Hit-Detector: Hierarchical Trinity Architecture Search for Object Detection

by: Hirokatsu Kataoka

Object Detection NAS

Geometrically Principled Connections in Graph Neural Networks

by: Hirokatsu Kataoka

Graph Convolution GCN Geometry

Differential Treatment for Stuff and Things: A Simple Unsupervised Domain Adaptation Method for Semantic Segmentation

by: Hirokatsu Kataoka

Unsupervised Domain Adaptation Cityscapes SYNTHIA GTA5

Robust Object Detection Under Occlusion With Context-Aware CompositionalNets

by: Hirokatsu Kataoka

Object Detection Occlusion Context

Resolution Adaptive Networks for Efficient Inference

by: Masanori YANO

Recognition Image Classification

GNN3DMOT: Graph Neural Network for 3D Multi-Object Tracking With 2D-3D Multi-Feature Learning

by: Naoya Chiba

Multi Object Tracking Graph Neural Networks

Self-Supervised Human Depth Estimation From Monocular Videos

by: 福沢　栄治

深度推定、単眼カメラ、3D非剛体運動

VPLNet: Deep Single View Normal Estimation With Vanishing Points and Lines

by: 福沢　栄治

消失点の推定、マンハッタン線マップ、RGB画像

MEBOW: Monocular Estimation of Body Orientation in the Wild

by: 福沢　栄治

３次元姿勢推定、データセット、COCO-MEBOW

Zooming Slow-Mo: Fast and Accurate One-Stage Space-Time Video Super-Resolution

by: 福沢　栄治

超解像度、時空間情報、ビデオフレーム補間

FaceScape: A Large-Scale High Quality 3D Face Dataset and Detailed Riggable 3D Face Prediction

by: 福沢　栄治

3D顔データセット、3D顔モデル、単一画像

Spatial-Temporal Graph Convolutional Network for Video-Based Person Re-Identification

by: 福沢　栄治

同一人物判定、時空間情報、グラフ畳み込みネットワーク

Online Depth Learning Against Forgetting in Monocular Videos

by: 福沢　栄治

オンライン深度学習、忘却学習、単眼ビデオ

Relation-Aware Global Attention for Person Re-Identification

by: 福沢　栄治

同一人物判定、注目機制、グローバル構造情報、相関

Deep Metric Learning via Adaptive Learnable Assessment

by: 福沢　栄治

ディープメトリックラーニング、適応学習可能評価、エピソードに基づくトレーニング

Exemplar Normalization for Learning Deep Representation

by: Hirokatsu Kataoka

Exemplar Normalization Image Recognition

Imitative Non-Autoregressive Modeling for Trajectory Forecasting and Imputation

by: Hirokatsu Kataoka

Trajectory Forecasting VAE

Sparse Layered Graphs for Multi-Object Segmentation

by: Hirokatsu Kataoka

Segmentation Ishikawa Layered Technique

End-to-End 3D Point Cloud Instance Segmentation Without Detection

by: Hirokatsu Kataoka

3D Point Cloud Instance Segmentation

Exploring Bottom-Up and Top-Down Cues With Attentive Learning for Webly Supervised Object Detection

by: Hirokatsu Kataoka

Webly Supervision Object Detection

IDA-3D: Instance-Depth-Aware 3D Object Detection From Stereo Vision for Autonomous Driving

by: Hirokatsu Kataoka

Stereo Vision 3D Object Detection

Learning to Restore Low-Light Images via Decomposition-and-Enhancement

by: Teppei Kurita

Low-Light Frequency Decomposition

EventCap: Monocular 3D Capture of High-Speed Human Motions Using an Event Camera

by: a2kiti

pose estimation; event camera

Structure Aware Single-Stage 3D Object Detection From Point Cloud

by: Anonymous

3D object detection

OccuSeg: Occupancy-Aware 3D Instance Segmentation

by: Naoya Chiba

3D Instance Segmentation Spatial Segmentaiton

AD-Cluster: Augmented Discriminative Clustering for Domain Adaptive Person Re-Identification

by: Masanori YANO

Transfer Learning Person Re-Identification Domain Adaptation

Neural Head Reenactment with Latent Pose Descriptors

by: Anonymous

Deep Shutter Unrolling Network

by: Teppei Kurita

Rolling Shutter Global Shutter

Object-Occluded Human Shape and Pose Estimation From a Single Color Image

by: a2kiti

pose estimation shape estimation

PolarMask: Single Shot Instance Segmentation With Polar Representation

by: Masanori YANO

Instance Segmentation

Optical Non-Line-of-Sight Physics-Based 3D Human Pose Estimation

by: Naoya Chiba

Non-Line-Of-Sight Human Pose Estimation

Epipolar Transformers

by: Shoji Sonoyama

pose estimation keypoint detection

DeepLPF: Deep Local Parametric Filters for Image Enhancement

by: Teppei Kurita

Local Parametric Filter Image Enhancement

Explainable Object-Induced Action Decision for Autonomous Vehicles

by: Ryo Takahashi

Salience-Guided Cascaded Suppression Network for Person Re-Identification

by: Masanori YANO

Person Re-Identification Attention

Learning Fast and Robust Target Models for Video Object Segmentation

by: Yukitaka Tsuchiya

VOS segmentation

Generalized ODIN: Detecting Out-of-Distribution Image Without Learning From Out-of-Distribution Data

by: Masanori YANO

Recognition Image Classification

Attention Mechanism Exploits Temporal Contexts: Real-Time 3D Human Pose Reconstruction

by: a2kiti

pose estimaiton

Revisiting the Sibling Head in Object Detector

by: Hirokatsu Kataoka

Object Detection Sibling Head

Visual Reaction: Learning to Play Catch With Your Drone

by: Hirokatsu Kataoka

Visual Reactino Drone

Assessing Image Quality Issues for Real-World Problems

by: Teppei Kurita

Image Quality Assessment Blind

ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks

by: Hirokatsu Kataoka

MTL-NAS: Task-Agnostic Neural Architecture Search Towards General-Purpose Multi-Task Learning

by: Hirokatsu Kataoka

NAS Multi-task Learning

Learning to Segment the Tail

by: Hirokatsu Kataoka

Instance Segmentation LVIS dataset Few-show Learning

CoverNet: Multimodal Behavior Prediction Using Trajectory Sets

by: Hirokatsu Kataoka

Trajectory Prediction

Multi-Scale Interactive Network for Salient Object Detection

by: 綱島秀樹

SOD Salient Object Detection

Learning to See Through Obstructions

by: Hirokatsu Kataoka

See Through Vision Obstruction

MoreFusion: Multi-object Reasoning for 6D Pose Estimation from Volumetric Fusion

by: Hirokatsu Kataoka

6D Pose Estimation MoreFusion DenseFusion

Instance Guided Proposal Network for Person Search

by: 渡部海

Person Re-ID ObjectDetection

Cross-domain Object Detection through Coarse-to-Fine Feature Adaptation

by: Hirokatsu Kataoka

Domain Adaptatino Coarse-to-Fine

FGN: Fully Guided Network for Few-Shot Instance Segmentation

by: Hirokatsu Kataoka

Instance Segmentation Few-Shot Learning

LSM: Learning Subspace Minimization for Low-Level Vision

by: Teppei Kurita

Low-Level Vision Energy Minimization

GHUM & GHUML: Generative 3D Human Shape and Articulated Pose Models

by: a2kiti

human shape estimation

Coherent Reconstruction of Multiple Humans From a Single Image

by: a2kiti

human shape estimation

"Looking at the Right Stuff" - Guided Semantic-Gaze for Autonomous Driving

by: Anonymous

Dynamic Convolution: Attention Over Convolution Kernels

by: Masanori YANO

Recognition Image Classification Pose Estimation Attention

Dynamic Convolutions: Exploiting Spatial Sparsity for Faster Inference

by: Masanori YANO

Recognition Image Classification Pose Estimation Sparsity

Bidirectional Graph Reasoning Network for Panoptic Segmentation

by: Anonymous

panoptic segmentation semantic segmentation instance segmentation

Panoptic-DeepLab: A Simple, Strong, and Fast Baseline for Bottom-Up Panoptic Segmentation

by: Anonymous

CVPR2020論文サマリ