CVPR2019サーベイまとめ（一覧）

Mask Scoring R-CNN

by: Ryota Suzuki

Mask R-CNN Instance segmentation

Double-DIP: Unsupervised Image Decomposition via Coupled Deep-Image-Priors

by: Yoshiki

画像分解教師なし学習 deep image prior

SoPhie: An Attentive GAN for Predicting Paths Compliant to Social and Physical Constraints

by: Kazuma_Asano

歩行者予測 LSTM Prediction path prediction multiple interacting agents GAN

Heavy Rain Image Restoration: Integrating Physics Model and Conditional Adversarial Learning

by: Kazuma_Asano

雨除去雨豪雨 Heavy rain derain dehaze GAN

Large-Scale Weakly-Supervised Pre-Training for Video Action Recognition

by: GOTO Keita

弱教師あり学習 Action Recognition pre-training

Dance With Flow: Two-In-One Stream Action Detection

by: Tsubura Kazuki

Action Detection Two-in-One stream Two stream

Feature Denoising for Improving Adversarial Robustness

by: Hideki Tsunashima

Adversarial_Examples defense self_attention

On Stabilizing Generative Adversarial Training With Noise

by: yasud

GAN Training Stabilize 学習安定化

Recurrent Neural Networks With Intra-Frame Iterations for Video Deblurring

by: shirouchi satoshi

Recurrent Neural Networks Video Deblurring

Completeness Modeling and Context Separation for Weakly Supervised Temporal Action Localization

by: Hirokatsu Kataoka

Temporal Action Localization Weakly Supervised Learning

Unsupervised Deep Tracking

by: Hirokatsu Kataoka

Unsupervised Tracking Correlation Filter SIamese Network

Tracking by Animation: Unsupervised Learning of Multi-Object Attentive Trackers

by: Hirokatsu Kataoka

Multiple Object Tracking Tracking-by-detection

Fast Online Object Tracking and Segmentation: A Unifying Approach

by: Hirokatsu Kataoka

Tracking Semi-supervised Learning Video Object Segmentation

Practical Full Resolution Learned Lossless Image Compression

by: Takaya Yamazoe

lossless compression parallel

Privacy Protection in Street-View Panoramas Using Depth and Multi-View Imagery

by: Takaya Yamazoe

street view privacy GAN multi view

Cross-Modality Personalization for Retrieval

by: maokura

cross-modal search gaze personality caption

Towards Universal Object Detection by Domain Attention

by: Shuhei M Yoshida

物体検知 universal object detection domain attention

A Cross-Season Correspondence Dataset for Robust Semantic Segmentation

by: Tomoki Tanimura

Dataset Domain Semantic Segmentation point cloud 3D

Toward Convolutional Blind Denoising of Real Photographs

by: Katsuya Shimabukuro

Image Denoising Blind Denoising Noise Level Estimation

Engaging Image Captioning via Personality

by: Eisuke Yamagata

image-captioning personality

A Content Transformation Block for Image Style Transfer

by: Sou Uchida

style-transfer

Local to Global Learning: Gradually Adding Classes for Training Deep Neural Networks

by: Koki Obinata

learning paradigm learning framework

Representation Flow for Action Recognition

by: Tsubura Kazuki

optical flow action recognition representation flow

AE2-Nets: Autoencoder in Autoencoder Networks

by: kodai nakashima

Event-Based High Dynamic Range Image and Very High Frame Rate Video Generation Using Conditional Generative Adversarial Networks

by: Sou Uchida

Event Camera HDR High Frame Rate

Self-Supervised GANs via Auxiliary Rotation Loss

by: yasud

GAN 学習安定化 Discriminator Forgetting

All-Weather Deep Outdoor Lighting Estimation

by: Sou Uchida

Lighting Estimation HDR Panorama

Object Tracking by Reconstruction With View-Specific Discriminative Correlation Filters

by: Hirokatsu Kataoka

Correlation Filter

Grid R-CNN

by: Shuhei M Yoshida

Grid RCNN 物体検知 two-stage detection

Photo Wake-Up: 3D Character Animation From a Single Photo

by: Anonymous

3D SMPL animation

Leveraging Shape Completion for 3D Siamese Tracking

by: Hirokatsu Kataoka

3D Siamese Network Point Cloud

Target-Aware Deep Tracking

by: Hirokatsu Kataoka

Object Tracking Siamese Network

Learning Binary Code for Personalized Fashion Recommendation

by: Takaya Yamazoe

Fashion Recommendation Hash technique

Spatiotemporal CNN for Video Object Segmentation

by: Hirokatsu Kataoka

Video Object Segmentation

Character Region Awareness for Text Detection

by: Tomoki Tanimura

Text Detection Arbitrary Shape Pseudo Labling Weakly Supervised Dataset Polygon

Learning Multi-Class Segmentations From Single-Class Datasets

by: Yuta Tokuoka

multi-class segmentation conditioning biomedical image

Towards Rich Feature Discovery With Class Activation Maps Augmentation for Person Re-Identification

by: Hirokatsu Kataoka

Person Re-identification Class Activation Maps

Text2Scene: Generating Compositional Scenes From Textual Descriptions

by: Kiro Otsu

network compression

Inverse Cooking: Recipe Generation From Food Images

by: Takaya Yamazoe

Food Recipe Cooking

Wide-Context Semantic Image Extrapolation

by: Hirokatsu Kataoka

Image Extrapolation

A Compact Embedding for Facial Expression Similarity

by: rindybell

画像埋め込みリソース表情解析

Spatially Variant Linear Representation Models for Joint Filtering

by: Katsuya Shimabukuro

Joint Filtering Upsampling Image Denoising Image Deblurring

Mode Seeking Generative Adversarial Networks for Diverse Image Synthesis

by: Kazuma_Asano

GAN mode collapse モード崩壊 Diverse Image Synthesis 多様性

Domain-Specific Batch Normalization for Unsupervised Domain Adaptation

by: Shuhei M Yoshida

unsupervised domain adaptation domain-specific batch normalization

Atlas of Digital Pathology: A Generalized Hierarchical Histological Tissue Type-Annotated Database for Deep Learning

by: Yuta Nakamura

Dataset Medical Pathology Diagnosis Hierarchial

RENAS: Reinforced Evolutionary Neural Architecture Search

by: neka-nat

Pluralistic Image Completion

by: Kazuma_Asano

GAN 画像補完 image completion pluralistic diversity 多様性

A Generative Adversarial Density Estimator

by: Yoshiki

生成モデル

Structured Knowledge Distillation for Semantic Segmentation

by: kodai nakashima

Towards Scene Understanding: Unsupervised Monocular Depth Estimation With Semantic-Aware Representation

by: kodai nakashima

Tell Me Where I Am: Object-Level Scene Context Prediction

by: kodai nakashima

End-To-End Time-Lapse Video Synthesis From a Single Outdoor Image

by: Hirokatsu Kataoka

Time Lapse Video

Do Better ImageNet Models Transfer Better?

by: kodai nakashima

GIF2Video: Color Dequantization and Temporal Interpolation of GIF Images

by: Hirokatsu Kataoka

GIF to Video

Gotta Adapt 'Em All: Joint Pixel and Feature-Level Domain Adaptation for Recognition in the Wild

by: kodai nakashima

DMC-Net: Generating Discriminative Motion Cues for Fast Compressed Video Action Recognition

by: Kensho Hara

Action Recognition GAN

C-MIL: Continuation Multiple Instance Learning for Weakly Supervised Object Detection

by: Munetaka Minoguchi

Weakly Supervised Object Detection Multiple Instance Learning

Attention-Based Dropout Layer for Weakly Supervised Object Localization

by: Munetaka Minoguchi

Weakly Supervised Object Detection self-attention

D2-Net: A Trainable CNN for Joint Description and Detection of Local Features

by: shirouchi satoshi

Joint Detection and Description

ScratchDet: Training Single-Shot Object Detectors From Scratch

by: Munetaka Minoguchi

Object Detection pretrain

Style Transfer by Relaxed Optimal Transport and Self-Similarity

by: Sou Uchida

Style-transfer Earth Mover Distance

Learning Correspondence From the Cycle-Consistency of Time

by: Kensho Hara

Self-supervised Learning Visual Correspondence

Cascaded Partial Decoder for Fast and Accurate Salient Object Detection

by: Munetaka Minoguchi

Salient Object Detection

A Simple Pooling-Based Design for Real-Time Salient Object Detection

by: Munetaka MInoguchi

Salient Object Detection

MirrorGAN: Learning Text-To-Image Generation by Redescription

by: Kazuma_Asano

Text2Image Cycle Consistency GAN NLP

Contrast Prior and Fluid Pyramid Integration for RGBD Salient Object Detection

by: Munetaka MInoguchi

Salient Object Detection

Grounding Human-To-Vehicle Advice for Self-Driving Vehicles

by: Takaya Yamazoe

Self Driving HAD Honda NLP

X2CT-GAN: Reconstructing CT From Biplanar X-Rays With Generative Adversarial Networks

by: Takaya Yamazoe

CT X-ray reconstruction GAN 3D

Semi-Supervised Learning With Graph Learning-Convolutional Networks

by: Yusuke Mori

Graph Convolutional Network Semi-supervised Learning

Label Efficient Semi-Supervised Learning via Graph Filtering

by: uchi_k

graph graph convolutional neural network semi-supervised learning graph signal processing

Learning Active Contour Models for Medical Image Segmentation

by: Yuta Nakamura

segmentation active contour model medical image loss function

Salient Object Detection With Pyramid Attention and Salient Edges

by: Kzuma_Asano

Object Detection SOTA Saliency

Deep Network Interpolation for Continuous Imagery Effect Transition

by: Katsuya Shimabukuro

Image Interpolation Image Denoising Super Resolution Style Transfer Image-to-Image

DM-GAN: Dynamic Memory Generative Adversarial Networks for Text-To-Image Synthesis

by: Anonymous

GAN

Catastrophic Child's Play: Easy to Perform, Hard to Defend Adversarial Attacks

by: Yoshihiro Fukuhara

Adversarial_Examples AEs Adversarial_Attack

FEELVOS: Fast End-To-End Embedding Learning for Video Object Segmentation

by: Yuta Tokuoka

video object segmentatioon pixel-wise metric learning embedding

Explore-Exploit Graph Traversal for Image Retrieval

by: uchi_k

Weakly Supervised Deep Image Hashing Through Tag Embeddings

by: maokura

hashing word2vec search weakly-supervised

Latent Filter Scaling for Multimodal Unsupervised Image-To-Image Translation

by: Kazuma_Asano

GAN Image-to-Image Translation Multimodal Unsupervised Image-to-Image Translation

Learning to Detect Human-Object Interactions With Knowledge

by: kota yoshida

human-object interactions

Complete the Look: Scene-Based Complementary Product Recommendation

by: Takaya Yamazoe

Fashion Recommendation Scene

Multi-Step Prediction of Occupancy Grid Maps With Recurrent Neural Networks

by: Takaya Yamazoe

Self Driving OGM RNN

Panoptic Segmentation

by: Tomoki Tanimura

Panoptic Segmentation Segmentation Semantic Segmentation Instance Segmentation Task Metrics

Hardness-Aware Deep Metric Learning

by: Akihiro Yoshida

Metric Learning Data Augmentation

Attention-Aware Multi-Stroke Style Transfer

by: Kazuma_Asano

style transfer スタイル変換 attention

Zoom-In-To-Check: Boosting Video Interpolation via Instance-Level Discrimination

by: yasud

GAN フレーム補完 Frame Interpolation RoI Align

Convolutional Recurrent Network for Road Boundary Extraction

by: Tomoki Tanimura

Road Detection Autonomous Driving Feature Extraction Iterative

Action4D: Online Action Recognition in the Crowd and Clutter

by: Tenga Wakamiya

Action recognition

A Generative Appearance Model for End-To-End Video Object Segmentation

by: Takuma Yagi

video object segmentation gaussian mixture model

Deformable ConvNets V2: More Deformable, Better Results

by: Takuma Yagi

object detection instance segmentation deformable convolution

Discovering Visual Patterns in Art Collections With Spatially-Consistent Feature Learning

by: Takuma Yagi

computer vision and art unsupervised feature learning metric learning

SpherePHD: Applying CNNs on a Spherical PolyHeDron Representation of 360deg Images

by: Takuma Yagi

omni-directional images image representation equirectangular projection

Occlusion-Net: 2D/3D Occluded Keypoint Localization Using Graph Networks

by: Shuhei M Yoshida

keypoint detection occlusion graph networks

Unsupervised Image Captioning

by: Munetaka Minoguchi

Image Captioning Unsupervised

Learning Shape-Aware Embedding for Scene Text Detection

by: Munetaka Minoguchi

Text Detection

Describing Like Humans: On Diversity in Image Captioning

by: Munetaka Minoguchi

Image Captioning

LSTA: Long Short-Term Attention for Egocentric Action Recognition

by: Tsubura Kazuki

LSTM LSTA attention Action Recognition

Learning Pyramid-Context Encoder Network for High-Quality Image Inpainting

by: Kazuma_Asano

Inpainting Pyramid Encoder Network

Unsupervised Domain Adaptation Using Feature-Whitening and Consensus Loss

by: uchi_k

Learning Personalized Modular Network Guided by Structured Knowledge

by: SohOhara

classification segmentation structured knowledge graph modular network PMN

SketchGAN: Joint Sketch Completion and Recognition With Generative Adversarial Network

by: Ryosuke Araki

GANs Sketch Completion

Action Recognition From Single Timestamp Supervision in Untrimmed Videos

by: Tsubura Kazuki

action recognition timestamp

Interpretable and Fine-Grained Visual Explanations for Convolutional Neural Networks

by: takumuikeya

visual explanation

Gait Recognition via Disentangled Representation Learning

by: Koki Obinata

gait disentanglement auto encoder dataset

Attention-Based Adaptive Selection of Operations for Image Restoration in the Presence of Unknown Combined Distortions

by: Takuma Yagi

image restoration attention

Modulating Image Restoration With Continual Levels via Adaptive Feature Modification Layers

by: Yoshiki

Denoising Super resolution DeJPEG Adaptive layer

Memory in Memory: A Predictive Neural Network for Learning Higher-Order Non-Stationarity From Spatiotemporal Dynamics

by: Takuma Yagi

recurrent neural network spatiotemporal dynamics non-stationarity forecasting

Deep Multimodal Clustering for Unsupervised Audiovisual Learning

by: Takuma Yagi

audiovisual learning clustering sound localization multisource detection

Multi-Task Learning of Hierarchical Vision-Language Representation

by: Takaya Yamazoe

Multi Task Vision Language VQA Coattention

TraVeLGAN: Image-To-Image Translation by Transformation Vector Learning

by: Takuma Yagi

image-to-image translation siamese network

Object-Centric Auto-Encoders and Dummy Anomalies for Abnormal Event Detection in Video

by: Shunsuke NAKATSUKA

abnormal event detection

Class-Balanced Loss Based on Effective Number of Samples

by: Takuma Yagi

image classification

Dual Encoding for Zero-Example Video Retrieval

by: Takuma Yagi

video retrieval

Learning for Single-Shot Confidence Calibration in Deep Neural Networks Through Stochastic Inferences

by: Takuma Yagi

confidence calibration stochastic infrence deep neural networks

IRLAS: Inverse Reinforcement Learning for Architecture Search

by: Takuma Yagi

inverse reinforcement learning architecture search

A Variational Auto-Encoder Model for Stochastic Point Processes

by: maokura

VAE PointProcess ActionPrediction 動画

Hybrid Task Cascade for Instance Segmentation

by: neka-nat

RENAS: Reinforced Evolutionary Neural Architecture Search

by: neka-nat

Object Instance Annotation With Deep Extreme Level Set Evolution

by: Anonymous

Cross-Modal Self-Attention Network for Referring Image Segmentation

by: Yasuhide Miura

Referring Image Segmentation cross-modal

Deep Supervised Cross-Modal Retrieval

by: Yasuhide Miura

Cross-modal cross-modal retrieval

Image-Question-Answer Synergistic Network for Visual Dialog

by: Yasuhide Miura

Visual dialog

Adversarial Semantic Alignment for Improved Image Captions

by: Yasuhide Miura

Image captioning NLP

Example-Guided Style-Consistent Image Synthesis From Semantic Labeling

by: Kazuma_Asano

Image2Image Translation Style Consistent Semantic Labeling

Soft Labels for Ordinal Regression

by: Koki Obinata

ordinal regression

BeautyGlow: On-Demand Makeup Transfer Framework With Reversible Generative Network

by: Sou Uchida

Makeup Transfer Glow

Learning Parallax Attention for Stereo Image Super-Resolution

by: yasud

GAN SR ステレオ画像 Attention Atrous Convolution

Answer Them All! Toward Universal Visual Question Answering Models

by: Yasuhide Miura

VQA NLP CV

Multi-Step Prediction of Occupancy Grid Maps With Recurrent Neural Networks

by: Takaya Yamazoe

tactile vision cross-modal

TACNet: Transition-Aware Context Network for Spatio-Temporal Action Detection

by: GOTO Keita

Spatio-Temporal Action Detection Action Recognition ConvLSTM

Learning to Remember: A Synaptic Plasticity Driven Framework for Continual Learning

by: Yusuke Mori

continual learning catastrophic forgetting

Attention Based Glaucoma Detection: A Large-Scale Database and CNN Model

by: Yasuhide Miura

Glaucoma attention

Look More Than Once: An Accurate Detector for Text of Arbitrary Shapes

by: Yasuhide Miura

scene text detection

Adapting Object Detectors via Selective Cross-Domain Alignment

by: Takanori Ebihara

物体認識、cross-domain、強化学習

AIRD: Adversarial Learning Framework for Image Repurposing Detection

by: Yusuke Mori

Image repurposing

Im2Pencil: Controllable Pencil Illustration From Photographs

by: Kazuma_Asano

Image2Image Translation Style Transfer Image2Pencil

LO-Net: Deep Real-Time Lidar Odometry

by: Anonymous

LiDAR Odometry

Pseudo-LiDAR From Visual Depth Estimation: Bridging the Gap in 3D Object Detection for Autonomous Driving

by: Anonymous

Stereo Depth point cloud

Linkage Based Face Clustering via Graph Convolution Network

by: Hiromasa Sakata

faceclassification graph

Attentive Region Embedding Network for Zero-Shot Learning

by: uchi_k

zero-shot generalized zero-shot embedding

MetaCleaner: Learning to Hallucinate Clean Representations for Noisy-Labeled Visual Recognition

by: Shuhei M Yoshida

MetaCleaner noisy labels meta learning

Video Relationship Reasoning Using Gated Spatio-Temporal Energy Graph

by: Takaya Yamazoe

Video Relationship Graph

PartNet: A Recursive Part Decomposition Network for Fine-Grained and Hierarchical Shape Segmentation

by: Yuta Tokuoka

fine-grained segmentation hierarchical shape segmentation recursive neural network

Capture, Learning, and Synthesis of 3D Speaking Styles

by: Shintaro Yamamoto

Speech synthesis 3D character animation

A Parametric Top-View Representation of Complex Road Scenes

by: Takaya Yamazoe

Driving Top View parametric

MVTec AD -- A Comprehensive Real-World Dataset for Unsupervised Anomaly Detection

by: Yuta Tokuoka

unsupervised anomaly detection anomaly detection dataset segmentation of anomalous regions

Automatic Face Aging in Videos via Deep Reinforcement Learning

by: Shintaro Yamamoto

age progression reinforcement learning

Efficient Featurized Image Pyramid Network for Single Shot Detector

by: Shuhei M Yoshida

object detection feature pyramid featurized image pyramid

Peeking Into the Future: Predicting Future Person Activities and Locations in Videos

by: Anonymous

Explicit Bias Discovery in Visual Question Answering Models

by: Tomoki Tanimura

VQA statistical correlation rule bias discover mining

Triply Supervised Decoder Networks for Joint Detection and Segmentation

by: Shuhei M Yoshida

joint detection and segmentation object detection semantic segmentation

Neural Task Graphs: Generalizing to Unseen Tasks From a Single Video Demonstration

by: SohOhara

Neural Task Graph conjugate task graph video

Learning Actor Relation Graphs for Group Activity Recognition

by: Tsubura Kazuki

GCN Activity Recognition

Overcoming Limitations of Mixture Density Networks: A Sampling and Fitting Framework for Multimodal Future Prediction

by: Satoshi Inose

mixture density networks multimodal prediction

Effective Aesthetics Prediction With Multi-Level Spatially Pooled Features

by: Tomoki Tanimura

Aesthetics Quality Assessment Inception pretrained feature map AVA

LaserNet: An Efficient Probabilistic 3D Object Detector for Autonomous Driving

by: Eisuke Yamagata

自動運転物体検知 3Dobjectdetection LiDAR

Multispectral Imaging for Fine-Grained Recognition of Powders on Complex Backgrounds

by: takeshi miura

Time-Conditioned Action Anticipation in One Shot

by: Tsubura Kazuki

Action Anticipation

Improving Action Localization by Progressive Cross-Stream Cooperation

by: GOTO Keita

Action Localization R-CNN

Max-Sliced Wasserstein Distance and Its Use for GANs

by: Hideki Tsunashima

Generative Adversarial Nets GANs Wasserstein Distance word translation image generation

Light Field Messaging With Deep Photographic Steganography

by: Hirokatsu Kataoka

When Color Constancy Goes Wrong: Correcting Improperly White-Balanced Images

by: Hirokatsu Kataoka

Relational Knowledge Distillation

by: Munetaka Minoguchi

Knowledge Distillation

Global Second-Order Pooling Convolutional Networks

by: Hideki Tsunashima

CNN Pooling Global Average Pooling second-order pooling CIFAR-100 ImageNet 1k

Beyond Volumetric Albedo -- A Surface Optimization Framework for Non-Line-Of-Sight Imaging

by: Hirokatsu Kataoka

Events-To-Video: Bringing Modern Computer Vision to Event Cameras

by: Munetaka Minoguchi

Event Camera

Emotion-Aware Human Attention Prediction

by: Masaki Miyamoto

AttI EASal DNN human gaze prediction

Feedback Adversarial Learning: Spatial Feedback for Improving Generative Adversarial Networks

by: Kazuma_Asano

GAN Image2Image pix2pix voxel voxelGAN feedback

Understanding and Visualizing Deep Visual Saliency Models

by: Yamada Yoshihiro

SoDeep: A Sorting Deep Net to Learn Ranking Loss Surrogates

by: Yoshiki

Ranking Cross-modal retrieval Multi-label classification Visual memorability ranking

Reasoning-RCNN: Unifying Adaptive Global Reasoning Into Large-Scale Object Detection

by: Ryota Suzuki

object detection RCNN visual reasoning

Composing Text and Image for Image Retrieval - an Empirical Odyssey

by: Ryota Suzuki

image retrieval vision and language

Unsupervised Domain-Specific Deblurring via Disentangled Representations

by: Takaya Yamazoe

deblurring blur unsupervised

3D Appearance Super-Resolution With Deep Learning

by: Yuta Tokuoka

super resolution 3D appearance super-resolution dataset texture map 3D geometric information

An Attention Enhanced Graph Convolutional LSTM Network for Skeleton-Based Action Recognition

by: Hiromasa Sakata

actionrecognition skeletonbased

Inverse Discriminative Networks for Handwritten Signature Verification

by: rindybell

署名認識手書き署名認識署名

Exploring the Bounds of the Utility of Context for Object Detection

by: Shuhei M Yoshida

object detection context utility

AutoAugment: Learning Augmentation Strategies From Data

by: cfiken

Data Augmentation Image Classification Transfer Learning

Semantics Disentangling for Text-To-Image Generation

by: Keito Ishihara

text2image GAN

The Visual Centrifuge: Model-Free Layered Video Representations

by: Kensho Hara

Layer Decomposition

Did It Change? Learning to Detect Point-Of-Interest Changes for Proactive Map Updates

by: Munetaka Minoguchi

Point-of-Interest(PoI) Detect Point-of-Interest Changes

Unsupervised Person Re-Identification by Soft Multilabel Learning

by: Kensho Hara

Person Reidentification Unsupservised Learning

Animating Arbitrary Objects via Deep Motion Transfer

by: Kensho Hara

Motion Transfer

Sensitive-Sample Fingerprinting of Deep Neural Networks

by: Koki Obinata

security attack detection

Learning Context Graph for Person Search

by: Kensho Hara

Person Search Group Reidentification

Speech2Face: Learning the Face Behind a Voice

by: Shion Honda

cross-modal learning speech recognition image generation

Improved Road Connectivity by Joint Learning of Orientation and Segmentation

by: Takaya Yamazoe

Road Map Satellite Orientation SpaceNet DeepGlobe

Lending Orientation to Neural Networks for Cross-View Geo-Localization

by: rindybell

衛星画像位置推定

Attentive Feedback Network for Boundary-Aware Salient Object Detection

by: Kazuma_Asano

Salient Object Detection RealTime SOTA

Towards Natural and Accurate Future Motion Prediction of Humans and Animals

by: Shintaro Yamamoto

Motion prediction

Attention-Guided Network for Ghost-Free High Dynamic Range Imaging

by: Takeru Suda

HDR Attention

3D Guided Fine-Grained Face Manipulation

by: Shintaro Yamamoto

facial expression 3D face model

Dual Attention Network for Scene Segmentation

by: Hideki Tsunashima

Semantic Segmentation SOTA Attention Cityscapes PASCAL Context COCO stuff dataset

Reflection Removal Using a Dual-Pixel Sensor

by: Hirokatsu Kataoka

Reflection Removal Dual Pixel Sensor Photography

Practical Coding Function Design for Time-Of-Flight Imaging

by: Hirokatsu Kataoka

Time-of-Flight (ToF) Coding Function 3D Reconstruction

Neural Sequential Phrase Grounding (SeqGROUND)

by: Keito Ishihara

PhraseGrounding LSTM

Temporal Cycle-Consistency Learning

by: kotayoshida

Cycle-Consistency 動画認識 Few-shot learning

Meta-SR: A Magnification-Arbitrary Network for Super-Resolution

by: Hirokatsu Kataoka

Super Resolution

Multispectral and Hyperspectral Image Fusion by MS/HS Fusion Net

by: Hirokatsu Kataoka

Multispectral Image Hyperspectral Image FusionNet

Object-Driven Text-To-Image Synthesis via Adversarial Training

by: Mitani Tomohiro

GAN COCO text-to-image

Learning Attraction Field Representation for Robust Line Segment Detection

by: Hirokatsu Kataoka

Line Segment Detection (LSD) U-Net

Self-Calibrating Deep Photometric Stereo Networks

by: Takeru Suda

フォトメトリックステレオ光源推定

Blind Super-Resolution With Iterative Kernel Correction

by: Kazuma_Asano

SR Super Resolution SOTA blind SR

Learning Spatio-Temporal Representation With Local and Global Diffusion

by: GOTO Keita

Action Recognition Spatio-Temporal Action Detection

Improving Few-Shot User-Specific Gaze Adaptation via Gaze Redirection Synthesis

by: Mitani Tomohiro

Gaze redirection gaze adaptation

PointFlowNet: Learning Representations for Rigid Motion Estimation From Point Clouds

by: shirouchi satoshi

Ridge Motion Estimation

Learning to Explain With Complemental Examples

by: SohOhara

multimodal explanation complemental explanation classification

Unifying Heterogeneous Classifiers With Distillation

by: Hideki Tsunashima

Distillation Classifier Heterogeneous Classifiers ImageNet LSUN Places365

AdaFrame: Adaptive Frame Selection for Fast Video Recognition

by: Hiromasa Sakata

videorecognition lstm reinforcementlearning

Revisiting Local Descriptor Based Image-To-Class Measure for Few-Shot Learning

by: Shuhei M Yoshida

few-shot learning metric learning local descriptor naive-Bayes nearest neighbor

Video Magnification in the Wild Using Fractional Anisotropy in Temporal Distribution

by: Kazuma_Asano

Video Magnification Fractional Anisotropy NTT

Graph-Based Global Reasoning Networks

by: cfiken

Global Reasoning Relational Reasoning CNN Graph Convolution

Learning to Calibrate Straight Lines for Fisheye Image Rectification

by: Hirokatsu Kataoka

Distortion Fisheye Camera Line

Sea-Thru: A Method for Removing Water From Underwater Images

by: Katsuya Shimabukuro

Image Reconstruction Removing Water

Towards Robust Curve Text Detection With Conditional Spatial Expansion

by: Shuhei M Yoshida

curve text detection conditional spatial expansion

Camera Lens Super-Resolution

by: Hirokatsu Kataoka

Super Resolution Camera Lens

Deep Plug-And-Play Super-Resolution for Arbitrary Blur Kernels

by: Katsuya Shimabukuro

Image Restoration Super Resolution

Frame-Consistent Recurrent Video Deraining With Dual-Level Flow

by: Kazuma_Asano

Frame-Consistent Recurrent Video deraining

Art2Real: Unfolding the Reality of Artworks via Semantically-Aware Image-To-Image Translation

by: Anonymous

Cycle-GAN Artworks Image2Image

Revisiting Perspective Information for Efficient Crowd Counting

by: Shuhei M Yoshida

crowd counting perspective map

KE-GAN: Knowledge Embedded Generative Adversarial Networks for Semi-Supervised Scene Parsing

by: Rei Tamaru

semantic segmentation generative adversarial networks knowledge graph

OK-VQA: A Visual Question Answering Benchmark Requiring External Knowledge

by: maokura

VQA dataset

Enhanced Pix2pix Dehazing Network

by: Anonymous

Hierarchical Cross-Modal Talking Face Generation With Dynamic Pixel-Wise Loss

by: Shunsuke NAKATSUKA

Face Generation Audio Signal GANs

EventNet: Asynchronous Recursive Event Processing

by: Munetaka Minoguchi

イベントカメラ

Exact Adversarial Attack to Image Captioning via Structured Output Learning With Latent Variables

by: Munetaka Minoguchi

Image Captioning Adversarial Noise Adversarial Attack

DenseFusion: 6D Object Pose Estimation by Iterative Dense Fusion

by: Hideki Tsunashima

6D Object Pose Estimation YCB-Video LineMOD Point Cloud Object Detection SOTA

Putting Humans in a Scene: Learning Affordance in 3D Indoor Environments

by: Eisuke Yamagata

affordance prediction

Unsupervised Learning of Action Classes With Continuous Temporal Embedding

by: GOTO Keita

教師なし学習 Temporal Segmantation

Interaction-And-Aggregation Network for Person Re-Identification

by: takumuikeya

preid

CrDoCo: Pixel-Level Domain Transfer With Cross-Domain Consistency

by: Hiroaki Aizawa

Unsupervised Domain Adaptation

Shape Robust Text Detection With Progressive Scale Expansion Network

by: takumuikeya

text detection

Attention-Guided Unified Network for Panoptic Segmentation

by: Ryuta Shitomi

Panoptic Segmentation Attention

Learning Semantic Segmentation From Synthetic Data: A Geometrically Guided Input-Output Adaptation Approach

by: Hiroaki Aizawa

Cross-domain Semantic Segmentation Domain Adaptation

All About Structure: Adapting Structural Information Across Domains for Boosting Semantic Segmentation

by: Hiroaki Aizawa

Unsupervised Domain Adaptation Semantic Segmentation

Predicting Future Frames Using Retrospective Cycle GAN

by: kotayoshida

video recognition cycle-constraints フレーム予測

Towards Accurate One-Stage Object Detection With AP-Loss

by: Ryota Nishijima

detection one-shot loss

SpotTune: Transfer Learning Through Adaptive Fine-Tuning

by: Koki Obinata

fine-tuning transfer learning

Adversarial Defense Through Network Profiling Based Path Extraction

by: neka-nat

Tightness-Aware Evaluation Protocol for Scene Text Detection

by: Tomoki Tanimura

Text detection Evaluation metrics Score

Listen to the Image

by: shirouhi satoshi

Sensory Substitution devices Generation Adversarial Networks

F-VAEGAN-D2: A Feature Generating Framework for Any-Shot Learning

by: Takaya Yamazoe

Few shot learning zero shot learning GAN VAE

SIGNet: Semantic Instance Aided Unsupervised 3D Geometry Perception

by: Shintaro Yamamoto

depth prediction optical flow unsupervised learning

Monocular Depth Estimation Using Relative Depth Maps

by: Tomoki Tanimura

Depth estimation Relative depth

MSCap: Multi-Style Image Captioning With Unpaired Stylized Text

by: Keito Ishihara

image captioning

COIN: A Large-Scale Dataset for Comprehensive Instructional Video Analysis

by: Hiromasa Sakata

crowdcounting

SparseFool: A Few Pixels Make a Big Difference

by: Yoshihiro Fukuhara

Adversarial Example Adversarial Attack Sparse Attack

Deep Spectral Clustering Using Dual Autoencoder Network

by: Munetaka Minoguchi

Clustering Deep Clustering Unsupervised

Unsupervised Multi-Modal Neural Machine Translation

by: Yasuhide Miura

multi-modal unsupervised NMT

Attentive Relational Networks for Mapping Images to Scene Graphs

by: Munetaka Minoguchi

Scene Graph

Compressing Convolutional Neural Networks via Factorized Convolutional Filters

by: Munetaka Minoguchi

Filter Pruning

Gradient Matching Generative Networks for Zero-Shot Learning

by: Kensho Hara

Zero-shot Learning GAN

On the Intrinsic Dimensionality of Image Representations

by: Munetaka Minoguchi

Robot Arm 3D model

Gradient Matching Generative Networks for Zero-Shot Learning

by: Kensho Hara

Zero-shot Learning GAN

Adaptive NMS: Refining Pedestrian Detection in a Crowd

by: Ryota Suzuki

non maximum suppression detection

Point in, Box Out: Beyond Counting Persons in Crowds

by: Ryota Suzuki

cloud counting detection annotation cost

Locating Objects Without Bounding Boxes

by: Ryota Suzuki

object detection Hausdorff distance

Douglas-Rachford Networks: Learning Both the Image Prior and Data Fidelity Terms for Blind Image Deconvolution

by: Takaya Yamazoe

Deconvolution Blur Blind Douglas-Rachfold

Multi-Task Self-Supervised Object Detection via Recycling of Bounding Box Annotations

by: Ryota Nishijima

self-supervised learning object detection multi-task learning

Towards Visual Feature Translation

by: maokura

feature-transfer handcraft search

3D-SIS: 3D Semantic Instance Segmentation of RGB-D Scans

by: Naoya Chiba

3D semantic instance segmentation RGB-D 3D CNN

Occupancy Networks: Learning 3D Reconstruction in Function Space

by: Naoya Chiba

Occupancy Networks 3D mesh point cloud 3D reconstruction

Efficient Neural Network Compression

by: Kiro Otsu

ネットワーク圧縮固有値分解

Revisiting Self-Supervised Visual Representation Learning

by: Hiroaki Aizawa

Self-supervised Learning

DeepFashion2: A Versatile Benchmark for Detection, Pose Estimation, Segmentation and Re-Identification of Clothing Images

by: Rei Tamaru

DeepFashion Mask R-CNN Image Detection Match R-CNN

Taking a Deeper Look at the Inverse Compositional Algorithm

by: Naoya Chiba

Lucas-Kanade (LK) Inverse Compositional (IC) Algorithm 6DoF Estimation Registration Rigid Motion Estimation

PVNet: Pixel-Wise Voting Network for 6DoF Pose Estimation

by: Naoya Chiba

6DoF Estimation Vector Field Keypoints Detection

VITAMIN-E: VIsual Tracking and MappINg With Extremely Dense Feature Points

by: Tomoki Tanimura

SLAM Indirect method Dense geometry reconstruction dense feature points dominant flow optical flow 3D

NetTailor: Tuning the Architecture, Not Just the Weights

by: Hideki Tsunashima

Image Recognition Pretrained model fine-tuning Task Transfer Distillation Pruning SVHN Flowers PASCAL VOC

PlaneRCNN: 3D Plane Detection and Reconstruction From a Single Image

by: Naoya Chiba

Plane Estimation Plane Segmentation

Causes and Corrections for Bimodal Multi-Path Scanning With Structured Light

by: Naoya Chiba

Structured Light Phase Measuring Profilometry 3D Measurement 3D Scanning

H+O: Unified Egocentric Recognition of 3D Hand-Object Poses and Interactions

by: Naoya Chiba

Hand-Object Pose Action Recognition

Multimodal Explanations by Predicting Counterfactuality in Videos

by: SohOhara

multimodal explanation counterfactuality explanation video classification clipping

Deep Asymmetric Metric Learning via Rich Relationship Mining

by: Keito Ishihara

metric learning graph

Ensemble Deep Manifold Similarity Learning Using Hard Proxies

by: Shuhei M Yoshida

metric learning ensemble learning N-pair loss hard proxy estimation

Audio Visual Scene-Aware Dialog

by: Shion Honda

question answering video audio dialogue

Constrained Generative Adversarial Networks for Interactive Image Generation

by: Yoshiki

GAN Generative model Interactive

StoryGAN: A Sequential Conditional GAN for Story Visualization

by: Ryosuke Tanno

StoryGAN story-to-image-sequence generation 画像生成

Noise-Aware Unsupervised Deep Lidar-Stereo Fusion

by: Ryosuke Tanno

Lidar-Stereo Unsupervised stereo matching depth completion

Versatile Multiple Choice Learning and Its Application to Vision Computing

by: Ryosuke Tanno

ensemble method multiple choice learning MCL

EV-Gait: Event-Based Robust Gait Recognition Using Dynamic Vision Sensors

by: Ryosuke Tanno

Gait Recognition Dinamic Vision Sensor Event Camera

Learning 3D Human Dynamics From Video

by: Anonymous

3D motion

ToothNet: Automatic Tooth Instance Segmentation and Identification From Cone Beam CT Images

by: Ryosuke Tanno

Instance Segmentation Identification CBCT image RPN medical image teeth

Image Generation From Layout

by: Takeru Suda

生成モデル Disentangle

Inverse Procedural Modeling of Knitwear

by: Takeru Suda

Knitwear fashion 教師なし学習

Sampling Techniques for Large-Scale Object Detection From Sparsely Annotated Objects

by: Ryota Suzuki

object detection sparse annotation sampling

Barrage of Random Transforms for Adversarially Robust Defense

by: Ryota Suzuki

adversarial example

Large-Scale Interactive Object Segmentation With Human Annotators

by: Anonymous

segmentation annotation dataset

Iterative Reorganization With Weak Spatial Constraints: Solving Arbitrary Jigsaw Puzzles for Unsupervised Representation Learning

by: Hiroaki Aizawa

Self-supervised Learning Jigsaw Puzzle

Iterative Alignment Network for Continuous Sign Language Recognition

by: Masaki Miyamoto

sign language recognition 3D-ResNet SLR LSTM CTC

TAFE-Net: Task-Aware Feature Embeddings for Low Shot Learning

by: Hiroaki Aizawa

Few-shot Learning Meta Learning

PPGNet: Learning Point-Pair Graph for Line Segment Detection

by: Ryuta Shitomi

Line Segment Detection Graph theory

Collaborative Spatiotemporal Feature Learning for Video Action Recognition

by: Shunsuke NAKATSUKA

Spatiotemporal Feature Learning Video Action Recognition Conv3D

Pyramid Feature Attention Network for Saliency Detection

by: Hideki Tsunashima

saliency detection object detection visual tracking image retrieval semantic segmentation SOTA DUTS-test ECSSD HKU-IS PASCAL-S DUT-OMRON

End-To-End Interpretable Neural Motion Planner

by: Takeru Suda

自動運転 LiDAR HDmap

Expressive Body Capture: 3D Hands, Face, and Body From a Single Image

by: Yoitsu Takahashi

Pose Estimation 3D Human Body Model

Refine and Distill: Exploiting Cycle-Inconsistency and Knowledge Distillation for Unsupervised Monocular Depth Estimation

by: Tomoki Tanimura

depth estimation disparity map cycle inconsistency distilation

Fast Spatially-Varying Indoor Lighting Estimation

by: Satoshi Inose

CNN Spherical harmonics

Speed Invariant Time Surface for Learning to Detect Corner Points With Event-Based Cameras

by: Takaya Yamazoe

Event Camera Corner Detection Classifier

What's to Know? Uncertainty as a Guide to Asking Goal-Oriented Questions

by: Masaki Miyamoto

rainforcement agent visual dialogue bayesian

CityFlow: A City-Scale Benchmark for Multi-Target Multi-Camera Vehicle Tracking and Re-Identification

by: takeshi miura

multi target multi-camera tracking image-based vehicle re-identification a city-scale benchmark

Estimating 3D Motion and Forces of Person-Object Interactions From Monocular Video

by: takeshi miura

Estimating 3D Motion and Forces MoCap

Beyond Tracking: Selecting Memory and Refining Poses for Deep Visual Odometry

by: takeshi miura

monocular Visual Odometry CNN RNN

Assessment of Faster R-CNN in Man-Machine Collaborative Search

by: maokura

ubiquitous search eye-tracking

What Correspondences Reveal About Unknown Camera and Motion Models?

by: takeshi miura

3D vision

Unsupervised Face Normalization With Extreme Pose and Expression in the Wild

by: Shintaro Yamamoto

GAN face recognition

Rethinking the Evaluation of Video Summaries

by: Shion Honda

video summarization metrics

Feedback Network for Image Super-Resolution

by: Masaki Miyamoto

feedback block Image Super Resolution SR image

Learning Monocular Depth Estimation Infusing Traditional Stereo Knowledge

by: Shintaro Yamamoto

depth prediction unsupervised learning

Signal-To-Noise Ratio: A Robust Distance Metric for Deep Metric Learning

by: neka-nat

Information Maximizing Visual Question Generation

by: kotayoshida

visual question generation visual question answering language and vision

IP102: A Large-Scale Benchmark Dataset for Insect Pest Recognition

by: takeshi miura

dataset pest recognition

Defense Against Adversarial Images Using Web-Scale Nearest-Neighbor Search

by: Takeru Suda

Adversarial Attack Nearest-Neighbor Search Big Data

ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification

by: kota yoshida

text text recognition image refrection

Argoverse: 3D Tracking and Forecasting With Rich Maps

by: takeshi miura

dataset

Learning Not to Learn: Training Deep Neural Networks With Biased Data

by: Yoshihiro Fukuhara

UPSNet: A Unified Panoptic Segmentation Network

by: Takeru Suda

semantic segmentation instance segmentation panoptic segmentation

Geometry-Aware Symmetric Domain Adaptation for Monocular Depth Estimation

by: Shintaro Yamamoto

depth prediction domain adaptation

Side Window Filtering

by: Takeru Suda

Window Filtering image smoothing denoising enhancement structure preserving texture-removing mutual-structure extraction and high dynamic range image tone mapping Colorization

Sequence-To-Sequence Domain Adaptation Network for Robust Text Image Recognition

by: hiroki iida

DomainAdaptation TextImageRecognition

Quantization Networks

by: Shuhei M Yoshida

quantization

RES-PCA: A Scalable Approach to Recovering Low-Rank Matrices

by: Shuhei M Yoshida

ロバスト主成分分析 RPCA

Neuro-Inspired Eye Tracking With Eye Movement Dynamics

by: Shintaro Yamamoto

eye tracking

ZigZagNet: Fusing Top-Down and Bottom-Up Context for Object Segmentation

by: Shuhei M Yoshida

semantic segmentation instance segmentation multi-scale context ZigZagNet

Facial Emotion Distribution Learning by Exploiting Low-Rank Label Correlations Locally

by: Shintaro Yamamoto

facial emotion

Robust Point Cloud Based Reconstruction of Large-Scale Outdoor Scenes

by: Tomoki Tanimura

SLAM 3D reconstruction odometry cauchy probabilistic bayes data association loop-closures

Cyclic Guidance for Weakly Supervised Joint Detection and Segmentation

by: Takanori Ebihara

弱教師あり学習物体認識物体セグメンテーション

Variational Prototyping-Encoder: One-Shot Learning With Prototypical Images

by: Yuta Tokuoka

one-shot learning embedding variational encoder

Rare Event Detection Using Disentangled Representation Learning

by: Tomoki Tsujimura

disentanglement learning representation learning change detection

DeepLight: Learning Illumination for Unconstrained Mobile Mixed Reality

by: Anonymous

CNN 3D HDR

DDLSTM: Dual-Domain LSTM for Cross-Dataset Action Recognition

by: Shunsuke NAKATSUKA

Video Action Recognition Cross Domain

Deep Dual Relation Modeling for Egocentric Interaction Recognition

by: Shunsuke NAKATSUKA

Egocentric Interaction Recognition Relation Modeling

Hyperspectral Imaging With Random Printed Mask

by: Sou Uchida

Hyperspectral Imaging

PA3D: Pose-Action 3D Machine for Video Recognition

by: Shunsuke NAKATSUKA

Conv3D Video Action Recognition Pose

Visual Localization by Learning Objects-Of-Interest Dense Match Regression

by: Anonymous

CNN localization

Inserting Videos Into Videos

by: Sou Uchida

Video Insertion

ESIR: End-To-End Scene Text Recognition via Iterative Image Rectification

by: kota yoshida

video question answering video

DFANet: Deep Feature Aggregation for Real-Time Semantic Segmentation

by: Yuta Tokuoka

real-time semantic segmentation feature aggregation multi scale

MOTS: Multi-Object Tracking and Segmentation

by: Shunsuke NAKATSUKA

Multi-Object Tracking and Segmentation Dataset

Two-Stream Adaptive Graph Convolutional Networks for Skeleton-Based Action Recognition

by: GOTO Keita

Graph Convolutional Networks Two-stream Skeleton Action Recognition

Noise2Void - Learning Denoising From Single Noisy Images

by: kota yoshida

denoising biomedical

Siamese Cascaded Region Proposal Networks for Real-Time Visual Tracking

by: Shunsuke NAKATSUKA

Object Tracking Region Proposal Network Siamese Network

Learning Image and Video Compression Through Spatial-Temporal Energy Compaction

by: Sou Uchida

Image Compression Video Compression Energy Function

Object Discovery in Videos as Foreground Motion Clustering

by: Tsubura Kazuki

object detection segmentation pixel-trajectory RNN U-Net Y-Net

Deep RNN Framework for Visual Sequential Applications

by: cfiken

Visual sequence Deep RNN

Pose2Seg: Detection Free Human Instance Segmentation

by: Hiromasa Sakata

instance-segmentetion

Neighbourhood Watch: Referring Expression Comprehension via Language-Guided Graph Attention Networks

by: kota yoshida

referring expression comprehension language and vision

Deep Blind Video Decaptioning by Temporal Aggregation and Recurrence

by: Keito Ishihara

decaptioning encoder decoder

Scene Graph Generation With External Knowledge and Image Reconstruction

by: kota yoshida

Scene Graph Generation Image Reconstruction

Long-Term Feature Banks for Detailed Video Understanding

by: cfiken

Video Understanding

SSN: Learning Sparse Switchable Normalization via SparsestMax

by: Daisuke Makino

Normalization

Semantic Component Decomposition for Face Attribute Manipulation

by: Shintaro Yamamoto

facial attribute image editing

What Object Should I Use? - Task Driven Object Detection

by: Shuhei M Yoshida

task driven object detection graph neural network

Multi-Task Multi-Sensor Fusion for 3D Object Detection

by: Shuhei M Yoshida

3D object detection multi-task learning LiDAR

Multi-Similarity Loss With General Pair Weighting for Deep Metric Learning

by: Koki Obinata

metric learning pair loss framework

Domain-Symmetric Networks for Adversarial Domain Adaptation

by: Koki Obinata

domain adaptation adversarial learning

Learning to Learn From Noisy Labeled Data

by: Koki Obinata

meta learning noisy data

ApolloCar3D: A Large 3D Car Instance Understanding Benchmark for Autonomous Driving

by: Rei Tamaru

3D Car Dataset 3D-RCNN DeepMANTA

Task-Free Continual Learning

by: NSD

Continual Learning

Context-Aware Visual Compatibility Prediction

by: Eisuke Yamagata

fashion compatibility

Scalable Convolutional Neural Network for Image Compressed Sensing

by: yasud

CS 圧縮センシング SCSNet

Edge-Labeling Graph Neural Network for Few-Shot Learning

by: Akihiro Yoshida

graph neural network few-shot learning

Weakly Supervised Learning of Instance Segmentation With Inter-Pixel Relations

by: Kensho Hara

Instance Segmentation Weakly Supervised Learning

Efficient Video Classification Using Fewer Frames

by: Akihiro Yoshida

video classification distillation

A Neural Network Based on SPD Manifold Learning for Skeleton-Based Hand Gesture Recognition

by: GOTO Keita

Manifold Learning Gesture Recognition

Hierarchy Denoising Recursive Autoencoders for 3D Scene Layout Prediction

by: Hiroaki Aizawa

3D object detection point cloud

Adaptively Connected Neural Networks

by: Hiroaki Aizawa

Attentive Single-Tasking of Multiple Tasks

by: Hiroaki Aizawa

Multi-task Learning Attention

End-To-End Multi-Task Learning With Attention

by: Hiroaki Aizawa

Multi-task Learning Attention

Joint Discriminative and Generative Learning for Person Re-Identification

by: Hiroaki Aizawa

Person Re-Identification

Learning Regularity in Skeleton Trajectories for Anomaly Detection in Videos

by: Mitani Tomohiro

anomaly detection skelton features

A Structured Model for Action Detection

by: Tsubura Kazuki

Action Detection Graph Convolutional Network I3D Mask R-CNN

Learning Words by Drawing Images

by: Hiroaki Aizawa

Audio-visual Model GAN

What and How Well You Performed? A Multitask Learning Approach to Action Quality Assessment

by: Akihiro Yoshida

Action Quality Assessment Multi Task Learning

DSFD: Dual Shot Face Detector

by: Ryota Nishijima

face detection object detection

MaxpoolNMS: Getting Rid of NMS Bottlenecks in Two-Stage Object Detectors

by: takumuikeya

NMS objectdetection detection 物体検出

Context-Reinforced Semantic Segmentation

by: Masaki Miyamoto

segmentation p-map context reinforce

Translate-to-Recognize Networks for RGB-D Scene Recognition

by: Tenga Wakamiya

RGB-D Scene Recognition

AdaptiveFace: Adaptive Margin and Sampling for Face Recognition

by: yasud

Face Recognition 顔認識 cos face margin based loss arc face

Self-Supervised Learning via Conditional Motion Propagation

by: Hiroaki Aizawa

Self-supervised Learning

SR-LSTM: State Refinement for LSTM Towards Pedestrian Trajectory Prediction

by: Mitani Tomohiro

Collaborative Learning of Semi-Supervised Segmentation and Classification for Medical Images

by: Hiroaki Aizawa

Semi-supervised Learning

Curls & Whey: Boosting Black-Box Adversarial Attacks

by: Ryota Suzuki

adversarial example

Disentangled Representation Learning for 3D Face Shape

by: Mitani Tomohiro

3d mesh face model disentagle spectal graph convolution

Distilling Object Detectors With Fine-Grained Feature Imitation

by: Ryota Nishijima

knowledge distillation object detection

NDDR-CNN: Layerwise Feature Fusing in Multi-Task CNNs by Neural Discriminative Dimensionality Reduction

by: Hideki Tsunashima

Multi Task Learning NYU v2 dataset IMDB-WIKI dataset

Min-Max Statistical Alignment for Transfer Learning

by: Takuma Yagi

statistical alignment unsupervised domain adaptation zero-shot learning

On Exploring Undetermined Relationships for Visual Relationship Detection

by: QIUYUE

Visual Relationship Detection

Densely Semantically Aligned Person Re-Identification

by: Takanori Ebihara

人物検出

Weakly Supervised Image Classification Through Noise Regularization

by: Tenga Wakamiya

weakly supervised Image Classification

Stochastic Class-Based Hard Example Mining for Deep Metric Learning

by: s.kasai

Metric Learning Sample Mining

Learning Without Memorizing

by: QIUYUE

Incremental Learning

Dynamic Recursive Neural Network

by: QIUYUE

Recursive nerual network

Weakly Supervised Video Moment Retrieval From Text Queries

by: Tenga Wakamiya

weakly supervised

Learning Loss for Active Learning

by: cfiken

Active Learning

Classification-Reconstruction Learning for Open-Set Recognition

by: Masaki Miyamoto

Open-set classification CROSR reconstruction recognition

Deep Sky Modeling for Single Image Outdoor Lighting Estimation

by: Satoshi Inose

outdoor lighting HDR

GANFIT: Generative Adversarial Network Fitting for High Fidelity 3D Face Reconstruction

by: Hiromasa Sakata

GAN face-reconstruction

A Flexible Convolutional Solver for Fast Style Transfers

by: Takuma Yagi

style transfer

Mapping, Localization and Path Planning for Image-Based Navigation Using Visual Features and Map

by: Shuhei M Yoshida

image-based navigation retrieval-based localization map construction

Face Parsing With RoI Tanh-Warping

by: Anonymous

face parsing Mask R-CNN FCN

LiveSketch: Query Perturbations for Guided Sketch-Based Visual Search

by: Anonymous

LiveSketch Visual Search

Learning Cross-Modal Embeddings With Adversarial Networks for Cooking Recipes and Food Images

by: Yuta Nakamura

recipe food cross-modal adversarial loss retrieval loss metric learning modality alignment

Deep Embedding Learning With Discriminative Sampling Policy

by: neka-nat

InverseRenderNet: Learning Single Image Inverse Rendering

by: maokura

InverseRendering Renderer MultiViewStereo

Destruction and Construction Learning for Fine-Grained Image Recognition

by: QIUYUE

Fine-grained Image Recognition

Distraction-Aware Shadow Detection

by: QIUYUE

Shadow Detection

Mask-Guided Portrait Editing With Conditional GANs

by: kubo.takahiro

GAN

Neural Rejuvenation: Improving Deep Network Training by Enhancing Computational Resource Utilization

by: Hirokatsu Kataoka

Neural Rejuvenation Initialization Reallocation

Group Sampling for Scale Invariant Face Detection

by: kubo.takahiro

Object Detection

Perceive Where to Focus: Learning Visibility-Aware Part-Level Features for Partial Person Re-Identification

by: Akihiro Yoshida

re-identification metric learning

Revealing Scenes by Inverting Structure From Motion Reconstructions

by: Kazuma_Asano

GAN Structure from Motion Reconstructions SfM

Dichromatic Model Based Temporal Color Constancy for AC Light Sources

by: Anonymous

Multi-Label Image Recognition With Graph Convolutional Networks

by: QIUYUE

Multi-label image recognition Graph Convolutional Network

High-Level Semantic Feature Detection: A New Perspective for Pedestrian Detection

by: QIUYUE

Pedestrian Detection

Towards Social Artificial Intelligence: Nonverbal Social Signal Prediction in a Triadic Interaction

by: Takuma Yagi

social interaction social signal prediction motion capture

Kernel Transformer Networks for Compact Spherical Convolution

by: Tomoki Tanimura

spherical convolution scalability transfer

Isospectralization, or How to Hear Shape, Style, and Correspondence

by: Shion Honda

mesh object reconstruction graph

Semantic Image Synthesis With Spatially-Adaptive Normalization

by: Shion Honda

GAN semantic segmentation

Social Relation Recognition From Videos via Multi-Scale Spatial-Temporal Reasoning

by: kubo.takahiro

GraphConvolution Video

Relational Action Forecasting

by: Takuma Yagi

action forecasting early action prediction graph neural networks

MS-TCN: Multi-Stage Temporal Convolutional Network for Action Segmentation

by: Masaki Taniguchi

Temporal Action Segmentation Temporal Convolutional Network

Semi-Supervised Transfer Learning for Image Rain Removal

by: Masaki Miyamoto

Semi supervised transfer rain removal SIRR deep

Photometric Mesh Optimization for Video-Aligned 3D Object Reconstruction

by: Takuma Yagi

mesh reconstruction photometric consistency shape prior

3D Motion Decomposition for RGBD Future Dynamic Scene Synthesis

by: Takuma Yagi

future prediction visual forecasting motion decomposition

Joint Representation and Estimator Learning for Facial Action Unit Intensity Estimation

by: kubo.takahiro

Face

DeepVoxels: Learning Persistent 3D Feature Embeddings

by: Takuma Yagi

novel view synthesis 3d vision scene representation

What Do Single-View 3D Reconstruction Networks Learn?

by: Hideki Tsunashima

3D reconstruction Reconstruction Single-view 3D reconstruction

Semantic Alignment: Finding Semantically Consistent Ground-Truth for Facial Landmark Detection

by: kubo.takahiro

FaceLandmark

Efficient Parameter-Free Clustering Using First Neighbor Relations

by: Takeru Suda

hierarchical agglomerative method clustering DeepClustering

Learning View Priors for Single-View 3D Reconstruction

by: Takuma Yagi

single-view 3D reconstruction view prior

Shape2Motion: Joint Analysis of Motion Parts and Attributes From 3D Shapes

by: Takeru Suda

point cloud mobility PointNet

Image Super-Resolution by Neural Texture Transfer

by: shirouchi satoshi

Neural Texture Transfer

Deep Single Image Camera Calibration With Radial Distortion

by: Tenga Wakamiya

Camera Calibration

Second-Order Attention Network for Single Image Super-Resolution

by: Sou Uchida

Super Resolution Attention

Collaborative Global-Local Networks for Memory-Efficient Segmentation of Ultra-High Resolution Images

by: Takeru Suda

Segmentation Memory-Efficient

RepMet: Representative-Based Metric Learning for Classification and Few-Shot Object Detection

by: QIUYUE

Object Detection Distance Metric Learning few-shot learning

Extreme Relative Pose Estimation for RGB-D Scans via Scene Completion

by: Takuma Yagi

relative pose estimation scene completion spectral matching

Ranked List Loss for Deep Metric Learning

by: QIUYUE

Deep Metric Learning (DML) Ranking-motivated structured loss function

CANet: Class-Agnostic Segmentation Networks With Iterative Refinement and Attentive Few-Shot Learning

by: QIUYUE

Semantic Segmentation few-shot learning

R3 Adversarial Network for Cross Model Face Recognition

by: Shintaro Yamamoto

feature transformation

Co-Occurrent Features in Semantic Segmentation

by: ERLYN MANGUILIMOTAN

semantic segmentation co-occurent features

PartNet: A Large-Scale Benchmark for Fine-Grained and Hierarchical Part-Level 3D Object Understanding

by: Hiromasa Sakata

Deep Spherical Quantization for Image Search

by: Anonymous

image search quantization sparse coding

Learning Joint Reconstruction of Hands and Manipulated Objects

by: Anonymous

hand-object manipulation reconstruction

PifPaf: Composite Fields for Human Pose Estimation

by: yasud

Human Pose Estimation ポーズ推定 HPE

Fully Learnable Group Convolution for Acceleration of Deep Neural Networks

by: takumuikeya

高速化 DNN

Local Temporal Bilinear Pooling for Fine-Grained Action Parsing

by: GOTO Keita

Temporal Action Segmantation Bilinear Pooling

Point-To-Pose Voting Based Hand Pose Estimation Using Residual Permutation Equivariant Layer

by: yasud

3D Hand Pose Estimation Point Net

Divergence Triangle for Joint Training of Generator Model, Energy-Based Model, and Inferential Model

by: takeshi miura

Learning Single-Image Depth From Videos Using Quality Assessment Networks

by: Anonymous

SfM

Sliced Wasserstein Generative Models

by: asato matsumoto

画像生成 AE GAN Wasserstein Distribution

Deep Flow-Guided Video Inpainting

by: asato matsumoto

Video Inpainting Optical Flow Field

Video Generation From Single Semantic Label Map

by: asato matsumoto

image-to-video Video Generation Semantic

Cross-Modal Relationship Inference for Grounding Referring Expressions

by: Keito Ishihara

grounding Cross-Modal

CLEVR-Ref+: Diagnosing Visual Reasoning With Referring Expressions

by: Keito Ishihara

dataset Visual Reasoning

Precise Detection in Densely Packed Scenes

by: QIUYUE

Precise Object Detection Densely Packed Scenes

Recurrent Neural Network for (Un-)Supervised Learning of Monocular Video Visual Odometry and Depth

by: QIUYUE

Depth Estimation Odometry Estimation Recurrent Neural Network

The Perfect Match: 3D Point Cloud Matching With Smoothed Densities

by: QIUYUE

Point Cloud Matching;

Efficient Multi-Domain Learning by Covariance Normalization

by: NSD

Multi Domain Learning

Fast User-Guided Video Object Segmentation by Interaction-And-Propagation Networks

by: QIUYUE

Video Object Segmentation Interactive

LAEO-Net: Revisiting People Looking at Each Other in Videos

by: kubo.takahiro

Video Face

Robust Facial Landmark Detection via Occlusion-Adaptive Deep Networks

by: kubo.takahiro

Face

Learning Individual Styles of Conversational Gesture

by: kubo.takahiro

GestureGeneration Multimodal

Object Counting and Instance Segmentation With Image-Level Supervision

by: Eisuke Yamagata

common object counting desity map

CRAVES: Controlling Robotic Arm With a Vision-Based Economic System

by: Masaki Miyamoto

robot arm 3D pose estimation key point reinforcement robot

Coordinate-Based Texture Inpainting for Pose-Guided Human Image Generation

by: yasud

Hourglass Deep Fashion Coordinate-Based 再合成

Learning to Film From Professional Human Motion Videos

by: Masaki Miyamoto

film drone ConvLSTM planning motion DOF

How to Make a Pizza: Learning a Compositional Layer-Based GAN Model

by: shirouchi satoshi

Pizza GAN

UniformFace: Learning Deep Equidistributed Representation for Face Recognition

by: maiki.okura

FaceVerification FaceIdentification Loss

Generalizing Eye Tracking With Bayesian Adversarial Learning

by: GOTO Keita

Gaze Estimation Eye Tracking Bayesian Adversarial Learning

Learning to Learn Image Classifiers With Visual Analogy

by: Yuta Nakamura

graph embedding visual analogy few-shot learning

Spatial Fusion GAN for Image Synthesis

by: KazumaAasano

GAN Spatial Fusion Image Synthesis

Efficient Neural Network Compression

by: Kiro Otsu

3D CNN human identification multi-spectral image

Part-Regularized Near-Duplicate Vehicle Re-Identification

by: Masaki Miyamoto

Near-duplicate vehicle re-identification part-regularized framework

Spatial-Aware Graph Relation Network for Large-Scale Object Detection

by: takumuikeya

object detection 物体検出 detection

Spherical Fractal Convolutional Neural Networks for Point Cloud Recognition

by: Daisuke Makino

Generating Multiple Hypotheses for 3D Human Pose Estimation With Mixture Density Network

by: Tsubura Kazuki

3D pose estimation

Recurrent Back-Projection Network for Video Super-Resolution

by: Masaki Miyamoto

SR RBPN RNN Super-Resolution video VSR

A Mutual Learning Method for Salient Object Detection With Intertwined Multi-Supervision

by: Munetaka Minoguchi

Salient Object Detection Contour Detection Edge Detection

Show, Control and Tell: A Framework for Generating Controllable and Grounded Captions

by: Munetaka Minoguchi

Image Captioning

Texture Mixer: A Network for Controllable Synthesis and Interpolation of Texture

by: yasud

Texture GAN

Dissimilarity Coefficient Based Weakly Supervised Object Detection

by: Tomoki Tanimura

object detection weakly supervised conditional distribution discrete generative model

HetConv: Heterogeneous Kernel-Based Convolutions for Deep CNNs

by: Ryota Nishijima

convolution CNN architecture

Detect-To-Retrieve: Efficient Regional Aggregation for Image Search

by: Ryota Nishijima

image retrieval image search

Transferable Interactiveness Knowledge for Human-Object Interaction Detection

by: Masaki Taniguchi

HOI Detection Interactiveness Network Spatial Map

Fast Interactive Object Annotation With Curve-GCN

by: QIUYUE

Interactive Object Annotation Graph Convolutional Network

TransGaGa: Geometry-Aware Unsupervised Image-To-Image Translation

by: shirouchi satoshi

Unsupervised image-to-image translation

FickleNet: Weakly and Semi-Supervised Semantic Image Segmentation Using Stochastic Inference

by: QIUYUE

Weakly Supervised Learning Semi-supervised Learning Semantic Segmentation

Robustness via Curvature Regularization, and Vice Versa

by: Yoshihiro Fukuhara

adversarial examples adversarial training adversarial robustness geometry curvature

RVOS: End-To-End Recurrent Network for Video Object Segmentation

by: QIUYUE

Video Object Segmentation

Disentangling Latent Hands for Image Synthesis and Pose Estimation

by: Shintaro Yamamoto

hand pose estimation dientangled representation

Actional-Structural Graph Convolutional Networks for Skeleton-Based Action Recognition

by: Masaki Taniguchi

Skeleton-Based Action Recognition Graph Convolutional Network

4D Spatio-Temporal ConvNets: Minkowski Convolutional Neural Networks

by: Hideki Tsunashima

3D Video point cloud semantic segmentation SOTA LIDAR ScanNet Stanford 3D Indoor Spaces(S3DIS) RueMonge 2014(Varcity) Synthia 4D

3D Point Capsule Networks

by: Hiromasa Sakata

point-clouds

Towards VQA Models That Can Read

by: siida

Visual QA VQA QA multimodal

Content-Aware Multi-Level Guidance for Interactive Instance Segmentation

by: Yuta Nakamura

segmentation interactive instance segmentation superpixel FCN

ADVENT: Adversarial Entropy Minimization for Domain Adaptation in Semantic Segmentation

by: Anonymous

Learning Independent Object Motion From Unlabelled Stereoscopic Videos

by: Anonymous

scene flow prediction

Multi-Channel Attention Selection GAN With Cascaded Semantic Guidance for Cross-View Image Translation

by: Anonymous

Large Scale Incremental Learning

by: cfiken

Incremental Learning Catastrophic Forgetting

Re-Ranking via Metric Fusion for Object Retrieval and Person Re-Identification

by: ERLYN MANGUILIMOTAN

diffusion methods re-ranking retrieval

Dissecting Person Re-Identification From the Viewpoint of Viewpoint

by: Takanori Ebihara

データセットの水増し

Viewport Proposal CNN for 360deg Video Quality Assessment

by: Shintaro Yamamoto

360° video video quality assessment

Leveraging the Invariant Side of Generative Zero-Shot Learning

by: Shuhei M Yoshida

zero-shot learning GAN soul sample regularization

P2SGrad: Refined Gradients for Optimizing Deep Face Models

by: Shintaro Yamamoto

face recognition gradient

A-CNN: Annularly Convolutional Neural Networks on Point Clouds

by: Shuhei M Yoshida

point cloud annular convolution

Amodal Instance Segmentation With KINS Dataset

by: maokura

Segmentation dataset Mask R-CNN AmodalSegmentation

Video Action Transformer Network

by: Hirokatsu Kataoka

Action Localization ActionRecognition Self-attention

MARS: Motion-Augmented RGB Stream for Action Recognition

by: Hirokatsu Kataoka

Action Recognition Video Recognition Knowledge Distillation Optical Flow

Face Anti-Spoofing: Model Matters, so Does Data

by: kubo.takahiro

Face

Fast Human Pose Estimation

by: kubo.takahiro

PoseEstimation

Learning Unsupervised Video Object Segmentation Through Visual Attention

by: Kotaro Kitayama

UVOS dataset unsupervised

Decorrelated Adversarial Learning for Age-Invariant Face Recognition

by: kubo.takahiro

Face

Segmentation-Driven 6D Object Pose Estimation

by: Kotaro Kitayama

pose-estimation

Pointing Novel Objects in Image Captioning

by: Eisuke Yamagata

novel object image captioning

VERI-Wild: A Large Dataset and a New Method for Vehicle Re-Identification in the Wild

by: Kotaro Kitayama

dataset VERI-Wild

Decoupling Direction and Norm for Efficient Gradient-Based L2 Adversarial Attacks and Defenses

by: Motokawa Tetsuya

AEs

Cross-Task Weakly Supervised Learning From Instructional Videos

by: kubo.takahiro

ActionDetection

A General and Adaptive Robust Loss Function

by: Motokawa Tetsuya

Adaptive loss function

DeepFlux for Skeletons in the Wild

by: QIUYUE

Skeleton Detection

D3TW: Discriminative Differentiable Dynamic Time Warping for Weakly Supervised Action Alignment and Segmentation

by: kubo.takahiro

ActionDetection

Progressive Teacher-Student Learning for Early Action Prediction

by: kubo.takahiro

ActionPrediction

Depth-Attentional Features for Single-Image Rain Removal

by: shirouchi satoshi

Depth-attentional Features Single-image Rain Removal

Monocular 3D Object Detection Leveraging Accurate Proposals and Shape Reconstruction

by: Tenga Wakamiya

3D 物体検出

Multi-Granularity Generator for Temporal Action Proposal

by: Masaki Taniguchi

Temporal Action Proposal 行動認識

Interactive Image Segmentation via Backpropagating Refinement Scheme

by: QIUYUE

Interactive Image Segmentation

Polarimetric Camera Calibration Using an LCD Monitor

by: asato matsumoto

Calibration LCD Monitor CRF

Scene Parsing via Integrated Classification Model and Variance-Based Regularization

by: QIUYUE

Scene Parsing

RAVEN: A Dataset for Relational and Analogical Visual REasoNing

by: QIUYUE

Dataset Visual Reasoning Visual Question Answering

Doodle to Search: Practical Zero-Shot Sketch-Based Image Retrieval

by: Anonymous

CrossInfoNet: Multi-Task Information Sharing Based Hand Pose Estimation

by: Tsubura Kazuki

hand pose estimation heat-map depth

Analysis of Feature Visibility in Non-Line-Of-Sight Measurements

by: Tsubura Kazuki

Non-Line-of-Sight Measurements

Learning to Cluster Faces on an Affinity Graph

by: Keita Yanome

Learning to Cluster Faces on an Affinity Graph

by: Anonymous

Spectral Metric for Dataset Complexity Assessment

by: Hideki Tsunashima

Dataset Complexity c-measure dataset reduction

Surface Reconstruction From Normals: A Robust DGP-Based Discontinuity Preservation Approach

by: QIUYUE

3D surface reconstruction surface reconstruction from normals

DARNet: Deep Active Ray Network for Building Segmentation

by: Shuhei M Yoshida

segmentation active ray DARNet

Beyond Gradient Descent for Regularized Segmentation Losses

by: Tsubura Kazuki

Gradient decent Loss Segmentation

Multi-Adversarial Discriminative Deep Domain Generalization for Face Presentation Attack Detection

by: Sou Uchida

Domain Generalization Anti-spoofing

Enhancing TripleGAN for Semi-Supervised Conditional Instance Synthesis and Classification

by: Tsubura Kazuki

GAN TripleGAN Semi-Supervised Classification Instance Synthesis

Privacy Preserving Image-Based Localization

by: QIUYUE

Image-based localization Privacy Preserving

Cascaded Generative and Discriminative Learning for Microcalcification Detection in Breast Mammograms

by: Anonymous

A Variational EM Framework With Adaptive Edge Selection for Blind Motion Deblurring

by: Tsubura Kazuki

blind motion deblurring Bayes inference

FilterReg: Robust and Efficient Probabilistic Point-Set Registration Using Gaussian Filter and Twist Parameterization

by: neka-nat

Variational Convolutional Neural Network Pruning

by: hiroki iida

Learning Metrics From Teachers: Compact Networks for Image Embedding

by: Anonymous

Metric learning network distillation

MAGSAC: Marginalizing Sample Consensus

by: Sou Uchida

Sample Consensus

Fully Automatic Video Colorization With Self-Regularization and Diversity

by: asato matsumoto

Video Colorization Self-Regularization Diversity

SDRSAC: Semidefinite-Based Randomized Approach for Robust Point Cloud Registration Without Correspondences

by: neka-nat

Nesti-Net: Normal Estimation for Unstructured 3D Point Clouds Using Convolutional Neural Networks

by: Sou Uchida

Point Cloud Normal Estimation Unstructed Point Cloud

Normalized Object Coordinate Space for Category-Level 6D Object Pose and Size Estimation

by: neka-nat

Zoom to Learn, Learn to Zoom

by: asato matsumoto

Zoom Super-Resolution Raw sensor data sensor data

Single Image Reflection Removal Beyond Linearity

by: asato matsumoto

反射除去 Reflection Removal non-linearity synthesize

Re-Identification With Consistent Attentive Siamese Networks

by: Anonymous

Consistent Attentive Siamese Network Attention

Deep Rigid Instance Scene Flow

by: Masaki Taniguchi

Scene Flow Optical Flow Stereo

Learning to Separate Multiple Illuminants in a Single Image

by: asato matsumoto

Sparate Multiple Illuminats reflecance chromaticity shading chromaticity Separated Images

Graphonomy: Universal Human Parsing via Graph Transfer Learning

by: Shion Honda

semantic segmentation human parsing graph neural networks

Explicit Spatial Encoding for Deep Local Descriptors

by: uchi_k

Shape Unicode: A Unified Shape Representation

by: asato matsumoto

3D Shape Representaion Voxel Point Cloud Multi View Auto Encoder

Point Cloud Oversegmentation With Graph-Structured Deep Metric Learning

by: Shuhei M Yoshida

3D point cloud oversegmentation metric learning

Learning-Based Sampling for Natural Image Matting

by: maokura

ImageMatting inpainting 画像修復

Deep High-Resolution Representation Learning for Human Pose Estimation

by: yasud

Human Pose Estimation ポーズ推定 HRNet

Robust Video Stabilization by Optimization in CNN Weight Space

by: asato matsumoto

Video Stabilization Optimization CNN Weight Space Optical Flow

Weakly Supervised Complementary Parts Models for Fine-Grained Image Classification From the Bottom Up

by: maokura

Classification fine-grained CRF CAM MaskR-CNN

Networks for Joint Affine and Non-Parametric Image Registration

by: Masaki Miyamoto

end-to-end framework medical registration vSVF affine 3D non-parametric

Learning Linear Transformations for Fast Image and Video Style Transfer

by: asato matsumoto

Style Transfer Image Video SPN data-drive

Variational Information Distillation for Knowledge Transfer

by: Tomoki Tsujimura

transfer learning distillation

Jumping Manifolds: Geometry Aware Dense Non-Rigid Structure From Motion

by: QIUYUE

Non-rigid 3D Shape Reconstruction Grassmann manifold

What Does It Mean to Learn in Deep Networks? And, How Does One Detect Adversarial Attacks?

by: Yoshihro Fukuhara

interpretability explainability generalization adversarial attack

Unsupervised Open Domain Recognition by Semantic Discrepancy Minimization

by: ERLYN MANGUILIMOTAN

unsupervised domain adaptation gcn

LVIS: A Dataset for Large Vocabulary Instance Segmentation

by: QIUYUE

Instance Segmentation Large Scale Dataset

LaSOT: A High-Quality Benchmark for Large-Scale Single Object Tracking

by: QIUYUE

Single Object Tracking Large Scale Dataset

Local Detection of Stereo Occlusion Boundaries

by: asato matsumoto

Local Detection Stereo Occlusion Boundaries Small NN

Bi-Directional Cascade Network for Perceptual Edge Detection

by: asato matsumoto

Edge Detection dilated vonvoltion individual layer Bi-Directional Cascade multi-scale

C3AE: Exploring the Limits of Compact Model for Age Estimation

by: Eisuke Yamagata

age estimation

Local Relationship Learning With Person-Specific Shape Regularization for Facial Action Unit Detection

by: GOTO Keita

Action Unit Detection Regularization

Single Image Deraining: A Comprehensive Benchmark Analysis

by: asato matsumoto

De-Raining Dataset Detection Car Analysis

Aggregation Cross-Entropy for Sequence Recognition

by: Ryota Suzuki

sequence recognition

LaSO: Label-Set Operations Networks for Multi-Label Few-Shot Learning

by: Ryota Suzuki

ADCrowdNet: An Attention-Injective Deformable Convolutional Network for Crowd Understanding

by: Hideki Tsunashima

Crowd Crowd understanding Attention SOTA ShanghaiTech dataset UCF_CC_50 dataset WorldExpo'10 dataset UCSD dataset

Ray-Space Projection Model for Light Field Camera

by: Sou Uchida

Light Field Camera Camera Calibration Plucker Coordinates

Few-Shot Learning With Localization in Realistic Settings

by: Ryota Suzuki

few-shot learning

Dynamic Scene Deblurring With Parameter Selective Sharing and Nested Skip Connections

by: asato matsumoto

ぼけ除去 Deblurring Parameter Sharing Skip Connection Dataset

Learning With Batch-Wise Optimal Transport Loss for 3D Shape Recognition

by: Hideki Tsunashima

metric learning 3D shape recognition 3D shape retreival classification MNIST CIFAR-10 SHREC13 SHREC14 ModelNet10 ModelNet40 metric

Deep Geometric Prior for Surface Reconstruction

by: Sou Uchida

Geometric Prior Point Cloud Surface Reconstruction

Weakly Supervised Person Re-Identification

by: ERLYN MANGUILIMOTAN

weakly supervised person re-id

Co-Saliency Detection via Mask-Guided Fully Convolutional Networks With Multi-Scale Label Smoothing

by: maokura

SaliencyDetection MaskLearning

Scene Categorization From Contours: Medial Axis Based Salience Measures

by: Masaki Miyamoto

scene categorization CNN line classification medial-axis contour

A Kernelized Manifold Mapping to Diminish the Effect of Adversarial Perturbations

by: Yusuke Mori

adversarial perturbations Mahalanobis-like distance radial basis function

Domain Generalization by Solving Jigsaw Puzzles

by: Kensho Hara

Domain Generalizatoin Self-supervised Learning

Pay Attention! - Robustifying a Deep Visuomotor Policy Through Task-Focused Visual Attention

by: Masaki Miyamoto

visuomotor manipulator policy TFA attention visual natural language

DuLa-Net: A Dual-Projection Network for Estimating Room Layouts From a Single RGB Panorama

by: Hideki Tsunashima

single RGB panorama room rayout E2P 3D room layout.Realtor360 SOTA

AdaGraph: Unifying Predictive and Continuous Domain Adaptation Through Graphs

by: Ryota Suzuki

domain adaptation graph

A Neural Temporal Model for Human Motion Prediction

by: yasud

Human Motion Prediction 評価指標

SDC - Stacked Dilated Convolution: A Unified Descriptor Network for Dense Matching Tasks

by: Kensho Hara

Descriptor Dense Matching

SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines

by: maokura

dataset synthetic VideoSegmentation AmodalSegmentation MaskR-CNN

ContextDesc: Local Descriptor Augmentation With Cross-Modality Context

by: Kensho Hara

Local Descriptor

Deep Exemplar-Based Video Colorization

by: shirouchi satoshi

Video Colorization

C2AE: Class Conditioned Auto-Encoder for Open-Set Recognition

by: Kensho Hara

Open-set Recognition Auto Encoder

Creative Flow+ Dataset

by: QIUYUE

Large Scale Dataset Optical Flow Messy Stylized Content

Textured Neural Avatars

by: Kensho Hara

Renderer Graphics

Content Authentication for Neural Imaging Pipelines: End-To-End Optimization of Photo Provenance in Complex Distribution Channels

by: Takeru Suda

DLOW: Domain Flow for Adaptation and Generalization

by: Kensho Hara

Domain Adaptation Style Transfer

Weakly Supervised Open-Set Domain Adaptation by Dual-Domain Collaboration

by: QIUYUE

Domain Adaptation Weakly Supervised Learning Open-set

3D Human Pose Estimation in Video With Temporal Convolutions and Semi-Supervised Training

by: yasud

3D Human Pose Estimation ポーズ推定 dilated convolution

Shifting More Attention to Video Salient Object Detection

by: Takeru Suda

VSOD Attention object detection video

A Neurobiological Evaluation Metric for Neural Network Model Search

by: QIUYUE

fMRI(functional Magnetic Resonance Imaging) Neuroscience

Attribute-Aware Face Aging With Wavelet-Based Generative Adversarial Networks

by: Tenga Wakamiya

Iterative Projection and Matching: Finding Structure-Preserving Representatives and Its Application to Computer Vision

by: QIUYUE

Data Selection Algorithm

See More, Know More: Unsupervised Video Object Segmentation With Co-Attention Siamese Networks

by: Masaki Taniguchi

Unsupervised Video Object Segmentation Co-attention COSNet 被写体抽出

PointRCNN: 3D Object Proposal Generation and Detection From Point Cloud

by: ERLYN MANGUILIMOTAN

3D object detection

Fitting Multiple Heterogeneous Models by Multi-Class Cascaded T-Linkage

by: Shuhei M Yoshida

model fitting T-linkage

Self-Supervised Spatio-Temporal Representation Learning for Videos by Predicting Motion and Appearance Statistics

by: Keito Ishihara

Video self supervised CNN

A Late Fusion CNN for Digital Matting

by: Shuhei M Yoshida

alpha matting FCN

Leveraging Crowdsourced GPS Data for Road Extraction From Aerial Imagery

by: Shuhei M Yoshida

GPS 航空写真

Residual Networks for Light Field Image Super-Resolution

by: Yoitsu Takahashi

Light Field Image Super-Resolution resLF

Automatic Adaptation of Object Detectors to New Domains Using Self-Training

by: ERLYN MANGUILIMOTAN

domain adaptation self-training unsupervised

Fast Single Image Reflection Suppression via Convex Optimization

by: Anonymous

Convex Optimization

RF-Net: An End-To-End Image Matching Network Based on Receptive Field

by: Anonymous

Image Matching

Towards Real Scene Super-Resolution With Raw Images

by: Kobayashi Koga

Informative Object Annotations: Tell Me Something I Don't Know

by: Kiro Otsu

Predicting Visible Image Differences Under Varying Display Brightness and Viewing Distance

by: NSD

Image Difference

Compressing Unknown Images With Product Quantizer for Efficient Zero-Shot Classification

by: QIUYUE

Zero-Shot Learning Nearest Neighbor Search

Deep Sketch-Shape Hashing With Segmented 3D Stochastic Viewing

by: ERLYN MANGUILIMOTAN

Sketch-based 3D shape retrieval

Multi-Level Context Ultra-Aggregation for Stereo Matching

by: Hideki Tsunashima

stereo matching The Scene Flow datasets KITTI2015/2012 datasets PSM-Net SOTA

Self-Supervised Convolutional Subspace Clustering Network

by: QIUYUE

Self-supervised Learning Subspace Clustering

Multi-Scale Geometric Consistency Guided Multi-View Stereo

by: QIUYUE

Depth Map Estimation

Decoders Matter for Semantic Segmentation: Data-Dependent Decoding Enables Flexible Feature Aggregation

by: Hideki Tsunashima

semantic segmentation upsampling 1x1 convolution DUpsampling SOTA PASCAL VOC PASCAL Context Cityscapes

Recurrent MVSNet for High-Resolution Multi-View Stereo Depth Inference

by: QIUYUE

Multi-view Stereo Depth Map Estimation

Hierarchical Deep Stereo Matching on High-Resolution Images

by: QIUYUE

Stereo Matching High-resolution Images

Trust Region Based Adversarial Attack on Neural Networks

by: Yusuke Mori

adversarial attack trust region

SimulCap : Single-View Human Performance Capture With Cloth Simulation

by: QIUYUE

Human Performance Capture Cloth Simulation

Metric Learning for Image Registration

by: Anonymous

registration

Isospectralization, or How to Hear Shape, Style, and Correspondence

by: Shion Honda

semantic segmentation

World From Blur

by: Anonymous

Depth Estimation Camera Pose Estimation Blur 3D reconstruction

PointPillars: Fast Encoders for Object Detection From Point Clouds

by: Akihiro Matsufuji

Point Cloud Object Detection Convolutional Neural Networks LIDAR

Deep Defocus Map Estimation Using Domain Adaptation

by: GOTO Keita

Defocus Map Estimation Domain Adaptation

Sim-To-Real via Sim-To-Sim: Data-Efficient Robotic Grasping via Randomized-To-Canonical Adaptation Networks

by: Akihiro Matsufuji

Explicit Bias Discovery in Visual Question Answering Models

by: uchi_k

zero-shot ImageNet バイアス半自動化

Object Detection With Location-Aware Deformable Convolution and Backward Attention Filtering

by: Tomoki Tanimura

object detection deformable convolution attention context infromation high resolution

End-To-End Efficient Representation Learning via Cascading Combinatorial Optimization

by: Yusuke Mori

quantizable representations hierarchically similarity-based search

An Alternative Deep Feature Approach to Line Level Keyword Spotting

by: Akihiro Matsufuji

Keyword Spotting Matching Convolutional Neural Networks

Representation Similarity Analysis for Efficient Task Taxonomy & Transfer Learning

by: Akihiro Matsufuji

Transfer Learning Relation Estimation

Residual Regression With Semantic Prior for Crowd Counting

by: Keito Ishihara

crowd counting

Good News, Everyone! Context Driven Entity-Aware Captioning for News Images

by: Akihiro Matsufuji

Image Captioning Convolutional Neural Networks Recurrent Neural Networks Natural Language Processing

Sim-Real Joint Reinforcement Transfer for 3D Indoor Navigation

by: Yusuke Mori

3D indoor navigation adversarial feature adaption policy mimic

Factor Graph Attention

by: kota yoshida

visual dialog attention mechanism

Learning Structure-And-Motion-Aware Rolling Shutter Correction

by: Naoya Chiba

Rolling Shutter RS SfM

SAIL-VOS: Semantic Amodal Instance Level Video Object Segmentation - A Synthetic Dataset and Baselines

by: maokura

InstanceSegmentation ActivationMap DifferentialFillingModule

Accel: A Corrective Fusion Network for Efficient Semantic Segmentation on Video

by: Takeru Suda

semantic segmentation video

T-Net: Parametrizing Fully Convolutional Nets With a Single High-Order Tensor

by: Shunsuke NAKATSUKA

Tensorization Network Compression

Multi-Person Articulated Tracking With Spatial and Temporal Embeddings

by: Anonymous

pose estimation pose tracking

A Theoretically Sound Upper Bound on the Triplet Loss for Improving the Efficiency of Deep Distance Metric Learning

by: Takaya Yamazoe

metric learning DML triplet loss clustering

Large-Scale, Metric Structure From Motion for Unordered Light Fields

by: maokura

LightFieldCamera StructureFromMotion

3D Hand Shape and Pose Estimation From a Single RGB Image

by: Yoitsu Takahashi

3D Hand Pose Estimation Single RGB image Graph CNN

Data Augmentation Using Learned Transformations for One-Shot Medical Image Segmentation

by: Mitani Tomohiro

segmentation medical image few-shot one-shot

Learning Video Representations From Correspondence Proposals

by: Keito Ishihara

video point cloud

Progressive Image Deraining Networks: A Better and Simpler Baseline

by: Keito Ishihara

ResNet RNN rain

Octree Guided CNN With Spherical Kernels for 3D Point Clouds

by: uchi_k

3D point cloud convolution deconvolution computational efficiency

Understanding the Limitations of CNN-Based Absolute Camera Pose Regression

by: maokura

VisualLocalization SfM SLAM PoseRegression

Learning to Explore Intrinsic Saliency for Stereoscopic Video

by: uchi_k

unsupervised learning 3D Generative Model

Elastic Boundary Projection for 3D Medical Image Segmentation

by: Mitani Tomohiro

3D segmentation medical image

Panoptic Segmentation

by: uchi_k

convolution spatial encoding local descriptor

Tightness-Aware Evaluation Protocol for Scene Text Detection

by: uchi_k

dataset CAD surface normal

Joint Face Detection and Facial Motion Retargeting for Multiple Faces

by: uchi_k

facial motion retargeting joint face detection

Octree Guided CNN With Spherical Kernels for 3D Point Clouds

by: uchi_k

3D point cloud spherical kernel CNN convolution

You Reap What You Sow: Using Videos to Generate High Precision Object Proposals for Weakly-Supervised Object Detection

by: uchi_k

weakly-supervised object detection selective search object mining

Learning to Adapt for Stereo

by: uchi_k

streo depth estimation domain adaptation unsupervised online adaptation real-world

3D Hand Shape and Pose From Images in the Wild

by: Yoitsu Takahashi

3D Hand Pose Estimation Single RGB image E2E

Conditional Single-View Shape Generation for Multi-View Stereo Reconstruction

by: uchi_k

shape generation stereo reconstruction conditional gan

Self-Supervised 3D Hand Pose Estimation Through Training by Fitting

by: Yoitsu Takahashi

3D Hand Pose Estimation Self-supervision Depth Map

ODE-Inspired Network Design for Single Image Super-Resolution

by: Kobayashi Koga

Generative Dual Adversarial Network for Generalized Zero-Shot Learning

by: ERLYN MANGUILIMOTAN

generalized zero-shot learning GAN DualGAN

Fast Object Class Labelling via Speech

by: QIUYUE

Object Class Labelling Annotation Tool

Query-Guided End-To-End Person Search

by: ERLYN MANGUILIMOTAN

person search person re-id person detection

SFNet: Learning Object-Aware Semantic Correspondence

by: Anonymous

Deep Metric Learning Beyond Binary Supervision

by: Anonymous

Evading Defenses to Transferable Adversarial Examples by Translation-Invariant Attacks

by: Motokawa Tetsuya

DeepMapping: Unsupervised Map Estimation From Multiple Point Clouds

by: SohOhara

mapping registration unsupervised learning

Zero-Shot Task Transfer

by: Anonymous

Adversarial Attacks Beyond the Image Space

by: Motokawa Tetsuya

AEs CG

MUREL: Multimodal Relational Reasoning for Visual Question Answering

by: kota yoshida

VQA

Robust Histopathology Image Analysis: To Label or to Synthesize?

by: SohOhara

segmentation data augmentation histopathology image

Libra R-CNN: Towards Balanced Learning for Object Detection

by: ERLYN MANGUILIMOTAN

object detection training process imbalance

Blind Image Deblurring With Local Maximum Gradient Prior

by: kota yoshida

deblurring image prior

Image Deformation Meta-Networks for One-Shot Learning

by: SohOhara

data augmentation one-shot learning image deformation

ManTra-Net: Manipulation Tracing Network for Detection and Localization of Image Forgeries With Anomalous Features

by: Yuta Tokuoka

anomaly detection localization image forgeries anomalous feature

A Style-Based Generator Architecture for Generative Adversarial Networks

by: Motokawa Tetsuya

GAN

Patch-Based Discriminative Feature Learning for Unsupervised Person Re-Identification

by: Masaki Taniguchi

Re-Identification Patch-Based 人物照合教師なし学習

Learning a Unified Classifier Incrementally via Rebalancing

by: ERLYN MANGUILIMOTAN

incremental learning

Online High Rank Matrix Completion

by: SohOhara

matrix completion online method high rank matrix completion

From Coarse to Fine: Robust Hierarchical Localization at Large Scale

by: Katsuya Shimabukuro

Visual Localization Hierarchical Localization

Large-Scale Distributed Second-Order Optimization Using Kronecker-Factored Approximate Curvature for Deep Convolutional Neural Networks

by: Motokawa Tetsuya

Second-order-Optimizer Distributed

Progressive Attention Memory Network for Movie Story Question Answering

by: siida

MovieQA multimodal multi encoder multi source

The Pros and Cons: Rank-Aware Temporal Attention for Skill Determination in Long Videos

by: Shunsuke NAKATSUKA

Skill Determination Video Attention

Memory-Attended Recurrent Network for Video Captioning

by: siida

Video captioning Image captioning RNN multi decoder

SPM-Tracker: Series-Parallel Matching for Real-Time Visual Object Tracking

by: Masaki Taniguchi

Visual Object Tracking 物体トラッキング

A Simple Baseline for Audio-Visual Scene-Aware Dialog

by: Katsuya Shimabukuro

Audio-Visual Scene Aware Dialog Dialog Multimodal

TextureNet: Consistent Local Parametrizations for Learning From High-Resolution Signals on Meshes

by: Naoya Chiba

4-RoSy Semantic Segmentation Texture

Transfer Learning via Unsupervised Task Discovery for Visual Question Answering

by: siida

Visual QA VQA OOV out-of-vocabulary transfer learning unsupervised learning

CollaGAN: Collaborative GAN for Missing Image Data Imputation

by: Kyota Masuyama

GAN StarGAN CycleGAN Image imputation

Spatio-Temporal Dynamics and Semantic Attribute Enriched Visual Encoding for Video Captioning

by: Katsuya Shimabukuro

Video Captioning

A Skeleton-Bridged Deep Learning Approach for Generating Meshes of Complex Topologies From Single RGB Images

by: Naoya Chiba

Skelton 3D Reconstruction

Look Back and Predict Forward in Image Captioning

by: siida

Image Captioning context coherence RNN

Joint Manifold Diffusion for Combining Predictions on Decoupled Observations

by: Shuhei M Yoshida

manifold denoising

Generalized Zero-Shot Recognition Based on Visually Semantic Embedding

by: hiroki iida

Learning to Minify Photometric Stereo

by: Shuhei M Yoshida

photometric stereo

Noise-Tolerant Paradigm for Training Face Recognition CNNs

by: Tenga Wakamiya

顔認識

Low-Rank Laplacian-Uniform Mixed Model for Robust Face Recognition

by: Tenga Wakamiya

顔認識

Text Guided Person Image Synthesis

by: Masaki Taniguchi

GAN NLP Person Image Synthesis

Reflective and Fluorescent Separation Under Narrow-Band Illumination

by: Shuhei M Yoshida

reflective and fluorescent separation

Multi-Level Multimodal Common Semantic Space for Image-Phrase Grounding

by: Katsuya Shimabukuro

Phrase Grounding Weakly Supervised Learning Multimodal

Dense Depth Posterior (DDP) From Single Image and Sparse Range

by: maokura

Depth Map KITTI Point Cloud

Intention Oriented Image Captions With Guiding Objects

by: siida

Image Captioning multimodal

Greedy Structure Learning of Hierarchical Compositional Models

by: Tenga Wakamiya

Hierarchical Compositional Models

Object-Aware Aggregation With Bidirectional Temporal Graph for Video Captioning

by: siida

Video Captioning multi encoder hiearchical attention multimodal

Feature Selective Anchor-Free Module for Single-Shot Object Detection

by: ERLYN MANGUILIMOTAN

single-shot object detection

Efficient Decision-Based Black-Box Adversarial Attacks on Face Recognition

by: Takahiro Itazuri

adversarial examples black-box attack face recognition

Visual Query Answering by Entity-Attribute Graph Matching and Reasoning

by: siida

Visual QA VQA

Single Image Depth Estimation Trained via Depth From Defocus Cues

by: Takahiro Itazuri

depth estimation depth from defocus unsupervised

RGBD Based Dimensional Decomposition Residual Network for 3D Semantic Scene Completion

by: Takahiro Itazuri

3D Semantic Scene Completion 3D Shape Completion Dimensional Decomposition Residual block

Synthesizing 3D Shapes From Silhouette Image Collections Using Multi-Projection Generative Adversarial Networks

by: NSD

GAN MP-GAN

Video Summarization by Learning From Unpaired Data

by: Shunsuke NAKATSUKA

Video Summarization

Neural Scene Decomposition for Multi-Person Motion Capture

by: Takahiro Itazuri

Neural Scene Decomposition

Bottom-Up Object Detection by Grouping Extreme and Center Points

by: ERLYN MANGUILIMOTAN

object detection bottom-up approach bound box

Multi-Person Pose Estimation With Enhanced Channel-Wise and Spatial Information

by: Anonymous

multi pose estimation CSM SCARB

FA-RPN: Floating Region Proposals for Face Detection

by: Takahiro itazuri

face detection floating anchor region proposal network FA-RPN

A Bayesian Perspective on the Deep Image Prior

by: Rei Tamaru

Bayes Deep Image Prior stochastic gradient Langevin dynamics Gaussian Process

In Defense of Pre-Trained ImageNet Architectures for Real-Time Semantic Segmentation of Road-Driving Images

by: Eisuke Yamagata

semantic segmentation

Holistic and Comprehensive Annotation of Clinically Significant Findings on Diverse CT Images: Learning From Radiology Reports and Label Ontology

by: Mitani Tomohiro

medical image classification lesion annotation

Explainable and Explicit Visual Reasoning Over Scene Graphs

by: siida

Visual QA NMN Neural Module Networks Scene Graphs

Local Features and Visual Words Emerge in Activations

by: Katsuya Shimabukuro

Image Retrieval Local Feature Detection Local Feature Matching Reranking

Uncertainty Guided Multi-Scale Residual Learning-Using a Cycle Spinning CNN for Single Image De-Raining

by: siida

Image de-raining

Grounded Video Description

by: Ryota Suzuki

video description

Leveraging Heterogeneous Auxiliary Tasks to Assist Crowd Counting

by: Eisuke Yamagata

crowd counting density estimation

Unsupervised Deep Epipolar Flow for Stationary or Dynamic Scenes

by: GOTO Keita

Optical Flow 教師なし学習 Epipolar Geometry

Bringing Alive Blurred Moments

by: Ryota Suzuki

motion reconstruction deblurring

Bringing a Blurry Frame Alive at High Frame-Rate With an Event Camera

by: Ryota Suzuki

event-based camera deblurring

End-To-End Projector Photometric Compensation

by: Ryota Suzuki

projector compensation projector-camera

Incremental Object Learning From Contiguous Views

by: SohOhara

incremental learning learning environment baby infants

Spatial Attentive Single-Image Deraining With a High Quality Real Rain Dataset

by: GOTO Keita

Single-Image Deraining Deraining

SiCloPe: Silhouette-Based Clothed People

by: yasud

3D Reconstruction Silhouette visual hull

Toward Realistic Image Compositing With Adversarial Learning

by: siida

GAN Image Compositing

Binary Ensemble Neural Network: More Bits per Network or More Networks per Bit?

by: Ryota Nishijima

binary neural network ensemble

Direct Object Recognition Without Line-Of-Sight Using Optical Coherence

by: Tenga Wakamiya

コヒーレント光物体認識

Feature Distillation: DNN-Oriented JPEG Compression Against Adversarial Examples

by: ERLYN MANGUILIMOTAN

JPEG Compression

Unsupervised Learning of Dense Shape Correspondence

by: Naoya Chiba

Shape Deformation Mesh Deformation Functional Maps Correspondence Estimation

GSPN: Generative Shape Proposal Network for 3D Instance Segmentation in Point Cloud

by: Masaki Miyamoto

instance segmentation proposal 3D GSPN framework

Detailed Human Shape Estimation From a Single Image by Hierarchical Mesh Deformation

by: Naoya Chiba

Human Shape Estimation Human Pose Estimation Mesh Deformation

Associatively Segmenting Instances and Semantics in Point Clouds

by: Masaki Miyamoto

segmentation 3D point cloud instance win-win

Streamlined Dense Video Captioning

by: QIUYUE

Dense Video Captioning

Unified Visual-Semantic Embeddings: Bridging Vision and Language With Structured Meaning Representations

by: QIUYUE

Visual-Semantic Embedding Image Captioning Contrastive Example Mining

Cycle-Consistency for Robust Visual Question Answering

by: QIUYUE

Visual Question Answering Robustness Cycle-Consistency

Embodied Question Answering in Photorealistic Environments With Point Cloud Perception

by: QIUYUE

Embodied Question Answering Point Cloud Visual Question Answering Matterport 3D

GQA: A New Dataset for Real-World Visual Reasoning and Compositional Question Answering

by: QIUYUE

Visual Question Answering Dataset Compositional Visual Reasoning

Variational Autoencoders Pursue PCA Directions (by Accident)

by: Kiro Otsu

LBS Autoencoder: Self-Supervised Fitting of Articulated Meshes to Point Clouds

by: yasud

autoencoder Structured Chamfer Distance self-supervised unsupervised

UnOS: Unified Unsupervised Optical-Flow and Stereo-Depth Estimation by Watching Videos

by: shirouchi satoshi

Unsupervised Optical-flow and Stereo-depth Estimation

Wide-Area Crowd Counting via Ground-Plane Density Maps and Multi-View Fusion CNNs

by: siida

Crowd Counting CNN

Box-Driven Class-Wise Region Masking and Filling Rate Guided Loss for Weakly Supervised Semantic Segmentation

by: Hideki Tsunashima

semantic segmentation PASCAL VOC 2012 SOTA weakly supervised learning bounding box

Fast Neural Architecture Search of Compact Semantic Segmentation Models via Auxiliary Cells

by: Kiro Otsu

NAS semantic segmentation Neural Architecture Search

Convolutional Relational Machine for Group Activity Recognition

by: Shunsuke NAKATSUKA

Group Activity Recognition Relation

Depth From a Polarisation + RGB Stereo Pair

by: Shuhei M Yoshida

polarization image

Timeception for Complex Action Recognition

by: cfiken

Video Action Recognition

Partial Order Pruning: For Best Speed/Accuracy Trade-Off in Neural Architecture Search

by: Kiro Otsu

PEPSI : Fast Image Inpainting With Parallel Decoding Network

by: Katsuya Shimabukuro

Inpainting Coarse-to-Fine

FBNet: Hardware-Aware Efficient ConvNet Design via Differentiable Neural Architecture Search

by: Kiro Otsu

NAS AutoML edge AI

Exploiting Temporal Context for 3D Human Pose Estimation in the Wild

by: mokura

Pose Estimation 3D Pose Video bundle adjustment

Hybrid Scene Compression for Visual Localization

by: Shuhei M Yoshida

MMFace: A Multi-Metric Regression Network for Unconstrained Face Reconstruction

by: Shuhei M Yoshida

face reconstruction

Bilateral Cyclic Constraint and Adaptive Regularization for Unsupervised Monocular Depth Prediction

by: Anonymous

depth prediction make3d

Semantic Graph Convolutional Networks for 3D Human Pose Regression

by: mokura

Graph Convolutional Network 3D Pose Estimation Regression

Reinforced Cross-Modal Matching and Self-Supervised Imitation Learning for Vision-Language Navigation

by: Yoshitaka Ushiku

Two Body Problem: Collaborative Visual Task Completion

by: QIUYUE

Vision-Language Navigation Embodied Two Agents Collaboration

Text2Scene: Generating Compositional Scenes From Textual Descriptions

by: QIUYUE

Image Generation from Text Sequence-to-Sequence

Tactical Rewind: Self-Correction via Backtracking in Vision-And-Language Navigation

by: QIUYUE

Vision-Language Navigation (VLN)

HPLFlowNet: Hierarchical Permutohedral Lattice FlowNet for Scene Flow Estimation on Large-Scale Point Clouds

by: Hideki Tsunashima

point cloud scene flow SLAM LIDAR Bilateral Convolutional Layers permutohedral lattice FlyingThings3D KITTI Cene Flow 2015

Actor-Critic Instance Segmentation

by: siida

Instance Segmentation reinforcement learning

Robust Subspace Clustering With Independent and Piecewise Identically Distributed Noise Modeling

by: takeshi miura

RobustSC I.p.i.d

Interactive Full Image Segmentation by Considering All Regions Jointly

by: Anonymous

segmentation interactive segmentation

Group-Wise Correlation Stereo Network

by: mokura

Stereo Maching PSMNet Depth Prediction

STGAN: A Unified Selective Transfer Network for Arbitrary Image Attribute Editing

by: Masaki Taniguchi

GAN Image Attribute Editing 画像変換

Hyperspectral Image Reconstruction Using a Deep Spatial-Spectral Prior

by: shirouchi satoshi

Hyperspectral Image Reconstruction

Learning to Learn How to Learn: Self-Adaptive Visual Navigation Using Meta-Learning

by: QIUYUE

Meta Learning Reinforcement Learning Visual Navigation

Dense Intrinsic Appearance Flow for Human Pose Transfer

by: Masaki Taniguchi

Human Pose Transfer GAN

GPSfM: Global Projective SFM Using Algebraic Constraints on Multi-View Fundamental Matrices

by: Hideki Tsunashima

SfM algebra SOTA

Improving Semantic Segmentation via Video Propagation and Label Relaxation

by: Takeru Suda

Semantic segmentation video Label Relaxation data augmentation

Depth-Aware Video Frame Interpolation

by: Masaki Taniguchi

Video Frame Interpolation フレーム補完

Towards Instance-Level Image-To-Image Translation

by: Masaki Taniguchi

Image-To-Image Translation 画像ドメイン変換

Graph Convolutional Tracking

by: Naoya Chiba

Tracking Graph Convolution

LiFF: Light Field Features in Scale and Depth

by: shirouchi satoshi

Light Field Features

PoseFix: Model-Agnostic General Human Pose Refinement Network

by: Takahiro Itazuri

pose estimation pose refinement

Face-Focused Cross-Stream Network for Deception Detection in Videos

by: Takahiro Itazuri

automated deception detection cross-stream network

Fast and Robust Multi-Person 3D Pose Estimation From Multiple Views

by: Takahiro Itazuri

3D pose estimation

Learning to Explore Intrinsic Saliency for Stereoscopic Video

by: Tomoki Tanimura

saliency detection stereoscopic video dataset spatio-temporal depth

BASNet: Boundary-Aware Salient Object Detection

by: Anonymous

Fast, Diverse and Accurate Image Captioning Guided by Part-Of-Speech

by: Katsuya Shimabukuro

Image Captioning POS-tag

Learning a Deep ConvNet for Multi-Label Classification With Partial Labels

by: Shuhei M Yoshida

partial labels multi-label problem

Scan2Mesh: From Unstructured Range Scans to 3D Meshes

by: Anonymous

graph neural networks tsdf

Triangulation Learning Network: From Monocular to Stereo 3D Object Detection

by: Shion Honda

object detection 3D

Spherical Regression: Learning Viewpoints, Surface Normals and 3D Rotations on N-Spheres

by: Takuma Yagi

regression viewpoint estimation surface normal estimation 3d rotation estimation.n-sphere

DeepLiDAR: Deep Surface Normal Guided Depth Prediction for Outdoor Scene From Sparse LiDAR Data and Single Color Image

by: mokura

depth prediction LiDAR attention

Minimal Solvers for Mini-Loop Closures in 3D Multi-Scan Alignment

by: Yuta Tokuoka

3D scan alignment registration

Deep Video Inpainting

by: rindybell

画像修復動画修復 CNN DNN 修復

Foreground-Aware Image Inpainting

by: rindybell

画像修復輪郭抽出前景

Learning Non-Volumetric Depth Fusion Using Successive Reprojections

by: Anonymous

DynTypo: Example-Based Dynamic Text Effects Transfer

by: rindybell

スタイルトランスファー PatchMatch フォント

Stereo R-CNN Based 3D Object Detection for Autonomous Driving

by: Anonymous

Coloring With Limited Data: Few-Shot Colorization via Memory Augmented Networks

by: yasud

few-shot learning 自動着色 colorization memory network

GCAN: Graph Convolutional Adversarial Network for Unsupervised Domain Adaptation

by: Takehiko Ohkawa

Domain Adaptation Graph Convolution

Deep ChArUco: Dark ChArUco Marker Pose Estimation

by: Takehiko Ohkawa

ChArUco Detection Pose Estimation

On Finding Gray Pixels

by: shirouchi satoshi

grayness index

IM-Net for High Resolution Video Frame Interpolation

by: Kensho Hara

Video Frame Interpolation

Auto-DeepLab: Hierarchical Neural Architecture Search for Semantic Image Segmentation

by: Daiki Kimura

ネットワーク探索

Convolutional Mesh Regression for Single-Image Human Shape Reconstruction

by: Naoya Chiba

Mesh Regression Human Shape Estimation Human Pose Graph CNN

VizWiz-Priv: A Dataset for Recognizing the Presence and Purpose of Private Visual Information in Images Taken by Blind People

by: Kensho Hara

Privacy

GFrames: Gradient-Based Local Reference Frame for 3D Shape Matching

by: Naoya Chiba

Local Reference Frame 3D Descriptor

R2GAN: Cross-Modal Recipe Retrieval With Generative Adversarial Network

by: Yukitaka Tsuchiya

GAN cross-modal retrieval

3D Shape Reconstruction From Images in the Frequency Domain

by: Naoya Chiba

3D Reconstruction Thickness Map Voxel

Neural RGB(r)D Sensing: Depth and Uncertainty From a Video Camera

by: Takuma Yagi

monocular depth estimation uncertainty bayesian filtering

Unequal-Training for Deep Face Recognition With Long-Tailed Noisy Data

by: Takahiro Itazuri

face recognition

Learning the Depths of Moving People by Watching Frozen People

by: Takuma Yagi

deoth estimation motion parallax moving people multi-view stereo

Rules of the Road: Predicting Driving Behavior With a Convolutional Model of Semantic Interactions

by: Takuma Yagi

trajectory prediction driving behavior

Sphere Generative Adversarial Network Based on Geometric Moment Matching

by: Motokawa Tetsuya

GAN

Probabilistic End-To-End Noise Correction for Learning With Noisy Labels

by: Shuhei M Yoshida

noisy labels label correction

Parallel Optimal Transport GAN

by: Motokawa Tetsuya

GAN

PointWeb: Enhancing Local Neighborhood Features for Point Cloud Processing

by: Anonymous

kNN point cloud

Cross Domain Model Compression by Structurally Weight Sharing

by: Kiro Otsu

Network compression graph embedding weight sharing

From Recognition to Cognition: Visual Commonsense Reasoning

by: QIUYUE

Commonsense Reasoning Visual Question Answering

FineGAN: Unsupervised Hierarchical Disentanglement for Fine-Grained Object Generation and Discovery

by: Ryota Natsume

Modeling Point Clouds With Self-Attention and Gumbel Subset Sampling

by: Hideki Tsunashima

point cloud ModelNet40 Stanford Large-Scale 3D Indoor Spaces Dataset(S3DIS) DVS128 Gesture Dataset PAT GSA subset sampling PointCNN DGCNN

Pattern-Affinitive Propagation Across Depth, Surface Normal and Semantic Segmentation

by: Keito Ishihara

cross-task

Connecting the Dots: Learning Representations for Active Monocular Depth Estimation

by: Shion Honda

depth estimation monocular camera photometry geometry

Devil Is in the Edges: Learning Semantic Boundaries From Noisy Annotations

by: Yukitaka Tsuchiya

semantic boundary segmentation

Radial Distortion Triangulation

by: Tomoki Tanimura

radial distortion triangulation grobner basis optimization

A Theory of Fermat Paths for Non-Line-Of-Sight Shape Reconstruction

by: Ryota Suzuki

Non-Line-of-Sight reconstruction transient measurement

SiamRPN++: Evolution of Siamese Visual Tracking With Very Deep Networks

by: Naoya Chiba

Visual Tracking Shamese Network Cross Correlation

Steady-State Non-Line-Of-Sight Imaging

by: Ryota Suzuki

Non-Light-of-Sight

Acoustic Non-Line-Of-Sight Imaging

by: Ryota Suzuki

Non-Line-of-Sight acoustic imaging

Less Is More: Learning Highlight Detection From Video Duration

by: Mitani Tomohiro

video highlight highlight detection FAIR weak supervised learning

Deeper and Wider Siamese Networks for Real-Time Visual Tracking

by: Naoya Chiba

Siamese Tracker Visual Tracking

Learning Transformation Synchronization

by: shirouchi satoshi

Transformation Synchronization

Mutual Learning of Complementary Networks via Residual Correction for Improving Semi-Supervised Classification

by: Shuhei M Yoshida

semi-supervised learning mutual learning

Pixel-Adaptive Convolutional Neural Networks

by: Yoshiki

Adaptive filtering Dynamic Filter Networks Upsampling Conditional random field

Iterative Residual Refinement for Joint Optical Flow and Occlusion Estimation

by: Yoshiki

Optical flow Occlusion estimation Iterative refinement

An End-To-End Network for Panoptic Segmentation

by: Ryuta Shitomi

Panoptic Segmentation Semantic Segmentation

Recursive Visual Attention in Visual Dialog

by: Ryota Natsume

Reasoning Visual Dialogs With Structural and Partial Observations

by: Ryota Natsume

Adversarial Inference for Multi-Sentence Video Description

by: Ryota Natsume

Generalising Fine-Grained Sketch-Based Image Retrieval

by: yasud

FG-SBIR

Actively Seeking and Learning From Live Data

by: QIUYUE

Visual Question Answering Meta Learning

The Regretful Agent: Heuristic-Aided Navigation Through Progress Estimation

by: QIUYUE

Vision-Language Navigation Self-monitoring Mechanism

Visual Tracking via Adaptive Spatially-Regularized Correlation Filters

by: Naoya Chiba

Visual Tracking Correlation Filter ADMM

SelFlow: Self-Supervised Learning of Optical Flow

by: Naoya Chiba

Optical Flow Semi-supervised Occlusion

EIGEN: Ecologically-Inspired GENetic Approach for Neural Network Structure Searching From Scratch

by: Kiro Otsu

NAS genetic approach 遺伝的アルゴリズム automl

MnasNet: Platform-Aware Neural Architecture Search for Mobile

by: Kiro Otsu

NAS AutoML Edge AI

Iterative Residual CNNs for Burst Photography Applications

by: Anonymous

cnn burst photography

Dynamic Fusion With Intra- and Inter-Modality Attention Flow for Visual Question Answering

by: Keito Ishihara

VQA attention

Learning to Compose Dynamic Tree Structures for Visual Contexts

by: QIUYUE

Visual Reasoning Scene Graph Generation Visual Question Answering

Bounding Box Regression With Uncertainty for Accurate Object Detection

by: Shuhei M Yoshida

object detection uncertainty

ChamNet: Towards Efficient Network Design Through Platform-Aware Model Adaptation

by: Yusuke Mori

platform-aware model adaptation adaptive genetic algorithm accuracy latency energy consumption

Regularizing Activation Distribution for Training Binarized Deep Networks

by: Yusuke Mori

Binarized Neural Network Activation Distribution Degeneration Saturation Gradient Mismatch

Why ReLU Networks Yield High-Confidence Predictions Far Away From the Training Data and How to Mitigate the Problem

by: Shuhei M Yoshida

theory ReLU adversarial training

Efficient Online Multi-Person 2D Pose Tracking With Recurrent Spatio-Temporal Affinity Fields

by: Naoya Chiba

Pose Estimation Part Affinity Field Human Pose Estimation

Selective Sensor Fusion for Neural Visual-Inertial Odometry

by: Takaya Yamazoe

Sensor VIO robust

2.5D Visual Sound

by: cfiken

2.5D Visual Sound audio visual source separation generating sound

Fast Spatio-Temporal Residual Network for Video Super-Resolution

by: Takaya Yamazoe

Super Resolution Video Residual Skip Connection

On the Structural Sensitivity of Deep Convolutional Networks to the Directions of Fourier Basis Functions

by: Shuhei M Yoshida

adversarial examples universal adversarial perturbations

Fast Spatio-Temporal Residual Network for Video Super-Resolution

by: Takaya Yamazoe

CT MAR Reconstruction Dual Domain Medical

Self-Supervised Representation Learning by Rotation Feature Decoupling

by: Takaya Yamazoe

Self supervised Rotation Representation Learning

Learning Implicit Fields for Generative Shape Modeling

by: Anonymous

gan 3d

Learning Spatial Common Sense With Geometry-Aware Recurrent Networks

by: QIUYUE

3D CNN mobile visual scene understanding

The Alignment of the Spheres: Globally-Optimal Spherical Mixture Alignment for Camera Pose Estimation

by: Yuta Nakamura

2D-3D correspondence position orientation cross-modality 3D model mixture model

Attention Branch Network: Learning of Attention Mechanism for Visual Explanation

by: Katsuya Shimabukuro

Visual Explanation Response-Based Visual Explanation Fine-Grained recognition Multi-Task Learning

Improving Referring Expression Grounding With Cross-Modal Attention-Guided Erasing

by: kota yoshida

Referring Expression attention

Multi-Source Weak Supervision for Saliency Detection

by: roy29fuku

saliency detection weak supervision

APDrawingGAN: Generating Artistic Portrait Drawings From Face Photos With Hierarchical GANs

by: Katsuya Shimabukuro

GAN Hierarchical GANs Portrait Drawings Face Photos

Polysemous Visual-Semantic Embedding for Cross-Modal Retrieval

by: kota yoshida

Visual-semantic embedding

Eliminating Exposure Bias and Metric Mismatch in Multiple Object Tracking

by: Naoya Chiba

Multiple Object Tracking MOT

Progressive Pose Attention Transfer for Person Image Generation

by: Kyota Masuyama

GAN Pose-Transfer Attention

Striking the Right Balance With Uncertainty

by: cfiken

Class Imbalanced Bayesian Uncertainty Estimates

WarpGAN: Automatic Caricature Generation

by: Katsuya Shimabukuro

GAN Style Transfer Caricature Warping

Deep Transfer Learning for Multiple Class Novelty Detection

by: Yuta Nakamura

novelty detection loss function transfer learning fine-tuning

Skeleton-Based Action Recognition With Directed Graph Neural Networks

by: Shunsuke NAKATSUKA

Action Recognition Graph Graph Convolution

An Iterative and Cooperative Top-Down and Bottom-Up Inference Network for Salient Object Detection

by: rindybell

物体検出顕著的物体検出

Graphical Contrastive Losses for Scene Graph Parsing

by: Yuta Nakamura

scene graph parsing loss function graph semantic relationship

Label-Noise Robust Generative Adversarial Networks

by: Kensho Hara

GAN Label Noise

Homomorphic Latent Space Interpolation for Unpaired Image-To-Image Translation

by: Kensho Hara

Image-to-Image Translation

Learning to Quantize Deep Networks by Optimizing Quantization Intervals With Task Loss

by: Naoya Chiba

Quantization low bit

Not All Areas Are Equal: Transfer Learning for Semantic Segmentation via Hierarchical Region Selection

by: Naoya Chiba

Semantic Segmentation Transfer Learning

Structured Binary Neural Networks for Accurate Image Classification and Semantic Segmentation

by: cfiken

BNN Binary Neural Networks Structure approximation

Probabilistic Permutation Synchronization Using the Riemannian Structure of the Birkhoff Polytope

by: Yukitaka Tsuchiya

CAM-Convs: Camera-Aware Multi-Scale Convolutions for Single-View Depth

by: Anonymous

depth estimation camera parameter

Robustness Verification of Classification Deep Neural Networks via Linear Programming

by: Yusuke Mori

classification deep neural networks linear programming robustness verification sigmoid

Rob-GAN: Generator, Discriminator, and Adversarial Attacker

by: Yukitaka Tsuchiya

GAN Adversarial training

StereoDRNet: Dilated Residual StereoNet

by: Anonymous

3D reconstruction passive stereo image depth estimation 3d KITTI

ArcFace: Additive Angular Margin Loss for Deep Face Recognition

by: Shuhei M Yoshida

face recognition metric leanring additive angular margin loss

Blind Visual Motif Removal From a Single Image

by: Katsuya Shimabukuro

Inpainting Blind Inpainting Motif Removal

Adaptive Confidence Smoothing for Generalized Zero-Shot Learning

by: Anonymous

zero shot zero-shot generalized zero-shot learning

Progressive Ensemble Networks for Zero-Shot Recognition

by: Anonymous

zero-shot progressive ensemble

Volumetric Capture of Humans With a Single RGBD Camera via Semi-Parametric Learning

by: Yuta Tokuoka

image rendering RGBD image volumetric capture

Rethinking Knowledge Graph Propagation for Zero-Shot Learning

by: Anonymous

zero-shot learning graph convolution

ATOM: Accurate Tracking by Overlap Maximization

by: Naoya Chiba

Visual Tracking Bounding Box

Robustness of 3D Deep Learning in an Adversarial Setting

by: OKIMOTO Yusuke

Conditional Adversarial Generative Flow for Controllable Image Synthesis

by: shirouchi satoshi

Conditional Adversarial Generative Flow

Arbitrary Shape Scene Text Detection With Adaptive Text Region Representation

by: Ruka Funaki

scene text detection

Shapes and Context: In-The-Wild Image Synthesis & Manipulation

by: Kensho Hara

Image Synthesis

Learning Joint Gait Representation via Quintuplet Loss Minimization

by: Naoya Chiba

Gait Recognition Metric Learning Quintuplet Loss

Spot and Learn: A Maximum-Entropy Patch Sampler for Few-Shot Image Classification

by: Ryosuke Sato

Few-ShotLearning ImageClassification ReinforcementLearning Q-Learning

Context-Aware Crowd Counting

by: Ryota Nishijima

crowd counting density map

Veritatem Dies Aperit - Temporally Consistent Depth Prediction Enabled by a Multi-Task Geometric and Semantic Scene Understanding Approach

by: Kotaro Kitayama

A Relation-Augmented Fully Convolutional Network for Semantic Segmentation in Aerial Scenes

by: Ryuta Shitomi

Semantic Segmentation

SceneCode: Monocular Dense Semantic Reconstruction Using Learned Encoded Scene Representations

by: Yuta Nakamura

SLAM semantic segmentation 3D CodeSLAM monocular dense reconstruction

Task Agnostic Meta-Learning for Few-Shot Learning

by: Anonymous

meta learning few-shot learninng MAML

RepNet: Weakly Supervised Training of an Adversarial Reprojection Network for 3D Human Pose Estimation

by: Takahiro Itazuri

3D Pose Estimation Reproject Error

Re-Identification Supervised Texture Generation

by: Anonymous

re-identification texture generation 3D human pose SMPL

GeoNet: Deep Geodesic Networks for Point Cloud Analysis

by: Satoshi Inose

PointNet PU-Net geodesic neighborhood

PoseFix: Model-Agnostic General Human Pose Refinement Network

by: Takahiro Itazuri

3D Face Shape Estimation RingNet

Retrieval-Augmented Convolutional Neural Networks Against Adversarial Examples

by: Masahito Kumada

RaCNN Local Mixup adversarial examples

QATM: Quality-Aware Template Matching for Deep Learning

by: Yuta Nakamura

template matching semantic image alignment image-to-GPS

Latent Space Autoregression for Novelty Detection

by: Shunsuke NAKATSUKA

Anomaly Detection Autoregression Autoencoder

Guaranteed Matrix Completion Under Multiple Linear Transformations

by: Yukitaka Tsuchiya

LRMC

A Poisson-Gaussian Denoising Dataset With Real Fluorescence Microscopy Images

by: Yuta Nakamura

denoising Poisson-Gaussian noise dataset biology microscope

CrowdPose: Efficient Crowded Scenes Pose Estimation and a New Benchmark

by: Anonymous

Multi-person pose estimation Crowd Pose Crowd Index JC SPPE

ROI Pooled Correlation Filters for Visual Tracking

by: Anonymous

ROI correlation filter

Data-Driven Neuron Allocation for Scale Aggregation Networks

by: Anonymous

multi scale convolutional neural network image classification object detection

Exploiting Edge Features for Graph Neural Networks

by: Tomoki Tsujimura

graph neural network

Neural Illumination: Lighting Prediction for Indoor Environments

by: Satoshi Inose

geometry estimation scene completion LDR to HDR U-Net ResNet50

Adaptive Weighting Multi-Field-Of-View CNN for Semantic Segmentation in Pathology

by: Anonymous

病理学セマンティックセグメンテーション

Auto-Encoding Scene Graphs for Image Captioning

by: Anonymous

Scene Graph Auto-Encoder Image Captioning

Visual Question Answering as Reading Comprehension

by: Katsuya Shimabukuro

Visual Question Answering Machine Reading Comprehension

Knowledge Distillation via Instance Relationship Graph

by: Satoshi Inose

knowledge distillation teacher-student framework

Deeply-Supervised Knowledge Synergy

by: Satoshi Inose

deeply-supervised learning knowladge distillation

Semantically Tied Paired Cycle Consistency for Zero-Shot Sketch-Based Image Retrieval

by: Koki Obinata

sketch-based image retrieval zero-shot cycle-consistency

Unsupervised Visual Domain Adaptation: A Deep Max-Margin Gaussian Process Approach

by: Motokawa Tetsuya

DomainAdaption UDA

Transferrable Prototypical Networks for Unsupervised Domain Adaptation

by: Kensho Hara

Unsupervised Domain Adaptation

Label Propagation for Deep Semi-Supervised Learning

by: Koki Obinata

semi-supervised learning transductive learning

Balanced Self-Paced Learning for Generative Adversarial Clustering Network

by: Motokawa Tetsuya

GAN ClusterGAN

Balanced Self-Paced Learning for Generative Adversarial Clustering Network

by: Motokawa Tetsuya

GAN ClusterGAN

K-Nearest Neighbors Hashing

by: mokura

ANN search hashing entropy

Handwriting Recognition in Low-Resource Scripts Using Adversarial Learning

by: Koki Obinata

adversarial learning data augmentation handwritten word recognition handwritten word spotting

Meta-Transfer Learning for Few-Shot Learning

by: cfiken

Few-Shot Learning Meta Learning Transfer Learning

Contrastive Adaptation Network for Unsupervised Domain Adaptation

by: Koki Obinata

unsupervised domain adaptation unsupervised learning

STEP: Spatio-Temporal Progressive Learning for Video Action Detection

by: Shuhei M Yoshida

action localization

CVPR2019論文サマリ