图片来自ECA-Net
总结了过去一周新出的计算机视觉开源代码。
涵盖的方向包括域适应、消除偏差网络训练、视觉注意力模型、机器人动作搜索、机器人抓取、自动驾驶、神经架构搜索、文本识别、视觉搜索等。
域适应
MixMatch Domain Adaptaion: Prize-winning solution for both tracks of VisDA 2019 challenge
Danila Rukhovich, Danil Galeev
TASK-CV 2019 at ICCV
https://arxiv.org/abs/1910.03903v1
https://github.com/filaPro/visda2019
网络训练消除偏差影响 | 基于对抗性训练策略的方法来学习无偏且不变的区分特征
Bias-Resilient Neural Network
Ehsan Adeli, Qingyu Zhao, Adolf Pfefferbaum, Edith V. Sullivan, Li Fei-Fei, Juan Carlos Niebles, Kilian M. Pohl
https://arxiv.org/abs/1910.03676v1
http://github.com/QingyuZhao/BR-Net/
极轻量级的通道注意力模块,改进深度CNN网络
ECA-Net: Efficient Channel Attention for Deep Convolutional Neural Networks
Qilong Wang, Banggu Wu, Pengfei Zhu, Peihua Li, Wangmeng Zuo, Qinghua Hu
https://arxiv.org/abs/1910.03151v1
https://github.com/BangguWu/ECANet
多源域适应 + 半监督域适应
Multi-Source Domain Adaptation and Semi-Supervised Domain Adaptation with Focus on Visual Domain Adaptation Challenge 2019
Yingwei Pan, Yehao Li, Qi Cai, Yang Chen, Ting Yao
https://arxiv.org/abs/1910.03548v1
https://github.com/Panda-Peter/visda2019-multisource
https://github.com/Panda-Peter/visda2019-semisupervised
机器人动作搜索规划
Object-centric Forward Modeling for Model Predictive Control
Yufei Ye, Dhiraj Gandhi, Abhinav Gupta, Shubham Tulsiani
https://arxiv.org/abs/1910.03568v1
https://judyye.github.io/ocmpc/
用视觉计算辅助互联网仇恨语音的检测
Hate Speech in Pixels: Detection of Offensive Memes towards Automatic Moderation
Benet Oriol Sabat, Cristian Canton Ferrer, Xavier Giro-i-Nieto
AI for Social Good Workshop at NeurIPS 2019
https://arxiv.org/abs/1910.02334v1
https://github.com/imatge-upc/hate-speech-detection
机器人抓去透明物体的3D形状估计
ClearGrasp: 3D Shape Estimation of Transparent Objects for Manipulation
Shreeyak S. Sajjan (1), Matthew Moore (1), Mike Pan (1), Ganesh Nagaraja (1), Johnny Lee (2), Andy Zeng (2), Shuran Song (2 and 3) ((1) Synthesis.ai (2) Google (3) Columbia University)
https://arxiv.org/abs/1910.02550v1
https://sites.google.com/view/cleargrasp
自动驾驶,长程视觉与语音导航
Talk2Nav: Long-Range Vision-and-Language Navigation in Cities
Arun Balajee Vasudevan, Dengxin Dai, Luc Van Gool
https://arxiv.org/abs/1910.02029v1
https://www.trace.ethz.ch/publications/2019/talk2nav/index.html
4小时GPU时间的鲁棒神经架构搜索
Searching for A Robust Neural Architecture in Four GPU Hours
Xuanyi Dong, Yi Yang
https://arxiv.org/abs/1910.04465v1
https://github.com/D-X-Y/NAS-Projects
部分遮挡物体的快速点云配准算法AlignNet-3D
AlignNet-3D: Fast Point Cloud Registration of Partially Observed Objects
Johannes Groß, Aljosa Osep, Bastian Leibe
3DV'19
https://arxiv.org/abs/1910.04668v1
https://www.vision.rwth-aachen.de/page/alignnet
任意形状文本识别,基于2D自注意力机制
On Recognizing Texts of Arbitrary Shapes with 2D Self-Attention
Junyeop Lee, Sungrae Park, Jeonghun Baek, Seong Joon Oh, Seonghyeon Kim, Hwalsuk Lee
https://arxiv.org/abs/1910.04396v1
(代码将开源,还未公布地址)
京东开源的实时视觉搜索系统
The Design and Implementation of a Real Time Visual Search System on JD E-commerce Platform
Jie Li, Haifeng Liu, Chuanghua Gui, Jianyu Chen, Zhenyun Ni, Ning Wang
https://arxiv.org/abs/1908.07389
https://github.com/vearch/vearch