Loading [MathJax]/jax/output/CommonHTML/config.js
前往小程序,Get更优阅读体验!
立即前往
首页
学习
活动
专区
圈层
工具
发布
首页
学习
活动
专区
圈层
工具
MCP广场
社区首页 >专栏 >【论文推荐】最新七篇视觉问答(VQA)相关论文—融合算子、问题类型引导注意力、交互环境、可解释性、稠密对称联合注意力

【论文推荐】最新七篇视觉问答(VQA)相关论文—融合算子、问题类型引导注意力、交互环境、可解释性、稠密对称联合注意力

作者头像
WZEARW
发布于 2018-06-05 08:06:13
发布于 2018-06-05 08:06:13
9960
举报
文章被收录于专栏:专知专知

【导读】既昨天推出七篇视觉问答(Visual Question Answering)文章,专知内容组今天又推出最近七篇视觉问答相关文章,为大家进行介绍,欢迎查看!

1. Generalized Hadamard-Product Fusion Operators for Visual Question Answering(基于广义Hadamard-Product融合算子的视觉问答)



作者:Brendan Duke,Graham W. Taylor

机构:, University of Guelph

摘要:We propose a generalized class of multimodal fusion operators for the task of visual question answering (VQA). We identify generalizations of existing multimodal fusion operators based on the Hadamard product, and show that specific non-trivial instantiations of this generalized fusion operator exhibit superior performance in terms of OpenEnded accuracy on the VQA task. In particular, we introduce Nonlinearity Ensembling, Feature Gating, and post-fusion neural network layers as fusion operator components, culminating in an absolute percentage point improvement of $1.1\%$ on the VQA 2.0 test-dev set over baseline fusion operators, which use the same features as input. We use our findings as evidence that our generalized class of fusion operators could lead to the discovery of even superior task-specific operators when used as a search space in an architecture search over fusion operators.

期刊:arXiv, 2018年4月6日

网址

http://www.zhuanzhi.ai/document/444d5e32145432bb8bc7e703d21c237e

2. Fooling Vision and Language Models Despite Localization and Attention Mechanism



作者:Xiaojun Xu,Xinyun Chen,Chang Liu,Anna Rohrbach,Trevor Darrell,Dawn Song

机构:Shanghai Jiao Tong University

摘要:Adversarial attacks are known to succeed on classifiers, but it has been an open question whether more complex vision systems are vulnerable. In this paper, we study adversarial examples for vision and language models, which incorporate natural language understanding and complex structures such as attention, localization, and modular architectures. In particular, we investigate attacks on a dense captioning model and on two visual question answering (VQA) models. Our evaluation shows that we can generate adversarial examples with a high success rate (i.e., > 90%) for these models. Our work sheds new light on understanding adversarial attacks on vision systems which have a language component and shows that attention, bounding box localization, and compositional internal structures are vulnerable to adversarial attacks. These observations will inform future work towards building effective defenses.

期刊:arXiv, 2018年4月6日

网址

http://www.zhuanzhi.ai/document/82f0f9c5c92d2529a224eddf8aba4099

3. Question Type Guided Attention in Visual Question Answering(基于问题类型引导注意力的视觉问答)



作者:Yang Shi,Tommaso Furlanello,Sheng Zha,Animashree Anandkumar

机构:University of California,University of Southern California

摘要:Visual Question Answering (VQA) requires integration of feature maps with drastically different structures and focus of the correct regions. Image descriptors have structures at multiple spatial scales, while lexical inputs inherently follow a temporal sequence and naturally cluster into semantically different question types. A lot of previous works use complex models to extract feature representations but neglect to use high-level information summary such as question types in learning. In this work, we propose Question Type-guided Attention (QTA). It utilizes the information of question type to dynamically balance between bottom-up and top-down visual features, respectively extracted from ResNet and Faster R-CNN networks. We experiment with multiple VQA architectures with extensive input ablation studies over the TDIUC dataset and show that QTA systematically improves the performance by more than 5% across multiple question type categories such as "Activity Recognition", "Utility" and "Counting" on TDIUC dataset. By adding QTA on the state-of-art model MCB, we achieve 3% improvement for overall accuracy. Finally, we propose a multi-task extension to predict question types which generalizes QTA to applications that lack of question type, with minimal performance loss.

期刊:arXiv, 2018年4月6日

网址

http://www.zhuanzhi.ai/document/d612c4a406f3ae6232b0633c1e6aa433

4. IQA: Visual Question Answering in Interactive Environments(IQA:交互环境的视觉问答)



作者:Daniel Gordon,Aniruddha Kembhavi,Mohammad Rastegari,Joseph Redmon,Dieter Fox,Ali Farhadi

机构:, University of Washington

摘要:We introduce Interactive Question Answering (IQA), the task of answering questions that require an autonomous agent to interact with a dynamic visual environment. IQA presents the agent with a scene and a question, like: "Are there any apples in the fridge?" The agent must navigate around the scene, acquire visual understanding of scene elements, interact with objects (e.g. open refrigerators) and plan for a series of actions conditioned on the question. Popular reinforcement learning approaches with a single controller perform poorly on IQA owing to the large and diverse state space. We propose the Hierarchical Interactive Memory Network (HIMN), consisting of a factorized set of controllers, allowing the system to operate at multiple levels of temporal abstraction. To evaluate HIMN, we introduce IQUAD V1, a new dataset built upon AI2-THOR, a simulated photo-realistic environment of configurable indoor scenes with interactive objects. IQUAD V1 has 75,000 questions, each paired with a unique scene configuration. Our experiments show that our proposed model outperforms popular single controller based methods on IQUAD V1. For sample questions and results, please view our video: https://youtu.be/pXd3C-1jr98.

期刊:arXiv, 2018年4月6日

网址

http://www.zhuanzhi.ai/document/9b21295cf8d06b563639339aa7c5ac34

5. Vision-and-Language Navigation: Interpreting visually-grounded navigation instructions in real environments(Vision-and-Language Navigation)



作者:Peter Anderson,Qi Wu,Damien Teney,Jake Bruce,Mark Johnson,Niko Sünderhauf,Ian Reid,Stephen Gould,Anton van den Hengel

机构:Macquarie University,Queensland University of Technology,University of Adelaide,Australian National University

摘要:A robot that can carry out a natural-language instruction has been a dream since before the Jetsons cartoon series imagined a life of leisure mediated by a fleet of attentive robot helpers. It is a dream that remains stubbornly distant. However, recent advances in vision and language methods have made incredible progress in closely related areas. This is significant because a robot interpreting a natural-language navigation instruction on the basis of what it sees is carrying out a vision and language process that is similar to Visual Question Answering. Both tasks can be interpreted as visually grounded sequence-to-sequence translation problems, and many of the same methods are applicable. To enable and encourage the application of vision and language methods to the problem of interpreting visually-grounded navigation instructions, we present the Matterport3D Simulator -- a large-scale reinforcement learning environment based on real imagery. Using this simulator, which can in future support a range of embodied vision and language tasks, we provide the first benchmark dataset for visually-grounded natural language navigation in real buildings -- the Room-to-Room (R2R) dataset.

期刊:arXiv, 2018年4月6日

网址

http://www.zhuanzhi.ai/document/39ff789c654a6703403f697e0e625184

6. Improved Fusion of Visual and Language Representations by Dense Symmetric Co-Attention for Visual Question Answering(基于稠密对称联合注意力机制视觉语言表示融合的视觉问答)



作者:Duy-Kien Nguyen,Takayuki Okatani

机构:Tohoku University

摘要:A key solution to visual question answering (VQA) exists in how to fuse visual and language features extracted from an input image and question. We show that an attention mechanism that enables dense, bi-directional interactions between the two modalities contributes to boost accuracy of prediction of answers. Specifically, we present a simple architecture that is fully symmetric between visual and language representations, in which each question word attends on image regions and each image region attends on question words. It can be stacked to form a hierarchy for multi-step interactions between an image-question pair. We show through experiments that the proposed architecture achieves a new state-of-the-art on VQA and VQA 2.0 despite its small size. We also present qualitative evaluation, demonstrating how the proposed attention mechanism can generate reasonable attention maps on images and questions, which leads to the correct answer prediction.

期刊:arXiv, 2018年4月3日

网址

http://www.zhuanzhi.ai/document/c12b18f71fd47e9773e1a66dd45420fb

7. VizWiz Grand Challenge: Answering Visual Questions from Blind People(VizWiz Grand Challenge)



作者:Danna Gurari,Qing Li,Abigale J. Stangl,Anhong Guo,Chi Lin,Kristen Grauman,Jiebo Luo,Jeffrey P. Bigham

机构:University of Texas at Austin,University of Science and Technology of China,University of Colorado Boulder,Carnegie Mellon University,University of Rochester

摘要:The study of algorithms to automatically answer visual questions currently is motivated by visual question answering (VQA) datasets constructed in artificial VQA settings. We propose VizWiz, the first goal-oriented VQA dataset arising from a natural VQA setting. VizWiz consists of over 31,000 visual questions originating from blind people who each took a picture using a mobile phone and recorded a spoken question about it, together with 10 crowdsourced answers per visual question. VizWiz differs from the many existing VQA datasets because (1) images are captured by blind photographers and so are often poor quality, (2) questions are spoken and so are more conversational, and (3) often visual questions cannot be answered. Evaluation of modern algorithms for answering visual questions and deciding if a visual question is answerable reveals that VizWiz is a challenging dataset. We introduce this dataset to encourage a larger community to develop more generalized algorithms that can assist blind people.

期刊:arXiv, 2018年4月2日

网址

http://www.zhuanzhi.ai/document/710dfb4f5e0a252b6265c8ef05962a86

-END-

本文参与 腾讯云自媒体同步曝光计划,分享自微信公众号。
原始发表:2018-04-20,如有侵权请联系 cloudcommunity@tencent.com 删除

本文分享自 专知 微信公众号,前往查看

如有侵权,请联系 cloudcommunity@tencent.com 删除。

本文参与 腾讯云自媒体同步曝光计划  ,欢迎热爱写作的你一起参与!

评论
登录后参与评论
暂无评论
推荐阅读
编辑精选文章
换一批
【论文推荐】最新六篇视觉问答(VQA)相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力
【导读】专知内容组整理了最近六篇视觉问答(Visual Question Answering)相关文章,为大家进行介绍,欢迎查看! 1. VizWiz Grand Challenge: Answering Visual Questions from Blind People(VizWiz Grand Challenge:回答来自于盲人的视觉问题) ---- ---- 作者:Danna Gurari,Qing Li,Abigale J. Stangl,Anhong Guo,Chi Lin,Kristen Gr
WZEARW
2018/04/16
1.2K0
【论文推荐】最新六篇视觉问答(VQA)相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力
【论文推荐】最新6篇视觉问答(VQA)相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准
【导读】专知内容组整理了最近六篇视觉问答(Visual Question Answering)相关文章,为大家进行介绍,欢迎查看! 1. Object-based reasoning in VQA(基于目标推理机制的VQA方法) ---- ---- 作者:Mikyas T. Desta,Larry Chen,Tomasz Kornuta 摘要:Visual Question Answering (VQA) is a novel problem domain where multi-modal inputs
WZEARW
2018/04/13
1.3K0
【论文推荐】最新6篇视觉问答(VQA)相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准
【论文推荐】最新7篇视觉问答(VQA)相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数
【导读】专知内容组整理了最近七篇视觉问答(Visual Question Answering)相关文章,为大家进行介绍,欢迎查看! 1.VQA-E: Explaining, Elaborating, and Enhancing Your Answers for Visual Questions(VQA-E:解释、阐述并增强你对视觉问题的回答) 作者:Qing Li,Qingyi Tao,Shafiq Joty,Jianfei Cai,Jiebo Luo 机构:University of Science an
WZEARW
2018/04/08
3.2K0
【论文推荐】最新7篇视觉问答(VQA)相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数
【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标
【导读】专知内容组整理了最近六篇图像描述生成(Image Caption)相关文章,为大家进行介绍,欢迎查看! 1. Unpaired Image Captioning by Language Pivoting(以语言为枢纽生成不成对图像的描述) ---- 作者:Jiuxiang Gu,Shafiq Joty,Jianfei Cai,Gang Wang 机构:Alibaba AI Labs,Nanyang Technological University 摘要:Image captioning is a m
WZEARW
2018/04/08
9780
【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标
【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行
【导读】专知内容组整理了最近人工智能领域相关期刊的5篇最新综述文章,为大家进行介绍,欢迎查看! 1. The Unreasonable Effectiveness of Deep Features as a Perceptual Metric (深度特征在感知度量中难以置信的有效性) ---- ---- 作者: Richard Zhang,Phillip Isola,Alexei A. Efros,Eli Shechtman,Oliver Wang 摘要:While it is nearly effor
WZEARW
2018/04/12
1.2K0
【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行
【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐
【导读】专知内容组整理了最近自动问答相关文章,为大家进行介绍,欢迎查看! 1. An Interpretable Reasoning Network for Multi-Relation Question Answering(基于可解释推理网络的多关系自动问答) ---- ---- 作者:Mantong Zhou,Minlie Huang,Xiaoyan Zhu 摘要:Multi-relation Question Answering is a challenging task, due to the re
WZEARW
2018/04/13
9080
【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐
【论文推荐】最新七篇视觉问答(VQA)相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理
【导读】专知内容组整理了最近七篇视觉问答(Visual Question Answering)相关文章,为大家进行介绍,欢迎查看! 1. Differential Attention for Visual Question Answering(基于差别注意力机制的视觉问答) ---- ---- 作者:Badri Patro,Vinay P. Namboodir 摘要:In this paper we aim to answer questions based on images when provided
WZEARW
2018/06/05
1.2K0
【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量
【导读】专知内容组整理了最近七篇自注意力机制(Self-attention)相关文章,为大家进行介绍,欢迎查看! 1. A Structured Self-attentive Sentence Embedding(一个结构化的自注意力的句子嵌入) ---- 作者:Zhouhan Lin,Minwei Feng,Cicero Nogueira dos Santos,Mo Yu,Bing Xiang,Bowen Zhou,Yoshua Bengio 机构:Montreal Institute for Learn
WZEARW
2018/04/08
8.7K0
【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量
【论文推荐】最新六篇自动问答(QA)相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体
【导读】专知内容组整理了最近六篇自动问答(Question Answering)相关文章,为大家进行介绍,欢迎查看! 1. Complex Sequential Question Answering: Towards Learning to Converse Over Linked Question Answer Pairs with a Knowledge Graph(复杂序列问答:基于知识图谱的问答对关联方法) ---- ---- 作者:Amrita Saha,Vardaan Pahuja,Mitesh
WZEARW
2018/04/16
1.6K0
【论文推荐】最新六篇自动问答(QA)相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体
【论文推荐】最新六篇自动问答相关论文—排序函数、文本摘要评估、信息抽取框架、层次递归编码器、半监督问答
【导读】既前两天推出十三篇自动问答(Question Answering)相关文章,专知内容组今天又推出六篇自动问答相关文章,为大家进行介绍,欢迎查看! 14. Training a Ranking Function for Open-Domain Question Answering(训练排序函数对开放式问题进行回答) ---- ---- 作者:Phu Mon Htut,Samuel R. Bowman,Kyunghyun Cho 机构:New York University 摘要:In recent y
WZEARW
2018/06/05
7640
【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索
【导读】专知内容组整理了最近八篇图像描述生成(Image Captioning)相关文章,为大家进行介绍,欢迎查看! 1.Generating Diverse and Accurate Visual Captions by Comparative Adversarial Learning(通过比较级对抗学习产生多样而准确的视觉描述) 作者:Dianqi Li,Qiuyuan Huang,Xiaodong He,Lei Zhang,Ming-Ting Sun 机构:University of Washingt
WZEARW
2018/04/13
1.3K0
【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索
【论文推荐】最新5篇度量学习(Metric Learning)相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习
【导读】专知内容组整理了最近五篇度量学习(Metric Learning)相关文章,为大家进行介绍,欢迎查看! 1. Additive Margin Softmax for Face Verification(基于additive margin softmax的人脸验证方法) ---- ---- 作者:Feng Wang,Weiyang Liu,Haijun Liu,Jian Cheng 摘要:In this paper, we propose a conceptually simple and geome
WZEARW
2018/04/13
5.5K0
【论文推荐】最新5篇度量学习(Metric Learning)相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习
【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互
【导读】专知内容组整理了最近六篇强化学习(Reinforcement Learning)相关文章,为大家进行介绍,欢迎查看! 1. Multiagent Soft Q-Learning ---- ---- 作者:Ermo Wei,Drew Wicke,David Freelan,Sean Luke 机构:George Mason University 摘要:Policy gradient methods are often applied to reinforcement learning in conti
WZEARW
2018/06/05
7410
【论文推荐】最新7篇条件随机场(CRF)相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别
【导读】专知内容组整理了最近七篇条件随机场(Conditional Random Field )相关文章,为大家进行介绍,欢迎查看! 1. Deep Neural Networks In Fully Connected CRF For Image Labeling With Social Network Metadata(结合社交网络元数据的图像标注:全连接CRF的深度神经网络方法) ---- ---- 作者:Chengjiang Long,Roddy Collins,Eran Swears,Anthony
WZEARW
2018/04/13
1.5K0
【论文推荐】最新7篇条件随机场(CRF)相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别
【论文推荐】最新5篇行人再识别(ReID)相关论文—迁移学习、特征集成、重排序、 多通道金字塔、深层生成模型
【导读】专知内容组整理了最近五篇行人再识别(Person Re-identification)相关文章,为大家进行介绍,欢迎查看! 1.Unsupervised Cross-dataset Person Re-identification by Transfer Learning of Spatial-Temporal Patterns(基于迁移学习时空模式的无监督跨数据集的行人再识别) ---- 作者:Jianming Lv,Weihang Chen,Qing Li,Can Yang 机构:South C
WZEARW
2018/04/08
1.7K0
【论文推荐】最新5篇行人再识别(ReID)相关论文—迁移学习、特征集成、重排序、 多通道金字塔、深层生成模型
【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割
【导读】专知内容组整理了最近六篇图像分割(Image Segmentation)相关文章,为大家进行介绍,欢迎查看! 1.Virtual-to-Real: Learning to Control in Visual Semantic Segmentation 作者:Zhang-Wei Hong,Chen Yu-Ming,Shih-Yang Su,Tzu-Yun Shann,Yi-Hsiang Chang,Hsuan-Kung Yang,Brian Hsi-Lin Ho,Chih-Chieh Tu,Yueh-
WZEARW
2018/04/16
8960
【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割
【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计
【导读】专知内容组整理了最近七篇图像检索(Image Retrieval)相关文章,为大家进行介绍,欢迎查看! 1. Cross-Paced Representation Learning with Partial Curricula for Sketch-based Image Retrieval(基于草图的图像检索) ---- ---- 作者:Dan Xu,Xavier Alameda-Pineda,Jingkuan Song,Elisa Ricci,Nicu Sebe 机构:Indiana Unive
WZEARW
2018/06/05
1.2K0
【论文推荐】最新5篇信息抽取(IE)相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析
【导读】专知内容组整理了最近五篇信息抽取(Information Extraction)相关文章,为大家进行介绍,欢迎查看! 1. Assertion-based QA with Question-Aware Open Information Extraction(基于Assertion的问答和问题感知的开放信息抽取) ---- ---- 作者:Zhao Yan,Duyu Tang,Nan Duan,Shujie Liu,Wendi Wang,Daxin Jiang,Ming Zhou,Zhoujun Li
WZEARW
2018/04/13
1.2K0
【论文推荐】最新5篇信息抽取(IE)相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析
【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐
【导读】既昨天推出六篇知识图谱(Knowledge Graph)文章,专知内容组今天又推出最近六篇知识图谱相关文章,为大家进行介绍,欢迎查看! 1. Zero-shot Recognition via Semantic Embeddings and Knowledge Graphs(基于语义嵌入和知识图谱零次识别) ---- ---- 作者:Xiaolong Wang,Yufei Ye,Abhinav Gupta 机构:Carnegie Mellon University 摘要:We consider th
WZEARW
2018/06/05
1.8K0
【论文推荐】最新6篇推荐系统(Recommendation System)相关论文—深度、注意力、安全、可解释性、评论、自编码器
【导读】专知内容组整理了最近六篇推荐系统(Recommendation System)相关文章,为大家进行介绍,欢迎查看! 1. DKN: Deep Knowledge-Aware Network for News Recommendation(DKN:基于深度知识语义网络的新闻推荐) ---- ---- 作者:Hongwei Wang,Fuzheng Zhang,Xing Xie,Minyi Guo 摘要:Online news recommender systems aim to address the
WZEARW
2018/04/13
3.4K0
【论文推荐】最新6篇推荐系统(Recommendation System)相关论文—深度、注意力、安全、可解释性、评论、自编码器
推荐阅读
【论文推荐】最新六篇视觉问答(VQA)相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力
1.2K0
【论文推荐】最新6篇视觉问答(VQA)相关论文—目标推理、深度循环模型、可解释性、数据可视化、Triplet学习、基准
1.3K0
【论文推荐】最新7篇视觉问答(VQA)相关论文—解释、读写记忆网络、逆视觉问答、视觉推理、可解释性、注意力机制、计数
3.2K0
【论文推荐】最新6篇图像描述生成相关论文—语言为枢纽、细粒度、生成器、注意力机制、策略梯度优化、判别性目标
9780
【论文推荐】最新5篇深度学习相关论文推介——感知度量、图像检索、联合视盘和视杯分割、谱聚类、MPI并行
1.2K0
【论文推荐】最新5篇自动问答相关论文——多关系自动问答、知识图谱联合实体和关系、生物医学问题、维基百科语料数据、多句式旅游推荐
9080
【论文推荐】最新七篇视觉问答(VQA)相关论文—差别注意力机制、视觉问题推理、视觉对话、数据可视化、记忆增强网络、显式推理
1.2K0
【论文推荐】最新七篇自注意力机制(Self-attention)相关论文—结构化自注意力、相对位置、混合、句子表达、文本向量
8.7K0
【论文推荐】最新六篇自动问答(QA)相关论文—复杂序列问答、注意力机制、长短时记忆、文本推理、多因素注意力、主动的问答智能体
1.6K0
【论文推荐】最新六篇自动问答相关论文—排序函数、文本摘要评估、信息抽取框架、层次递归编码器、半监督问答
7640
【论文推荐】最新八篇图像描述生成相关论文—比较级对抗学习、正则化RNNs、深层网络、视觉对话、婴儿说话、自我检索
1.3K0
【论文推荐】最新5篇度量学习(Metric Learning)相关论文—人脸验证、BIER、自适应图卷积、注意力机制、单次学习
5.5K0
【论文推荐】最新六篇强化学习相关论文—Sublinear、机器阅读理解、加速强化学习、对抗性奖励学习、人机交互
7410
【论文推荐】最新7篇条件随机场(CRF)相关论文—图像标注、对抗学习、端到端、注意力机制、三维人体姿态、图像分割、行为分割和识别
1.5K0
【论文推荐】最新5篇行人再识别(ReID)相关论文—迁移学习、特征集成、重排序、 多通道金字塔、深层生成模型
1.7K0
【论文推荐】最新六篇图像分割相关论文—控制、全卷积网络、子空间表示、多模态图像分割
8960
【论文推荐】最新七篇图像检索相关论文—草图、Tie-Aware、场景图解析、叠加跨注意力机制、深度哈希、人群估计
1.2K0
【论文推荐】最新5篇信息抽取(IE)相关论文—开放信息抽取、不完整信息、主动学习、越南语、依存分析
1.2K0
【论文推荐】最新六篇知识图谱相关论文—Zero-shot识别、卷积二维知识图谱、变分知识图谱推理、张量分解、推荐
1.8K0
【论文推荐】最新6篇推荐系统(Recommendation System)相关论文—深度、注意力、安全、可解释性、评论、自编码器
3.4K0
相关推荐
【论文推荐】最新六篇视觉问答(VQA)相关论文—盲人问题、物体计数、多模态解释、视觉关系、对抗性网络、对偶循环注意力
更多 >
领券
问题归档专栏文章快讯文章归档关键词归档开发者手册归档开发者手册 Section 归档