
访问www.arxivdaily.com获取含摘要速递,涵盖CS|物理|数学|经济|统计|金融|生物|电气领域,更有搜索、收藏、发帖等功能!点击阅读原文即可访问
cs.RO机器人相关,共计31篇
【1】 Hierarchical Neural Dynamic Policies 标题:分层神经动态策略
作者:Shikhar Bahl,Abhinav Gupta,Deepak Pathak 机构:Carnegie Mellon University 备注:Accepted at RSS 2021. Videos and code at this https URL 链接:https://arxiv.org/abs/2107.05627 摘要:我们在学习高维图像输入的同时,解决了对现实世界中动态任务的不可见配置的泛化问题。基于非线性动力学系统的方法已经成功地演示了机器人的动态行为,但是很难推广到不可见的结构以及从图像输入中学习。最近的工作通过使用深度网络策略和重参数化动作来嵌入动态系统的结构来解决这个问题,但是仍然在图像目标的不同配置域中挣扎,因此很难推广。在本文中,我们通过将动态系统的结构嵌入到一个称为层次神经动态策略(H-NDPs)的层次深度策略学习框架中来解决这种二分性。H-NDPs不是直接将深层动力系统与不同的数据相匹配,而是在状态空间中学习基于局部动力系统的策略,然后将其提取为仅从高维图像操作的基于全局动力系统的策略。H-NDP还提供了平滑的轨迹,在现实世界中具有强大的安全优势。我们在现实世界(数字书写、舀水和倒水)和模拟(抓、扔、拣)中对动态任务进行了广泛的实验。我们发现,H-NDPs可以很容易地与模仿和强化学习相结合,并获得最先进的结果。视频结果位于https://shikharbahl.github.io/hierarchical-ndps/ 摘要:We tackle the problem of generalization to unseen configurations for dynamic tasks in the real world while learning from high-dimensional image input. The family of nonlinear dynamical system-based methods have successfully demonstrated dynamic robot behaviors but have difficulty in generalizing to unseen configurations as well as learning from image inputs. Recent works approach this issue by using deep network policies and reparameterize actions to embed the structure of dynamical systems but still struggle in domains with diverse configurations of image goals, and hence, find it difficult to generalize. In this paper, we address this dichotomy by leveraging embedding the structure of dynamical systems in a hierarchical deep policy learning framework, called Hierarchical Neural Dynamical Policies (H-NDPs). Instead of fitting deep dynamical systems to diverse data directly, H-NDPs form a curriculum by learning local dynamical system-based policies on small regions in state-space and then distill them into a global dynamical system-based policy that operates only from high-dimensional images. H-NDPs additionally provide smooth trajectories, a strong safety benefit in the real world. We perform extensive experiments on dynamic tasks both in the real world (digit writing, scooping, and pouring) and simulation (catching, throwing, picking). We show that H-NDPs are easily integrated with both imitation as well as reinforcement learning setups and achieve state-of-the-art results. Video results are at https://shikharbahl.github.io/hierarchical-ndps/
【2】 Kinematic Parameter Optimization of a Miniaturized Surgical Instrument Based on Dexterous Workspace Determination 标题:基于灵巧工作空间确定的小型化手术器械运动学参数优化
作者:Xin Zhi,Weibang Bai,Eric M. Yeatman 机构:Weibang Bai is with the Hamlyn Centre and the Department of Comput-ing 备注:IEEE ICARM 2021, Best Paper Award Finalist, 7 pages, 10 figures 链接:https://arxiv.org/abs/2107.05625 摘要:机器人辅助医疗保健和治疗非常需要微型仪器,尤其是微创手术,因为它可以更灵活地进行有限的解剖干预。但是,由于小型化和大灵巧工作空间操作能力的矛盾,机器人的设计更具挑战性。因此,在这种情况下,运动参数优化具有重要意义。为此,本文提出了一种基于灵巧工作空间确定的方法,用于在必要的约束条件下设计小型肌腱驱动手术器械。工作空间的确定是通过边界的确定和体积的估计来实现的。最终的机器人配置与优化运动学参数证明是合格的,有足够大的灵巧的工作空间和目标的微型规模。 摘要:Miniaturized instruments are highly needed for robot assisted medical healthcare and treatment, especially for less invasive surgery as it empowers more flexible access to restricted anatomic intervention. But the robotic design is more challenging due to the contradictory needs of miniaturization and the capability of manipulating with a large dexterous workspace. Thus, kinematic parameter optimization is of great significance in this case. To this end, this paper proposes an approach based on dexterous workspace determination for designing a miniaturized tendon-driven surgical instrument under necessary restraints. The workspace determination is achieved by boundary determination and volume estimation with partition and least-squares polynomial fitting methods. The final robotic configuration with optimized kinematic parameters is proved to be eligible with a large enough dexterous workspace and targeted miniature size.
【3】 Let's Play for Action: Recognizing Activities of Daily Living by Learning from Life Simulation Video Games 标题:让我们为行动而玩:从生活模拟视频游戏中学习认识日常生活活动
作者:Alina Roitberg,David Schneider,Aulia Djamal,Constantin Seibold,Simon Reiß,Rainer Stiefelhagen 机构:Institute for Anthropomatics and Robotics, Karlsruhe Institute of Technology, Germany, Accepted at IROS , © IEEE. Personal use is permitted, but republicationredistribution requires IEEE permission. Permission from IEEE must be 链接:https://arxiv.org/abs/2107.05617 摘要:对智能辅助机器人来说,识别日常生活活动(ADL)是一个至关重要的过程,但是收集大量带注释的数据集需要耗时的时间标记,并且会引起隐私问题,例如,如果数据是在真实的家庭中收集的。在这项工作中,我们探索了通过玩生活模拟视频游戏来构建ADL识别训练示例的概念,并介绍了用流行的商业游戏Sims4创建的SIMS4ACTION数据集。我们通过“自上而下”的方式具体执行感兴趣的动作来构建SIMS4ACTION,而游戏环境允许我们在环境、拍摄角度和主题外观之间自由切换。虽然从理论角度来看,对游戏数据的ADL识别很有趣,但关键的挑战在于将其转移到现实世界的应用程序,如智能家居或辅助机器人。为了满足这个需求,Sims4Action附带了一个GamingToReal基准,在这个基准中,模型是在从现有ADL数据集派生的真实视频上进行评估的。在我们的框架中,我们整合了两种现代的基于视频的活动识别算法,揭示了生活模拟视频游戏作为一种廉价且入侵性小得多的训练数据源的价值。然而,我们的研究结果也表明,涉及游戏和真实数据混合的任务具有挑战性,开辟了一个新的研究方向。我们将在https://github.com/aroitberg/sims4action. 摘要:Recognizing Activities of Daily Living (ADL) is a vital process for intelligent assistive robots, but collecting large annotated datasets requires time-consuming temporal labeling and raises privacy concerns, e.g., if the data is collected in a real household. In this work, we explore the concept of constructing training examples for ADL recognition by playing life simulation video games and introduce the SIMS4ACTION dataset created with the popular commercial game THE SIMS 4. We build Sims4Action by specifically executing actions-of-interest in a "top-down" manner, while the gaming circumstances allow us to freely switch between environments, camera angles and subject appearances. While ADL recognition on gaming data is interesting from the theoretical perspective, the key challenge arises from transferring it to the real-world applications, such as smart-homes or assistive robotics. To meet this requirement, Sims4Action is accompanied with a GamingToReal benchmark, where the models are evaluated on real videos derived from an existing ADL dataset. We integrate two modern algorithms for video-based activity recognition in our framework, revealing the value of life simulation video games as an inexpensive and far less intrusive source of training data. However, our results also indicate that tasks involving a mixture of gaming and real data are challenging, opening a new research direction. We will make our dataset publicly available at https://github.com/aroitberg/sims4action.
【4】 Linear Contact-Implicit Model-Predictive Control 标题:线性接触-隐式模型-预测控制
作者:Simon Le Cleac'h,Taylor Howell,Mac Schwager,Zachary Manchester 机构:StanfordUniversity 链接:https://arxiv.org/abs/2107.05616 摘要:我们提出了一种控制机器人系统与环境接触的一般方法:线性接触隐式模型预测控制(LCI-MPC)。我们使用可微接触动力学提供了一个自然扩展的线性模型预测控制接触丰富的设置。该策略利用关于参考状态或轨迹的预先计算的线性化,同时通过互补约束编码的接触模式被显式保留,从而产生可以有效地评估的策略,同时保持对接触时序变化的鲁棒性。在许多情况下,该算法甚至能够生成全新的接触序列。为了实现实时性,我们设计了一个定制的结构,利用接触动力学的线性求解器。我们证明了该策略可以通过发现和利用新的接触模式来响应干扰,并且对于一组模拟机器人系统(包括pushbot、hopper、四足动物和biped)的模型不匹配和未建模环境具有鲁棒性。 摘要:We present a general approach for controlling robotic systems that make and break contact with their environments: linear contact-implicit model-predictive control (LCI-MPC). Our use of differentiable contact dynamics provides a natural extension of linear model-predictive control to contact-rich settings. The policy leverages precomputed linearizations about a reference state or trajectory while contact modes, encoded via complementarity constraints, are explicitly retained, resulting in policies that can be efficiently evaluated while maintaining robustness to changes in contact timings. In many cases, the algorithm is even capable of generating entirely new contact sequences. To enable real-time performance, we devise a custom structure-exploiting linear solver for the contact dynamics. We demonstrate that the policy can respond to disturbances by discovering and exploiting new contact modes and is robust to model mismatch and unmodeled environments for a collection of simulated robotic systems, including: pushbot, hopper, quadruped, and biped.
【5】 A Persistent Spatial Semantic Representation for High-level Natural Language Instruction Execution 标题:一种面向高级自然语言指令执行的持久空间语义表示
作者:Valts Blukis,Chris Paxton,Dieter Fox,Animesh Garg,Yoav Artzi 机构:NVIDIA, Cornell University, University of Washington, University of Toronto, Vector Institute 备注:Submitted to CoRL 2021 链接:https://arxiv.org/abs/2107.05612 摘要:自然语言为机器人代理提供了一个可访问的、可表达的接口来指定长期任务。然而,非专家可能会用高级指令来指定这些任务,这些指令通过几个抽象层抽象特定的机器人动作。我们认为,在长时间的执行范围内,语言和机器人动作之间弥合这一鸿沟的关键是持久性表示。我们提出了一种持久的空间语义表示方法,并展示了如何构建一个执行分层推理的代理来有效地执行长期任务。我们在ALFRED基准上评估我们的方法,并获得最先进的结果,尽管完全避免了常用的分步说明。 摘要:Natural language provides an accessible and expressive interface to specify long-term tasks for robotic agents. However, non-experts are likely to specify such tasks with high-level instructions, which abstract over specific robot actions through several layers of abstraction. We propose that key to bridging this gap between language and robot actions over long execution horizons are persistent representations. We propose a persistent spatial semantic representation method, and show how it enables building an agent that performs hierarchical reasoning to effectively execute long-term tasks. We evaluate our approach on the ALFRED benchmark and achieve state-of-the-art results, despite completely avoiding the commonly used step-by-step instructions.
【6】 Leveraging Explainability for Comprehending Referring Expressions in the Real World 标题:利用可理解性理解现实世界中的指称表达式
作者:Fethiye Irmak Dogan,Gaspar I. Melsion,Iolanda Leite 机构: Perception and Learning from the School of ElectricalEngineering and Computer Science at KTH Royal Institute of Technology 链接:https://arxiv.org/abs/2107.05593 摘要:对于有效的人-机器人协作来说,机器人理解用户的请求并在出现歧义时提出合理的后续问题是至关重要的。在理解用户在请求中的对象描述的同时,已有的研究集中在有限的对象类别上,这些对象类别可以通过现有的对象检测和定位模块进行检测或定位。另一方面,在野外,不可能限制交互过程中可能遇到的对象类别。为了理解所描述的对象并在野外解决歧义,我们首次提出了一种利用可解释性的方法。我们的方法聚焦于场景中的活动区域来寻找所描述的对象,而不需要对对象类别和自然语言指令施加约束。我们在不同的真实图像中评估了我们的方法,并观察到我们的方法所建议的区域可以帮助解决歧义。当我们将我们的方法与最新的基线进行比较时,我们发现我们的方法在具有现有对象检测器无法识别的模糊对象的场景中表现得更好。 摘要:For effective human-robot collaboration, it is crucial for robots to understand requests from users and ask reasonable follow-up questions when there are ambiguities. While comprehending the users' object descriptions in the requests, existing studies have focused on this challenge for limited object categories that can be detected or localized with existing object detection and localization modules. On the other hand, in the wild, it is impossible to limit the object categories that can be encountered during the interaction. To understand described objects and resolve ambiguities in the wild, for the first time, we suggest a method by leveraging explainability. Our method focuses on the active regions of a scene to find the described objects without putting the previous constraints on object categories and natural language instructions. We evaluate our method in varied real-world images and observe that the regions suggested by our method can help resolve ambiguities. When we compare our method with a state-of-the-art baseline, we show that our method performs better in scenes with ambiguous objects which cannot be recognized by existing object detectors.
【7】 Infrastructure-less Wireless Connectivity for Mobile Robotic Systems in Logistics: Why Bluetooth Mesh Networking is Important? 标题:物流中移动机器人系统的无基础设施无线连接:为什么蓝牙网状网很重要?
作者:Adnan Aijaz 机构:Bristol Research and Innovation Laboratory, Toshiba Europe Ltd., Bristol, United Kingdom 备注:To appear in IEEE ETFA 2021 链接:https://arxiv.org/abs/2107.05563 摘要:移动机器人打乱了正在经历巨大变化的物料处理行业。在不同的工业领域中,对增强自动化的需求通常需要在几乎没有基础设施的物流设施中运行移动机器人系统。在这种环境下,开箱即用的低成本机器人解决方案是可取的。无线连接对于此类移动机器人系统的成功运行起着至关重要的作用。移动机器人的无线网状网络是一个很有吸引力的解决方案;然而,许多系统级的挑战产生了独特而严格的服务需求。本文的重点是蓝牙mesh技术,这是物联网(IoT)连接领域的最新发展,在解决移动机器人系统无基础设施连接的挑战方面的作用。它从通信、控制、协作、覆盖、安全和导航/定位的角度阐述了关键的系统级设计挑战,并探讨了蓝牙mesh技术应对这些挑战的不同能力。它还通过对蓝牙mesh的实际实验评估提供性能见解,同时研究其与竞争解决方案的不同特性。 摘要:Mobile robots have disrupted the material handling industry which is witnessing radical changes. The requirement for enhanced automation across various industry segments often entails mobile robotic systems operating in logistics facilities with little/no infrastructure. In such environments, out-of-box low-cost robotic solutions are desirable. Wireless connectivity plays a crucial role in successful operation of such mobile robotic systems. A wireless mesh network of mobile robots is an attractive solution; however, a number of system-level challenges create unique and stringent service requirements. The focus of this paper is the role of Bluetooth mesh technology, which is the latest addition to the Internet-of-Things (IoT) connectivity landscape, in addressing the challenges of infrastructure-less connectivity for mobile robotic systems. It articulates the key system-level design challenges from communication, control, cooperation, coverage, security, and navigation/localization perspectives, and explores different capabilities of Bluetooth mesh technology for such challenges. It also provides performance insights through real-world experimental evaluation of Bluetooth mesh while investigating its differentiating features against competing solutions.
【8】 A new metaheuristic approach for the art gallery problem 标题:美术馆问题的一种新的元启发式方法
作者:Bahram Sadeghi Bigham,Sahar Badri,Nazanin Padkan 机构:Institute for Advanced Studies in Basic Sciences (IASBS), Zanjan ,-, Iran, Department of Computer Science and Information Technology 备注:A metaheuristic approach for an old NP-hard problem applicable in Robotics and Telecommunication 链接:https://arxiv.org/abs/2107.05540 摘要:在地标数量最少的定位和三边测量问题中,我们面临着三卫和经典美术馆问题。美术馆问题的目标是在一个简单的多边形中找到最小数量的守卫来观察和保护整个多边形。它在机器人、通信等领域有着广泛的应用,对于理论上NP难的美术馆问题,也有一些处理方法。提出了一种基于粒子滤波算法的求解问题最基本状态近似最优的有效方法。对Bottino生成的随机多边形的实验结果表明,该方法在减少或等于保护的情况下具有更高的精度。此外,我们还讨论了重采样和粒子数以最小化运行时间。 摘要:In the problem Localization and trilateration with minimum number of landmarks, we faced 3-guard and classic Art Gallery Problem. The goal of the art gallery problem is to find the minimum number of guards within a simple polygon to observe and protect the entire of it. It has many applications in Robotics, Telecommunication and so on and there are some approaches to handle the art gallery problem which is theoretically NP-hard. This paper offers an efficient method based on the Particle Filter algorithm which solves the most fundamental state of the problem near optimal. The experimental results on the randomly polygons generated by Bottino shows that the new method is more accurate with less or equal guards. Furthermore, we discuss the resampling and particle numbers to minimize the running time.
【9】 Shared Control for Bimanual Telesurgery with Optimized Robotic Partner 标题:具有优化机器人伙伴的双手遥手术共享控制
作者:Ziwei Wang,Yanpei Huang,Xiaoxiao Cheng,Pakorn Uttayopas,Etienne Burdet 机构: Yan-pei Huang and Xiaoxiao Cheng)The authors are with the Department of Bioengineering 链接:https://arxiv.org/abs/2107.05531 摘要:传统的远程手术依赖于外科医生对病人一侧机器人的完全控制,这往往会增加外科医生的疲劳,降低手术效率。本文介绍了一种机器人伙伴(RP)以方便直观的双手遥控手术,目的是减少外科医生的工作量,提高外科医生的辅助能力。采用基于区间2型多项式模糊模型的学习算法,从外科医生那里提取专家领域知识,反映环境交互信息。在此基础上,开发了一种双手共享控制系统,与外科医生遥控的其他机器人进行交互,理解他们的控制并提供帮助。由于不需要环境模型的先验信息,减少了控制设计中对力传感器的依赖。在DaVinci手术系统上的实验结果表明,在无力传感器的情况下,RP可以辅助peg转移任务,减少外科医生51%的工作量。 摘要:Traditional telesurgery relies on the surgeon's full control of the robot on the patient's side, which tends to increase surgeon fatigue and may reduce the efficiency of the operation. This paper introduces a Robotic Partner (RP) to facilitate intuitive bimanual telesurgery, aiming at reducing the surgeon workload and enhancing surgeon-assisted capability. An interval type-2 polynomial fuzzy-model-based learning algorithm is employed to extract expert domain knowledge from surgeons and reflect environmental interaction information. Based on this, a bimanual shared control is developed to interact with the other robot teleoperated by the surgeon, understanding their control and providing assistance. As prior information of the environment model is not required, it reduces reliance on force sensors in control design. Experimental results on the DaVinci Surgical System show that the RP could assist peg-transfer tasks and reduce the surgeon's workload by 51\% in force-sensor-free scenarios.
【10】 SoRoSim: a MATLAB Toolbox for Soft Robotics Based on the Geometric Variable-strain Approach 标题:SoRoSim:基于几何变应变方法的MATLAB软机器人工具箱
作者:Anup Teejo Mathew,Ikhlas Ben Hmida,Costanza Armanini,Frederic Boyer,Federico Renda 机构:ae 1 Department of Mechanical Engineering, Khalifa University of Science and Technology, France 3 Khalifa University Center for Autonomous Robotic Systems (KUCARS) 备注:45 pages including reference, 18 figures and 6 tables 链接:https://arxiv.org/abs/2107.05494 摘要:近二十年来,软机器人技术一直是机器人界的一个热门话题。然而,可用于社区建模和分析软机器人工件的工具仍然有限。本文介绍了一个用户友好的MATLAB工具箱SoRoSim的开发,该工具箱集成了几何变量应变模型,以便于刚性-柔性混合开链机器人系统的建模、分析和仿真。工具箱实现了一个递归的两级嵌套求积方案来求解模型。我们展示了几个例子和应用程序来验证工具箱,并探讨工具箱的能力,有效地建模范围广泛的机器人系统,考虑到不同的驱动器和外部负载,包括流体-结构相互作用。我们认为软机器人研究社区将受益于SoRoSim工具箱的广泛应用。 摘要:Soft robotics has been a trending topic within the robotics community for almost two decades. However, the available tools for the community to model and analyze soft robotics artifacts are still limited. This paper presents the development of a user-friendly MATLAB toolbox, SoRoSim, that integrates the Geometric Variable Strain model to facilitate the modeling, analysis, and simulation of hybrid rigid-soft open-chain robotic systems. The toolbox implements a recursive, two-level nested quadrature scheme to solve the model. We demonstrate several examples and applications to validate the toolbox and explore the toolbox's capabilities to efficiently model a vast range of robotic systems, considering different actuators and external loads, including the fluid-structure interactions. We think that the soft-robotics research community will benefit from the SoRoSim toolbox for a wide variety of applications.
【11】 Sniffy Bug: A Fully Autonomous Swarm of Gas-Seeking Nano Quadcopters in Cluttered Environments 标题:嗅探虫:一群在杂乱环境中完全自主寻找气体的纳米四轴飞行器
作者:Bardienus P. Duisterhof,Shushuai Li,Javier Burgués,Vijay Janapa Reddi,Guido C. H. E. de Croon 机构:Javier Burgu´es, Guido C.H.E. de Croon 链接:https://arxiv.org/abs/2107.05490 摘要:纳米四架直升机是理想的气体源定位(GSL),因为他们是安全,灵活和廉价的。然而,它们的传感器和计算资源极为有限,使得GSL成为一个令人望而生畏的挑战。在这项工作中,我们提出了一种新的bug算法Sniffy-bug,它允许一个完全自主的气体搜索纳米四驱机群在未知、混乱和GPS拒绝的环境中定位一个气体源。计算效率高的mapless算法在追踪所需航路点的同时,可以避免障碍物和其他群成员。首先设置航路点进行探测,当单个群成员探测到气体时,采用基于粒子群优化的程序。我们使用我们新的模拟管道“AutoGDM”来进化bug(和PSO)算法的所有参数。它构建并扩展了开放源码工具,以实现完全自动化的端到端环境生成和气体扩散建模,允许在模拟中学习。飞行测试表明,在杂乱无章的真实环境中,具有进化参数的Sniffy Bug的性能优于手动选择的参数。 摘要:Nano quadcopters are ideal for gas source localization (GSL) as they are safe, agile and inexpensive. However, their extremely restricted sensors and computational resources make GSL a daunting challenge. In this work, we propose a novel bug algorithm named `Sniffy Bug', which allows a fully autonomous swarm of gas-seeking nano quadcopters to localize a gas source in an unknown, cluttered and GPS-denied environments. The computationally efficient, mapless algorithm foresees in the avoidance of obstacles and other swarm members, while pursuing desired waypoints. The waypoints are first set for exploration, and, when a single swarm member has sensed the gas, by a particle swarm optimization-based procedure. We evolve all the parameters of the bug (and PSO) algorithm, using our novel simulation pipeline, `AutoGDM'. It builds on and expands open source tools in order to enable fully automated end-to-end environment generation and gas dispersion modeling, allowing for learning in simulation. Flight tests show that Sniffy Bug with evolved parameters outperforms manually selected parameters in cluttered, real-world environments.
【12】 Visual-Tactile Cross-Modal Data Generation using Residue-Fusion GAN with Feature-Matching and Perceptual Losses 标题:基于特征匹配和感知损失残差融合GaN的视觉-触觉交叉模态数据生成
作者:Shaoyu Cai,Kening Zhu,Yuki Ban,Takuji Narumi 机构:hk 2City University of Hong Kong Shenzhen Research Institute 备注:8 pages, 6 figures, Accepted by IEEE Robotics and Automation Letters 链接:https://arxiv.org/abs/2107.05468 摘要:现有的心理物理学研究表明,跨模态视觉触觉知觉是人类日常活动中的常见现象。然而,建立从一个模态空间到另一个模态空间的算法映射,即跨模态视觉触觉数据的转换/生成仍然是一个挑战,这对于机器人的操作可能是重要的。在本文中,我们提出了一种基于深度学习的跨模态视觉触觉数据生成方法。该方法以物体表面的视觉图像作为视觉数据,笔在物体表面滑动所产生的加速度计信号作为触觉数据。我们采用条件GAN(cGAN)结构和剩余融合(RF)模块,通过附加特征匹配(FM)和感知损耗来训练模型,实现跨模态数据的生成。实验结果表明,RF模块的加入,FM和感知损耗的引入,显著地提高了跨模态数据的生成性能,提高了对生成数据的分类精度和地面真实感与生成数据的视觉相似性。 摘要:Existing psychophysical studies have revealed that the cross-modal visual-tactile perception is common for humans performing daily activities. However, it is still challenging to build the algorithmic mapping from one modality space to another, namely the cross-modal visual-tactile data translation/generation, which could be potentially important for robotic operation. In this paper, we propose a deep-learning-based approach for cross-modal visual-tactile data generation by leveraging the framework of the generative adversarial networks (GANs). Our approach takes the visual image of a material surface as the visual data, and the accelerometer signal induced by the pen-sliding movement on the surface as the tactile data. We adopt the conditional-GAN (cGAN) structure together with the residue-fusion (RF) module, and train the model with the additional feature-matching (FM) and perceptual losses to achieve the cross-modal data generation. The experimental results show that the inclusion of the RF module, and the FM and the perceptual losses significantly improves cross-modal data generation performance in terms of the classification accuracy upon the generated data and the visual similarity between the ground-truth and the generated data.
【13】 Post Triangular Rewiring Method for Shorter RRT Robot Path Planning 标题:短程RRT机器人路径规划的后三角重布线法
作者:Jin-Gu Kang,Jin-Woo Jung 机构:Department of Computer Science and Engineering, Dongguk University, Seoul , Korea 备注:Under review on IJFIS(International Journal of Fuzzy logic and Intelligent Systems; this http URL) 链接:https://arxiv.org/abs/2107.05344 摘要:本文提出了后三角重布线方法,该方法最大限度地减少了规划时间的牺牲,克服了快速探索随机树(RRT)算法等基于抽样的算法的最优性限制。通过三角不等式原理,提出了一种比RRT算法更接近最优路径的后三角重布线方法。通过实验验证了该方法的有效性。将本文提出的方法应用于RRT算法,与规划时间相比,优化效率提高。 摘要:This paper proposed the 'Post Triangular Rewiring' method that minimizes the sacrifice of planning time and overcomes the limit of Optimality of sampling-based algorithm such as Rapidly-exploring Random Tree (RRT) algorithm. The proposed 'Post Triangular Rewiring' method creates a closer to the optimal path than RRT algorithm before application through the triangular inequality principle. The experiments were conducted to verify a performance of the proposed method. When the method proposed in this paper are applied to the RRT algorithm, the Optimality efficiency increase compared to the planning time.
【14】 End-to-end Trainable Deep Neural Network for Robotic Grasp Detection and Semantic Segmentation from RGB 标题:端到端可训练深度神经网络在机器人抓取检测和RGB语义分割中的应用
作者:Stefan Ainetter,Friedrich Fraundorfer 机构: Evaluation of our proposed model on this OCIDdataset extension indicates high accuracy for grasp detection 1AllauthorsarewiththeInstituteofComputerGraphicsandVision, GrazUniversityofTechnology 链接:https://arxiv.org/abs/2107.05287 摘要:在这项工作中,我们介绍了一种新颖的,端到端可训练的基于CNN的结构,以提供高质量的抓取检测结果,适用于平行板夹具和语义分割。利用这一点,我们提出了一种新的细化模块,该模块利用了先前计算的抓取检测和语义分割,进一步提高了抓取检测的准确率。我们提出的网络提供了两个流行的抓取数据集,即康奈尔和提花国家的最先进的准确性。作为额外的贡献,我们为OCID数据集提供了一个新的数据集扩展,使得在高挑战性场景中评估抓取检测成为可能。使用这个数据集,我们还可以使用语义分割将抓取候选对象分配给对象类,这些对象类可以用来拾取场景中的特定对象。 摘要:In this work, we introduce a novel, end-to-end trainable CNN-based architecture to deliver high quality results for grasp detection suitable for a parallel-plate gripper, and semantic segmentation. Utilizing this, we propose a novel refinement module that takes advantage of previously calculated grasp detection and semantic segmentation and further increases grasp detection accuracy. Our proposed network delivers state-of-the-art accuracy on two popular grasp dataset, namely Cornell and Jacquard. As additional contribution, we provide a novel dataset extension for the OCID dataset, making it possible to evaluate grasp detection in highly challenging scenes. Using this dataset, we show that semantic segmentation can additionally be used to assign grasp candidates to object classes, which can be used to pick specific objects in the scene.
【15】 Benchmark of visual and 3D lidar SLAM systems in simulation environment for vineyards 标题:葡萄园仿真环境中视觉和三维激光雷达SLAM系统基准
作者:Ibrahim Hroob,Riccardo Polvara,Sergi Molina,Grzegorz Cielniak,Marc Hanheide 机构:Hanheide, Lincoln Center for Autonomous Systems, University of Lincoln, UK 链接:https://arxiv.org/abs/2107.05283 摘要:在这项工作中,我们提出了一个比较分析的轨迹估计从各种同步定位和映射(SLAM)系统在模拟环境葡萄园。葡萄园的环境是具有挑战性的SLAM方法,由于视觉外观的变化随着时间的推移,不平坦的地形,重复的视觉模式。因此,我们专门为葡萄园创建了一个模拟环境,以帮助在这样一个具有挑战性的环境中研究SLAM系统。我们在四种不同的场景中评估了以下SLAM系统:LIO-SAM、StaticMapping、ORB-SLAM2和RTAB-MAP。本研究中使用的移动机器人配备了二维和三维激光雷达、IMU和RGB-D相机(Kinect v2)。结果表明,RTAB-MAP在这种环境下具有良好的性能。 摘要:In this work, we present a comparative analysis of the trajectories estimated from various Simultaneous Localization and Mapping (SLAM) systems in a simulation environment for vineyards. Vineyard environment is challenging for SLAM methods, due to visual appearance changes over time, uneven terrain, and repeated visual patterns. For this reason, we created a simulation environment specifically for vineyards to help studying SLAM systems in such a challenging environment. We evaluated the following SLAM systems: LIO-SAM, StaticMapping, ORB-SLAM2, and RTAB-MAP in four different scenarios. The mobile robot used in this study equipped with 2D and 3D lidars, IMU, and RGB-D camera (Kinect v2). The results show good and encouraging performance of RTAB-MAP in such an environment.
【16】 Impact of Energy Efficiency on the Morphology and Behaviour of Evolved Robots 标题:能效对进化机器人形态和行为的影响
作者:Margarita Rebolledo,Daan Zeeuwe,Thomas Bartz-Beielstein,A. E. Eiben 机构:and A.E. Eiben, Institute for Data Science, Engineering, and Analytics, TH, Köln, Gummersbach, Germany, Department of Computer Science, Vrije Universiteit, Amsterdam, Netherlands 链接:https://arxiv.org/abs/2107.05249 摘要:大多数进化机器人的研究集中在进化一些有针对性的行为而不考虑能量的使用。这限制了这类系统的实用价值,因为能源效率是真实世界自主机器人的一个重要特性。在本文中,我们通过扩展我们的模拟器与电池模型,并考虑到能源消耗,在健身评估的问题。利用这个系统,我们研究了能量感知如何影响机器人的进化。由于我们的系统是进化形态以及控制器,主要的研究问题有两个:(i)对进化机器人的形态有什么影响,和(ii)如果能量消耗包含在适应性评估中,对进化机器人的行为有什么影响?结果表明,NSGA-II将能量消耗纳入多目标适应度中,在降低机器人速度的同时,减小了机器人的平均体积。然而,在不减小尺寸的情况下生成的机器人可以达到与基线集机器人相当的速度。 摘要:Most evolutionary robotics studies focus on evolving some targeted behavior without taking the energy usage into account. This limits the practical value of such systems because energy efficiency is an important property for real-world autonomous robots. In this paper, we mitigate this problem by extending our simulator with a battery model and taking energy consumption into account during fitness evaluations. Using this system we investigate how energy awareness affects the evolution of robots. Since our system is to evolve morphologies as well as controllers, the main research question is twofold: (i) what is the impact on the morphologies of the evolved robots, and (ii) what is the impact on the behavior of the evolved robots if energy consumption is included in the fitness evaluation? The results show that including the energy consumption in the fitness in a multi-objective fashion (by NSGA-II) reduces the average size of robot bodies while at the same time reducing their speed. However, robots generated without size reduction can achieve speeds comparable to robots from the baseline set.
【17】 Entropy Regularized Motion Planning via Stein Variational Inference 标题:基于Stein变分推理的熵正则运动规划
作者:Alexander Lambert,Byron Boots 机构:Georgia Institute of Technology∗, University of Washington† 备注:RSS 2021 Workshop on Integrating Planning and Learning 链接:https://arxiv.org/abs/2107.05146 摘要:许多模仿和强化学习方法依赖于专家生成的演示来从数据中学习策略或价值函数。从运动规划器获得可靠的轨迹分布是非常重要的,因为它必须广泛地覆盖执行过程中可能遇到的状态空间,同时还要满足基于任务的约束。针对高自由度运动规划任务,提出了一种基于变分推理的采样策略来生成可行的、低成本的轨迹分布。这包括一个分布式的、基于粒子的运动规划算法,该算法利用结构化图形表示对多模态后验分布进行推理。我们还将轨迹优化的近似推理和熵正则化强化学习联系起来。 摘要:Many Imitation and Reinforcement Learning approaches rely on the availability of expert-generated demonstrations for learning policies or value functions from data. Obtaining a reliable distribution of trajectories from motion planners is non-trivial, since it must broadly cover the space of states likely to be encountered during execution while also satisfying task-based constraints. We propose a sampling strategy based on variational inference to generate distributions of feasible, low-cost trajectories for high-dof motion planning tasks. This includes a distributed, particle-based motion planning algorithm which leverages a structured graphical representations for inference over multi-modal posterior distributions. We also make explicit connections to both approximate inference for trajectory optimization and entropy-regularized reinforcement learning.
【18】 Anomaly Detection in Smart Manufacturing with an Application Focus on Robotic Finishing Systems: A Review 标题:智能制造中的异常检测及其在机器人精加工系统中的应用综述
作者:Tareq Tayeh,Abdallah Shami 机构:ECE Department, Western University, London, Canada 备注:7 pages, 1 figure 链接:https://arxiv.org/abs/2107.05053 摘要:随着智能制造系统变得越来越复杂,产生大量数据,生产失败的可能性越来越大。因此需要尽量减少或消除生产故障,其中之一就是通过异常检测。然而,随着异常检测系统的部署,有许多方面需要考虑。在这项工作中,概述了智能制造和机器人精加工系统中异常检测的组成、优点、挑战、方法和存在的问题。 摘要:As systems in smart manufacturing become increasingly complex, producing an abundance of data, the potential for production failures becomes increasingly more likely. There arises the need to minimize or eradicate production failures, one of which is by means of anomaly detection. However, with the deployment of anomaly detection systems, there are many aspects to be considered. In this work, an overview of the components, benefits, challenges, methods, and open problems of anomaly detection in smart manufacturing and robotic finishing systems are discussed.
【19】 Stabilizing Neural Control Using Self-Learned Almost Lyapunov Critics 标题:用自学习几乎Lyapunov批评者稳定神经控制
作者:Ya-Chien Chang,Sicun Gao 机构: Gao are with the Department of Computer Science and Engineering 备注:ICRA 2021 链接:https://arxiv.org/abs/2107.04989 摘要:稳定性保证的缺乏限制了基于学习的方法在机器人核心控制问题中的实际应用。我们提出了在无模型强化学习(RL)环境下学习神经控制策略和神经Lyapunov批评函数的新方法。我们使用基于样本的方法和几乎Lyapunov函数条件,通过学习的Lyapunov批评函数来估计吸引区域和不变性。该方法提高了神经网络控制器对各种非线性系统的稳定性,包括汽车和四转子控制。 摘要:The lack of stability guarantee restricts the practical use of learning-based methods in core control problems in robotics. We develop new methods for learning neural control policies and neural Lyapunov critic functions in the model-free reinforcement learning (RL) setting. We use sample-based approaches and the Almost Lyapunov function conditions to estimate the region of attraction and invariance properties through the learned Lyapunov critic functions. The methods enhance stability of neural controllers for various nonlinear systems including automobile and quadrotor control.
【20】 Toward Certifiable Motion Planning for Medical Steerable Needles 标题:医用可控针头的可认证运动规划研究
作者:Mengyu Fu,Oren Salzman,Ron Alterovitz 机构:∗Department of Computer Science, University of North Carolina at Chapel Hill, Chapel Hill, NC , USA, †Computer Science Department, Technion - Israel Institute of Technology, Israel 备注:To be published in Robotics: Science and Systems (RSS) 2021 链接:https://arxiv.org/abs/2107.04939 摘要:医用导向针可以沿着三维曲线轨迹移动,避免解剖障碍,到达人体内具有临床意义的靶点。自动化可操纵针程序可使医生和患者最大限度地利用可操纵针的可操纵性,安全、准确地达到诸如活检和癌症局部治疗等医疗程序的目标,从而充分发挥可操纵针的潜力。为了使医疗程序的自动化在临床上得到接受,从病人护理、安全和监管的角度证明程序自动化中涉及的运动规划算法的正确性和有效性是至关重要的。在本文中,我们采取了一个重要的步骤,建立一个可认证的运动规划可操纵针。我们介绍了第一个用于可操纵针的运动规划器,它保证在临床上适当的假设下,它将在有限的时间内计算到指定目标的精确的避障运动计划,或者通知用户不存在这样的计划。基于一种新的多分辨率规划方法,提出了一种高效的、分辨率完备的可操纵针运动规划方法。与最先进的可操纵针运动规划器(没有一个提供任何完整性保证)相比,我们证明了新的分辨率完整运动规划器计算计划更快,成功率更高。 摘要:Medical steerable needles can move along 3D curvilinear trajectories to avoid anatomical obstacles and reach clinically significant targets inside the human body. Automating steerable needle procedures can enable physicians and patients to harness the full potential of steerable needles by maximally leveraging their steerability to safely and accurately reach targets for medical procedures such as biopsies and localized therapy delivery for cancer. For the automation of medical procedures to be clinically accepted, it is critical from a patient care, safety, and regulatory perspective to certify the correctness and effectiveness of the motion planning algorithms involved in procedure automation. In this paper, we take an important step toward creating a certifiable motion planner for steerable needles. We introduce the first motion planner for steerable needles that offers a guarantee, under clinically appropriate assumptions, that it will, in finite time, compute an exact, obstacle-avoiding motion plan to a specified target, or notify the user that no such plan exists. We present an efficient, resolution-complete motion planner for steerable needles based on a novel adaptation of multi-resolution planning. Compared to state-of-the-art steerable needle motion planners (none of which provide any completeness guarantees), we demonstrate that our new resolution-complete motion planner computes plans faster and with a higher success rate.
【21】 BEV-MODNet: Monocular Camera based Bird's Eye View Moving Object Detection for Autonomous Driving 标题:BEV-MODNet:基于单目摄像机的自动驾驶鸟瞰运动目标检测
作者:Hazem Rashed,Mariam Essam,Maha Mohamed,Ahmad El Sallab,Senthil Yogamani 机构:Valeo R&D Cairo, Egypt, Valeo Visions Systems, Ireland 备注:Accepted for Oral Presentation at IEEE Intelligent Transportation Systems Conference (ITSC) 2021 链接:https://arxiv.org/abs/2107.04937 摘要:在自主驾驶系统中,运动目标的检测是一项非常重要的任务。在感知阶段之后,通常在鸟瞰(BEV)空间执行运动规划。这需要将图像平面上检测到的对象投影到俯视BEV平面。这种投影由于深度信息的缺乏和在遥远地区的噪声映射而容易产生误差。CNNs可以利用场景中的全局上下文更好地进行投影。在这项工作中,我们探讨了端到端的运动目标检测(MOD)在BEV地图上直接使用单目图像作为输入。据我们所知,这样一个数据集并不存在,我们创建了一个扩展的KITTI原始数据集,由5个类的12.9k图像组成,其中包含BEV空间中的运动对象掩码注释。该数据集旨在用于基于类无关运动线索的对象检测,并将类作为元数据提供,以便更好地进行调整。设计并实现了一种两流RGB和光流融合结构,直接在BEV空间输出运动分割。我们将其与图像平面上最新运动分割预测的逆透视映射进行了比较。我们观察到,使用简单的基线实现,mIoU显著提高了13%。这证明了在BEV空间中直接学习运动分割输出的能力。我们的基线和数据集注释的定性结果可以在https://sites.google.com/view/bev-modnet. 摘要:Detection of moving objects is a very important task in autonomous driving systems. After the perception phase, motion planning is typically performed in Bird's Eye View (BEV) space. This would require projection of objects detected on the image plane to top view BEV plane. Such a projection is prone to errors due to lack of depth information and noisy mapping in far away areas. CNNs can leverage the global context in the scene to project better. In this work, we explore end-to-end Moving Object Detection (MOD) on the BEV map directly using monocular images as input. To the best of our knowledge, such a dataset does not exist and we create an extended KITTI-raw dataset consisting of 12.9k images with annotations of moving object masks in BEV space for five classes. The dataset is intended to be used for class agnostic motion cue based object detection and classes are provided as meta-data for better tuning. We design and implement a two-stream RGB and optical flow fusion architecture which outputs motion segmentation directly in BEV space. We compare it with inverse perspective mapping of state-of-the-art motion segmentation predictions on the image plane. We observe a significant improvement of 13% in mIoU using the simple baseline implementation. This demonstrates the ability to directly learn motion segmentation output in BEV space. Qualitative results of our baseline and the dataset annotations can be found in https://sites.google.com/view/bev-modnet.
【22】 Potential iLQR: A Potential-Minimizing Controller for Planning Multi-Agent Interactive Trajectories 标题:潜势iLQR:一种多智能体交互轨迹规划的潜势最小化控制器
作者:Talha Kavuncu,Ayberk Yaraneri,Negar Mehr 机构:Aerospace Engineering, UIUC 链接:https://arxiv.org/abs/2107.04926 摘要:许多机器人应用程序涉及多个代理之间的交互,其中一个代理的决策会影响其他代理的行为。这种行为可以被微分对策的均衡所捕获,它提供了一个表达性的框架来模拟主体之间的相互影响。然而,寻找微分对策的平衡点通常是具有挑战性的,因为它涉及到求解一组耦合的最优控制问题。在这项工作中,我们建议利用多智能体交互的特殊结构,通过简单地求解一个最优控制问题,即与最小化微分对策的势函数相关的最优控制问题来生成交互轨迹。我们的主要观点是,对于一类多智能体交互,潜在的微分对策实际上是一个潜在的微分对策,通过求解一个最优控制问题就可以找到均衡点。我们引入了这样一个最优控制问题,并在单智能体轨迹优化方法的基础上,提出了一种计算简单、可扩展的多智能体交互轨迹规划算法。我们将在仿真中演示我们的算法的性能,并显示我们的算法优于最先进的游戏求解器。为了进一步展示我们算法的实时性,我们将在一组实验中演示我们提出的算法的应用,这些实验涉及到两架四旋翼机的交互轨迹。 摘要:Many robotic applications involve interactions between multiple agents where an agent's decisions affect the behavior of other agents. Such behaviors can be captured by the equilibria of differential games which provide an expressive framework for modeling the agents' mutual influence. However, finding the equilibria of differential games is in general challenging as it involves solving a set of coupled optimal control problems. In this work, we propose to leverage the special structure of multi-agent interactions to generate interactive trajectories by simply solving a single optimal control problem, namely, the optimal control problem associated with minimizing the potential function of the differential game. Our key insight is that for a certain class of multi-agent interactions, the underlying differential game is indeed a potential differential game for which equilibria can be found by solving a single optimal control problem. We introduce such an optimal control problem and build on single-agent trajectory optimization methods to develop a computationally tractable and scalable algorithm for planning multi-agent interactive trajectories. We will demonstrate the performance of our algorithm in simulation and show that our algorithm outperforms the state-of-the-art game solvers. To further show the real-time capabilities of our algorithm, we will demonstrate the application of our proposed algorithm in a set of experiments involving interactive trajectories for two quadcopters.
【23】 Distributed Deep Reinforcement Learning for Intelligent Traffic Monitoring with a Team of Aerial Robots 标题:分布式深度强化学习在空中机器人智能交通监控中的应用
作者:Behzad Khamidehi,Elvino S. Sousa 机构:UniversityofToronto 备注:IEEE International Conference on Intelligent Transportation - ITSC2021 链接:https://arxiv.org/abs/2107.04924 摘要:本文研究了一组空中机器人在路网中的交通监控问题。由于两个主要原因,这个问题具有挑战性。首先,交通事件在时间和空间上都是随机的。其次,当交通事件以不同的速率到达路网的不同位置时,问题具有非齐次结构。因此,与其他位置相比,某些位置需要更多的机器人访问。为了解决这些问题,我们为道路网络的每个位置定义了一个不确定性度量,并为空中机器人提出了一个路径规划问题,以最小化网络的平均不确定性。将该问题表示为部分可观测马尔可夫决策过程(POMDP),提出了一种基于深度强化学习的分布式可扩展算法。我们考虑两种不同的场景,取决于代理(空中机器人)和交通管理中心(TMC)之间的通信模式。第一个场景假设代理连续地与TMC通信,以发送/接收关于流量事件的实时信息。因此,代理具有全局和实时的环境知识。然而,在第二种情况下,我们考虑一个具有挑战性的设置,其中的空中机器人的观测是部分的,并局限于它们的感测范围。此外,与第一种情况相比,空中机器人和TMC之间的信息交换仅限于特定的时间实例。我们评估了我们提出的算法在这两种情况下的性能,并在一个交通监控系统中演示了它的功能。 摘要:This paper studies the traffic monitoring problem in a road network using a team of aerial robots. The problem is challenging due to two main reasons. First, the traffic events are stochastic, both temporally and spatially. Second, the problem has a non-homogeneous structure as the traffic events arrive at different locations of the road network at different rates. Accordingly, some locations require more visits by the robots compared to other locations. To address these issues, we define an uncertainty metric for each location of the road network and formulate a path planning problem for the aerial robots to minimize the network's average uncertainty. We express this problem as a partially observable Markov decision process (POMDP) and propose a distributed and scalable algorithm based on deep reinforcement learning to solve it. We consider two different scenarios depending on the communication mode between the agents (aerial robots) and the traffic management center (TMC). The first scenario assumes that the agents continuously communicate with the TMC to send/receive real-time information about the traffic events. Hence, the agents have global and real-time knowledge of the environment. However, in the second scenario, we consider a challenging setting where the observation of the aerial robots is partial and limited to their sensing ranges. Moreover, in contrast to the first scenario, the information exchange between the aerial robots and the TMC is restricted to specific time instances. We evaluate the performance of our proposed algorithm in both scenarios for a real road network topology and demonstrate its functionality in a traffic monitoring system.
【24】 Feature-based Event Stereo Visual Odometry 标题:基于特征的事件立体视觉里程计
作者:Antea Hadviger,Igor Cvišić,Ivan Marković,Sacha Vražić,Ivan Petrović 机构: and Ivan Petrovi´c are withthe University of Zagreb Faculty of Electrical Engineering and Computing 备注:Submitted to Accepted to European Conference on Mobile Robots (ECMR) 2021 链接:https://arxiv.org/abs/2107.04921 摘要:基于事件的相机是受生物启发的传感器,输出事件,即场景中的异步像素亮度变化。它们的高动态范围和微秒的时间分辨率使得它们在具有挑战性的照明环境和高速场景中比标准相机更可靠,因此开发完全基于事件相机的里程算法为自主系统和机器人提供了令人兴奋的新可能性。本文提出了一种新的基于特征检测和匹配的事件摄像机立体视觉里程测量方法,并对其进行了细致的特征管理,同时采用重投影误差最小化的方法进行姿态估计。我们在两个公开的数据集上评估了该方法的性能:室内飞行无人机捕获的MVSEC序列和DSEC室外驾驶序列。MVSEC通过运动捕捉提供准确的地面实况,而对于不提供地面实况的DSEC,为了在标准相机帧上获得参考轨迹,我们使用了我们的软视觉里程计,这是KITTI记分牌上排名最高的算法之一。我们比较了我们的方法与ESVO方法,这是第一个也是唯一的立体声事件测距法,显示了在MVSEC序列上的PAR性能,而在DSEC数据集ESVO上,不像我们的方法,不能处理具有默认参数的室外驾驶场景。此外,与ESVO相比,该方法的两个重要优点是跟踪频率适应异步事件率,并且不需要初始化。 摘要:Event-based cameras are biologically inspired sensors that output events, i.e., asynchronous pixel-wise brightness changes in the scene. Their high dynamic range and temporal resolution of a microsecond makes them more reliable than standard cameras in environments of challenging illumination and in high-speed scenarios, thus developing odometry algorithms based solely on event cameras offers exciting new possibilities for autonomous systems and robots. In this paper, we propose a novel stereo visual odometry method for event cameras based on feature detection and matching with careful feature management, while pose estimation is done by reprojection error minimization. We evaluate the performance of the proposed method on two publicly available datasets: MVSEC sequences captured by an indoor flying drone and DSEC outdoor driving sequences. MVSEC offers accurate ground truth from motion capture, while for DSEC, which does not offer ground truth, in order to obtain a reference trajectory on the standard camera frames we used our SOFT visual odometry, one of the highest ranking algorithms on the KITTI scoreboards. We compared our method to the ESVO method, which is the first and still the only stereo event odometry method, showing on par performance on the MVSEC sequences, while on the DSEC dataset ESVO, unlike our method, was unable to handle outdoor driving scenario with default parameters. Furthermore, two important advantages of our method over ESVO are that it adapts tracking frequency to the asynchronous event rate and does not require initialization.
【25】 SynPick: A Dataset for Dynamic Bin Picking Scene Understanding 标题:SynPick:一种用于动态拣箱场景理解的数据集
作者:Arul Selvam Periyasamy,Max Schwarz,Sven Behnke 机构:All authors are with the Autonomous Intelligent Systems group ofUniversity of Bonn 备注:Accepted for 17th IEEE International Conference on Automation Science and Engineering (CASE), Lyon, France, August 2021 链接:https://arxiv.org/abs/2107.04852 摘要:我们提出SynPick,一个用于动态场景理解的合成数据集。与现有数据集相比,我们的数据集既位于现实的工业应用领域(灵感来自著名的亚马逊机器人挑战赛(ARC)),又具有我们为ARC 2017开发的采摘启发式选择的具有真实采摘动作的动态场景。该数据集与流行的BOP数据集格式兼容。我们详细描述了数据集的生成过程,包括对象配置生成和操纵仿真使用英伟达物理X引擎。为了覆盖一个大的动作空间,我们执行无目标和有目标的拾取动作,以及随机移动动作。为了建立目标感知的基线,在数据集上评估了最新的姿态估计方法。我们证明了在操纵过程中跟踪姿态的有用性,而不是单镜头估计,即使是一个天真的滤波方法。生成器源代码和数据集是公开的。 摘要:We present SynPick, a synthetic dataset for dynamic scene understanding in bin-picking scenarios. In contrast to existing datasets, our dataset is both situated in a realistic industrial application domain -- inspired by the well-known Amazon Robotics Challenge (ARC) -- and features dynamic scenes with authentic picking actions as chosen by our picking heuristic developed for the ARC 2017. The dataset is compatible with the popular BOP dataset format. We describe the dataset generation process in detail, including object arrangement generation and manipulation simulation using the NVIDIA PhysX physics engine. To cover a large action space, we perform untargeted and targeted picking actions, as well as random moving actions. To establish a baseline for object perception, a state-of-the-art pose estimation approach is evaluated on the dataset. We demonstrate the usefulness of tracking poses during manipulation instead of single-shot estimation even with a naive filtering approach. The generator source code and dataset are publicly available.
【26】 Informing Real-time Corrections in Corrective Shared Autonomy Through Expert Demonstrations 标题:通过专家演示通知矫正共享自治中的实时纠正
作者:Michael Hagenow,Emmanuel Senft,Robert Radwin,Michael Gleicher,Bilge Mutlu,Michael Zinn 机构: University of Wisconsin–Madison 备注:IEEE Robotics and Automation Letters (RA-L) 链接:https://arxiv.org/abs/2107.04836 摘要:矫正共享自治是一种方法,在这种方法中,人类的矫正是在其他自治机器人行为之上分层的。具体而言,纠正共享自治系统利用外部控制器允许跨一系列任务变量(例如,工具的旋转速度、施加的力、路径)进行纠正,以满足任务的特定需求。然而,这种固有的灵活性使得很难决定在任何给定时刻允许什么样的修正。修正的选择包括确定适当的机器人状态变量、这些变量的缩放以及允许用户以直观的方式指定修正的方法。本文通过提供一个基于演示学习的自动化解决方案,实现了有效的纠正共享自治,既提取了名义行为,又解决了这些核心问题。我们的评估表明,该解决方案使用户能够成功地完成表面清洁任务,确定用户在应用更正时采用的不同策略,并指出我们解决方案的未来改进。 摘要:Corrective Shared Autonomy is a method where human corrections are layered on top of an otherwise autonomous robot behavior. Specifically, a Corrective Shared Autonomy system leverages an external controller to allow corrections across a range of task variables (e.g., spinning speed of a tool, applied force, path) to address the specific needs of a task. However, this inherent flexibility makes the choice of what corrections to allow at any given instant difficult to determine. This choice of corrections includes determining appropriate robot state variables, scaling for these variables, and a way to allow a user to specify the corrections in an intuitive manner. This paper enables efficient Corrective Shared Autonomy by providing an automated solution based on Learning from Demonstration to both extract the nominal behavior and address these core problems. Our evaluation shows that this solution enables users to successfully complete a surface cleaning task, identifies different strategies users employed in applying corrections, and points to future improvements for our solution.
【27】 LS3: Latent Space Safe Sets for Long-Horizon Visuomotor Control of Iterative Tasks 标题:LS3:迭代任务长视距视觉运动控制的潜在空间安全集
作者:Albert Wilcox,Ashwin Balakrishna,Brijen Thananjeyan,Joseph E. Gonzalez,Ken Goldberg 机构: equal contribution 备注:Preprint, Under Review. First two authors contributed equally 链接:https://arxiv.org/abs/2107.04775 摘要:强化学习(RL)算法在探索高维环境以学习复杂的、长时间范围的任务方面取得了令人印象深刻的成功,但在探索不受约束的情况下,它往往表现出不安全的行为,需要广泛的环境交互。动态不确定环境下安全学习的一个很有前途的策略是要求agent能够鲁棒地返回到任务成功(从而保证安全)的状态。虽然这种方法在低维环境中取得了成功,但在具有高维状态空间(如图像)的环境中实施这种约束是一个挑战。我们提出了潜在空间安全集(LS3),通过使用次优演示和学习动力学模型,将该策略扩展到迭代的、长视距的图像观测任务,将探索限制在学习安全集的邻域内,其中任务可能完成。我们评估了4个领域的LS3,包括一个具有挑战性的模拟顺序推送任务和一个物理电缆路由任务。我们发现,LS3在满足约束条件的同时,可以利用先前的任务成功来限制探索和学习,比先前的算法更有效。看到了吗https://tinyurl.com/latent-ss 代码和补充材料。 摘要:Reinforcement learning (RL) algorithms have shown impressive success in exploring high-dimensional environments to learn complex, long-horizon tasks, but can often exhibit unsafe behaviors and require extensive environment interaction when exploration is unconstrained. A promising strategy for safe learning in dynamically uncertain environments is requiring that the agent can robustly return to states where task success (and therefore safety) can be guaranteed. While this approach has been successful in low-dimensions, enforcing this constraint in environments with high-dimensional state spaces, such as images, is challenging. We present Latent Space Safe Sets (LS3), which extends this strategy to iterative, long-horizon tasks with image observations by using suboptimal demonstrations and a learned dynamics model to restrict exploration to the neighborhood of a learned Safe Set where task completion is likely. We evaluate LS3 on 4 domains, including a challenging sequential pushing task in simulation and a physical cable routing task. We find that LS3 can use prior task successes to restrict exploration and learn more efficiently than prior algorithms while satisfying constraints. See https://tinyurl.com/latent-ss for code and supplementary material.
【28】 Attitude Reconstruction from Inertial Measurement: Mitigating Runge Effect for Dynamic Applications 标题:基于惯性测量的姿态重构:减轻动态应用中的Runge效应
作者:Yuanxin Wu,Maoran Zhu 备注:9 pages, 8 figures 链接:https://arxiv.org/abs/2107.04722 摘要:时间等间隔惯性测量实际上被用作运动确定的输入。多项式插值是一种常用的陀螺信号恢复技术,但由于等间距采样的龙格效应,其数值稳定性存在根本性的问题。本文回顾了相关领域关于龙格现象的理论研究成果,并提出了一种直接的借贷与切割策略来抑制龙格现象。它利用相邻样本进行高阶多项式插值,但在实际时间间隔内只使用中间多项式段。通过函数迭代将BAC策略引入到姿态计算中,在经典圆锥运动下获得了几个数量级的精度效益。这将为惯性导航在持续动态运动下的计算带来巨大的效益。 摘要:Time-equispaced inertial measurements are practically used as inputs for motion determination. Polynomial interpolation is a common technique of recovering the gyroscope signal but is subject to a fundamentally numerical stability problem due to the Runge effect on equispaced samples. This paper reviews the theoretical results of Runge phenomenon in related areas and proposes a straightforward borrowing-and-cutting (BAC) strategy to depress it. It employs the neighboring samples for higher-order polynomial interpolation but only uses the middle polynomial segment in the actual time interval. The BAC strategy has been incorporated into attitude computation by functional iteration, leading to accuracy benefit of several orders of magnitude under the classical coning motion. It would potentially bring significant benefits to the inertial navigation computation under sustained dynamic motions.
【29】 Using Depth for Improving Referring Expression Comprehension in Real-World Environments 标题:利用深度提高现实环境中的指称表达理解能力
作者:Fethiye Irmak Dogan,Iolanda Leite 机构: Perception and Learning from the School of Electrical Engineeringand Computer Science at KTH Royal Institute of Technology 链接:https://arxiv.org/abs/2107.04658 摘要:在机器人协作任务中,机器人通过寻找描述的对象来帮助伙伴完成任务,深度维度对任务的成功完成起着至关重要的作用。现有的研究主要集中在利用RGB图像来理解物体的描述。然而,在现实环境中,包含深度信息的三维空间感知是最基本的。在这项工作中,我们提出了一种方法来识别描述对象考虑深度维度数据。在深度数据对消除对象歧义至关重要的场景中,以及在包含可使用和不使用深度维度指定的对象的整个评估数据集中,使用深度特征可显著提高性能。 摘要:In a human-robot collaborative task where a robot helps its partner by finding described objects, the depth dimension plays a critical role in successful task completion. Existing studies have mostly focused on comprehending the object descriptions using RGB images. However, 3-dimensional space perception that includes depth information is fundamental in real-world environments. In this work, we propose a method to identify the described objects considering depth dimension data. Using depth features significantly improves performance in scenes where depth data is critical to disambiguate the objects and across our whole evaluation dataset that contains objects that can be specified with and without the depth dimension.
【30】 Diverse Video Generation using a Gaussian Process Trigger 标题:使用高斯过程触发器的多样化视频生成
作者:Gaurav Shrivastava,Abhinav Shrivastava 机构:University of Maryland, College Park 备注:International Conference on Learning Representations, 2021 链接:https://arxiv.org/abs/2107.04619 摘要:在给定一些上下文(或过去)帧的情况下生成未来帧是一项具有挑战性的任务。它需要对视频的时间一致性进行建模,并根据潜在未来状态的多样性对多模态进行建模。当前用于视频生成的变分方法倾向于在多模式的未来结果上边缘化。相反,我们建议在未来的结果中显式地建模多模态,并利用它来抽样不同的未来。我们的方法,多样化的视频发生器,使用高斯过程(GP)来学习给定过去的未来状态的先验知识,并在给定特定样本的可能未来保持概率分布。此外,我们利用这种分布随时间的变化,通过估计正在进行的序列的结束来控制未来不同状态的采样。也就是说,我们使用GP在输出函数空间上的方差来触发动作序列中的变化。在重建质量和生成序列的多样性方面,我们在不同的未来帧生成方面取得了最新的成果。 摘要:Generating future frames given a few context (or past) frames is a challenging task. It requires modeling the temporal coherence of videos and multi-modality in terms of diversity in the potential future states. Current variational approaches for video generation tend to marginalize over multi-modal future outcomes. Instead, we propose to explicitly model the multi-modality in the future outcomes and leverage it to sample diverse futures. Our approach, Diverse Video Generator, uses a Gaussian Process (GP) to learn priors on future states given the past and maintains a probability distribution over possible futures given a particular sample. In addition, we leverage the changes in this distribution over time to control the sampling of diverse future states by estimating the end of ongoing sequences. That is, we use the variance of GP over the output function space to trigger a change in an action sequence. We achieve state-of-the-art results on diverse future frame generation in terms of reconstruction quality and diversity of the generated sequences.
【31】 Work in Progress -- Automated Generation of Robotic Planning Domains from Observations 标题:进行中的工作--根据观测结果自动生成机器人规划域
作者:Maximilian Diehl,Karinne Ramirez-Amaro 机构: Chalmers University of Technology 备注:Accepted at Ubiquitous Robots 2021 -- 4 pages 链接:https://arxiv.org/abs/2107.04614 摘要:在这篇论文中,我们报告了我们最新的工作成果,在自动生成规划运营商从人类示范,我们提出了一些未来的研究思路。为了自动生成计划操作符,我们的系统将从人类演示中分割并识别不同的观察动作。然后,我们提出了一种自动提取方法,从这些演示中检测相关的前提条件和效果。最后,我们的系统会产生相关的规划运算子,并使用符号规划者找出符合使用者定义目标的动作序列。这个计划部署在一个模拟的蒂亚戈机器人上。我们未来的研究方向包括学习和解释执行失败,检测手部活动的因果关系及其对机器人环境的影响。前者对于基于信任的高效人机协作至关重要,后者对于现实和动态环境中的学习至关重要。 摘要:In this paper, we report the results of our latest work on the automated generation of planning operators from human demonstrations, and we present some of our future research ideas. To automatically generate planning operators, our system segments and recognizes different observed actions from human demonstrations. We then proposed an automatic extraction method to detect the relevant preconditions and effects from these demonstrations. Finally, our system generates the associated planning operators and finds a sequence of actions that satisfies a user-defined goal using a symbolic planner. The plan is deployed on a simulated TIAGo robot. Our future research directions include learning from and explaining execution failures and detecting cause-effect relationships between demonstrated hand activities and their consequences on the robot's environment. The former is crucial for trust-based and efficient human-robot collaboration and the latter for learning in realistic and dynamic environments.