【信息技术】【2016.07】基于视觉和遥感数据的多模式学习

【信息技术】【2016.07】基于视觉和遥感数据的多模式学习
本文为澳大利亚悉尼大学(作者:Dushyant Rao)的博士论文,共164页。

无人驾驶汽车通常被部署在未知的环境中执行探测和监测任务。在这种应用中,信息丰富性和不同传感器模式的获取成本之间往往存在折衷。视觉数据的信息通常非常丰富,但需要用机器人进行现场采集。相反,遥感数据的覆盖范围更大,可能在执行任务之前就可以获得。为了有效和高效地探索、监测环境,充分利用机器人所能获得的所有感知信息至关重要。一个重要的应用是使用自主式水下机器人(AUV)测量海底数据。AUV可以拍摄高分辨率的海底原位照片,这些照片可以用来将不同的区域划分为不同的栖息地类别,总结观察到的物理和生物特性,这就是所谓的底栖动物栖息地测绘。然而,由于水下机器人只能对海底的一小部分区域进行成像,因此通常使用从舰载多波束声纳获得的遥感测深(海洋深度)数据进行栖息地测绘。

随着近年来无监督特征学习和深度学习技术的迅猛发展,许多已有的技术研究了多模态学习的概念:捕捉不同传感器模式之间的关系,以执行分类和其他推理任务。本文提出了基于视觉和遥感数据的相关技术,并将其应用于AUV自主探测和监测任务中。这样做可以更准确地分类海底环境,也有助于自主调查规划。本文的第一个贡献是将无监督特征学习技术应用于海洋数据,将所提出的技术分别用于从图像和水深数据中提取特征,并将其性能与传统的传感器模态特征进行比较。第二个贡献是开发了一个多模式学习体系架构,捕捉了两种模式之间的关系。该模型对缺失模式具有很强的鲁棒性,这意味着在只有水深测量可用时,它可以为大型底栖动物栖息地测绘提取更好的特征。该模型使用各种模式组合执行分类,表明多模态学习相对基准案例提供了很大的性能改进。第三个贡献是使用门控特征学习模型扩展了标准学习体系架构,使模型能够更好地捕获视觉数据和水深数据之间的“一对多”关系。这就进一步扩展了推理功能,能够从水深数据中预测视觉特征,从而允许基于图像的查询。这样的查询对于AUV测量规划非常有用,特别是在没有监督标签的情况下。本文的最后一个贡献是提出了一些信息论方法来辅助测量规划。根据预期的额外视觉信息量,采用拟议措施预测未观察区域的效用。因此能够在一个大的区域内绘制实用地图,AUV可以利用这些地图从一组候选任务中确定信息最丰富的位置。通过对实际海洋数据的大量实验,验证了本文提出的模型的正确性。此外,所介绍的技术在机器人学的其他领域也有应用。因此,本文最后讨论了这些贡献的更广泛含义,以及由此产生的未来研究方向。

Autonomous vehicles are often deployed toperform exploration and monitoring missions in unseen environments. In suchapplications, there is often a compromise between the information richness andthe acquisition cost of different sensor modalities. Visual data is usuallyvery information-rich, but requires in-situ acquisition with the robot. Incontrast, remotely sensed data has a larger range and footprint, and may beavailable prior to a mission. In order to effectively and efficiently exploreand monitor the environment, it is critical to make use of all of the sensoryinformation available to the robot. One important application is the use of anAutonomous Underwater Vehicle (AUV) to survey the ocean floor. AUVs can takehigh resolution in-situ photographs of the sea floor, which can be used toclassify different regions into various habitat classes that summarise theobserved physical and biological properties. This is known as benthic habitatmapping. However, since AUVs can only image a tiny fraction of the ocean floor,habitat mapping is usually performed with remotely sensed bathymetry (oceandepth) data, obtained from shipborne multibeam sonar. With the recent surge inunsupervised feature learning and deep learning techniques, a number ofprevious techniques have investigated the concept of multimodal learning:capturing the relationship between different sensor modalities in order toperform classification and other inference tasks. This thesis proposes relatedtechniques for visual and remotely sensed data, applied to the task ofautonomous exploration and monitoring with an AUV. Doing so enables moreaccurate classification of the benthic environment, and also assists autonomoussurvey planning. The first contribution of this thesis is to apply unsupervisedfeature learning techniques to marine data. The proposed techniques are used toextract features from image and bathymetric data separately, and theperformance is compared to that with more traditionally used features for eachsensor modality. The second contribution is the development of a multimodallearning architecture that captures the relationship between the twomodalities. The model is robust to missing modalities, which means it canextract better features for large-scale benthic habitat mapping, where onlybathymetry is available. The model is used to perform classification withvarious combinations of modalities, demonstrating that multimodal learningprovides a large performance improvement over the baseline case. The thirdcontribution is an extension of the standard learning architecture using agated feature learning model, which enables the model to better capture the‘oneto-many’ relationship between visual and bathymetric data. This opens upfurther inference capabilities, with the ability to predict visual featuresfrom bathymetric data, which allows image-based queries. Such queries are usefulfor AUV survey planning, especially when supervised labels are unavailable. Thefinal contribution is the novel derivation of a number of information-theoreticmeasures to aid survey planning. The proposed measures predict the utility ofunobserved areas, in terms of the amount of expected additional visualinformation. As such, they are able to produce utility maps over a large regionthat can be used by the AUV to determine the most informative locations from aset of candidate missions. The models proposed in this thesis are validatedthrough extensive experiments on real marine data. Furthermore, the introducedtechniques have applications in various other areas within robotics. As such,this thesis concludes with a discussion on the broader implications of thesecontributions, and the future research directions that arise as a result ofthis work.

  1. 引言
  2. 项目背景
  3. 基于海洋数据的学习特征
  4. 基于视觉和水深特征的多模式学习
  5. 基于门控模型的扩展多模式学习
  6. 用于AUV调查规划的信息论度量
  7. 结论

更多精彩文章请关注公众号:【信息技术】【2016.07】基于视觉和遥感数据的多模式学习