

本文为葡萄牙里斯本技术大学(作者:Daniel Luis Simões Marta)的硕士论文,共95页。



This thesis focuses on the challenge ofdecoupling state perception and function approximation when applying DeepLearning Methods within Reinforcement Learning. As a starting point,high-dimensional states were considered, being this the fundamental limitationwhen applying Reinforcement Learning to real world tasks. Addressing the Curseof Dimensionality issue, we propose to reduce the dimensionality of data inorder to obtain succinct codes (internal representations of the environment),to be used as alternative states in a Reinforcement Learning framework.Different approaches were made along the last few decades, including KernelMachines with hand-crafted features, where the choice of appropriate filterswas task dependent and consumed a considerable amount of research. In thiswork, various Deep Learning methods with unsupervised learning mechanisms wereconsidered. Another key thematic relates to estimating Q-values for largestate-spaces, where tabular approaches are no longer feasible. As a mean toperform Q-function approximation, we search for supervised learning methodswithin Deep Learning. The objectives of this thesis include a detailedexploration and understanding of the proposed methods with the implementationof a neural controller. Several simulations were performed taking into accounta variety of optimization procedures and increased parameters to draw several conclusions.Several architectures were used as a Q-value function approximation. To inferbetter approaches and hint for higher scale applications, a trial between twosimilar types of Q-networks were conducted. Implementations regardingstate-of-the-art techniques were tested on classic control problems.

  1. 引言
  2. 深度学习的概念
  3. 强化学习
  4. 实验架构
  5. 实验结果
  6. 结论
