A kernel based true online Sarsa(λ) for continuous space control problems

Fei Zhu1, 2, Haijun Zhu1, Yuchen Fu3 and Xiaoke Zhou4

  1. School of Computer Science and Technology, Soochow University
    Shizi Street No.1 158 box, 215006, Suzhou, Jiangsu, China
    zhufei@suda.edu.cn, 1017942265@qq.com, dhchen@suda.edu.cn
  2. Provincial Key Laboratory for Computer Information Processing Technology, Soochow University
    Shizi Street No.1 158 box, 215006, Suzhou, Jiangsu, China
  3. School of Computer Science and Engineering, Changshu Institute of Technology
    yuchenfu@suda.edu.cn
  4. University of Basque Country
    Spanish
    xzhou001@ikasle.ehu.eus

Abstract

Reinforcement learning is an efficient learning method for the control problem by interacting with the environment to get an optimal policy. However, it also faces challenges such as low convergence accuracy and slow convergence. Moreover, conventional reinforcement learning algorithms could hardly solve continuous control problems. The kernel-based method can accelerate convergence speed and improve convergence accuracy; and the policy gradient method is a good way to deal with continuous space problems. We proposed a Sarsa(λ) version of true online time difference algorithm, named True Online Sarsa(λ)(TOSarsa(λ)), on the basis of the clustering-based sample specification method and selective kernelbased value function. The TOSarsa(λ) algorithm has a consistent result with both the forward view and the backward view which ensures to get an optimal policy in less time. Afterwards we also combined TOSarsa(λ) with heuristic dynamic programming. The experiments showed our proposed algorithm worked well in dealing with continuous control problem.

Key words

reinforcement learning, kernel method, true online, policy gradient, Sarsa(λ)

Digital Object Identifier (DOI)

https://doi.org/10.2298/CSIS170107029Z

Publication information

Volume 14, Issue 3 (September 2017)
Advances in Information Technology, Distributed and Model Driven Systems
Year of Publication: 2017
ISSN: 2406-1018 (Online)
Publisher: ComSIS Consortium

Full text

DownloadAvailable in PDF
Portable Document Format

How to cite

Zhu, F., Zhu, H., Fu, Y., Zhou, X.: A kernel based true online Sarsa(λ) for continuous space control problems. Computer Science and Information Systems, Vol. 14, No. 3, 789–804. (2017), https://doi.org/10.2298/CSIS170107029Z