Reinforcement learning with converging goal space and binary reward function

Wooseok Ro, Wonseok Jeon, Hamid Bamshad, Hyunseok Yang

Research output: Contribution to journalConference articlepeer-review

Abstract

Usage of a sparse and binary reward function remains one of the most challenging problems in reinforcement learning. In particular, when the environments wherein robotic agents learn are sufficiently vast, it is much more difficult to learn tasks because the probability of reaching the goal is minimal. A Hindsight Experience Replay algorithm was proposed to overcome these difficulties; however, problems persist that affect the learning speed and delay learning when a learning agent cannot receive proper rewards at the beginning of the learning process. In this paper, we present a simple method called Converging Goal Space and Binary Reward Function, which helps agents learn tasks easily and efficiently in large environments while providing a binary reward. At an early stage in training, a larger goal space margin facilitates the reward function for a more rapid policy learning. As the number of successes increases, the goal space is gradually reduced to the size used to the size used in the test. We apply this reward function to two different task experiments: Sliding and throwing, which must be explored at a wider range than the reach of the robotic arms, and then compare the learning efficiency to that of experiments that only employ a sparse and binary reward function. We show that the proposed reward function performs better in large environments using physics simulation, and we demonstrate that the function is applicable to real world robotic arms.

Original languageEnglish
Article number09249227
Pages (from-to)921-927
Number of pages7
JournalIEEE International Conference on Automation Science and Engineering
Volume2020-January
DOIs
Publication statusPublished - 2020
Event16th IEEE International Conference on Automation Science and Engineering, CASE 2020 - Hong Kong, Hong Kong
Duration: 2020 Aug 202020 Aug 21

Bibliographical note

Funding Information:
*This research was supported by Basic Science Research Program through the National Research Foundation of Korea (NRF) funded by the Ministry of Education (NRF-2018R1D1A1B07049267) and funded by the Korea government(MSIT) (2018R1A4A1025986). †Corresponding author 1Department of Mechanical Engineering, Yonsei University, Seodaemun-gu, Seoul 03722, Korea. {wsro0224, wsjeonno, hamidbamshad, hsyang}@yonsei.ac.kr

Publisher Copyright:
© 2020 IEEE.

All Science Journal Classification (ASJC) codes

  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Fingerprint Dive into the research topics of 'Reinforcement learning with converging goal space and binary reward function'. Together they form a unique fingerprint.

Cite this