Q-learning is the most representative algorithm of reinforcement learning, and it is a method of making the best choice by utilizing the compensation obtained through interaction between environment and agent. However, existing Q-learning has a problem of performance degradation when the environment is complicated, there are multi-agents, or the memory load is low. To solve these problems, various algorithms such as deep Q-learning, modular Q-learning, and Nash Q-learning have been developed. In this paper, we evaluate their performances through grid world experiments. Experiments compare Q-learning, deep Q-learning, modular Q-learning and Nash Q-learning. Q-learning performs better than deep Q-learning in simple problems. However, for more difficult problems, Q-learning is limited, and deep Q-learning is more efficient. Nash Q-learning and modular Q-learning show similar performance. However, Nash Q-learning has better performance for more difficult problems.
Bibliographical notePublisher Copyright:
© 2019 Pushpa Publishing House, Prayagraj, India.
All Science Journal Classification (ASJC) codes
- Atomic and Molecular Physics, and Optics