This work presents a novel control algorithm of redirected walking called steer-to-optimal-target (S2OT) for effective real-time planning in redirected walking. S2OT is a method of redirection estimating the optimal steering target that can avoid the collision on the future path based on the user's virtual and physical paths. We design and train the machine learning model for estimating optimal steering target through reinforcement learning, especially, using the technique called Deep Q-Learning. S2OT significantly reduces the number of resets caused by collisions between user and physical space boundaries compared to well-known algorithms such as steer-to-center (S2C) and Model Predictive Control Redirection (MPCred). The results are consistent for any combinations of room-scale and large-scale physical spaces and virtual maps with or without predefined paths. S2OT also has a fast computation time of 0.763 msec per redirection, which is sufficient for redirected walking in real-time environments.