This paper addresses the problem that policy iteration (PI) for continuous-time (CT) systems requires explorations of the state space which is known as persistency of excitation in adaptive control community, and as a result, proposes a PI scheme explorized by an additional probing signal to solve the addressed problem. The proposed PI method efficiently finds in online fashion the related CT linear quadratic (LQ) optimal control without knowing the system matrix A, and guarantees the stability and convergence to the LQ optimal control, which is proven in this paper in the presence of the probing signal. A design method for the probing signal is also presented to balance the exploration of the state space and the control performance. Finally, several simulation results are provided to verify the effectiveness of the proposed explorized PI method.
|Number of pages||8|
|Journal||Transactions of the Korean Institute of Electrical Engineers|
|Publication status||Published - 2012 Mar 1|
All Science Journal Classification (ASJC) codes
- Electrical and Electronic Engineering