Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems

Tae Yoon Chun, Jin Bae Park, Yoon Ho Choi

Research output: Chapter in Book/Report/Conference proceedingConference contribution

1 Citation (Scopus)

Abstract

This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of OPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of OPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.

Original languageEnglish
Title of host publicationICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems
Pages454-458
Number of pages5
DOIs
Publication statusPublished - 2013 Dec 1
Event2013 13th International Conference on Control, Automation and Systems, ICCAS 2013 - Gwangju, Korea, Republic of
Duration: 2013 Oct 202013 Oct 23

Publication series

NameInternational Conference on Control, Automation and Systems
ISSN (Print)1598-7833

Other

Other2013 13th International Conference on Control, Automation and Systems, ICCAS 2013
CountryKorea, Republic of
CityGwangju
Period13/10/2013/10/23

Fingerprint

Linear systems
Reinforcement learning
Dynamic programming
Computer simulation

All Science Journal Classification (ASJC) codes

  • Artificial Intelligence
  • Computer Science Applications
  • Control and Systems Engineering
  • Electrical and Electronic Engineering

Cite this

Chun, T. Y., Park, J. B., & Choi, Y. H. (2013). Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. In ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems (pp. 454-458). [6703973] (International Conference on Control, Automation and Systems). https://doi.org/10.1109/ICCAS.2013.6703973
Chun, Tae Yoon ; Park, Jin Bae ; Choi, Yoon Ho. / Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems. 2013. pp. 454-458 (International Conference on Control, Automation and Systems).
@inproceedings{e84b0678beee4c0db870edd4f94fc18c,
title = "Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems",
abstract = "This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of OPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of OPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.",
author = "Chun, {Tae Yoon} and Park, {Jin Bae} and Choi, {Yoon Ho}",
year = "2013",
month = "12",
day = "1",
doi = "10.1109/ICCAS.2013.6703973",
language = "English",
isbn = "9788993215052",
series = "International Conference on Control, Automation and Systems",
pages = "454--458",
booktitle = "ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems",

}

Chun, TY, Park, JB & Choi, YH 2013, Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. in ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems., 6703973, International Conference on Control, Automation and Systems, pp. 454-458, 2013 13th International Conference on Control, Automation and Systems, ICCAS 2013, Gwangju, Korea, Republic of, 13/10/20. https://doi.org/10.1109/ICCAS.2013.6703973

Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. / Chun, Tae Yoon; Park, Jin Bae; Choi, Yoon Ho.

ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems. 2013. p. 454-458 6703973 (International Conference on Control, Automation and Systems).

Research output: Chapter in Book/Report/Conference proceedingConference contribution

TY - GEN

T1 - Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems

AU - Chun, Tae Yoon

AU - Park, Jin Bae

AU - Choi, Yoon Ho

PY - 2013/12/1

Y1 - 2013/12/1

N2 - This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of OPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of OPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.

AB - This paper presents the properties of policy iteration (PI)-mode monotone convergence and stability of generalized policy iteration (OPI) algorithms for discrete-time (DT) linear systems. OPI is one of the reinforcement learning based dynamic programming (DP) methods for solving optimal control problems, interacting policy evaluation and policy improvement steps. To deal with the convergence and stability of OPI, several equivalent equations are derived. Then, as a result, the PI-mode monotone convergence (one behaves like PI) and stability of OPI algorithm are proved under the some initial conditions which are closely related with Lyapunov approach. Finally, some numerical simulations are performed to verify the proposed convergence and stability properties.

UR - http://www.scopus.com/inward/record.url?scp=84893524435&partnerID=8YFLogxK

UR - http://www.scopus.com/inward/citedby.url?scp=84893524435&partnerID=8YFLogxK

U2 - 10.1109/ICCAS.2013.6703973

DO - 10.1109/ICCAS.2013.6703973

M3 - Conference contribution

AN - SCOPUS:84893524435

SN - 9788993215052

T3 - International Conference on Control, Automation and Systems

SP - 454

EP - 458

BT - ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems

ER -

Chun TY, Park JB, Choi YH. Policy iteration-mode monotone convergence of generalized policy iteration for discrete-time linear systems. In ICCAS 2013 - 2013 13th International Conference on Control, Automation and Systems. 2013. p. 454-458. 6703973. (International Conference on Control, Automation and Systems). https://doi.org/10.1109/ICCAS.2013.6703973