CPL - Chalmers Publication Library
Infinite horizon discounted dynamic programming subject to total variation ambiguity on conditional distribution

I. Tzortzis ; C. D. Charalambous ; Themistoklis Charalambous (Institutionen för signaler och system, Kommunikationssystem)
Proceedings of the 55th IEEE Conference on Decision and Control (CDC 2016); Las Vegas; United States; 12-14 December 2016 (0743-1546). p. Art no 7798559, Pages 2010-2015. (2016)
[Konferensbidrag, refereegranskat]

We analyze the infinite horizon minimax discounted cost Markov Control Model (MCM), for a class of controlled process conditional distributions, which belong to a ball, with respect to total variation distance metric, centered at a known nominal controlled conditional distribution with radius R ϵ [0, 2], in which the minimization is over the control strategies and the maximization is over conditional distributions. Through our analysis (i) we derive a new discounted dynamic programming equation, (ii) we show the associated contraction property, and (iii) we develop a new policy iteration algorithm. Finally, the application of the new dynamic programming and the corresponding policy iteration algorithm are shown via an illustrative example.

