CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Linear Bayesian reinforcement learning

Nikolaos Tziortziotis ; Christos Dimitrakakis (Institutionen för data- och informationsteknik, Datavetenskap, Algoritmer (Chalmers)) ; Konstantinos Blekas
IJCAI 2013, Proceedings of the 23rd International Joint Conference on Artificial Intelligence (2013)
[Konferensbidrag, refereegranskat]

This paper proposes a simple linear Bayesian approach to reinforcement learning. We show that with an appropriate basis, a Bayesian linear Gaussian model is sufficient for accurately estimating the system dynamics, and in particular when we allow for correlated noise. Policies are estimated by first sampling a transition model from the current posterior, and then performing approximate dynamic programming on the sampled model. This form of approximate Thompson sampling results in good exploration in unknown environments. The approach can also be seen as a Bayesian generalisation of least-squares policy iteration, where the empirical transition matrix is replaced with a sample from the posterior.

Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2013-12-17. Senast ändrad 2015-01-08.
CPL Pubid: 189619