CPL - Chalmers Publication Library
| Utbildning | Forskning | Styrkeområden | Om Chalmers | In English In English Ej inloggad.

Probabilistic inverse reinforcement learning in unknown environments

Aristide C. Y. Tossou ; Christos Dimitrakakis (Institutionen för data- och informationsteknik, Datavetenskap, Algoritmer (Chalmers))
Conference on Uncertainty in Artificial Intelligence, UAI 2013 (2013)
[Konferensbidrag, refereegranskat]

We consider the problem of learning by demonstration from agents acting in un- known stochastic Markov environments or games. Our aim is to estimate agent prefer- ences in order to construct improved policies for the same task that the agents are trying to solve. To do so, we extend previous prob- abilistic approaches for inverse reinforcement learning in known MDPs to the case of un- known dynamics or opponents. We do this by deriving two simplified probabilistic mod- els of the demonstrator's policy and utility. For tractability, we use maximum a posteri- ori estimation rather than full Bayesian in- ference. Under a at prior, this results in a convex optimisation problem. We nd that the resulting algorithms are highly compet- itive against a variety of other methods for inverse reinforcement learning that do have knowledge of the dynamics.

Den här publikationen ingår i följande styrkeområden:

Läs mer om Chalmers styrkeområden  

Denna post skapades 2013-12-17. Senast ändrad 2016-09-28.
CPL Pubid: 189616