RLPy is a framework for conducting sequential decision making experiments that involve value-function based approaches. It provides a modular toolbox, where various components can be linked together to create experiments.
Setting up an RL experiment requires selecting the following 4 key components:
The Experiment class works as a glue that connect all these pieces together.
If the full model of the MDP is known, Dynamic Programming techniques can be used to solve the MDP. To setup a DP experiment the following 3 components have to be set:
Note
Each of the components mentioned here has several realizations in RLPy, yet this website provides guidance only on the main abstract classes, namely: Agent, MDP Solver, Representation, Policy, Domain and Experiment
See also
The tutorial page provides simple 10-15 minutes examples on how various experiments can be setup and used.n
The project was partially funded by ONR and AFOSR grants.
If you use RLPy to conduct your research, please cite
Alborz Geramifard, Robert H Klein, Christoph Dann, William Dabney and Jonathan P How, RLPy: The Reinforcement Learning Library for Education and Research, 2013. http://acl.mit.edu/RLPy, April, 2013
Bibtex:
@ONLINE{RLPy,
author = {Alborz Geramifard and Robert H Klein and Christoph Dann and
William Dabney and Jonathan P How},
title = {{RLPy: The Reinforcement Learning Library for Education and Research}},
month = April,
year = {2013},
howpublished = {\url{http://acl.mit.edu/RLPy}},
}
Feel free to join the rlpy list, rlpy@mit.edu, by clicking here. This list is intended for open discussion about questions, potential improvements, etc.