.. _overview: Overview ======== Vision ------ RLPy is a framework for conducting sequential decision making experiments that involve value-function based approaches. It provides a modular toolbox, where various components can be linked together to create experiments. The Big Picture --------------- .. image:: overview.* :width: 85 % **Reinforcement Learning (RL)** ^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^^ Setting up an RL experiment requires selecting the following 4 key components: 1. :ref:`Agent `: This is the box where learning happens. It is often done by changing the weight vector corresponding to the features. 2. :ref:`Policy `: This box is responsible to generate actions based on the current states. The action selection mechanism often dependends on the estimated value function. 3. :ref:`Representation `: In this framework, we assume the use of linear function approximators to represent the value function. This box realizes the underlying representation used for capturing the value function. Note that the features used for approximation can be non-linear. 4. :ref:`Domain `: This box is an MDP that we are interested to solve. The :ref:`Experiment ` class works as a glue that connect all these pieces together. **Dynamic Programming** ^^^^^^^^^^^^^^^^^^^^^^^ If the full model of the MDP is known, Dynamic Programming techniques can be used to solve the MDP. To setup a DP experiment the following 3 components have to be set: 1. :ref:`MDP Solver `: Dynamic programming algorithm 2. :ref:`Representation `: Same as the RL case. Notice that the Value Iteration and Policy Iteration techniques can be only coupled with the tabular representation. 3. :ref:`Domain `: Same as the RL case. .. note:: Each of the components mentioned here has several realizations in RLPy, yet this website provides guidance only on the main abstract classes, namely: :ref:`Agent `, :ref:`MDP Solver `, :ref:`Representation `, :ref:`Policy `, :ref:`Domain ` and :ref:`Experiment ` .. seealso:: The :ref:`tutorial page ` provides simple 10-15 minutes examples on how various experiments can be setup and used.\n Acknowledgements ================ The project was partially funded by **ONR** and **AFOSR** grants. Citing RLPy =========== If you use RLPy to conduct your research, please cite Alborz Geramifard, Robert H Klein, Christoph Dann, William Dabney and Jonathan P How, RLPy: The Reinforcement Learning Library for Education and Research, 2013. http://acl.mit.edu/RLPy, April, 2013 Bibtex:: @ONLINE{RLPy, author = {Alborz Geramifard and Robert H Klein and Christoph Dann and William Dabney and Jonathan P How}, title = {{RLPy: The Reinforcement Learning Library for Education and Research}}, month = April, year = {2013}, howpublished = {\url{http://acl.mit.edu/RLPy}}, } Staying Connected ================= Feel free to join the rlpy list, rlpy@mit.edu, by `clicking here `_. This list is intended for open discussion about questions, potential improvements, etc.