You can have a look at Getting Started or the examples directory where you find many ready-to-run examples of reinforcement learning experiments.
See documentation in the Getting Started section of the Getting Started.
88825: E[0:01:23]-R[0:00:10]: Return=-1.00, Steps=56, Features = 174
Field | Meaning |
---|---|
88825 | steps of learning |
E[0:01:23] | Elapsed time (s) |
R[0:00:10] | Remaining time (s) |
Return=-1.00 | Sum of rewards for the last episode |
Steps=56 | Number of steps for the last episode |
Features = 174 | Number of Features used for the last episode |
You can use the rlpy.Tools.run.run_profiled() function which takes a make_experiment function and generates a pictorial profile of the resulting running time in pdf format (see api doc for details on where to find this files). Each node represents proportional time for finishing the function, proportional time spent within the function, and number of times it has been called. Nodes are color coded based on their time. You want to spend your time boosting the running time of nodes with the highest proportional time spent within them shown in parentheses. As an example you can look at Profiling/Example.pdf
Please see the Install page.
The use of episode numbers does not provide accurate plots as the number of samples can vary within each episode. The use of steps gurantees that all methods saw exactly the same amount of data before being tested.