Frequently Asked Questions (FAQ)¶
How do I use the framework?¶
You can have a look at Getting Started or the examples directory where you find many ready-to-run examples of reinforcement learning experiments.
What does each line of output mean?¶
See documentation in the Getting Started section of the Getting Started.
88825: E[0:01:23]-R[0:00:10]: Return=-1.00, Steps=56, Features = 174
|88825||steps of learning|
|E[0:01:23]||Elapsed time (s)|
|R[0:00:10]||Remaining time (s)|
|Return=-1.00||Sum of rewards for the last episode|
|Steps=56||Number of steps for the last episode|
|Features = 174||Number of Features used for the last episode|
My code is slow, how can I improve its speed?¶
You can use the
rlpy.Tools.run.run_profiled() function which takes a
make_experiment function and generates a pictorial profile of the
resulting running time in pdf format (see api doc for details on where to
find this files).
Each node represents proportional time
for finishing the function, proportional time spent within the function, and
number of times it has been called. Nodes are color coded based on their time.
You want to spend your time boosting the running time of nodes with the highest
proportional time spent within them shown in parentheses. As an example you can
I used to plot my figures based on number of episodes, why do you prefer steps?¶
The use of episode numbers does not provide accurate plots as the number of samples can vary within each episode. The use of steps gurantees that all methods saw exactly the same amount of data before being tested.