Resets and time-limits on real robot

Hi,

I am looking to perform some reinforcement learning training on the real robot. I was wondering:

Is it possible to ‘reset’ the real environment multiple times during a job submission? Based on the example environment it seems like it is not possible (unless we were to implement a reset mechanism ourselves). What mechanism is used to reset the robot and cube to their initial positions between jobs?

Is there a limit to the length of time a single job can run on the real robot?

If you want to reset during one submission, you have indeed to implement this yourself. The function we are using for this is implemented here: trifingerpro_post_submission.py#L195. You should be able to copy it and use it in your code. I think you would just need to replace robot.frontend with a TriFingerPlatformWithObjectFrontend instance and set trajectory_file = "trifingerpro_shuffle_cube_trajectory_fast.csv".

By default the length of a job is limited to 120000 steps. There is an (I think undocumented) option episode_length to change this in the roboch.json. Example:

  "episode_length": 20000,

This is limited to a maximum duration of 300000 steps, though.
Note that during evaluation, the episode length is always set to the default value.

Thanks, I’ll try to copy that code for the resets.

Is there any chance the maximum duration could be increased? 300,000 steps is only 5 minutes, which is a very short amount of time for performing any sort of reinforcement learning.

We are unfortunately limited in the maximum duration as all data recorded during the run needs to fit into the memory (the data is only written to files after the robot stopped to not cause timing issues).

Instead, you can automate making submissions to some degree, e.g. using this script from the example package as template. The idea is to run a loop that

  • makes a submission to run one episode,
  • waits until it is finished,
  • downloads the data,
  • processes it in some way (e.g. do some training) to update parameters, and
  • goes back to step 1

Great, thank you. I’ll give that a go and get back to you if I have any issues.