RolloutRunner(n_episodes: int, max_episode_steps: int, record_trajectory: bool, record_event_logs: bool)¶
General abstract class for rollout runners.
Offers general structure, plus a couple of helper methods for env instantiation and performing the rollout.
n_episodes – Count of episodes to run.
max_episode_steps – Count of steps to run in each episode (if environment returns done, the episode will be finished earlier though).
record_trajectory – Whether to record trajectory data.
record_event_logs – Whether to record event logs.
init_env_and_agent(env_config: omegaconf.DictConfig, wrappers_config: Union[List[Union[None, Mapping[str, Any], Any]], Mapping[Union[str, Type], Union[None, Mapping[str, Any], Any]]], max_episode_steps: int, agent_config: omegaconf.DictConfig, input_dir: str, env_instance_seed: int, agent_instance_seed: int) -> (<class 'maze.core.env.base_env.BaseEnv'>, <class 'maze.core.agent.policy.Policy'>)¶
Build the environment (including wrappers) and agent according to given configuration.
env_config – Environment config.
wrappers_config – Wrapper config.
max_episode_steps – Max number of steps per episode to limit the env for.
agent_config – Policies config.
input_dir – Directory to load the model from.
env_instance_seed – The seed for this particular env.
agent_instance_seed – The seed for this particular agent.
Tuple of (instantiated environment, instantiated agent).
run_interaction_loop(env: maze.core.env.structured_env.StructuredEnv, agent: maze.core.agent.policy.Policy, n_episodes: int, render: bool = False, episode_end_callback: Callable = None) → None¶
Helper function for running the agent-environment interaction loop for specified number of steps and episodes.
env – Environment to run.
agent – Agent to use.
n_episodes – Count of episodes to perform.
render – Whether to render the environment after every step.
episode_end_callback – If supplied, this will be executed after each episode to notify the observer.
run_with(env: Union[None, Mapping[str, Any], Any], wrappers: Union[List[Union[None, Mapping[str, Any], Any]], Mapping[Union[str, Type], Union[None, Mapping[str, Any], Any]]], agent: Union[None, Mapping[str, Any], Any]) → None¶
Run the rollout with the given env, wrappers and agent configuration. A helper method to make rollouts easily runnable also directly from python, without building the hydra config object.
Note that this method is designed to run only once – if you call it from python directly (and not using Hydra from command line as is the main use case), you should respect this. Otherwise, you might get weird behavior especially from the statistics and events logging system, as the rollout runners register their own stats and event writers (so you might get duplicate stats) and order of operations sometimes matters (especially with parallel rollouts, where we do not want to carry the writers into child processes).
env – Env config or object.
wrappers – Wrappers config (see
agent – Agent config or object.