ActionRecordingWrapper

class maze.core.wrappers.action_recording_wrapper.ActionRecordingWrapper(*args, **kwds)

An Action Recording Wrapper that records for (sub-)step the respective MazeAction or agent action taken.

Parameters
  • env – Environment to wrap.

  • record_maze_actions – If True maze action objects are recorded.

  • record_actions – If True agent actions are recorded.

  • output_dir – Path where to store the action records.

clone_from(env: maze.core.wrappers.action_recording_wrapper.ActionRecordingWrapper)None

Reset this gym environment to the given state by creating a deep copy of the env.state instance variable

dump()None

Dump recorded trajectory to file.

get_observation_and_action_dicts(maze_state: Optional[Any], maze_action: Optional[Any], first_step_in_episode: bool) → Tuple[Optional[Dict[Union[int, str], Any]], Optional[Dict[Union[int, str], Any]]]

Convert MazeState and MazeAction back into raw action and observation.

This method is mostly used when working with trajectory data, e.g. for imitation learning. As part of trajectory data, MazeState and MazeActions are recorded. For imitation learning, they then need to be converted to raw observations and actions in the desired format (i.e. using all the required wrappers etc.)

The conversion is done by first transforming the MazeState and MazeAction using the space interfaces in MazeEnv, and then running them through the entire wrapper stack (“back up”).

Both the MazeState and the MazeAction on top of it are converted as part of this single method, as some wrappers (mostly multi-step ones) need them both together (to be able to split them into observations and actions taken in different sub-steps). If you are not using multi-step wrappers, you don’t need to convert both MazeState and MazeAction, you can pass in just one of them. Not all wrappers have to support this though.

See below for an example implementation.

Note: The conversion of MazeState to observation is in the “natural” direction, how it takes place when stepping the env. This is not true for the MazeAction to action conversion – when stepping the env, actions are converted to MazeActions, whereas here the MazeAction needs to be converted back into the “raw” action (i.e. in reverse direction).

(!) Attention: In case that there are some stateful wrappers in the wrapper stack (e.g. a wrapper stacking observations from previous steps), you should ensure that (1) the first_step_in_episode flag is passed to this function correctly and (2) that all states and MazeActions are converted in order – as they happened during the recorded episode.

Parameters
  • maze_state – MazeState to convert. If none, only MazeAction will be converted (not all wrappers support this).

  • maze_action – MazeAction (the one following the state given as the first param). If none, only MazeState will be converted (not all wrappers support this, some need both).

  • first_step_in_episode – True if this is the first step in the episode. Serves to notify stateful wrappers (e.g. observation stacking) that they should reset their state.

Returns

observation and action dictionaries (keys are IDs of sub-steps)

reset() → Any

(overrides ObservationWrapper)

Intercept ObservationWrapper.reset and map observation.

seed(seed: int)None

(overrides CoreEnv)

Sets the seed for this environment’s random number generator(s).

param

seed: the seed integer initializing the random number generator.

step(action) → Tuple[Any, Any, bool, Dict[Any, Any]]

(overrides ObservationWrapper)

Intercept ObservationWrapper.step and map observation.