ActionRecordingWrapper¶
-
class
maze.core.wrappers.action_recording_wrapper.
ActionRecordingWrapper
(*args, **kwds)¶ An Action Recording Wrapper that records for (sub-)step the respective MazeAction or agent action taken.
- Parameters
env – Environment to wrap.
record_maze_actions – If True maze action objects are recorded.
record_actions – If True agent actions are recorded.
output_dir – Path where to store the action records.
-
clone_from
(env: maze.core.wrappers.action_recording_wrapper.ActionRecordingWrapper) → None¶ Reset this gym environment to the given state by creating a deep copy of the env.state instance variable
-
get_observation_and_action_dicts
(maze_state: Optional[Any], maze_action: Optional[Any], first_step_in_episode: bool) → Tuple[Optional[Dict[Union[int, str], Any]], Optional[Dict[Union[int, str], Any]]]¶ Convert MazeState and MazeAction back into raw action and observation.
This method is mostly used when working with trajectory data, e.g. for imitation learning. As part of trajectory data, MazeState and MazeActions are recorded. For imitation learning, they then need to be converted to raw observations and actions in the desired format (i.e. using all the required wrappers etc.)
The conversion is done by first transforming the MazeState and MazeAction using the space interfaces in MazeEnv, and then running them through the entire wrapper stack (“back up”).
Both the MazeState and the MazeAction on top of it are converted as part of this single method, as some wrappers (mostly multi-step ones) need them both together (to be able to split them into observations and actions taken in different sub-steps). If you are not using multi-step wrappers, you don’t need to convert both MazeState and MazeAction, you can pass in just one of them. Not all wrappers have to support this though.
See below for an example implementation.
Note: The conversion of MazeState to observation is in the “natural” direction, how it takes place when stepping the env. This is not true for the MazeAction to action conversion – when stepping the env, actions are converted to MazeActions, whereas here the MazeAction needs to be converted back into the “raw” action (i.e. in reverse direction).
(!) Attention: In case that there are some stateful wrappers in the wrapper stack (e.g. a wrapper stacking observations from previous steps), you should ensure that (1) the first_step_in_episode flag is passed to this function correctly and (2) that all states and MazeActions are converted in order – as they happened during the recorded episode.
- Parameters
maze_state – MazeState to convert. If none, only MazeAction will be converted (not all wrappers support this).
maze_action – MazeAction (the one following the state given as the first param). If none, only MazeState will be converted (not all wrappers support this, some need both).
first_step_in_episode – True if this is the first step in the episode. Serves to notify stateful wrappers (e.g. observation stacking) that they should reset their state.
- Returns
observation and action dictionaries (keys are IDs of sub-steps)
-
reset
() → Any¶ (overrides
ObservationWrapper
)Intercept
ObservationWrapper.reset
and map observation.
-
seed
(seed: int) → None¶ (overrides
CoreEnv
)Sets the seed for this environment’s random number generator(s).
- param
seed: the seed integer initializing the random number generator.
-
step
(action) → Tuple[Any, Any, bool, Dict[Any, Any]]¶ (overrides
ObservationWrapper
)Intercept
ObservationWrapper.step
and map observation.