MazeEnv¶
- class maze.core.env.maze_env.MazeEnv(core_env: CoreEnv, action_conversion_dict: Dict[str | int, ActionConversionInterface], observation_conversion_dict: Dict[str | int, ObservationConversionInterface])¶
Base class for (gym style) environments wrapping a core environment and defining state and execution interfaces. The aim of this class is to provide reusable functionality across different gym environments. This functionality comprises for example the reset-function, the step-function or the render-function.
- Parameters:
core_env – Core environment.
action_conversion_dict – A dictionary with action conversion interface implementation and policy names as keys.
observation_conversion_dict – A dictionary with observation conversion interface implementation and policy names as keys.
- property action_conversion¶
Return the action conversion mapping for the current policy.
- action_conversion_dict¶
The action conversion mapping used by this env.
- property action_space¶
Keep this env compatible with the gym interface by returning the action space of the current policy.
- property action_spaces_dict: Dict[int | str, gymnasium.spaces.Space]¶
(overrides
StructuredEnvSpacesMixin)Policy action spaces as dict.
- actor_id() ActorID¶
(overrides
StructuredEnv)forward call to
self.core_env
- property agent_counts_dict: Dict[str | int, int]¶
(overrides
StructuredEnv)forward call to
self.core_env
- clone_from(env: MazeEnv) None¶
(overrides
SimulatedEnvMixin)Reset the maze env to the state of the provided env.
Note, that it also clones the CoreEnv and its member variables including environment context.
- param env:
The environment to clone from.
- close() None¶
(overrides
BaseEnv)forward call to
self.core_env
- get_actor_rewards() numpy.ndarray | None¶
(overrides
StructuredEnv)forward call to
self.core_env
- static get_done_info(done: bool, info: Dict[str, str]) Tuple[bool, bool]¶
Distinguish the end of episode between terminated and truncated by looking into the info dict.
- Parameters:
done – The done information from the last env.
info – The info of the last env step.
- Returns:
Return a tuple, first the terminated information and second the truncated information.
- get_env_time() int¶
(overrides
TimeEnvMixin)Forward the call to
self.core_env
- get_episode_id() str¶
(overrides
RecordableEnvMixin)Return the ID of current episode (the ID changes on env reset).
- get_kpi_calculator() KpiCalculator | None¶
(overrides
CoreEnv)forward call to
self.core_env
- get_maze_action() Any¶
(overrides
RecordableEnvMixin)Return last MazeAction object for trajectory recording.
- get_maze_state() Any¶
(overrides
RecordableEnvMixin)Return current State object for the core env for trajectory recording.
- get_observation_and_action_dicts(maze_state: Any | None, maze_action: Any | None, first_step_in_episode: bool) Tuple[Dict[int | str, Any] | None, Dict[int | str, Any] | None]¶
(overrides
Wrapper)Convert MazeState and MazeAction back into observations and actions using the space conversion interfaces.
- param maze_state:
State of the environment
- param maze_action:
MazeAction (the one following the state given as the first param)
- param first_step_in_episode:
True if this is the first step in the episode.
- return:
observation and action dictionaries (keys are substep_ids)
- get_renderer() Renderer¶
(overrides
RecordableEnvMixin)Return the renderer exposed by the underlying core env.
- get_step_events() Iterable[EventRecord]¶
(overrides
CoreEnv)forward call to
self.core_env
- is_actor_done() bool¶
(overrides
StructuredEnv)forward call to
self.core_env
- is_flat_step() bool¶
forward call to
self.core_env
- property is_single_substep_env: bool¶
Checks whether this env is a single sub-step environment.
- Returns:
[bool] True if there is a single agent in the environment.
- maze_env¶
direct access to the maze env (useful to bypass the wrapper hierarchy)
- metadata¶
Only there to be compatible with gymnasium.core.Env
- noop_action() Dict[str, int | numpy.ndarray]¶
(overrides
Wrapper)Helper function for accessing the noop action for current step, compatible with the Wrapper interface.
- property observation_conversion¶
Return the state to observation mapping for the current policy.
- observation_conversion_dict¶
The observation conversion mapping used by this env.
- property observation_space¶
Keep this env compatible with the gym interface by returning the observation space of the current policy.
- property observation_spaces_dict: Dict[int | str, gymnasium.spaces.Space]¶
(overrides
StructuredEnvSpacesMixin)Policy observation spaces as dict.
- reset() Dict[str, numpy.ndarray]¶
(overrides
BaseEnv)Resets the environment and returns the initial observation.
- return:
the initial observation after resetting.
- reward_range¶
A tuple (reward min value, reward max value) to be compatible with gymnasium.core.Env
- seed(seed: Any) None¶
(overrides
BaseEnv)forward call to
self.core_env
- set_core_env(core_env: CoreEnv) None¶
Helper method for setting the core env to a new, different core env instance while maintaining the same core env context object (to not break event reporting, callbacks etc.).
Helpful e.g. during deployment, when simulation in core env is not needed, and we are just mirroring the production environment instead.
The old core env instanced is not referenced anymore and should be discarded.
- spec¶
Only there to be compatible with gymnasium.core.Env