RewardWrapper¶

class maze.core.wrappers.wrapper.RewardWrapper(env: EnvType)¶

A Wrapper with typing support modifying the reward before passed to the agent.

get_observation_and_action_dicts(maze_state: Any | None, maze_action: Any | None, first_step_in_episode: bool) → Tuple[Dict[int | str, Any] | None, Dict[int | str, Any] | None]¶

(overrides Wrapper)

Keep both actions and observation the same.

abstract reward(reward: Any) → Any¶: Reward mapping method.

step(action) → Tuple[Any, Any, bool, Dict[Any, Any]]¶: Intercept BaseEnv.step and map rewards.