ReplayRecordedActionsPolicy

class maze.core.agent.replay_recorded_actions_policy.ReplayRecordedActionsPolicy(action_record_path: ActionRecord | str | None, with_agent_actions: bool)

A replay action record policy that executes in each (sub-)step the action stored in the provided action record.

Parameters:
  • action_record_path – Action record or path to action record dump.

  • with_agent_actions – If True agent actions are returned; else MazeActions.

compute_action(observation: Dict[str, numpy.ndarray], maze_state: Any | None, env: MazeEnv | None, actor_id: ActorID | None = None, deterministic: bool = False) Dict[str, int | numpy.ndarray] | Any

(overrides Policy)

Deterministically returns the action record action at the respective step.

compute_top_action_candidates(observation: Dict[str, numpy.ndarray], num_candidates: int, maze_state: Any | None, env: MazeEnv | None, actor_id: str | int | None = None) Tuple[Sequence[Dict[str, int | numpy.ndarray]], Sequence[float]]

(overrides Policy)

Implementation of compute_top_action_candidates.

load_action_record(action_record_path: ActionRecord | str) None

Load action record from file.

Parameters:

action_record_path – Action record or path to action record dump.

needs_env() bool

(overrides Policy)

This policy does not require the env object to compute the action.

needs_state() bool

(overrides Policy)

This policy requires the state object to compute the action.

seed(seed: int) None

(overrides Policy)

Seed the policy.