ReplayRecordedActionsPolicy

class maze.core.agent.replay_recorded_actions_policy.ReplayRecordedActionsPolicy(action_record_path: Optional[Union[maze.core.trajectory_recording.records.action_record.ActionRecord, str]], with_agent_actions: bool)

A replay action record policy that executes in each (sub-)step the action stored in the provided action record.

Parameters
  • action_record_path – Action record or path to action record dump.

  • with_agent_actions – If True agent actions are returned; else MazeActions.

compute_action(observation: Dict[str, numpy.ndarray], maze_state: Optional[Any], env: Optional[maze.core.env.maze_env.MazeEnv], actor_id: Optional[maze.core.env.structured_env.ActorID] = None, deterministic: bool = False) → Union[Dict[str, Union[int, numpy.ndarray]], Any]

(overrides Policy)

Deterministically returns the action record action at the respective step.

compute_top_action_candidates(observation: Dict[str, numpy.ndarray], num_candidates: int, maze_state: Optional[Any], env: Optional[maze.core.env.maze_env.MazeEnv], actor_id: Union[str, int] = None) → Tuple[Sequence[Dict[str, Union[int, numpy.ndarray]]], Sequence[float]]

(overrides Policy)

Implementation of compute_top_action_candidates.

load_action_record(action_record_path: Union[maze.core.trajectory_recording.records.action_record.ActionRecord, str])None

Load action record from file.

Parameters

action_record_path – Action record or path to action record dump.

needs_env()bool

(overrides Policy)

This policy does not require the env object to compute the action.

needs_state()bool

(overrides Policy)

This policy requires the state object to compute the action.

seed(seed: int)None

(overrides Policy)

Seed the policy.