DefaultPolicy¶
-
class
maze.core.agent.default_policy.
DefaultPolicy
(policies: Union[List[Union[None, Mapping[str, Any], Any]], Mapping[Union[str, Type], Union[None, Mapping[str, Any], Any]]])¶ Encapsulates one or more policies identified by policy IDs.
- Parameters
policies – Dict of policy IDs and corresponding policies.
-
compute_action
(observation: Dict[str, numpy.ndarray], maze_state: Optional[Any] = None, env: Optional[maze.core.env.base_env.BaseEnv] = None, actor_id: Optional[maze.core.env.structured_env.ActorID] = None, deterministic: bool = False) → Dict[str, Union[int, numpy.ndarray]]¶ (overrides
Policy
)implementation of
Policy
interface
-
compute_top_action_candidates
(observation: Dict[str, numpy.ndarray], num_candidates: Optional[int], maze_state: Optional[Any], env: Optional[maze.core.env.base_env.BaseEnv], actor_id: Optional[maze.core.env.structured_env.ActorID] = None) → Tuple[Sequence[Dict[str, Union[int, numpy.ndarray]]], Sequence[float]]¶ (overrides
Policy
)implementation of
Policy
interface
-
needs_state
() → bool¶ (overrides
Policy
)This policy does not require the state() object to compute the action.
-
policy_for
(actor_id: Optional[maze.core.env.structured_env.ActorID]) → maze.core.agent.flat_policy.FlatPolicy¶ Return policy corresponding to the given actor ID (or the single available policy if no actor ID is provided)
- Parameters
actor_id – Actor ID to get policy for
- Returns
Flat policy corresponding to the actor ID