TemplateModelComposer¶

class maze.perception.models.template_model_composer.TemplateModelComposer(action_spaces_dict: Dict[Union[str, int], gym.spaces.Dict], observation_spaces_dict: Dict[Union[str, int], gym.spaces.Dict], agent_counts_dict: Dict[Union[str, int], int], distribution_mapper_config: Union[None, Mapping[str, Any], Any], model_builder: Union[None, Mapping[str, Any], Any, Type[maze.perception.builders.base.BaseModelBuilder]], policy: Union[None, Mapping[str, Any], Any], critic: Union[None, Mapping[str, Any], Any])¶

Composes template models from configs.

Parameters

action_spaces_dict – Dict of sub-step id to action space.
observation_spaces_dict – Dict of sub-step id to observation space.
distribution_mapper_config – Distribution mapper configuration.
model_builder – The model builder (template) to create the model from.
policy – Specifies the policy type as a configType. E.g. {‘type’: maze.perception.models.policies.ProbabilisticPolicyComposer} specifies a probabilistic policy.
critic – Specifies the critic type as a configType. E.g. {‘type’: maze.perception.models.critics.StateCriticComposer} specifies the single step state critic.

classmethod check_model_config(model_config: Union[None, Mapping[str, Any], Any]) → None ¶: Asserts the provided model config for consistency. :param model_config: The model config to check.

property critic¶

(overrides BaseModelComposer)

Implementation of the BaseModelComposer interface, returns the value networks.

property policy¶

(overrides BaseModelComposer)

Implementation of the BaseModelComposer interface, returns the policy networks.

template_perception_net(observation_space: gym.spaces.Dict) → maze.perception.blocks.inference.InferenceBlock ¶

Compiles a template perception network for a given observation space.

Parameters: observation_space – The observation space tp build the model for.
Returns: A Perception Inference Block.

template_policy_net(observation_space: gym.spaces.Dict, action_space: gym.spaces.Dict, shared_embedding_keys: List[str]) → Tuple[maze.perception.blocks.inference.InferenceBlock, maze.perception.blocks.inference.InferenceBlock]¶

Compiles a template policy network.

Parameters

observation_space – The input observations for the perception network.
action_space – The action space that defines the network action heads.
shared_embedding_keys – The list of embedding keys for this substep’s model.

Returns

A policy network (actor) InferenceBlock, as well as the embedding net InferenceBlock if shared keys have been specified.

template_q_value_net(observation_space: Optional[gym.spaces.Dict], action_space: gym.spaces.Dict, only_discrete_spaces: bool, perception_net: Optional[maze.perception.blocks.inference.InferenceBlock] = None) → maze.perception.blocks.inference.InferenceBlock ¶

Compiles a template state action (Q) value network.

Parameters

observation_space – The input observations for the perception network.
action_space – The action space that defines the network action heads.
perception_net – A initial network to continue from. (e.g. useful for shared weights. Model building continues from the key ‘latent’.)
only_discrete_spaces – A dict specifying if the action spaces w.r.t. the step only hold discrete action spaces.

Returns

A q value network (critic) InferenceBlock.

template_value_net(observation_space: Optional[gym.spaces.Dict], shared_embedding_keys: List[str] = None, perception_net: Optional[maze.perception.blocks.inference.InferenceBlock] = None) → maze.perception.blocks.inference.InferenceBlock ¶

Compiles a template value network.

Parameters

observation_space – The input observations for the perception network.
shared_embedding_keys – The shared embedding keys for this substep’s model (input)
perception_net – The embedding network of the policy network if shared keys have been specified, in order reuse the block and share the embedding.

Returns

A value network (critic) InferenceBlock.