TemplateModelComposer

class maze.perception.models.template_model_composer.TemplateModelComposer(action_spaces_dict: Dict[Union[str, int], gym.spaces.Dict], observation_spaces_dict: Dict[Union[str, int], gym.spaces.Dict], agent_counts_dict: Dict[Union[str, int], int], distribution_mapper_config: Union[None, Mapping[str, Any], Any], model_builder: Union[None, Mapping[str, Any], Any, Type[maze.perception.builders.base.BaseModelBuilder]], policy: Union[None, Mapping[str, Any], Any], critic: Union[None, Mapping[str, Any], Any])

Composes template models from configs.

Parameters
  • action_spaces_dict – Dict of sub-step id to action space.

  • observation_spaces_dict – Dict of sub-step id to observation space.

  • distribution_mapper_config – Distribution mapper configuration.

  • model_builder – The model builder (template) to create the model from.

  • policy – Specifies the policy type as a configType. E.g. {‘type’: maze.perception.models.policies.ProbabilisticPolicyComposer} specifies a probabilistic policy.

  • critic – Specifies the critic type as a configType. E.g. {‘type’: maze.perception.models.critics.StateCriticComposer} specifies the single step state critic.

classmethod check_model_config(model_config: Union[None, Mapping[str, Any], Any])None

Asserts the provided model config for consistency. :param model_config: The model config to check.

property critic

(overrides BaseModelComposer)

Implementation of the BaseModelComposer interface, returns the value networks.

property policy

(overrides BaseModelComposer)

Implementation of the BaseModelComposer interface, returns the policy networks.

template_perception_net(observation_space: gym.spaces.Dict)maze.perception.blocks.inference.InferenceBlock

Compiles a template perception network for a given observation space.

Parameters

observation_space – The observation space tp build the model for.

Returns

A Perception Inference Block.

template_policy_net(observation_space: gym.spaces.Dict, action_space: gym.spaces.Dict, shared_embedding_keys: List[str]) → Tuple[maze.perception.blocks.inference.InferenceBlock, maze.perception.blocks.inference.InferenceBlock]

Compiles a template policy network.

Parameters
  • observation_space – The input observations for the perception network.

  • action_space – The action space that defines the network action heads.

  • shared_embedding_keys – The list of embedding keys for this substep’s model.

Returns

A policy network (actor) InferenceBlock, as well as the embedding net InferenceBlock if shared keys have been specified.

template_q_value_net(observation_space: Optional[gym.spaces.Dict], action_space: gym.spaces.Dict, only_discrete_spaces: bool, perception_net: Optional[maze.perception.blocks.inference.InferenceBlock] = None)maze.perception.blocks.inference.InferenceBlock

Compiles a template state action (Q) value network.

Parameters
  • observation_space – The input observations for the perception network.

  • action_space – The action space that defines the network action heads.

  • perception_net – A initial network to continue from. (e.g. useful for shared weights. Model building continues from the key ‘latent’.)

  • only_discrete_spaces – A dict specifying if the action spaces w.r.t. the step only hold discrete action spaces.

Returns

A q value network (critic) InferenceBlock.

template_value_net(observation_space: Optional[gym.spaces.Dict], shared_embedding_keys: List[str] = None, perception_net: Optional[maze.perception.blocks.inference.InferenceBlock] = None)maze.perception.blocks.inference.InferenceBlock

Compiles a template value network.

Parameters
  • observation_space – The input observations for the perception network.

  • shared_embedding_keys – The shared embedding keys for this substep’s model (input)

  • perception_net – The embedding network of the policy network if shared keys have been specified, in order reuse the block and share the embedding.

Returns

A value network (critic) InferenceBlock.