BCLoss

class maze.train.trainers.imitation.bc_loss.BCLoss(action_spaces_dict: Dict[int | str, gymnasium.spaces.Dict], entropy_coef: float, loss_discrete: torch.nn.Module = torch.nn.CrossEntropyLoss, loss_box: torch.nn.Module = torch.nn.MSELoss, loss_multi_binary: torch.nn.Module = torch.nn.BCEWithLogitsLoss)

Loss function for behavioral cloning.

action_spaces_dict: Dict[int | str, gymnasium.spaces.Dict]

Action space we are training on (used to determine appropriate loss functions)

calculate_loss(policy: TorchPolicy, observations: List[Dict[str, numpy.ndarray]], actions: List[Dict[str, torch.Tensor]], action_logits: List[Dict[str, torch.Tensor]] | None, actor_ids: List[ActorID], events: ImitationEvents, log_substep_events: bool) torch.Tensor

Calculate and return the training loss for one step (= multiple sub-steps in structured scenarios).

Parameters:
  • policy – Structured policy to evaluate.

  • observations – List with observations w.r.t. actor_ids.

  • actions – List with actions w.r.t. actor_ids.

  • action_logits – The optional action logits of the policy.

  • actor_ids – List of actor ids.

  • events – Events of current episode.

  • log_substep_events – Whether to log individually the substep events.

Returns:

Total loss

entropy_coef: float

Weight of entropy loss

loss_box: torch.nn.Module

Loss function used for box (continuous) spaces

loss_discrete: torch.nn.Module

Loss function used for discrete (categorical) spaces

loss_multi_binary: torch.nn.Module

Loss function used for multi-binary spaces