BCLoss¶

class maze.train.trainers.imitation.bc_loss.BCLoss(action_spaces_dict: Dict[int | str, gymnasium.spaces.Dict], entropy_coef: float, loss_discrete: torch.nn.Module = torch.nn.CrossEntropyLoss, loss_box: torch.nn.Module = torch.nn.MSELoss, loss_multi_binary: torch.nn.Module = torch.nn.BCEWithLogitsLoss)¶

Loss function for behavioral cloning.

action_spaces_dict: Dict[int | str, gymnasium.spaces.Dict]¶: Action space we are training on (used to determine appropriate loss functions)

calculate_loss(policy: TorchPolicy, observations: List[Dict[str, numpy.ndarray]], actions: List[Dict[str, torch.Tensor]], action_logits: List[Dict[str, torch.Tensor]] | None, actor_ids: List[ActorID], events: ImitationEvents, log_substep_events: bool) → torch.Tensor¶

Calculate and return the training loss for one step (= multiple sub-steps in structured scenarios).

Parameters:

policy – Structured policy to evaluate.
observations – List with observations w.r.t. actor_ids.
actions – List with actions w.r.t. actor_ids.
action_logits – The optional action logits of the policy.
actor_ids – List of actor ids.
events – Events of current episode.
log_substep_events – Whether to log individually the substep events.

Returns:

Total loss

entropy_coef: float¶: Weight of entropy loss

loss_box: torch.nn.Module¶: Loss function used for box (continuous) spaces

loss_discrete: torch.nn.Module¶: Loss function used for discrete (categorical) spaces

loss_multi_binary: torch.nn.Module¶: Loss function used for multi-binary spaces