ImitationEvents¶

class maze.train.trainers.imitation.imitation_events.ImitationEvents¶

Event interface defining statistics emitted by the imitation learning trainers.

box_mean_abs_deviation(step_id: str | int, agent_id: int, subspace_name: str, value: int)¶: Mean absolute deviation for box (continuous) subspaces.

discrete_accuracy(step_id: str | int, agent_id: int, subspace_name: str, value: int)¶: Accuracy for discrete (categorical) subspaces.

discrete_action_rank(step_id: str | int, agent_id: int, subspace_name: str, value: int)¶: Rank of target action in discrete (categorical) subspaces.

discrete_top_10_accuracy(step_id: str | int, agent_id: int, subspace_name: str, value: int)¶: Accuracy for discrete (categorical) subspaces.

discrete_top_5_accuracy(step_id: str | int, agent_id: int, subspace_name: str, value: int)¶: Accuracy for discrete (categorical) subspaces.

mean_step_discrete_accuracy(step_id: str | int, value: int)¶: Accuracy for discrete (categorical) subspaces.

mean_step_discrete_action_rank(subspace_name: str, value: int)¶: Rank of target action in discrete (categorical) subspaces.

mean_step_policy_entropy(step_id: str | int, value: float)¶: Entropy of the step policies.

mean_step_policy_grad_norm(step_id: str | int, value: float)¶: Gradient norm of the step policies.

mean_step_policy_l2_norm(step_id: str | int, value: float)¶: L2 norm of the step policies.

mean_step_policy_loss(step_id: str | int, value: float)¶: Optimization loss of the step policy.

multi_binary_accuracy(step_id: str | int, agent_id: int, subspace_name: str, value: int)¶: Accuracy for multi-binary subspaces.

policy_entropy(step_id: str | int, agent_id: int, value: float)¶: Entropy of the step policies.

policy_grad_norm(step_id: str | int, agent_id: int, value: float)¶: Gradient norm of the step policies.

policy_l2_norm(step_id: str | int, agent_id: int, value: float)¶: L2 norm of the step policies.

policy_loss(step_id: str | int, agent_id: int, value: float)¶: Optimization loss of the step policy.

training_iterations(value: int)¶: The number of training iterations before early stopping