Workflow
Policy and Value Networks
Training
Concepts and Structure
Environment Customization
Best Practices and Tutorials
Logging and Monitoring
Scaling the Training Process
maze.train.trainers.common.actor_critic.actor_critic_events.
ActorCriticEvents
Event interface, defining statistics emitted by the A2CTrainer.
critic_grad_norm
gradient norm of the step critic
critic_value
critic value of the step critic
critic_value_loss
optimization loss of the step critic
learning_rate
optimizer learning rate
policy_entropy
entropy of the step policies
policy_grad_norm
gradient norm of the step policies
policy_loss
optimization loss of the step policy
time_epoch
time required for epoch
time_rollout
time required for rollout
time_update
time required for update