ESEvents

class maze.train.trainers.es.es_events.ESEvents

Event interface, defining statistics emitted by the ESTrainer.

policy_grad_norm(policy_id: int, value: float)

gradient norm of the step policies

policy_norm(policy_id: int, value: float)

l2 norm of the step policy parameters

real_time(value: float)

elapsed real time per iteration (=epoch)

update_ratio(value: float)

norm(optimizer step) / norm(all parameters)