ImpalaAlgorithmConfig¶
-
class
maze.train.trainers.impala.impala_algorithm_config.
ImpalaAlgorithmConfig
(n_epochs: int, epoch_length: int, patience: int, critic_burn_in_epochs: int, n_rollout_steps: int, lr: float, gamma: float, policy_loss_coef: float, value_loss_coef: float, entropy_coef: float, max_grad_norm: float, device: str, queue_out_of_sync_factor: float, actors_batch_size: int, num_actors: int, vtrace_clip_rho_threshold: float, vtrace_clip_pg_rho_threshold: float, rollout_evaluator: maze.train.trainers.common.evaluators.rollout_evaluator.RolloutEvaluator)¶ Algorithm parameters for Impala.
-
queue_out_of_sync_factor
: float¶ this factor multiplied by the actor_batch_size gives the size of the queue for the agents output collected by the learner. Therefor if the all rollouts computed can be at most (queue_out_of_sync_factor + num_agents/actor_batch_size) out of sync with learner policy
-
rollout_evaluator
: maze.train.trainers.common.evaluators.rollout_evaluator.RolloutEvaluator¶ Rollout evaluator.
-