IMPALA

class maze.train.trainers.impala.impala_trainer.IMPALA(algorithm_config: maze.train.trainers.impala.impala_algorithm_config.ImpalaAlgorithmConfig, rollout_generator: maze.train.parallelization.distributed_actors.distributed_actors.DistributedActors, evaluator: Optional[maze.train.trainers.common.evaluators.rollout_evaluator.RolloutEvaluator], model: maze.core.agent.torch_actor_critic.TorchActorCritic, model_selection: Optional[maze.train.trainers.common.model_selection.best_model_selection.BestModelSelection])

Multi step advantage actor critic.

train(n_epochs: Optional[int] = None)None

(overrides Trainer)

Train function that wraps normal train function in order to close all processes properly

param n_epochs

number of epochs to train.