PPOAlgorithmConfig¶
-
class
maze.train.trainers.ppo.ppo_algorithm_config.
PPOAlgorithmConfig
(n_epochs: int, epoch_length: int, patience: int, critic_burn_in_epochs: int, n_rollout_steps: int, lr: float, gamma: float, gae_lambda: float, policy_loss_coef: float, value_loss_coef: float, entropy_coef: float, max_grad_norm: float, device: str, batch_size: int, n_optimization_epochs: int, clip_range: float, rollout_evaluator: maze.train.trainers.common.evaluators.rollout_evaluator.RolloutEvaluator)¶ Algorithm parameters for multi-step PPO model.
-
rollout_evaluator
: maze.train.trainers.common.evaluators.rollout_evaluator.RolloutEvaluator¶ Rollout evaluator.
-