ESTrainer

class maze.train.trainers.es.es_trainer.ESTrainer(algorithm_config: ESAlgorithmConfig, torch_policy: TorchPolicy, shared_noise: SharedNoiseTable, normalization_stats: Dict[str, Tuple[numpy.ndarray, numpy.ndarray]] | None)

Trainer class for OpenAI Evolution Strategies.

Parameters:
  • algorithm_config – Algorithm parameters.

  • torch_policy – Multi-step policy encapsulating the policy networks

  • shared_noise – The noise table, with the same content for every worker and the master.

  • normalization_stats – Normalization statistics as calculated by the NormalizeObservationWrapper.

load_state(file_path: str | BinaryIO) None

(overrides Trainer)

implementation of Trainer

load_state_dict(state_dict: Dict) None

Set the model and optimizer state. :param state_dict: The state dict.

state_dict()

(overrides Trainer)

implementation of Trainer

train(distributed_rollouts: ESDistributedRollouts, n_epochs: int | None = None, model_selection: ModelSelectionBase | None = None) None

(overrides Trainer)

Run the ES training loop. :param distributed_rollouts: The distribution interface for experience collection. :param n_epochs: Number of epochs to train. :param model_selection: Optional model selection class, receives model evaluation results.