ESDistributedRollouts¶
- class maze.train.trainers.es.distributed.es_distributed_rollouts.ESDistributedRollouts¶
Abstract base class of ES rollout distribution.
- abstract generate_rollouts(policy: Policy | TorchModel, max_steps: int | None, noise_stddev: float, normalization_stats: Dict[str, Tuple[numpy.ndarray, numpy.ndarray]]) Generator[ESRolloutResult, None, None]¶
Declare a new rollout task and start producing results that can be obtained from the returned generator.
Note that different distribution strategies have different ways of balancing evaluation and training rollouts.
- Parameters:
policy – Multi-step policy encapsulating the policy networks
max_steps – Optionally limit the rollout to a number of environment steps (horizon).
noise_stddev – The standard deviation of the applied parameter noise.
normalization_stats – Normalization statistics as calculated by the NormalizeObservationWrapper.