ESDistributedRollouts¶
-
class
maze.train.trainers.es.distributed.es_distributed_rollouts.
ESDistributedRollouts
¶ Abstract base class of ES rollout distribution.
-
abstract
generate_rollouts
(policy: Union[maze.core.agent.policy.Policy, maze.core.agent.torch_model.TorchModel], max_steps: Optional[int], noise_stddev: float, normalization_stats: Dict[str, Tuple[numpy.ndarray, numpy.ndarray]]) → Generator[maze.train.trainers.es.distributed.es_distributed_rollouts.ESRolloutResult, None, None]¶ Declare a new rollout task and start producing results that can be obtained from the returned generator.
Note that different distribution strategies have different ways of balancing evaluation and training rollouts.
- Parameters
policy – Multi-step policy encapsulating the policy networks
max_steps – Optionally limit the rollout to a number of environment steps (horizon).
noise_stddev – The standard deviation of the applied parameter noise.
normalization_stats – Normalization statistics as calculated by the NormalizeObservationWrapper.
-
abstract