ESDummyDistributedRollouts

class maze.train.trainers.es.distributed.es_dummy_distributed_rollouts.ESDummyDistributedRollouts(env: maze.core.env.structured_env.StructuredEnv, n_eval_rollouts: int, shared_noise: maze.train.trainers.es.es_shared_noise_table.SharedNoiseTable, agent_instance_seed: int)

Implementation of the ES distribution by running the rollouts synchronously in the same process.

generate_rollouts(policy: maze.core.agent.torch_policy.TorchPolicy, max_steps: Optional[int], noise_stddev: float, normalization_stats: Dict[str, Dict[str, Union[numpy.ndarray, float, int, Iterable[Union[float, int]]]]]) → Generator[maze.train.trainers.es.distributed.es_distributed_rollouts.ESRolloutResult, None, None]

(overrides ESDistributedRollouts)

First execute a fixed number of eval rollouts and then continue with producing training samples.