SequentialDistributedActors

class maze.train.parallelization.distributed_actors.sequential_distributed_actors.SequentialDistributedActors(env_factory: Callable[[], StructuredEnv | StructuredEnvSpacesMixin | LogStatsEnv], policy: TorchPolicy, n_rollout_steps: int, n_actors: int, batch_size: int, actor_env_seeds: List[int])
Dummy implementation of distributed actors creates the actors as a list. Once the outputs are to

be collected, it simply rolls them out in a loop until is has enough to be returned.

Parameters:

actor_env_seeds – A list of seeds for each actors’ env.

broadcast_updated_policy(state_dict: Dict) None

(overrides DistributedActors)

Store the newest policy in the shared network object

collect_outputs(learner_device: str) Tuple[StructuredSpacesRecord, float, float, float]

(overrides DistributedActors)

Run the rollouts and collect the outputs.

start() None

(overrides DistributedActors)

Nothing to do in dummy implementation

stop() None

(overrides DistributedActors)

Nothing to do in dummy implementation