SubprocVectorEnv¶

class maze.train.parallelization.vector_env.subproc_vector_env.SubprocVectorEnv(env_factories: List[Callable[], maze.core.env.maze_env.MazeEnv]], logging_prefix: Optional[str] = None, start_method: str = None)¶

Creates a multiprocess wrapper for multiple environments, distributing each environment to its own process. This allows a significant speed up when the environment is computationally complex. For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.

Warning

Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an if __name__ == "__main__": block. For more information, see the multiprocessing documentation.

Parameters

env_factories – A list of functions that will create the environments (each callable returns a MultiStepEnvironment instance when called).
start_method – Method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.

close() → None ¶: VectorEnv implementation

get_actor_rewards() → Optional[numpy.ndarray]¶

(overrides StructuredVectorEnv)

Stack actor rewards from encapsulated environments.

reset() → Dict[str, numpy.ndarray]¶: VectorEnv implementation

seed(seeds: List[Any]) → None ¶

(overrides VectorEnv)

VectorEnv implementation

step(actions: Dict[str, Union[int, numpy.ndarray]]) → Tuple[Dict[str, numpy.ndarray], numpy.ndarray, numpy.ndarray, Iterable[Dict[Any, Any]]]¶

Step the environments with the given actions.

Parameters: actions – the list of actions for the respective envs.
Returns: observations, rewards, dones, information-dicts all in env-aggregated form.