SubprocVectorEnv(env_factories: List[Callable, maze.core.env.maze_env.MazeEnv]], logging_prefix: Optional[str] = None, start_method: str = None)¶
Creates a multiprocess wrapper for multiple environments, distributing each environment to its own process. This allows a significant speed up when the environment is computationally complex. For performance reasons, if your environment is not IO bound, the number of environments should not exceed the number of logical cores on your CPU.
Only ‘forkserver’ and ‘spawn’ start methods are thread-safe, which is important when TensorFlow sessions or other non thread-safe libraries are used in the parent (see issue #217). However, compared to ‘fork’ they incur a small start-up cost and have restrictions on global variables. With those methods, users must wrap the code in an
if __name__ == "__main__":block. For more information, see the multiprocessing documentation.
env_factories – A list of functions that will create the environments (each callable returns a MultiStepEnvironment instance when called).
start_method – Method used to start the subprocesses. Must be one of the methods returned by multiprocessing.get_all_start_methods(). Defaults to ‘forkserver’ on available platforms, and ‘spawn’ otherwise.
get_actor_rewards() → Optional[numpy.ndarray]¶
Stack actor rewards from encapsulated environments.
step(actions: Dict[str, Union[int, numpy.ndarray]]) → Tuple[Dict[str, numpy.ndarray], numpy.ndarray, numpy.ndarray, Iterable[Dict[Any, Any]]]¶
Step the environments with the given actions.
actions – the list of actions for the respective envs.
observations, rewards, dones, information-dicts all in env-aggregated form.