Abstract base class for vectorised environments.
An instance of this class encapsulates multiple environments under the hood and steps them synchronously.
Note that actions and observations are handled and returned in a stacked form, i.e. not as a list, but as a single action/observation dict where the items have an additional dimension corresponding to the number of encapsulated environments (as such setting is more convenient when working with Torch policies). To convert these to/from a list, use the training helpers such as
Also note that in structured scenarios, only synchronous environments are supported – i.e., in each sub-step, the actor ID must be the same for all environments.
n_envs – The number of encapsulated environments.
Reset all the environments and return respective observations in env-aggregated form.
observations in env-aggregated form.
seed(seeds: List[Any]) → None¶
Sets the seed for this vectorised env’s random number generator(s) and its contained parallel envs.
step(actions: Dict[str, Union[int, numpy.ndarray]]) → Tuple[Dict[str, numpy.ndarray], numpy.ndarray, numpy.ndarray, Iterable[Dict[Any, Any]]]¶
Step the environments with the given actions.
actions – the list of actions for the respective envs.
observations, rewards, dones, information-dicts all in env-aggregated form.