VectorEnv

class maze.train.parallelization.vector_env.vector_env.VectorEnv(n_envs: int)

Abstract base class for vectorised environments.

An instance of this class encapsulates multiple environments under the hood and steps them synchronously.

Note that actions and observations are handled and returned in a stacked form, i.e. not as a list, but as a single action/observation dict where the items have an additional dimension corresponding to the number of encapsulated environments (as such setting is more convenient when working with Torch policies). To convert these to/from a list, use the training helpers such as maze.train.utils.train_utils.stack_numpy_dict_list() and maze.train.utils.train_utils.unstack_numpy_list_dict().

Also note that in structured scenarios, only synchronous environments are supported – i.e., in each sub-step, the actor ID must be the same for all environments.

Parameters

n_envs – The number of encapsulated environments.

abstract reset()

Reset all the environments and return respective observations in env-aggregated form.

Returns

observations in env-aggregated form.

abstract seed(seeds: List[Any])None

Sets the seed for this vectorised env’s random number generator(s) and its contained parallel envs.

abstract step(actions: Dict[str, Union[int, numpy.ndarray]]) → Tuple[Dict[str, numpy.ndarray], numpy.ndarray, numpy.ndarray, Iterable[Dict[Any, Any]]]

Step the environments with the given actions.

Parameters

actions – the list of actions for the respective envs.

Returns

observations, rewards, dones, information-dicts all in env-aggregated form.