class maze.train.trainers.imitation.bc_runners.BCRunner(state_dict_dump_file: str, dump_interval: Optional[int], spaces_config_dump_file: str, normalization_samples: int, dataset: omegaconf.DictConfig, eval_concurrency: int)

Dev runner for imitation learning. Loads the given trajectory data and trains a policy on top of it using supervised learning.

abstract classmethod create_distributed_eval_env(env_factory: Callable[], Union[maze.core.env.structured_env.StructuredEnv, maze.core.env.structured_env_spaces_mixin.StructuredEnvSpacesMixin]], eval_concurrency: int, logging_prefix: str)maze.train.parallelization.vector_env.vector_env.VectorEnv

The individual runners implement the setup of the distributed eval env

dataset: omegaconf.DictConfig

Specify the Dataset class used to load the trajectory data for training.

eval_concurrency: int

Number of concurrent evaluation envs.

run(n_epochs: Optional[int] = None, evaluator: Optional[maze.train.trainers.common.evaluators.evaluator.Evaluator] = None, eval_every_k_iterations: Optional[int] = None)None

(overrides TrainingRunner)

Run the training master node. See run(). :param evaluator: Evaluator to use for evaluation rollouts :param n_epochs: How many epochs to train for :param eval_every_k_iterations: Number of iterations after which to run evaluation (in addition to evaluations at the end of each epoch, which are run automatically). If set to None, evaluations will run on epoch end only.

setup(cfg: omegaconf.DictConfig)None

(overrides TrainingRunner)

See setup().