BCAlgorithmConfig

class maze.train.trainers.imitation.bc_algorithm_config.BCAlgorithmConfig(n_epochs: int, device: str, batch_size: int, n_eval_workers: int, validation_percentage: float, eval_every_k_iterations: int, n_eval_episodes: int, max_episode_steps: int, optimizer: Any, loss: maze.train.trainers.imitation.bc_loss.BCLoss)

Algorithm parameters for behavioral cloning.

batch_size: int

Batch size for training

device: str

Either “cpu” or “cuda”

eval_every_k_iterations: int

Number of iterations after which to run evaluation (in addition to evaluations at the end of each epoch, which are run automatically). If set to None, evaluations will run on epoch end only.

loss: maze.train.trainers.imitation.bc_loss.BCLoss

The loss to be used for the behavioural cloning

max_episode_steps: int

Max number of steps per episode to run during each evaluation rollout

n_epochs: int

number of epochs to train

n_eval_episodes: int

Number of episodes to run during each evaluation rollout

n_eval_workers: int

Number of workers to perform evaluation runs in. If set to 1, evaluation is performed in the main process.

optimizer: Any

The optimizer to use to update the policy.

validation_percentage: float

Percentage of the data used for validation.