RolloutEvaluator¶
- class maze.train.trainers.common.evaluators.rollout_evaluator.RolloutEvaluator(eval_env: StructuredVectorEnv, n_episodes: int, model_selection: ModelSelectionBase | None, deterministic: bool = False)¶
Evaluates a given policy by rolling it out and collecting the mean reward.
- Parameters:
eval_env – Distributed environment to run evaluation rollouts in.
n_episodes – Number of evaluation episodes to run. Note that the actual number might be slightly larger due to the distributed nature of the environment.
model_selection – Model selection to notify about the recorded rewards.
deterministic – deterministic or stochastic action sampling (selection).
- evaluate(policy: TorchPolicy) None¶
(overrides
Evaluator)Evaluate given policy (results are stored in stat logs) and dump the model if the reward improved.
- param policy:
Policy to evaluate