# Structured Environments and Action Masking¶

This tutorial provides a step by step guide explaining how to implement a decision problem as a structured environment and how to train an agent for such a StructuredEnv with a structured Maze Trainer. The examples are again based on the online version of the Guillotine 2D Cutting Stock Problem which is a perfect fit for introducing the underlying concepts.

In particular, we will see how to evolve the performance of an RL agent by going through the following stages:

1. Flat Gym-style environment with vanilla feed forward models

2. Structured environment (e.g., with hierarchical sub-steps) with task specific policy networks

3. Structured environment (e.g., with hierarchical sub-steps) with masked task specific policy networks

Before diving into this tutorial we recommend to familiarize yourself with Control Flows with Structured Environments and the basic Maze - step by step tutorial.

The remainder of this tutorial is structured as follows:

Turning a “flat” MazeEnv into a StructuredEnv
We will reformulate the problem from a “flat” Gym-style environment into a structured environment.

Training the Structured Environment
We will train the structured environment with a Maze Trainer.