1. Cutting-2D Problem Specification

This page introduces the problem we would like to address with a Deep Reinforcement Learning agent: an online version of the Guillotine 2D Cutting Stock Problem.

Description of Problem:

  • In each step there is one new incoming customer order generated according to a certain demand pattern.

  • This customer order has to be fulfilled by cutting the exact x/y-dimensions from a set of available candidate pieces in the inventory.

  • A new raw piece is transferred to the inventory every time the current raw piece in inventory is used to cut and deliver a customer order.

  • The goal is to use as few raw pieces as possible throughout the episode, which can be achieved by following a clever cutting policy.

Agent-Environment Interaction Loop:

To make the problem more explicit from an RL perspective we formulate it according to the agent-environment interaction loop shown below.

  • The State contains the dimensions of the currently pending customer orders and all pieces on inventory.

  • The Reward is specified to discourage the usage of raw inventory pieces.

  • The Action is a joint action consisting of the following components (see image below for details):

    • Action \(a_0\): Cutting piece selection (decides which piece from inventory to use for cutting)

    • Action \(a_1\): Cutting orientation selection (decides the orientation of the customer)

    • Action \(a_2\): Cutting order selection (decides which cut to take first; x or y)


Given this description of the problem we will now proceed with implementing a corresponding simulation.