1. Cutting-2D Problem Specification¶

This page introduces the problem we would like to address with a Deep Reinforcement Learning agent: an online version of the Guillotine 2D Cutting Stock Problem.

Description of Problem:

• In each step there is one new incoming customer order generated according to a certain demand pattern.

• This customer order has to be fulfilled by cutting the exact x/y-dimensions from a set of available candidate pieces in the inventory.

• A new raw piece is transferred to the inventory every time the current raw piece in inventory is used to cut and deliver a customer order.

• The goal is to use as few raw pieces as possible throughout the episode, which can be achieved by following a clever cutting policy.

Agent-Environment Interaction Loop:

To make the problem more explicit from an RL perspective we formulate it according to the agent-environment interaction loop shown below.

• The State contains the dimensions of the currently pending customer orders and all pieces on inventory.

• The Reward is specified to discourage the usage of raw inventory pieces.

• The Action is a joint action consisting of the following components (see image below for details):

• Action $$a_0$$: Cutting piece selection (decides which piece from inventory to use for cutting)

• Action $$a_1$$: Cutting orientation selection (decides the orientation of the customer)

• Action $$a_2$$: Cutting order selection (decides which cut to take first; x or y)

Given this description of the problem we will now proceed with implementing a corresponding simulation.