# GraphAttentionBlock¶

class maze.perception.blocks.feed_forward.graph_attention.GraphAttentionBlock(*args: Any, **kwargs: Any)

A block containing multiple subsequent graph (multi-head) attention stacks.

One convolution stack consists of one graph multi-head attention in addition to an activation layer. The block expects the input tensors to have the form:

• Feature matrix: first in_key: (batch-dim, num-of-nodes, feature-dim)

• Adjacency matrix: second in_key: (batch-dim, num-of-nodes, num-of-nodes) (also symmetric)

And returns a tensor of the form (batch-dim, num-of-nodes, feature-out-dim).

Parameters
• in_keys – Two keys identifying the feature matrix and adjacency matrix respectively.

• out_keys – One key identifying the output tensors.

• in_shapes – List of input shapes.

• hidden_features – List containing the number of hidden features for hidden layers.

• non_lins – The non-linearity/ies to apply after each layer (the same in all layers, or a list corresponding to each layer).

• n_heads – The number of heads each stack should have. (default suggestion 8)

• attention_alpha – Specify the negative slope of the leakyReLU in each of the attention layers. parameter with init value :param node_self_importance. (default suggestion 0.2)

• avg_last_head_attentions – Specify whether to average the outputs from the attention head in the last layer of the attention stack. (default suggestion True or n_heads=0 in the last layer)

• attention_dropout – Specify the dropout to be within the layers applied on the computed attention.

build_layer_dict()collections.OrderedDict

Compiles a block-specific dictionary of network layers.

This could be overwritten by derived layers (e.g. to get a ‘BatchNormalizedConvolutionBlock’).

Returns

Ordered dictionary of torch modules [str, nn.Module].

normalized_forward(block_input: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor]

(overrides ShapeNormalizationBlock)

implementation of ShapeNormalizationBlock interface