GraphAttentionBlock¶

class maze.perception.blocks.feed_forward.graph_attention.GraphAttentionBlock(*args: Any, **kwargs: Any)¶

A block containing multiple subsequent graph (multi-head) attention stacks.

One convolution stack consists of one graph multi-head attention in addition to an activation layer. The block expects the input tensors to have the form:

Feature matrix: first in_key: (batch-dim, num-of-nodes, feature-dim)
Adjacency matrix: second in_key: (batch-dim, num-of-nodes, num-of-nodes) (also symmetric)

And returns a tensor of the form (batch-dim, num-of-nodes, feature-out-dim).

Parameters

in_keys – Two keys identifying the feature matrix and adjacency matrix respectively.
out_keys – One key identifying the output tensors.
in_shapes – List of input shapes.
hidden_features – List containing the number of hidden features for hidden layers.
non_lins – The non-linearity/ies to apply after each layer (the same in all layers, or a list corresponding to each layer).
n_heads – The number of heads each stack should have. (default suggestion 8)
attention_alpha – Specify the negative slope of the leakyReLU in each of the attention layers. parameter with init value :param node_self_importance. (default suggestion 0.2)
avg_last_head_attentions – Specify whether to average the outputs from the attention head in the last layer of the attention stack. (default suggestion True or n_heads=0 in the last layer)
attention_dropout – Specify the dropout to be within the layers applied on the computed attention.

build_layer_dict() → collections.OrderedDict ¶

Compiles a block-specific dictionary of network layers.

This could be overwritten by derived layers (e.g. to get a ‘BatchNormalizedConvolutionBlock’).

Returns: Ordered dictionary of torch modules [str, nn.Module].

normalized_forward(block_input: Dict[str, torch.Tensor]) → Dict[str, torch.Tensor]¶

(overrides ShapeNormalizationBlock)

implementation of ShapeNormalizationBlock interface