Simple Multi Attention Head Model

class flood_forecast.transformer_xl.multi_head_base.MultiAttnHeadSimple(number_time_series: int, seq_len=10, output_seq_len=None, d_model=128, num_heads=8, dropout=0.1, output_dim=1, final_layer=False)[source]

A simple multi-head attention model inspired by Vaswani et al.

__init__(number_time_series: int, seq_len=10, output_seq_len=None, d_model=128, num_heads=8, dropout=0.1, output_dim=1, final_layer=False)[source]: Initialize internal Module state, shared by both nn.Module and ScriptModule.

forward(x: Tensor, mask=None) → Tensor[source]

Param:: x torch.Tensor: of shape (B, L, M)

Where B is the batch size, L is the sequence length and M is the number of time :return: a tensor of dimension (B, forecast_length)