mindformers.modules.transformer.AttentionMask¶
- class mindformers.modules.transformer.AttentionMask(**kwargs)[源代码]¶
Get the Lower triangular matrix from the input mask. The input mask is a 2D tensor (batch_size, seq_length) with 1 and 0, where 1 indicates the current position is a valid token, otherwise not.
- Args:
seq_length(int): The sequence length of the input tensor. parallel_config(OpParallelConfig): The parallel configure. Default default_dpmp_config,
an instance of OpParallelConfig with default args.
- Inputs:
input_mask (Tensor) - The mask indicating whether each position is a valid input with (batch_size, seq_length).
- Outputs:
Tensor. The attention mask matrix with shape (batch_size, seq_length, seq_length).
- Raises:
TypeError: seq_length is not an integer. ValueError: seq_length is not a positive value. TypeError: parallel_config is not a subclass of OpParallelConfig.
- Supported Platforms:
AscendGPU- Examples:
>>> import numpy as np >>> from mindformers.modules.transformer import AttentionMask >>> from mindspore import Tensor >>> mask = AttentionMask(seq_length=4) >>> mask_array = np.array([[1, 1, 1, 0]], np.float32) >>> inputs = Tensor(mask_array) >>> res = mask(inputs) >>> print(res) [[[1. 0. 0. 0] [1. 1. 0. 0] [1. 1. 1. 0] [0. 0. 0. 0]]]