mindformers.modules.transformer.VocabEmbedding

class mindformers.modules.transformer.VocabEmbedding(**kwargs)[源代码]

The embedding lookup table from the 0-th dim of the parameter table. When the parallel_config.vocab_emb_dp is True and in the AUTO_PARALLEL mode, the embedding lookup will be trained by the data parallel way, as the parameters will be repeated on each device. If false, the embedding table will be sharded into n parts at the 0-th dimension of the embedding table, where the n is the model parallel way determined by parallel_config.model_parallel (EmbeddingOpParallelConfig).

注解

When AUTO_PARALLEL or SEMI_AUTO_PARALLEL mode is enabled, this layer support only 2-d dimension inputs, as the shard is designed for 2d inputs.

参数
  • vocab_size (int) – Size of the dictionary of embeddings.

  • embedding_size (int) – The size of each embedding vector.

  • parallel_config (EmbeddingOpParallelConfig) – The parallel config of network. Default default_embedding_parallel_config, an instance of EmbeddingOpParallelConfig with default args.

  • param_init (Union[Tensor, str, Initializer, numbers.Number]) – Initializer for the embedding_table. Refer to class initializer for the values of string when a string is specified. Default: ‘normal’.

Inputs:
  • input_ids (Tensor) - The tokenized inputs with datatype int32 with shape (batch_size, seq_length)

Outputs:

Tuple, a tuple contains (output, embedding_table)

  • output (Tensor) - The embedding vector for the input with shape (batch_size, seq_length, embedding_size).

  • embedding_table (Tensor) - The embedding table with shape (vocab_size, embedding_size).

引发
  • ValueError – If the parallel_config.vocab_emb_dp is True, the vocab size is not a multiple of parallel_config.model_parallel

  • ValueErrorvocab_size is not a positive value.

  • ValueErrorembedding_size is not a positive value.

  • TypeErrorparallel_config is not a subclass of OpParallelConfig.

Supported Platforms:

Ascend GPU

实际案例

>>> import numpy as np
>>> from mindformers.modules.transformer import VocabEmbedding
>>> from mindspore import Tensor
>>> from mindspore import dtype as mstype
>>> model = VocabEmbedding(vocab_size=30, embedding_size=30)
>>> tensor = Tensor(np.ones((20, 15)), mstype.int32)
>>> output, table = model(tensor)
>>> print(output.shape)
(20, 15, 30)
>>> print(table.shape)
(30, 30)