mindformers.models.Tokenizer

class mindformers.models.Tokenizer(**kwargs)[源代码]

Pretrained Tokenizer provides detailed the tokenizer method.

convert_ids_to_tokens(input_ids, skip_special_tokens=False)[源代码]

Convert the ids to tokens using vocab mapping

convert_tokens_to_ids(input_tokens)[源代码]

Convert the tokens to ids using vocab mapping

convert_tokens_to_string(tokens)[源代码]

Convert the tokens to the string

classmethod get_support_list()[源代码]

get_support_list method

num_special_tokens_to_add()[源代码]

Return the special tokens to be added to the ids and ids_pair

classmethod show_support_list()[源代码]

show_support_list method

property vocab_size

Get the vocab size of the