mindformers.AutoTokenizer

class mindformers.AutoTokenizer[源代码]

Load the tokenizer according to the yaml_name_or_path. It supports the following situations 1. yaml_name_or_path is the model name. 2. yaml_name_or_path is the path to the downloaded files.

Examples:
>>> from mindformers.auto_class import AutoTokenizer
>>>
>>> # 1)  instantiates a tokenizer by the model name
>>> tokenizer_a = AutoTokenizer.from_pretrained("clip_vit_b_32")
>>> # 2)  instantiates a tokenizer by the path to the downloaded files.
>>> from mindformers.models.clip.clip_tokenizer import CLIPTokenizer
>>> clip_tokenizer = CLIPTokenizer.from_pretrained("clip_vit_b_32")
>>> clip_tokenizer.save_pretrained(path_saved)
>>> restore_tokenizer = AutoTokenizer.from_pretrained(path_saved)
classmethod from_pretrained(yaml_name_or_path, **kwargs)[源代码]

From pretrain method, which instantiates a tokenizer by yaml name or path.

Args:
yaml_name_or_path (str): A supported yaml name or a path to .yaml file,

the supported model name could be selected from .show_support_list(). If yaml_name_or_path is model name, it supports model names beginning with mindspore or the model name itself, such as “mindspore/clip_vit_b_32” or “clip_vit_b_32”.

pretrained_model_name_or_path (Optional[str]): Equal to “yaml_name_or_path”,

if “pretrained_model_name_or_path” is set, “yaml_name_or_path” is useless.

Returns:

A tokenizer which inherited from PretrainedTokenizer.

classmethod get_support_list()[源代码]

get support list method

classmethod invalid_yaml_name(yaml_name_or_path)[源代码]

Check whether it is a valid yaml name

classmethod show_support_list()[源代码]

show support list method