mindformers.AutoTokenizer¶
-
class
mindformers.AutoTokenizer[源代码]¶ Load the tokenizer according to the yaml_name_or_path. It supports the following situations 1. yaml_name_or_path is the model name. 2. yaml_name_or_path is the path to the downloaded files.
实际案例
>>> from mindformers.auto_class import AutoTokenizer >>> >>> # 1) instantiates a tokenizer by the model name >>> tokenizer_a = AutoTokenizer.from_pretrained("clip_vit_b_32") >>> # 2) instantiates a tokenizer by the path to the downloaded files. >>> from mindformers.models.clip.clip_tokenizer import CLIPTokenizer >>> clip_tokenizer = CLIPTokenizer.from_pretrained("clip_vit_b_32") >>> clip_tokenizer.save_pretrained(path_saved) >>> restore_tokenizer = AutoTokenizer.from_pretrained(path_saved)
-
classmethod
from_pretrained(yaml_name_or_path, **kwargs)[源代码]¶ From pretrain method, which instantiates a tokenizer by yaml name or path.
- 参数
yaml_name_or_path (str) – A supported yaml name or a path to .yaml file, the supported model name could be selected from .show_support_list(). If yaml_name_or_path is model name, it supports model names beginning with mindspore or the model name itself, such as “mindspore/clip_vit_b_32” or “clip_vit_b_32”.
pretrained_model_name_or_path (Optional[str]) – Equal to “yaml_name_or_path”, if “pretrained_model_name_or_path” is set, “yaml_name_or_path” is useless.
- 返回
A tokenizer which inherited from PretrainedTokenizer.
-
classmethod