mindformers.pipeline.ZeroShotImageClassificationPipeline

class mindformers.pipeline.ZeroShotImageClassificationPipeline(model: Union[str, BaseModel, Model], tokenizer: Optional[BaseTokenizer] = None, image_processor: Optional[BaseImageProcessor] = None, **kwargs)[源代码]

Pipeline For Zero Shot Image Classification

Args:
model (Union[str, BaseModel]): The model used to perform task,

the input could be a supported model name, or a model instance inherited from BaseModel.

tokenizer (Optional[BaseTokenizer]): A tokenizer for text processing. image_processor (Optional[BaseImageProcessor]): The image_processor of model,

it could be None if the model do not need image_processor.

Raises:

TypeError: If input model, tokenizer, and image_processor’s types are not corrected. ValueError: if the input model is not in support list.

Examples:
>>> from mindformers.tools.image_tools import load_image
>>> from mindformers.pipeline import ZeroShotImageClassificationPipeline
>>> classifier = ZeroShotImageClassificationPipeline(
...     model='clip_vit_b_32',
...     candidate_labels=["sunflower", "tree", "dog", "cat", "toy"],
...     hypothesis_template="This is a photo of {}."
...     )
>>> img = load_image("https://ascend-repo-modelzoo.obs.cn-east-2."
...                  "myhuaweicloud.com/XFormer_for_mindspore/clip/sunflower.png")
>>> classifier(img)
    [[{'score': 0.99995565, 'label': 'sunflower'},
    {'score': 2.5318595e-05, 'label': 'toy'},
    {'score': 9.903885e-06, 'label': 'dog'},
    {'score': 6.75336e-06, 'label': 'tree'},
    {'score': 2.396818e-06, 'label': 'cat'}]]
forward(model_inputs: dict, **forward_params)[源代码]

Forward process

Args:

model_inputs (dict): Outputs of preprocess.

Return:

Probs dict.

postprocess(model_outputs: dict, **postprocess_params)[源代码]

Postprocess

Args:

model_outputs (dict): Outputs of forward process. top_k (int): Return top_k probs of result

Return:

Classification results.

preprocess(inputs: dict, **preprocess_params)[源代码]

Preprocess of ZeroShotImageClassificationPipeline

Args:

inputs (Union[url, PIL.Image, tensor, numpy]): The image to be classified. candidate_labels (List[str]): The candidate labels for classification. hypothesis_template (Optional[str]): Prompt for text input. max_length (Optional[int]): Max length of tokenizer’s output padding (Optional[Union[False, “max_length”]]): Padding for max_length return_tensors (Optional[“ms”]): The type of returned tensors

Return:

Processed data.

Raises:

ValueError: If candidate_labels or hypothesis_template is None.