mindformers.pipeline.TextClassificationPipeline¶
- class mindformers.pipeline.TextClassificationPipeline(model, tokenizer=None, **kwargs)[源代码]¶
Pipeline for text classification
- Args:
- model (Union[str, BaseModel]): The model used to perform task,
the input could be a supported model name, or a model instance inherited from BaseModel.
tokenizer : a tokenizer (None or Tokenizer) for text processing.
- Raises:
TypeError: If input model and image_processor’s types are not corrected. ValueError: If the input model is not in support list.
- Examples:
>>> from mindformers.pipeline import TextClassificationPipeline >>> from mindformers import AutoTokenizer, BertForMultipleChoice, AutoConfig >>> input_data = ["The new rights are nice enough-Everyone really likes the newest benefits ", ... "i don't know um do you do a lot of camping-I know exactly."] >>> tokenizer = AutoTokenizer.from_pretrained('txtcls_bert_base_uncased_mnli') >>> txtcls_mnli_config = AutoConfig.from_pretrained('txtcls_bert_base_uncased_mnli') >>> model = BertForMultipleChoice(txtcls_mnli_config) >>> txtcls_pipeline = TextClassificationPipeline(task='text_classification', ... model=model, ... tokenizer=tokenizer, ... max_length=model.config.seq_length, ... padding="max_length") >>> results = txtcls_pipeline(input_data, top_k=1) >>> print(results) [[{'label': 'neutral', 'score': 0.9714198708534241}], [{'label': 'contradiction', 'score': 0.9967639446258545}]]
- forward(model_inputs, **forward_params)[源代码]¶
Forward process
- Args:
model_inputs (dict): outputs of preprocess.
- Return:
probs dict.
- inputs_process(inputs_zero, inputs_one)[源代码]¶
process of two sentences relationship classification
- Args:
inputs_zero (str): the first sentence inputs_one (str): the second sentence
- Return:
processed inputs, mask, token_type about two sentences
- postprocess(model_outputs, **postprocess_params)[源代码]¶
Postprocess
- Args:
model_outputs (dict): outputs of forward process.
- Return:
Classification results
- preprocess(inputs, **preprocess_params)[源代码]¶
Preprocess of text classification
- Args:
inputs (str): the str to be classified. max_length (int): max length of tokenizer’s output padding (False / “max_length”): padding for max_length return_tensors (“ms”): the type of returned tensors
- Return:
processed text.