mindformers.models.clip.CLIPConfig¶

class mindformers.models.clip.CLIPConfig(text_config: Optional[CLIPTextConfig] = None, vision_config: Optional[CLIPVisionConfig] = None, projection_dim: Optional[int] = 512, logit_scale_init_value: Optional[float] = 2.6592, checkpoint_name_or_path: Optional[str] = '', dtype: Optional[str] = 'float16', **kwargs)[源代码]¶

Config For CLIP Model

Args:

text_config (Optional[CLIPTextConfig]): The config of text transformer. vision_config (Optional[CLIPVisionConfig]): The config of vision transformer. projection_dim (Optional[int]): The dims of projected featrues. logit_scale_init_value (Optional[float]): The initial value of the logit_scale parameter. checkpoint_name_or_path (Optional[str]): The path of checkpoint(.ckpt)

or a support model name in CLIPConfig.show_support_list()

dtype (Optional[str]): The type of tensors in model, [“float16”, “float32”].

Raises:

TypeError: If the type of text_config is not CLIPTextConfig or the type of vision_config: is not CLIPVisionConfig

Examples:

>>> from mindformers import CLIPConfig
>>> CLIPConfig.show_support_list()
    INFO - support list of CLIPConfig is:
    INFO -    ['clip_vit_b_32']
    INFO - -------------------------------------
>>> config = CLIPConfig.from_pretrained('clip_vit_b_32')
>>> config
    {'text_config': {'hidden_size': 512, 'vocab_size': 49408, 'max_position_embeddings': 77,
     'num_hidden_layers': 12}, 'vision_config': {'hidden_size': 768, 'image_size': 224,
      'patch_size': 32, 'num_hidden_layers': 12}, 'projection_dim': 512, 'ratio': 64,
       'checkpoint_name_or_path': 'clip_vit_b_32', 'dtype': 'float16'}
>>> config.save_pretrained(save_directory="./", save_name="clip_config")
    INFO - config saved successfully!