mindformers.models.clip.CLIPConfig

class mindformers.models.clip.CLIPConfig(text_config: Optional[mindformers.models.clip.clip_config.CLIPTextConfig] = None, vision_config: Optional[mindformers.models.clip.clip_config.CLIPVisionConfig] = None, projection_dim: Optional[int] = 512, logit_scale_init_value: Optional[float] = 2.6592, checkpoint_name_or_path: Optional[str] = '', dtype: Optional[str] = 'float16', **kwargs)[源代码]

Config For CLIP Model

参数
  • text_config (Optional[CLIPTextConfig]) – The config of text transformer.

  • vision_config (Optional[CLIPVisionConfig]) – The config of vision transformer.

  • projection_dim (Optional[int]) – The dims of projected featrues.

  • logit_scale_init_value (Optional[float]) – The initial value of the logit_scale parameter.

  • checkpoint_name_or_path (Optional[str]) – The path of checkpoint(.ckpt) or a support model name in CLIPConfig.show_support_list()

  • dtype (Optional[str]) – The type of tensors in model, [“float16”, “float32”].

引发

TypeError – If the type of text_config is not CLIPTextConfig or the type of vision_config is not CLIPVisionConfig

实际案例

>>> from mindformers import CLIPConfig
>>> CLIPConfig.show_support_list()
    INFO - support list of CLIPConfig is:
    INFO -    ['clip_vit_b_32']
    INFO - -------------------------------------
>>> config = CLIPConfig.from_pretrained('clip_vit_b_32')
>>> config
    {'text_config': {'hidden_size': 512, 'vocab_size': 49408, 'max_position_embeddings': 77,
     'num_hidden_layers': 12}, 'vision_config': {'hidden_size': 768, 'image_size': 224,
      'patch_size': 32, 'num_hidden_layers': 12}, 'projection_dim': 512, 'ratio': 64,
       'checkpoint_name_or_path': 'clip_vit_b_32', 'dtype': 'float16'}
>>> config.save_pretrained(save_directory="./", save_name="clip_config")
    INFO - config saved successfully!