mindformers.models.clip.CLIPVisionConfig

class mindformers.models.clip.CLIPVisionConfig(hidden_size: Optional[int] = 768, intermediate_size: Optional[int] = 3072, num_hidden_layers: Optional[int] = 12, num_attention_heads: Optional[int] = 12, image_size: Optional[int] = 224, patch_size: Optional[int] = 32, hidden_act: Optional[str] = 'quick_gelu', dropout: Optional[float] = 0.0, attention_dropout: Optional[float] = 0.0, initializer_range: Optional[float] = 0.02, initializer_factor: Optional[float] = 1.0, **kwargs)[源代码]

Config For CLIP Vision Module

参数
  • hidden_size (Optional[int]) – Dimensionality of the encoder layers and the pooler layer.

  • intermediate_size (Optional[int]) – Dimensionality of the “intermediate” (i.e., feed-forward) layer in the Transformer encoder.

  • num_hidden_layers (Optional[int]) – Number of hidden layers in the Transformer encoder.

  • num_attention_heads (Optional[int]) – Number of attention heads for each attention layer in the Transformer encoder.

  • image_size (Optional[int]) – The size (resolution) of each image.

  • patch_size (Optional[int]) – The size (resolution) of each patch.

  • hidden_act (Optional[str]) – The non-linear activation function (function or string) in the encoder and pooler. Only “quick_gelu” supported currently.

  • dropout (Optional[float]) – The dropout probabilitiy for all fully connected layers in the embeddings, encoder, and pooler.

  • attention_dropout (Optional[float]) – The dropout ratio for the attention probabilities.

  • initializer_range (Optional[float]) – The standard deviation of the truncated_normal_initializer for initializing all weight matrices.

  • initializer_factor (Optional[float]) – A factor for initializing all weight matrices (should be kept to 1, used internally for initialization testing).

实际案例

>>> from mindformers import CLIPVisionConfig
>>> CLIPVisionConfig(hidden_size=512, image_size=256)
    {'hidden_size': 512, 'intermediate_size': 3072, 'num_hidden_layers': 12,
     'num_attention_heads': 16, 'image_size': 256, 'patch_size': 32,
      'hidden_act': 'quick_gelu', 'dropout': 0.0, 'attention_dropout': 0.0,
       'initializer_range': 0.02, 'initializer_factor': 1.0}