MCPcopy
hub / github.com/deepspeedai/DeepSpeedExamples / BertConfig

Class BertConfig

BingBertSquad/turing/nvidia_modeling.py:199–285  ·  view source on GitHub ↗

Configuration class to store the configuration of a `BertModel`.

Source from the content-addressed store, hash-verified

197
198
199class BertConfig(object):
200 """Configuration class to store the configuration of a `BertModel`.
201 """
202 def __init__(self,
203 vocab_size_or_config_json_file,
204 hidden_size=768,
205 num_hidden_layers=12,
206 num_attention_heads=12,
207 intermediate_size=3072,
208 hidden_act="gelu",
209 hidden_dropout_prob=0.1,
210 attention_probs_dropout_prob=0.1,
211 max_position_embeddings=512,
212 type_vocab_size=2,
213 initializer_range=0.02):
214 """Constructs BertConfig.
215
216 Args:
217 vocab_size_or_config_json_file: Vocabulary size of `inputs_ids` in `BertModel`.
218 hidden_size: Size of the encoder layers and the pooler layer.
219 num_hidden_layers: Number of hidden layers in the Transformer encoder.
220 num_attention_heads: Number of attention heads for each attention layer in
221 the Transformer encoder.
222 intermediate_size: The size of the "intermediate" (i.e., feed-forward)
223 layer in the Transformer encoder.
224 hidden_act: The non-linear activation function (function or string) in the
225 encoder and pooler. If string, "gelu", "relu" and "swish" are supported.
226 hidden_dropout_prob: The dropout probabilitiy for all fully connected
227 layers in the embeddings, encoder, and pooler.
228 attention_probs_dropout_prob: The dropout ratio for the attention
229 probabilities.
230 max_position_embeddings: The maximum sequence length that this model might
231 ever be used with. Typically set this to something large just in case
232 (e.g., 512 or 1024 or 2048).
233 type_vocab_size: The vocabulary size of the `token_type_ids` passed into
234 `BertModel`.
235 initializer_range: The sttdev of the truncated_normal_initializer for
236 initializing all weight matrices.
237 """
238 if isinstance(vocab_size_or_config_json_file, str) or (sys.version_info[0] == 2
239 and isinstance(vocab_size_or_config_json_file, unicode)):
240 with open(vocab_size_or_config_json_file, "r", encoding='utf-8') as reader:
241 json_config = json.loads(reader.read())
242 for key, value in json_config.items():
243 self.__dict__[key] = value
244 elif isinstance(vocab_size_or_config_json_file, int):
245 self.vocab_size = vocab_size_or_config_json_file
246 self.hidden_size = hidden_size
247 self.num_hidden_layers = num_hidden_layers
248 self.num_attention_heads = num_attention_heads
249 self.hidden_act = hidden_act
250 self.intermediate_size = intermediate_size
251 self.hidden_dropout_prob = hidden_dropout_prob
252 self.attention_probs_dropout_prob = attention_probs_dropout_prob
253 self.max_position_embeddings = max_position_embeddings
254 self.type_vocab_size = type_vocab_size
255 self.initializer_range = initializer_range
256 else:

Callers 3

mainFunction · 0.90
mainFunction · 0.90
from_dictMethod · 0.70

Calls

no outgoing calls

Tested by

no test coverage detected