MCPcopy Index your code
hub / github.com/deepspeedai/DeepSpeedExamples / prepare_optimizer_parameters

Function prepare_optimizer_parameters

bing_bert/deepspeed_train.py:329–347  ·  view source on GitHub ↗
(args, model)

Source from the content-addressed store, hash-verified

327 return args
328
329def prepare_optimizer_parameters(args, model):
330 config = args.config
331
332 param_optimizer = list(model.network.named_parameters())
333 param_optimizer = [n for n in param_optimizer if 'pooler' not in n[0]]
334 no_decay = ['bias', 'LayerNorm.bias', 'LayerNorm.weight']
335 if "weight_decay" in config["training"].keys():
336 weight_decay = config["training"]["weight_decay"]
337 else:
338 weight_decay = 0.01
339
340 optimizer_grouped_parameters = [
341 {'params': [p for n, p in param_optimizer if not any(
342 nd in n for nd in no_decay)], 'weight_decay': weight_decay},
343 {'params': [p for n, p in param_optimizer if any(
344 nd in n for nd in no_decay)], 'weight_decay': 0.0}
345 ]
346
347 return optimizer_grouped_parameters
348
349def prepare_model_optimizer(args):
350 # Loading Model

Callers 1

prepare_model_optimizerFunction · 0.85

Calls

no outgoing calls

Tested by

no test coverage detected