MCPcopy
hub / github.com/jingyaogong/minimind / init_model

Function init_model

trainer/trainer_utils.py:119–131  ·  view source on GitHub ↗
(lm_config, from_weight='pretrain', tokenizer_path='../model', save_dir='../out', device='cuda')

Source from the content-addressed store, hash-verified

117
118
119def init_model(lm_config, from_weight='pretrain', tokenizer_path='../model', save_dir='../out', device='cuda'):
120 tokenizer = AutoTokenizer.from_pretrained(tokenizer_path)
121 model = MiniMindForCausalLM(lm_config)
122
123 if from_weight!= 'none':
124 moe_suffix = '_moe' if lm_config.use_moe else ''
125 weight_path = f'{save_dir}/{from_weight}_{lm_config.hidden_size}{moe_suffix}.pth'
126 weights = torch.load(weight_path, map_location=device)
127 model.load_state_dict(weights, strict=False)
128
129 get_model_params(model, lm_config)
130 Logger(f'Trainable Params: {sum(p.numel() for p in model.parameters() if p.requires_grad) / 1e6:.3f}M')
131 return model.to(device), tokenizer
132
133
134class SkipBatchSampler(Sampler):

Callers 8

train_lora.pyFile · 0.90
train_ppo.pyFile · 0.90
train_agent.pyFile · 0.90
train_pretrain.pyFile · 0.90
train_full_sft.pyFile · 0.90
train_grpo.pyFile · 0.90
train_dpo.pyFile · 0.90

Calls 3

MiniMindForCausalLMClass · 0.90
get_model_paramsFunction · 0.85
LoggerFunction · 0.85

Tested by

no test coverage detected