Loads weights from `oai` to `our` via in place copy. `oai` is a huggingface gpt2model, while `our` is one of our models. dst2src=True loads parameters from our models into huggingface's. ^dst2src=True is still untested
(our, oai, dst2src=False)
| 451 | |
| 452 | |
| 453 | def move_weights(our, oai, dst2src=False): |
| 454 | """ |
| 455 | Loads weights from `oai` to `our` via in place copy. |
| 456 | `oai` is a huggingface gpt2model, while `our` is one of our models. |
| 457 | dst2src=True loads parameters from our models into huggingface's. |
| 458 | ^dst2src=True is still untested |
| 459 | """ |
| 460 | # while isinstance(our, (torchDDP, model.distributed.DistributedDataParallel, FP16_Module)): |
| 461 | # our=our.module |
| 462 | transformer_model = oai.transformer |
| 463 | load_weights(transformer_model.ln_f, our.transformer.final_layernorm, dst2src) |
| 464 | load_weights(transformer_model.wte, our.word_embeddings, dst2src) |
| 465 | load_weights(transformer_model.wpe, our.position_embeddings, dst2src) |
| 466 | |
| 467 | for our_layer, oai_layer in zip(our.transformer.layers, oai.transformer.h): |
| 468 | load_transformer_layer(our_layer, oai_layer, dst2src) |
| 469 | |
| 470 | |
| 471 | def debug_finetune_data(local_vars, batch_id, tokenizer): |
nothing calls this directly
no test coverage detected