MCPcopy
hub / github.com/karpathy/nanochat / load_hf_model

Function load_hf_model

scripts/base_eval.py:67–77  ·  view source on GitHub ↗

Load a HuggingFace model and tokenizer.

(hf_path: str, device)

Source from the content-addressed store, hash-verified

65
66
67def load_hf_model(hf_path: str, device):
68 """Load a HuggingFace model and tokenizer."""
69 print0(f"Loading HuggingFace model from: {hf_path}")
70 from transformers import AutoModelForCausalLM
71 model = AutoModelForCausalLM.from_pretrained(hf_path)
72 model.to(device)
73 model.eval()
74 max_seq_len = 1024 if "gpt2" in hf_path else None
75 model = ModelWrapper(model, max_seq_len=max_seq_len)
76 tokenizer = HuggingFaceTokenizer.from_pretrained(hf_path)
77 return model, tokenizer
78
79
80def get_hf_token_bytes(tokenizer, device="cpu"):

Callers 1

mainFunction · 0.85

Calls 3

print0Function · 0.90
ModelWrapperClass · 0.85
from_pretrainedMethod · 0.45

Tested by

no test coverage detected