MCPcopy
hub / github.com/lm-sys/FastChat / load_model

Method load_model

fastchat/model/model_adapter.py:613–660  ·  view source on GitHub ↗

Loads the base model then the (peft) adapter weights

(self, model_path: str, from_pretrained_kwargs: dict)

Source from the content-addressed store, hash-verified

611 return "peft" in model_path.lower()
612
613 def load_model(self, model_path: str, from_pretrained_kwargs: dict):
614 """Loads the base model then the (peft) adapter weights"""
615 from peft import PeftConfig, PeftModel
616
617 config = PeftConfig.from_pretrained(model_path)
618 base_model_path = config.base_model_name_or_path
619 if "peft" in base_model_path:
620 raise ValueError(
621 f"PeftModelAdapter cannot load a base model with 'peft' in the name: {config.base_model_name_or_path}"
622 )
623
624 # Basic proof of concept for loading peft adapters that share the base
625 # weights. This is pretty messy because Peft re-writes the underlying
626 # base model and internally stores a map of adapter layers.
627 # So, to make this work we:
628 # 1. Cache the first peft model loaded for a given base models.
629 # 2. Call `load_model` for any follow on Peft models.
630 # 3. Make sure we load the adapters by the model_path. Why? This is
631 # what's accessible during inference time.
632 # 4. In get_generate_stream_function, make sure we load the right
633 # adapter before doing inference. This *should* be safe when calls
634 # are blocked the same semaphore.
635 if peft_share_base_weights:
636 if base_model_path in peft_model_cache:
637 model, tokenizer = peft_model_cache[base_model_path]
638 # Super important: make sure we use model_path as the
639 # `adapter_name`.
640 model.load_adapter(model_path, adapter_name=model_path)
641 else:
642 base_adapter = get_model_adapter(base_model_path)
643 base_model, tokenizer = base_adapter.load_model(
644 base_model_path, from_pretrained_kwargs
645 )
646 # Super important: make sure we use model_path as the
647 # `adapter_name`.
648 model = PeftModel.from_pretrained(
649 base_model, model_path, adapter_name=model_path
650 )
651 peft_model_cache[base_model_path] = (model, tokenizer)
652 return model, tokenizer
653
654 # In the normal case, load up the base model weights again.
655 base_adapter = get_model_adapter(base_model_path)
656 base_model, tokenizer = base_adapter.load_model(
657 base_model_path, from_pretrained_kwargs
658 )
659 model = PeftModel.from_pretrained(base_model, model_path)
660 return model, tokenizer
661
662 def get_default_conv_template(self, model_path: str) -> Conversation:
663 """Uses the conv template of the base model"""

Callers

nothing calls this directly

Calls 2

get_model_adapterFunction · 0.85
load_modelMethod · 0.45

Tested by

no test coverage detected