ExportArguments is a dataclass that inherits from BaseArguments and MergeArguments. Args: output_dir (Optional[str]): Directory to save the exported results. Defaults to None, which automatically sets a path with an appropriate suffix. quant_method (Optional[str]): T
| 14 | |
| 15 | @dataclass |
| 16 | class ExportArguments(MergeArguments, BaseArguments): |
| 17 | """ExportArguments is a dataclass that inherits from BaseArguments and MergeArguments. |
| 18 | |
| 19 | Args: |
| 20 | output_dir (Optional[str]): Directory to save the exported results. Defaults to None, which automatically sets |
| 21 | a path with an appropriate suffix. |
| 22 | quant_method (Optional[str]): The quantization method. Can be 'awq', 'gptq', 'bnb', 'fp8', or 'gptq_v2'. |
| 23 | Defaults to None. See examples for more details. |
| 24 | quant_n_samples (int): Number of samples for GPTQ/AWQ calibration. Defaults to 256. |
| 25 | quant_batch_size (int): The batch size for quantization. Defaults to 1. |
| 26 | group_size (int): The group size for quantization. Defaults to 128. |
| 27 | to_cached_dataset (bool): Whether to tokenize and export the dataset in advance as a cached dataset. Defaults |
| 28 | to False. Note: You can specify the validation set content through |
| 29 | `--split_dataset_ratio` or `--val_dataset`. |
| 30 | to_ollama (bool): Whether to generate the `Modelfile` required by Ollama. Defaults to False. |
| 31 | to_mcore (bool): Whether to convert Hugging Face format weights to Megatron-Core format. Defaults to False. |
| 32 | to_hf (bool): Whether to convert Megatron-Core format weights to Hugging Face format. Defaults to False. |
| 33 | mcore_model (Optional[str]): The path to the Megatron-Core format model. Defaults to None. |
| 34 | mcore_adapter (Optional[str]): A list of adapter paths for the Megatron-Core format model. Defaults to []. |
| 35 | thread_count (Optional[int]): The number of model shards when `to_mcore` is True. Defaults to None, which |
| 36 | automatically sets the number based on the model size to keep the largest shard under 10GB. |
| 37 | test_convert_precision (bool): Whether to test the precision error of weight conversion between Hugging Face |
| 38 | and Megatron-Core formats. Defaults to False. |
| 39 | test_convert_dtype (str): The dtype to use for the conversion precision test. Defaults to 'float32'. |
| 40 | push_to_hub (bool): Whether to push the output to the Model Hub. Defaults to False. See examples for more |
| 41 | details. |
| 42 | hub_model_id (Optional[str]): The model ID for pushing to the Hub (e.g., 'user_name/repo_name' or 'repo_name'). |
| 43 | Defaults to None. |
| 44 | hub_private_repo (bool): Whether the Hub repository is private. Defaults to False. |
| 45 | commit_message (str): The commit message for pushing to the Hub. Defaults to 'update files'. |
| 46 | to_peft_format (bool): Whether to export in PEFT format. This argument is for compatibility and currently has |
| 47 | no effect. Defaults to False. |
| 48 | exist_ok (bool): If the output_dir exists, do not raise an exception and overwrite its contents. Defaults to |
| 49 | False. |
| 50 | """ |
| 51 | output_dir: Optional[str] = None |
| 52 | |
| 53 | # awq/gptq |
| 54 | quant_method: Literal['awq', 'gptq', 'bnb', 'fp8', 'gptq_v2'] = None |
| 55 | quant_n_samples: int = 256 |
| 56 | quant_batch_size: int = 1 |
| 57 | group_size: int = 128 |
| 58 | |
| 59 | # cached_dataset |
| 60 | to_cached_dataset: bool = False |
| 61 | template_mode: Literal['train', 'rlhf', 'kto'] = 'train' |
| 62 | |
| 63 | # ollama |
| 64 | to_ollama: bool = False |
| 65 | |
| 66 | # megatron |
| 67 | to_mcore: bool = False |
| 68 | to_hf: bool = False |
| 69 | mcore_model: Optional[str] = None |
| 70 | mcore_adapter: Optional[str] = None |
| 71 | thread_count: Optional[int] = None |
| 72 | test_convert_precision: bool = False |
| 73 | test_convert_dtype: str = 'float32' |
no outgoing calls