Code
Hub
Workspaces
Connect
Indexed graphs
Engine
MCP
copy
hub
/
github.com/ymcui/Chinese-LLaMA-Alpaca-2
/ inference_hf.py
File
inference_hf.py
scripts/inference/inference_hf.py:None–None ·
view source on GitHub ↗
Source
from the content-addressed store, hash-verified
1
import
argparse
2
import
json, os
3
4
DEFAULT_SYSTEM_PROMPT =
""
"You are a helpful assistant. 你是一个乐于助人的助手。"
""
Callers
nothing calls this directly
Calls
9
replace_llama_attn_with_flash_attn
Function · 0.90
apply_attention_patch
Function · 0.90
apply_ntk_scaling_patch
Function · 0.90
speculative_sample
Function · 0.90
generate_prompt
Function · 0.70
from_pretrained
Method · 0.45
eval
Method · 0.45
generate
Method · 0.45
save_pretrained
Method · 0.45
Tested by
no test coverage detected