MCPcopy
hub / github.com/kvcache-ai/ktransformers / test_config

Function test_config

kt-kernel/python/cli/utils/tuna_engine.py:87–281  ·  view source on GitHub ↗

Test if a configuration with given num_gpu_experts works. Args: num_gpu_experts: Number of GPU experts to test model_path: Path to the model config: Configuration dict with all parameters verbose: Whether to show detailed logs Returns: (success:

(
    num_gpu_experts: int,
    model_path: Path,
    config: dict,
    verbose: bool = False,
)

Source from the content-addressed store, hash-verified

85
86
87def test_config(
88 num_gpu_experts: int,
89 model_path: Path,
90 config: dict,
91 verbose: bool = False,
92) -> tuple[bool, float]:
93 """
94 Test if a configuration with given num_gpu_experts works.
95
96 Args:
97 num_gpu_experts: Number of GPU experts to test
98 model_path: Path to the model
99 config: Configuration dict with all parameters
100 verbose: Whether to show detailed logs
101
102 Returns:
103 (success: bool, elapsed_time: float)
104 - success: True if server starts and inference works
105 - elapsed_time: Time taken for the test
106 """
107 start_time = time.time()
108
109 # Use random port to avoid conflicts
110 test_port = random.randint(30000, 40000)
111
112 # Build command
113 cmd = [
114 sys.executable,
115 "-m",
116 "sglang.launch_server",
117 "--model",
118 str(model_path),
119 "--port",
120 str(test_port),
121 "--host",
122 "127.0.0.1",
123 "--tensor-parallel-size",
124 str(config["tensor_parallel_size"]),
125 "--kt-num-gpu-experts",
126 str(num_gpu_experts),
127 "--max-total-tokens",
128 str(config["max_total_tokens"]),
129 ]
130
131 # Add kt-kernel options
132 if config.get("weights_path"):
133 cmd.extend(["--kt-weight-path", str(config["weights_path"])])
134 else:
135 cmd.extend(["--kt-weight-path", str(model_path)])
136
137 cmd.extend(
138 [
139 "--kt-cpuinfer",
140 str(config.get("cpu_threads", 64)),
141 "--kt-threadpool-count",
142 str(config.get("numa_nodes", 2)),
143 "--kt-method",
144 config.get("kt_method", "AMXINT4"),

Callers 2

find_max_gpu_expertsFunction · 0.85
run_tunaFunction · 0.85

Calls 6

print_errorFunction · 0.90
print_warningFunction · 0.90
detect_oomFunction · 0.85
test_inferenceFunction · 0.85
getMethod · 0.45
printMethod · 0.45

Tested by

no test coverage detected