MCPcopy
hub / github.com/NVIDIA/TensorRT-LLM / apply_linear

Method apply_linear

tensorrt_llm/_torch/modules/linear.py:2266–2276  ·  view source on GitHub ↗
(self,
                     input,
                     bias,
                     lora_params: Optional[dict] | None = None,
                     layer_idx: Optional[int] | None = None)

Source from the content-addressed store, hash-verified

2264 )
2265
2266 def apply_linear(self,
2267 input,
2268 bias,
2269 lora_params: Optional[dict] | None = None,
2270 layer_idx: Optional[int] | None = None):
2271 output = self.quant_method.apply(self, input, bias)
2272 if self.lora is not None and bool(lora_params):
2273 lora_result = self.lora(input, lora_params, layer_idx)
2274 if lora_result is not None:
2275 output = output + lora_result
2276 return output
2277
2278 def apply_linear_allreduce(self,
2279 input,

Callers 1

forwardMethod · 0.95

Calls 1

applyMethod · 0.45

Tested by

no test coverage detected