hub / github.com/NVIDIA/TensorRT-LLM / apply_linear

Method apply_linear

tensorrt_llm/_torch/modules/linear.py:2266–2276 · view source on GitHub ↗

(self,
                     input,
                     bias,
                     lora_params: Optional[dict] | None = None,
                     layer_idx: Optional[int] | None = None)

Source from the content-addressed store, hash-verified

2264	)
2265
2266	def apply_linear(self,
2267	input,
2268	bias,
2269	lora_params: Optional[dict] \| None = None,
2270	layer_idx: Optional[int] \| None = None):
2271	output = self.quant_method.apply(self, input, bias)
2272	if self.lora is not None and bool(lora_params):
2273	lora_result = self.lora(input, lora_params, layer_idx)
2274	if lora_result is not None:
2275	output = output + lora_result
2276	return output
2277
2278	def apply_linear_allreduce(self,
2279	input,

Callers 1

forwardMethod · 0.95

Calls 1

applyMethod · 0.45

Tested by

no test coverage detected