MCPcopy
hub / github.com/deepseek-ai/DeepSeek-V3 / linear

Function linear

inference/model.py:129–161  ·  view source on GitHub ↗

Applies a linear transformation to the incoming data: y = xA^T + b. This function supports specialized implementations based on quantization and tensor formats. Args: x (torch.Tensor): The input tensor. weight (torch.Tensor): The weight tensor. It may be quantized a

(x: torch.Tensor, weight: torch.Tensor, bias: Optional[torch.Tensor] = None)

Source from the content-addressed store, hash-verified

127
128
129def linear(x: torch.Tensor, weight: torch.Tensor, bias: Optional[torch.Tensor] = None) -> torch.Tensor:
130 """
131 Applies a linear transformation to the incoming data: y = xA^T + b.
132 This function supports specialized implementations based on quantization
133 and tensor formats.
134
135 Args:
136 x (torch.Tensor): The input tensor.
137 weight (torch.Tensor): The weight tensor. It may be quantized and
138 requires dequantization for certain cases.
139 bias (Optional[torch.Tensor]): The bias tensor to be added. Default is None.
140
141 Returns:
142 torch.Tensor: The result of the linear transformation, which may involve
143 quantization-aware computations depending on the input parameters.
144
145 Notes:
146 - If `weight` is quantized (e.g., `element_size() == 1`), a dequantized version
147 is used for computation.
148 - If `gemm_impl == "bf16"`, dequantization and a `bf16` GEMM operation are applied.
149 - For other cases, the function applies quantization to `x` and uses `fp8_gemm` for computation.
150 """
151 if weight.element_size() > 1:
152 return F.linear(x, weight, bias)
153 elif gemm_impl == "bf16":
154 weight = weight_dequant(weight, weight.scale)
155 return F.linear(x, weight, bias)
156 else:
157 x, scale = act_quant(x, block_size)
158 y = fp8_gemm(x, scale, weight, weight.scale)
159 if bias is not None:
160 y += bias
161 return y
162
163
164class Linear(nn.Module):

Callers 4

forwardMethod · 0.85
forwardMethod · 0.85
forwardMethod · 0.85
forwardMethod · 0.85

Calls 3

weight_dequantFunction · 0.90
act_quantFunction · 0.90
fp8_gemmFunction · 0.90

Tested by

no test coverage detected