hub / github.com/NVIDIA/TensorRT-LLM / swiglu

Function swiglu

tensorrt_llm/functional.py:841–857 · view source on GitHub ↗

Add a SwiGLU (`x * SiLU(gate)`) operation. That function takes a tensor, splits it into two halves along the last dimension, applies SiLU to the second half and multiply the results. The behavior is undefined if the last dimension is not even. Parameters: input : Tenso

(input: Tensor)

Source from the content-addressed store, hash-verified

839
840
841	def swiglu(input: Tensor) -> Tensor:
842	'''
843	Add a SwiGLU (`x * SiLU(gate)`) operation.
844
845	That function takes a tensor, splits it into two halves along the last
846	dimension, applies SiLU to the second half and multiply the results. The
847	behavior is undefined if the last dimension is not even.
848
849	Parameters:
850	input : Tensor
851	The input tensor on which the activation function is applied.
852
853	Returns:
854	The tensor produced by the activation layer.
855	'''
856	x, gate = chunk(input, 2, dim=-1)
857	return silu(gate) * x
858
859
860	def squared_relu(x: Tensor) -> Tensor:

Callers

nothing calls this directly

Calls 2

chunkFunction · 0.85

siluFunction · 0.85

Tested by

no test coverage detected