Implementation of the gelu activation function. For information: OpenAI GPT's gelu is slightly different (and gives slightly different results): 0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3))))
(x)
| 49 | WEIGHTS_NAME = 'pytorch_model.bin' |
| 50 | |
| 51 | def gelu(x): |
| 52 | """Implementation of the gelu activation function. |
| 53 | For information: OpenAI GPT's gelu is slightly different (and gives slightly different results): |
| 54 | 0.5 * x * (1 + torch.tanh(math.sqrt(2 / math.pi) * (x + 0.044715 * torch.pow(x, 3)))) |
| 55 | """ |
| 56 | pdtype = x.dtype |
| 57 | x=x.float() |
| 58 | y = x * 0.5 * (1.0 + torch.erf(x / math.sqrt(2.0))) |
| 59 | return y.to(pdtype) |
| 60 | |
| 61 | |
| 62 | def swish(x): |