GELU(Gaussian Error Linear Units)
GELU(x)=x∗Φ(x)
where Φ(x) is the Cumulative Distribution Function for Gaussian Distribution.
When the approximate argument is ‘tanh’, Gelu is estimated with:
GELU(x)=0.5∗x∗(1+Tanh(2/π∗(x+0.044715∗x3)))
- Input: (∗), where \∗ means any number of dimensions.
- Output: (∗), same shape as the input.

Reference
[1] https://pytorch.org/docs/stable/generated/torch.nn.GELU.html#torch.nn.GELU