What is Activation Function?

TL;DR

A function that introduces non-linearity into neural networks. ReLU, Sigmoid, and GELU are common examples.

Activation Function: Definition & Explanation

An activation function is a non-linear function applied to each neuron's output in a neural network. Without activation functions, even multi-layer networks would reduce to simple linear transformations, unable to learn complex patterns. Key activation functions include ReLU (Rectified Linear Unit, the most widely used today), Sigmoid (outputs between 0-1, used for probability representation), Tanh (outputs between -1 and 1), GELU (used in Transformers like GPT and BERT), and Swish/SiLU (proposed by Google as an improved ReLU). Choosing the right activation function significantly impacts training efficiency and model performance, and plays a crucial role in avoiding the vanishing gradient problem.

Related Terms

AI Marketing Tools by Our Team