What is Gradient Descent?

TL;DR

The fundamental optimization algorithm for training AI model parameters. The core of the deep learning training process.

Gradient Descent: Definition & Explanation

Gradient descent is an optimization algorithm that iteratively updates model parameters to minimize the loss function (the error between predictions and ground truth). It computes the gradient (slope) of the loss function and adjusts parameters incrementally in the direction that reduces the gradient. Major variants include stochastic gradient descent (SGD), mini-batch gradient descent, and adaptive learning rate optimizers like Adam, AdaGrad, and RMSProp. Adam has become the default optimizer for many deep learning models. The learning rate setting is critical — too large causes divergence, too small leads to slow convergence. All neural network training, including LLMs, is based on gradient descent.

Related Terms

AI Marketing Tools by Our Team