AI Blog

by Michele Laurelli

Back to Glossary

SGD (Stochastic Gradient Descent)

/stoʊˈkæstɪk ˈɡreɪdiənt dɪˈsɛnt/

Algorithm

Definition

A gradient descent variant that updates weights using gradients from a single random training example at a time.

SGD is faster and enables online learning but has noisy gradients. The noise can help escape local minima. Mini-batch SGD balances efficiency and gradient quality.

Examples

Online learning

Large-scale training

Escaping local minima

Related Terms

Gradient Descent

An optimization algorithm that iteratively adjusts parameters to minimize a loss function by following the gradient.

Michele Laurelli - AI Research & Engineering