by Michele Laurelli
Direction and magnitude of steepest increase in loss function with respect to parameters.
Gradients indicate how to adjust weights to reduce loss. Computed via backpropagation. Move in opposite direction (gradient descent).
Weight gradient: -0.5
Gradient flow through network
Vanishing gradients