Optimization

Topics

Notes

Linked

Stochastic Gradient Descent

Full-batch gradient descent is too slow for large datasets. Update parameters using gradients from random mini-batches.