SGD with Momentum
SGD with Momentum
Vanilla gradient descent can oscillate or move slowly in narrow valleys. Momentum accelerates convergence by accumulating a velocity vector:
where:
Think of it as a ball rolling downhill: it builds speed in consistent directions and dampens oscillations.
Your task:
Implement sgd_momentum_update(weights, velocity, gradient, learning_rate, momentum). Return (weights_new, velocity_new) as a tuple.
Example Tests
Updated weights (accessor [0]): cold start
Input: {"weights":[1,2],"gradient":[1,1],"momentum":0.9,"velocity":[0,0],"learning_rate":0.1}
Expected: [0.9,1.9]
Updated velocity (accessor [1]): cold start
Input: {"weights":[1,2],"gradient":[1,1],"momentum":0.9,"velocity":[0,0],"learning_rate":0.1}
Expected: [0.1,0.1]
Warm start: prior velocity carries forward
Input: {"weights":[0,0],"gradient":[1,0],"momentum":0.9,"velocity":[0.5,0.5],"learning_rate":0.1}
Expected: [-0.55,-0.45]