Hard

SGD with Momentum

Hard

~20 min

code completion

Vanilla gradient descent can oscillate or move slowly in narrow valleys. Momentum accelerates convergence by accumulating a velocity vector:

v \leftarrow β v + α \nabla_{w} L

w \leftarrow w - v

where:

β

(momentum) — how much prior velocity to retain (typically 0.9)

α

— learning rate

\nabla_{w} L

— current gradient

Think of it as a ball rolling downhill: it builds speed in consistent directions and dampens oscillations.

Your task:

Implement sgd_momentum_update(weights, velocity, gradient, learning_rate, momentum). Return (weights_new, velocity_new) as a tuple.

Example Tests

Updated weights (accessor [0]): cold start

Input: {"weights":[1,2],"gradient":[1,1],"momentum":0.9,"velocity":[0,0],"learning_rate":0.1}

Expected: [0.9,1.9]

Updated velocity (accessor [1]): cold start

Input: {"weights":[1,2],"gradient":[1,1],"momentum":0.9,"velocity":[0,0],"learning_rate":0.1}

Expected: [0.1,0.1]

Warm start: prior velocity carries forward

Input: {"weights":[0,0],"gradient":[1,0],"momentum":0.9,"velocity":[0.5,0.5],"learning_rate":0.1}

Expected: [-0.55,-0.45]

You can read the full problem statement above. Create a free account to run code in the browser, submit solutions, and track your progress.