SGD with Momentum

Hard
~20 min
code completion

SGD with Momentum

Vanilla gradient descent can oscillate or move slowly in narrow valleys. Momentum accelerates convergence by accumulating a velocity vector:

where:

  • (momentum) — how much prior velocity to retain (typically 0.9)
  • — learning rate
  • — current gradient
  • Think of it as a ball rolling downhill: it builds speed in consistent directions and dampens oscillations.

    Your task:

    Implement sgd_momentum_update(weights, velocity, gradient, learning_rate, momentum). Return (weights_new, velocity_new) as a tuple.

    Example Tests

    Updated weights (accessor [0]): cold start

    Input: {"weights":[1,2],"gradient":[1,1],"momentum":0.9,"velocity":[0,0],"learning_rate":0.1}

    Expected: [0.9,1.9]

    Updated velocity (accessor [1]): cold start

    Input: {"weights":[1,2],"gradient":[1,1],"momentum":0.9,"velocity":[0,0],"learning_rate":0.1}

    Expected: [0.1,0.1]

    Warm start: prior velocity carries forward

    Input: {"weights":[0,0],"gradient":[1,0],"momentum":0.9,"velocity":[0.5,0.5],"learning_rate":0.1}

    Expected: [-0.55,-0.45]

    Sign in to solve this problem

    You can read the full problem statement above. Create a free account to run code in the browser, submit solutions, and track your progress.