Backpropagation
Hard
Sigmoid Gradient (Backprop)
Hard
~15 min
code completion
Sigmoid Gradient
During backpropagation, we need the derivative of each activation function. For sigmoid:
This elegant formula means: given the forward-pass output , the gradient costs almost nothing to compute.
The sigmoid gradient is maximized at (value: 0.25) and approaches 0 for large . This causes the vanishing gradient problem in deep networks — gradients shrink as they flow backward through many sigmoid layers.
Your task:
Implement sigmoid_gradient(z) that returns the derivative of sigmoid at each element of z.
Example Tests
z=0: maximum gradient = 0.25
Input: {"z":0}
Expected: 0.25
Large positive z: gradient near 0
Input: {"z":100}
Expected: 0
Symmetric: gradient same at +1 and -1
Input: {"z":1}
Expected: 0.19661