Backpropagation
Medium
Linear Layer Weight Gradient
Medium
~15 min
code completion
Weight Gradient of a Linear Layer
A linear layer computes:
where has shape , has shape , and has shape .
Given the upstream gradient (d_out, shape ), the gradient with respect to is:
Why? Each weight affects output column through every input row. The matrix product sums those contributions over the whole batch.
Check the shapes: — this matches 's shape, which is a good sanity check.
Your task:
Implement linear_weight_grad(X, d_out) that returns .
Example Tests
2x2 input, 2x1 upstream gradient
Input: {"X":[[1,2],[3,4]],"d_out":[[1],[1]]}
Expected: [[4],[6]]
Identity input: gradient equals d_out
Input: {"X":[[1,0],[0,1]],"d_out":[[2],[3]]}
Expected: [[2],[3]]
Single sample, 3 features, 1 output
Input: {"X":[[1,2,3]],"d_out":[[1]]}
Expected: [[1],[2],[3]]