Linear Head Prediction
Transfer Learning: Linear Head
In feature extraction (the simplest form of transfer learning), a pretrained network is used as a fixed feature extractor. We freeze its weights and train only a small linear head on top.
The pipeline:
1. Pass data through the frozen backbone → get embeddings (shape: m × d)
2. Apply a linear head: (shape: m × 1, for regression)
3. Compute MSE loss on the head's predictions
Only the head's weights are updated during fine-tuning.
Your task:
Implement linear_head_predict(Z, W) that returns the predictions and linear_head_mse(Z, W, y_true) that returns the MSE loss. Implement them as two separate functions.
Example Tests
linear_head_predict: 3 samples, 2-d embedding
Input: {"W":[[2],[3]],"Z":[[1,0],[0,1],[1,1]]}
Expected: [[2],[3],[5]]
linear_head_mse: perfect predictions → 0 loss
Input: {"W":[[1],[1]],"Z":[[1,0],[0,1]],"y_true":[[1],[1]]}
Expected: 0
linear_head_mse: known error
Input: {"W":[[2],[3]],"Z":[[1,0],[0,1]],"y_true":[[1],[1]]}
Expected: 2.5