Center the Data Matrix
Centering Data for PCA
PCA finds directions of maximum variance in the data. Before computing covariance or performing SVD, the data must be centered — each feature (column) is shifted to have zero mean:
where is the column-wise mean (each mean is subtracted from its respective column).
This is critical: without centering, the first principal component would point toward the mean of the data rather than the direction of greatest spread.
Your task:
Implement center_data(X) that subtracts the column mean from each column.
Example Tests
Column means become 0
Input: {"X":[[1,4],[3,6],[5,8]]}
Expected: [[-2,-2],[0,0],[2,2]]
Already-centered data unchanged
Input: {"X":[[-1,1],[0,0],[1,-1]]}
Expected: [[-1,1],[0,0],[1,-1]]
Sum of centered matrix is 0 (all column means are 0)
Input: {"X":[[1,2],[3,4],[5,6]]}
Expected: 0