Center the Data Matrix

Beginner
~10 min
code completion

Centering Data for PCA

PCA finds directions of maximum variance in the data. Before computing covariance or performing SVD, the data must be centered — each feature (column) is shifted to have zero mean:

where is the column-wise mean (each mean is subtracted from its respective column).

This is critical: without centering, the first principal component would point toward the mean of the data rather than the direction of greatest spread.

Your task:

Implement center_data(X) that subtracts the column mean from each column.

Example Tests

Column means become 0

Input: {"X":[[1,4],[3,6],[5,8]]}

Expected: [[-2,-2],[0,0],[2,2]]

Already-centered data unchanged

Input: {"X":[[-1,1],[0,0],[1,-1]]}

Expected: [[-1,1],[0,0],[1,-1]]

Sum of centered matrix is 0 (all column means are 0)

Input: {"X":[[1,2],[3,4],[5,6]]}

Expected: 0

Sign in to solve this problem

You can read the full problem statement above. Create a free account to run code in the browser, submit solutions, and track your progress.