Equal-Width Feature Binning
Equal-Width Feature Binning
Binning (discretization) converts continuous values into discrete integer bucket indices. This can reduce noise in high-variance features and improve certain model types.
Equal-width binning divides the observed range into n_bins equal-width intervals:
1. Create n_bins + 1 evenly-spaced edges: np.linspace(lo, hi, n_bins + 1)
2. Use np.digitize(arr, edges[1:-1]) — interior edges only — to assign 0-indexed bin labels
Example: arr = [0, 1, 2, 3, 4], n_bins = 2
[0, 2, 4], interior edge: [2]< 2 → bin 0; values >= 2 → bin 1[0, 0, 1, 1, 1]Edge case: If all values are identical (zero range), return all zeros.
Your task:
Implement bin_features(arr, n_bins) that returns an integer array of bin indices in [0, n_bins - 1].
Example Tests
Range [0,4] split into 2 bins: values below 2 and at-or-above 2
Input: {"arr":[0,1,2,3,4],"n_bins":2}
Expected: [0,0,1,1,1]
Range [0,20] split into 4 bins of width 5
Input: {"arr":[0,5,10,15,20],"n_bins":4}
Expected: [0,1,2,3,3]
All identical values: zero-range edge case returns all zeros
Input: {"arr":[7,7,7],"n_bins":3}
Expected: [0,0,0]