Bootstrap Sampling

Easy
~15 min
code completion

Bootstrap Sampling

Bagging (Bootstrap Aggregating) trains each model on a bootstrap sample: a random sample of the training data with replacement of the same size as the original dataset.

On average, each bootstrap sample contains ~63.2% unique examples from the original data; the remaining ~36.8% are duplicates. This diversity is what makes models in the ensemble disagree — reducing variance.

In NumPy:

indices = np.random.choice(n, size=n, replace=True)

Your task:

Implement bootstrap_sample(X, y, seed) that returns (X_sample, y_sample) — bootstrapped versions of the data using the given random seed.

Example Tests

Bootstrapped y_sample for seed 42

Input: {"X":[[1,2],[3,4],[5,6],[7,8]],"y":[0,1,0,1],"seed":42}

Expected: [0,1,0,1]

Bootstrapped y_sample for seed 0

Input: {"X":[[1],[2],[3],[4],[5]],"y":[1,2,3,4,5],"seed":0}

Expected: [5,4,3,2,2]

Seed=0 produces specific first label

Input: {"X":[[1],[2],[3],[4],[5]],"y":[10,20,30,40,50],"seed":0}

Expected: 50

Sign in to solve this problem

You can read the full problem statement above. Create a free account to run code in the browser, submit solutions, and track your progress.