Detect Label Leakage via Temporal Check
Detect Label Leakage via Temporal Check
Label leakage occurs when features contain information that is only available *after* the prediction target is known. In time-series settings, a feature recorded at time should not be used to predict a label recorded at time if .
You are given:
feature_times: array of timestamps at which each feature was recordedlabel_times: array of timestamps at which each label was recordedA sample leaks if feature_times[i] >= label_times[i] — the feature was observed at or after the label event.
Return a sorted integer array of the indices of leaking samples.
Example:
feature_times = [1, 5, 3] label_times = [3, 4, 7]
[1]Your task:
Implement find_leaking_indices(feature_times, label_times) returning a sorted integer array.
Example Tests
One leak at index 1 where feature_time > label_time
Input: {"label_times":[3,4,7],"feature_times":[1,5,3]}
Expected: [1]
All features recorded before labels: no leakage
Input: {"label_times":[4,5,6],"feature_times":[1,2,3]}
Expected: []
Feature time exactly equal to label time counts as leakage
Input: {"label_times":[3,4,2],"feature_times":[3,3,3]}
Expected: [0,2]