Medium

Softmax Attention Weights

Medium

~12 min

code completion

Attention Weights via Softmax

In the attention mechanism, raw scores measure how relevant each key is to a query. Scores are converted to weights via softmax so they sum to 1 and can be interpreted as probabilities:

α_{i} = \frac{e ^{s_{i}}}{\sum _{j} e ^{s_{j}}}

For numerical stability, subtract the maximum score before exponentiating (same trick as in softmax generally).

This is the first step of the attention formula:

Attention (Q, K, V) = softmax (\frac{Q K ^{⊤}}{d _{k}}) V

Your task:

Implement softmax_attention(scores) that converts a 1D scores vector to attention weights.

Example Tests

Softmax weights sum to 1.0

Input: {"scores":[1,2,3]}

Expected: 1

Equal scores: uniform weights

Input: {"scores":[0,0,0,0]}

Expected: [0.25,0.25,0.25,0.25]

Dominant score takes most weight

Input: {"scores":[0,10,0]}

Expected: [0.0000454,0.9999092,0.0000454]

You can read the full problem statement above. Create a free account to run code in the browser, submit solutions, and track your progress.