Medium

Sinusoidal Positional Encoding

Medium

~12 min

code completion

Self-attention is permutation-invariant: without extra information, the model cannot tell token order.

Transformers inject position information by adding a deterministic encoding vector to each token embedding.

For position $p os$ and model dimension $d$ :

P E [p os, 2 i] = sin (\frac{p os}{1000 0 ^{2 i / d}}), P E [p os, 2 i + 1] = cos (\frac{p os}{1000 0 ^{2 i / d}})

Key intuition:

Even dimensions use sine, odd dimensions use cosine.

Different dimensions use different frequencies.

Nearby positions have similar encodings; far positions differ more.

Your task: Implement positional_encoding(seq_len, d_model) returning a NumPy array of shape (seq_len, d_model).

Implementation requirements:

1. Positions are 0-indexed.

2. Use vectorized NumPy operations (no Python loop over positions).

3. Use a numerically stable expression for the denominator term.

Example Tests

Single token, d_model=2

Input: {"d_model":2,"seq_len":1}

Expected: [[0,1]]

Two tokens, d_model=2

Input: {"d_model":2,"seq_len":2}

Expected: [[0,1],[0.84147,0.5403]]

Shape check: seq_len=4, d_model=4

Input: {"d_model":4,"seq_len":4}

Expected: [4,4]

You can read the full problem statement above. Create a free account to run code in the browser, submit solutions, and track your progress.