Agents learn by interacting with an environment. Rewards shape behavior through the Markov Decision Process framework — the basis of game-playing AI.
Sign in for the concept check
The optional multiple-choice concept check tracks your understanding. Browse the coding problems below, then sign in when you're ready to solve them.
Discounted Return
~12 min· Hard
Epsilon-Greedy Action Selection
~15 min· Hard