Browse machine learning concepts and coding problems. Each topic includes learning objectives and hands-on Python exercises — sign in free when you're ready to practice.
NumPy Arrays
Master the fundamental data structure of numerical computing in Python. Learn to create, index, and manipulate n-dimensional arrays efficiently.
NumPy Operations & Broadcasting
Unlock the performance of vectorized operations. Understand broadcasting rules that eliminate explicit loops in numerical code.
Linear Algebra for ML
The mathematical backbone of machine learning. Dot products, matrix multiplication, and eigendecomposition form the language of models.
Probabilistic Models & Bayesian Thinking
ML is fundamentally about uncertainty. Bayes theorem, Gaussian distributions, and maximum likelihood estimation give models a rigorous probabilistic foundation.
Loss Functions & Task Design
The choice of loss function defines what your model optimizes. MSE, cross-entropy, hinge loss, and contrastive losses each encode different assumptions.
Data Preprocessing
Raw data is messy. Learn normalization, standardization, and feature engineering — the critical steps that happen before any model sees data.
Linear Regression
The foundation of supervised learning. Fit a line through data using the least squares criterion and understand the geometry of prediction.
Gradient Descent
The optimization engine behind all of machine learning. Learn how parameters update iteratively by following the direction of steepest descent.
Neural Network Basics
Build your intuition for deep learning from scratch. Understand layers, activations, and why depth enables hierarchical feature learning.
Backpropagation
The chain rule made computational. Learn how gradients flow backward through a network to train every parameter simultaneously.
Regularization Techniques
Prevent your model from memorizing noise. L1, L2 penalties, dropout, and early stopping each impose different inductive biases.
K-Means Clustering
The simplest unsupervised algorithm: assign points to the nearest centroid and update centroids until convergence. Deceptively powerful in practice.
PCA & Dimensionality Reduction
High-dimensional data is sparse and hard to visualize. PCA finds the directions of maximum variance, compressing without losing what matters.
Data Cleaning & Integrity
Real-world data arrives with outliers, duplicates, missing values, and evolving schemas. Mastering cleaning pipelines is the unglamorous work that makes or breaks model performance.
Feature Engineering & Representation
Raw features rarely speak the language of your model. One-hot encoding, target encoding, standardization, and binning transform data into representations that amplify signal and suppress noise.
Dataset Splitting & Sampling
How you split data determines whether your evaluation metrics are trustworthy. Leakage, class imbalance, and temporal ordering each demand different splitting strategies — and getting them wrong silently inflates your scores.