Q-Learning & MCTS: Reinforcement Learning for Connect 4

What’s inside
Highlights

Reinforcement learning agents that master Connect 4 through two fundamentally different strategies: Monte Carlo Tree Search (MCTS), which reasons ahead by simulating random game continuations, and Q-Learning, which learns a value function by playing thousands of training games against an MCTS opponent. Both agents operate on a configurable grid, expose a consistent action interface, and can be matched against each other or benchmarked against a random baseline player.

Quickstart

Run your first agent matchup in under five minutes

MCTS Agent

Full API reference for the Monte Carlo Tree Search agent

Q-Learning Agent

Full API reference for the Q-Learning agent

Training Guide

Train the Q-Learning agent against an MCTS opponent

What’s inside

Connect 4 Environment

Board representation, valid moves, and terminal-state detection

MCTS Concepts

Selection, expansion, simulation, and back-propagation explained

Q-Learning Concepts

Bellman updates, epsilon-greedy policy, and mirror symmetry

Evaluation

Run head-to-head matchups and interpret win/draw/loss results

Highlights

UCB1-guided tree search — MCTS balances exploration vs. exploitation using the Upper Confidence Bound formula.
Mirror-state symmetry — Q-Learning halves the state space by treating a board and its horizontal reflection as equivalent.
Pluggable board size — both agents accept arbitrary rows × cols dimensions at construction time.
Persistent Q-tables — trained value functions are saved as gzip-compressed pickle files and reloaded for evaluation.
Reward shaping — the Q-Learning agent uses shaped rewards (+50 win, −50 loss, −10 draw, −1 per step) for stable convergence.

Build docs developers (and LLMs) love

Get started for free Talk to us

Get Started

Concepts

Agents

Training & Evaluation

Q-Learning & MCTS: Reinforcement Learning for Connect 4

Quickstart

MCTS Agent

Q-Learning Agent

Training Guide

What’s inside

Connect 4 Environment

MCTS Concepts

Q-Learning Concepts

Evaluation

Highlights

Build docs developers (and LLMs) love

Get Started

Concepts

Agents

Training & Evaluation

Documentation Index

Quickstart

MCTS Agent

Q-Learning Agent

Training Guide

​What’s inside

Connect 4 Environment

MCTS Concepts

Q-Learning Concepts

Evaluation

​Highlights

Build docs developers (and LLMs) love

What’s inside

Highlights