Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lansinuote/Simple_Reinforcement_Learning/llms.txt

Use this file to discover all available pages before exploring further.

Simple Reinforcement Learning is a hands-on notebook series that takes you from the very basics of reinforcement learning — stateless bandit problems — all the way through state-of-the-art deep RL algorithms like PPO, DDPG, and SAC. Every topic is a self-contained Jupyter notebook with clean, minimal Python code built on PyTorch and OpenAI Gym.

Get Started

Understand what this course covers and how to navigate the notebooks.

Environment Setup

Install Python 3.9, PyTorch 1.12.1, and Gym 0.26.2 to run every notebook.

OpenAI Gym Basics

Learn how to create, reset, step through, and render Gym environments.

Bandit Algorithms

Explore Greedy, UCB, and Thompson Sampling on the multi-armed bandit problem.

What You’ll Learn

This series covers the full spectrum of modern RL, organized into four progressive sections:

Foundations

Gym environments, Markov Decision Processes, Monte Carlo methods, Bellman equations, and dynamic programming.

Tabular & Model-Based Methods

Sarsa, N-step Sarsa, Q-Learning, and DynaQ — classic tabular and model-assisted planning algorithms.

Deep RL Algorithms

DQN, Double DQN, Dueling DQN, REINFORCE, Actor-Critic, PPO, DDPG, and SAC using PyTorch neural networks.

Advanced Topics

Imitation Learning, Offline RL, Model Predictive Control, MBPO, Goal-conditioned RL, and Multi-agent systems.

Algorithm Coverage

SectionAlgorithms
Stateless BanditsGreedy, Decaying Greedy, UCB, Thompson Sampling
MDP FoundationsMonte Carlo, Bellman Equation
Dynamic ProgrammingPolicy Iteration, Value Iteration
Temporal DifferenceSarsa, N-step Sarsa, Q-Learning
Model-BasedDynaQ, MPC, MBPO
Deep Value-BasedDQN, Double DQN, Dueling DQN
Policy GradientREINFORCE, Actor-Critic, PPO
Continuous ActionDDPG, SAC
AdvancedImitation Learning, Offline RL, Goal-conditioned RL, Multi-agent

Prerequisites

You should be comfortable with Python and have a basic understanding of neural networks. No prior RL experience is required — the course builds all concepts from scratch.
  • Python — familiarity with NumPy and basic Python scripting
  • PyTorch — basic tensor operations and nn.Sequential models
  • Math — high-school probability and linear algebra are sufficient

Quick Setup

1

Install Python 3.9

Use Anaconda or pyenv to create an isolated Python 3.9 environment.
2

Install dependencies

pip install torch==1.12.1 gym==0.26.2 matplotlib numpy
3

Clone the repository

git clone https://github.com/lansinuote/Simple_Reinforcement_Learning.git
cd Simple_Reinforcement_Learning
4

Open a notebook

jupyter notebook
Navigate to any numbered folder and open the first notebook to begin.

Build docs developers (and LLMs) love