Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/lansinuote/Simple_Reinforcement_Learning/llms.txt

Use this file to discover all available pages before exploring further.

Before you open your first notebook, you need a Python environment with the correct package versions pinned. The Simple Reinforcement Learning series was updated on 2023-05-05 to target Python 3.9, PyTorch 1.12.1, and Gym 0.26.2 — small version mismatches in any of these three packages can cause silent behavioural differences or outright import errors. This guide walks you through every step from a fresh machine to a running Jupyter session.

System Requirements

  • Operating system: Windows 10/11, macOS 12+, or any modern Linux distribution
  • Python: 3.9 (3.10+ is not guaranteed compatible)
  • Package manager: Conda (recommended) or pyenv
  • Disk space: ~3 GB (PyTorch CPU build; add ~2 GB for a CUDA build)
  • GPU: Optional — every notebook runs on CPU; a CUDA-capable GPU speeds up training but is not required
The series was upgraded on 2023-05-05: Gym was bumped to 0.26.2, Python to 3.9, and PyTorch to 1.12.1. If you cloned the repository before that date, pull the latest changes and recreate your environment using the versions shown below.

Installation Steps

1

Install Conda (or pyenv)

Download and install Miniconda for your operating system. Miniconda gives you the conda command without bundling hundreds of packages you do not need.Verify the installation:
conda --version
If you prefer pyenv, follow the pyenv installation guide and use pyenv install 3.9.x to get Python 3.9.
2

Create a dedicated Conda environment

Create a new environment named rl with Python 3.9 pinned, then activate it:
conda create -n rl python=3.9
conda activate rl
Using an isolated environment keeps the project’s pinned versions separate from your other Python projects.
3

Install the required packages

With the rl environment active, install PyTorch, Gym, and the supporting libraries:
pip install torch==1.12.1 gym==0.26.2 matplotlib numpy
pip install torch==1.12.1 gym==0.26.2 matplotlib numpy
Do not upgrade PyTorch or Gym beyond the versions listed above without testing first. Gym’s environment API changed significantly between 0.21 and 0.26, and the notebooks rely on the terminated / truncated step return signature introduced in 0.26.
4

Install Jupyter

Install Jupyter Notebook inside the same environment so the kernel picks up all the packages you just installed:
pip install jupyter
5

Clone the repository

Clone the Simple Reinforcement Learning repository from GitHub:
git clone https://github.com/lansinuote/Simple_Reinforcement_Learning.git
Then change into the project directory:
cd Simple_Reinforcement_Learning
The repository contains 18 numbered folders. Each folder holds one or more .ipynb notebooks and any helper files that chapter needs.
6

Launch Jupyter Notebook

Start the Jupyter server from inside the project directory:
jupyter notebook
Your browser will open automatically at http://localhost:8888. Click any numbered folder to navigate to a chapter, then open its .ipynb file to start reading and running cells.

Verifying Your Installation

After launching Jupyter, open a new notebook and run the following snippet to confirm all three core packages are present at the correct versions:
import sys, torch, gym

print("Python :", sys.version)
print("PyTorch:", torch.__version__)
print("Gym    :", gym.__version__)
You should see output similar to:
Python : 3.9.x | ...
PyTorch: 1.12.1
Gym    : 0.26.2

Troubleshooting

Some notebooks import additional packages such as pygame, box2d-py, or ale-py for specific environments. If a cell raises a ModuleNotFoundError, install the missing package with pip install <package-name> in your terminal (with the rl environment active), then restart the Jupyter kernel and re-run the cell.
SymptomLikely causeFix
ModuleNotFoundError: No module named 'gym'Wrong Python kernel selectedIn Jupyter, go to Kernel → Change Kernel and select the rl environment
AttributeError: 'tuple' object has no attribute 'shape'Gym < 0.26 returns a single value from step()Upgrade to gym==0.26.2
RuntimeError: expected scalar type Float but found DoubleNumPy / PyTorch dtype mismatchAdd .float() when converting NumPy arrays to tensors
Slow training even with a GPUPyTorch CUDA build not installedInstall the CUDA variant shown in Step 3

Next Steps

With your environment ready, open notebook 01 to begin with the multi-armed bandit problem — the simplest RL setting and the perfect place to build intuition before moving on to Markov Decision Processes and deep networks. Refer back to the Introduction at any time for a map of where each chapter fits in the overall curriculum.

Build docs developers (and LLMs) love