Environment Setup for Simple Reinforcement Learning

Before you open your first notebook, you need a Python environment with the correct package versions pinned. The Simple Reinforcement Learning series was updated on 2023-05-05 to target Python 3.9, PyTorch 1.12.1, and Gym 0.26.2 — small version mismatches in any of these three packages can cause silent behavioural differences or outright import errors. This guide walks you through every step from a fresh machine to a running Jupyter session.

System Requirements

Operating system: Windows 10/11, macOS 12+, or any modern Linux distribution
Python: 3.9 (3.10+ is not guaranteed compatible)
Package manager: Conda (recommended) or pyenv
Disk space: ~3 GB (PyTorch CPU build; add ~2 GB for a CUDA build)
GPU: Optional — every notebook runs on CPU; a CUDA-capable GPU speeds up training but is not required

The series was upgraded on 2023-05-05: Gym was bumped to 0.26.2, Python to 3.9, and PyTorch to 1.12.1. If you cloned the repository before that date, pull the latest changes and recreate your environment using the versions shown below.

Installation Steps

Install Conda (or pyenv)

Download and install Miniconda for your operating system. Miniconda gives you the conda command without bundling hundreds of packages you do not need.Verify the installation:

conda --version

If you prefer pyenv, follow the pyenv installation guide and use pyenv install 3.9.x to get Python 3.9.

Create a dedicated Conda environment

Create a new environment named rl with Python 3.9 pinned, then activate it:

conda create -n rl python=3.9
conda activate rl

Using an isolated environment keeps the project’s pinned versions separate from your other Python projects.

Install the required packages

With the rl environment active, install PyTorch, Gym, and the supporting libraries:

pip install torch==1.12.1 gym==0.26.2 matplotlib numpy

pip install torch==1.12.1 gym==0.26.2 matplotlib numpy

Do not upgrade PyTorch or Gym beyond the versions listed above without testing first. Gym’s environment API changed significantly between 0.21 and 0.26, and the notebooks rely on the terminated / truncated step return signature introduced in 0.26.

Install Jupyter

Install Jupyter Notebook inside the same environment so the kernel picks up all the packages you just installed:

pip install jupyter

Clone the repository

Clone the Simple Reinforcement Learning repository from GitHub:

git clone https://github.com/lansinuote/Simple_Reinforcement_Learning.git

Then change into the project directory:

cd Simple_Reinforcement_Learning

The repository contains 18 numbered folders. Each folder holds one or more .ipynb notebooks and any helper files that chapter needs.

Launch Jupyter Notebook

Start the Jupyter server from inside the project directory:

jupyter notebook

Your browser will open automatically at http://localhost:8888. Click any numbered folder to navigate to a chapter, then open its .ipynb file to start reading and running cells.

Verifying Your Installation

After launching Jupyter, open a new notebook and run the following snippet to confirm all three core packages are present at the correct versions:

import sys, torch, gym

print("Python :", sys.version)
print("PyTorch:", torch.__version__)
print("Gym    :", gym.__version__)

You should see output similar to:

Python : 3.9.x | ...
PyTorch: 1.12.1
Gym    : 0.26.2

Troubleshooting

Some notebooks import additional packages such as pygame, box2d-py, or ale-py for specific environments. If a cell raises a ModuleNotFoundError, install the missing package with pip install <package-name> in your terminal (with the rl environment active), then restart the Jupyter kernel and re-run the cell.

Symptom	Likely cause	Fix
`ModuleNotFoundError: No module named 'gym'`	Wrong Python kernel selected	In Jupyter, go to Kernel → Change Kernel and select the `rl` environment
`AttributeError: 'tuple' object has no attribute 'shape'`	Gym < 0.26 returns a single value from `step()`	Upgrade to `gym==0.26.2`
`RuntimeError: expected scalar type Float but found Double`	NumPy / PyTorch dtype mismatch	Add `.float()` when converting NumPy arrays to tensors
Slow training even with a GPU	PyTorch CUDA build not installed	Install the CUDA variant shown in Step 3

Next Steps

With your environment ready, open notebook 01 to begin with the multi-armed bandit problem — the simplest RL setting and the perfect place to build intuition before moving on to Markov Decision Processes and deep networks. Refer back to the Introduction at any time for a map of where each chapter fits in the overall curriculum.

Get Started

Foundations

Tabular & Model-Based Methods

Deep RL Algorithms

Advanced Topics

Environment Setup for Simple Reinforcement Learning

System Requirements

Installation Steps

Verifying Your Installation

Troubleshooting

Next Steps

Build docs developers (and LLMs) love

Get Started

Foundations

Tabular & Model-Based Methods

Deep RL Algorithms

Advanced Topics

Documentation Index

​System Requirements

​Installation Steps

​Verifying Your Installation

​Troubleshooting

​Next Steps

Build docs developers (and LLMs) love

System Requirements

Installation Steps

Verifying Your Installation

Troubleshooting

Next Steps