Skip to main content

Overview

This guide will help you set up your environment to run all the code examples from the book. We provide multiple setup options to accommodate different preferences and hardware configurations.
Recommended for Beginners: We strongly recommend using Google Colab for the easiest setup. All examples in the book were built and tested using Google Colab with a free T4 GPU (16GB VRAM).

Setup Options

You can choose from three main approaches:
  1. Cloud-based (Recommended): Google Colab with free GPU access
  2. Local with Conda: Full control with version-managed environment
  3. Local with pip: Quick setup if you already have Python 3.10

Cloud Setup (Google Colab)

Google Colab provides free access to GPUs and comes with most dependencies pre-installed, making it the most stable and hassle-free option.
1

Open a Chapter Notebook

Click on any “Open in Colab” badge from the book’s repository table of contents.
2

Enable GPU Runtime

In Google Colab, navigate to:Runtime → Change runtime type → Hardware accelerator → GPU → GPU type → T4
3

Install Chapter Dependencies

Each notebook includes an installation cell at the top. Uncomment and run it to install required packages.For example, Chapter 1 requires:
!pip install transformers==4.41.2 accelerate==0.31.0
4

Run the Notebook

Execute cells sequentially to follow along with the book examples.
Google Colab’s free tier includes:
  • NVIDIA T4 GPU with 16GB VRAM
  • 12GB RAM
  • Session timeout after ~12 hours of inactivity

Local Setup with Conda

Conda provides the most reliable local setup with full version control and dependency management. This method does not require separate C++ compiler installation.

Prerequisites

  • Storage: At least 10GB free disk space
  • RAM: 8GB minimum (16GB recommended)
  • GPU: NVIDIA GPU with CUDA support (optional but highly recommended)
1

Install Miniconda

Download and install Miniconda with Python 3.10 for your operating system.
# Download installer from:
# https://docs.anaconda.com/free/miniconda/miniconda-other-installer-links/
# Select: Miniconda3 Windows 64-bit with Python 3.10
2

Create Conda Environment

Open your terminal and create a new environment named thellmbook:
conda create -n thellmbook python=3.10
3

Activate Environment

Activate the newly created environment:
conda activate thellmbook
4

Install Dependencies

Clone the repository and install dependencies:
# Clone the repository
git clone https://github.com/HandsOnLLM/Hands-On-Large-Language-Models.git
cd Hands-On-Large-Language-Models

# Install all dependencies at once
conda env create -f environment.yml
5

Install PyTorch with GPU Support

Visit pytorch.org and select your configuration to get the appropriate installation command.For CUDA 11.8 (most common):
pip3 install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
The --upgrade flag ensures the CPU version is replaced with the GPU version.
6

Verify GPU Access

Test that PyTorch can access your GPU:
import torch
print(torch.cuda.is_available())  # Should return True
print(torch.cuda.get_device_name(0))  # Shows your GPU model
7

Start JupyterLab

Launch JupyterLab to run the notebooks:
jupyter lab
Make sure to select the thellmbook kernel (ipykernel) in the top-right corner of each notebook.

Local Setup with pip

For users who already have Python 3.10 installed and want a quick setup.
This method requires Microsoft Visual C++ 14.0 or greater on Windows. See the Troubleshooting section if you encounter C++ errors.
1

Verify Python Version

Ensure you have Python 3.10 installed:
python --version  # Should show Python 3.10.x
2

Clone Repository

git clone https://github.com/HandsOnLLM/Hands-On-Large-Language-Models.git
cd Hands-On-Large-Language-Models
3

Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt
4

Install PyTorch GPU

Follow the same PyTorch installation steps from the Conda setup.

Core Dependencies

The following packages are required throughout the book:
CategoryPackagesPurpose
Deep Learningtorch==2.3.1
transformers==4.41.2
sentence-transformers==3.0.1
Core LLM framework
Data Processingnumpy==1.26.4
pandas==2.2.2
datasets==2.20.0
Data manipulation
Visualizationmatplotlib==3.9.0Plotting and visualization
ML Toolsscikit-learn==1.5.0
evaluate==0.4.2
scipy>=1.15.0
Machine learning utilities
NLPsentencepiece==0.2.0
nltk==3.8.1
Text processing
Environmentjupyterlab==4.2.2
ipywidgets==8.1.3
Interactive notebooks

Chapter-Specific Dependencies

Some chapters require additional packages:
pip install bertopic==0.16.3 datamapplot==0.3.0

GPU Requirements

While you can run some examples on CPU, most chapters require GPU acceleration for practical performance.

Minimum Requirements

  • VRAM: 4GB (6GB+ recommended)
  • CUDA: Version 11.8 or later
  • GPU Examples: NVIDIA RTX 3060, T4 (Colab), A10G, or better

Cloud Alternatives

If you don’t have a local GPU:
PlatformGPU OptionsFree TierNotes
Google ColabT4 (16GB)✅ YesRecommended, session limits
KaggleP100 (16GB), T4✅ Yes30 hours/week free
AWS SageMakerVarious🟡 Limitedml.t3.medium free tier
Azure MLVarious🟡 LimitedSome free credits
PaperspaceVarious❌ NoAffordable hourly rates

Troubleshooting

”Microsoft Visual C++ 14.0 or greater is required”

This error occurs on Windows when installing packages that need compilation.
1

Download Build Tools

Visit visualstudio.microsoft.com/visual-cpp-build-tools and click “Download Build Tools”.
2

Run Installer

Launch the installer and click “Continue” to prepare the installation.
3

Select C++ Development

Choose “Desktop development with C++” from the workload options.
4

Install

Click “Install” to install the necessary C++ tools.
5

Retry pip install

After installation completes, retry your pip install command.
Alternative: Use the conda installation method, which doesn’t require C++ build tools.

CUDA Not Available

If torch.cuda.is_available() returns False:
  1. Verify GPU drivers: Update to the latest NVIDIA drivers
    nvidia-smi  # Should show your GPU
    
  2. Reinstall PyTorch: Make sure you installed the CUDA version
    pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118
    
  3. Check CUDA compatibility: Your GPU must support CUDA 11.8 or later

Out of Memory Errors

If you get CUDA out-of-memory errors:
# Reduce batch size
batch_size = 4  # Try 2 or 1 if still failing

# Use gradient checkpointing
model.gradient_checkpointing_enable()

# Use mixed precision training
from torch.cuda.amp import autocast
with autocast():
    # Your model code here

Package Version Conflicts

If you encounter version conflicts:
# requirements.txt has exact versions that work together
pip install -r requirements.txt

JupyterLab Kernel Issues

If JupyterLab doesn’t show the correct kernel:
# Install ipykernel in your environment
conda activate thellmbook
pip install ipykernel
python -m ipykernel install --user --name=thellmbook

# Restart JupyterLab
jupyter lab

Import Errors

If you get import errors:
  1. Verify environment: Make sure you’re in the correct conda environment
    conda activate thellmbook
    python -c "import sys; print(sys.executable)"
    
  2. Reinstall package: Try reinstalling the problematic package
    pip uninstall package_name
    pip install package_name==version
    
  3. Check dependencies: Some packages need others to be installed first
    pip install -r requirements.txt  # Installs in correct order
    

Next Steps

Once your environment is set up:
  1. Start with Chapter 1: Introduction to Language Models
  2. Review the Prerequisites to ensure you have the necessary background
  3. Join the community discussions on GitHub
Save your environment configuration! After successfully setting up, you can export it:
# Conda users
conda env export > my_environment.yml

# pip users  
pip freeze > my_requirements.txt

Build docs developers (and LLMs) love