Environment Setup

Overview

This guide will help you set up your environment to run all the code examples from the book. We provide multiple setup options to accommodate different preferences and hardware configurations.

Recommended for Beginners: We strongly recommend using Google Colab for the easiest setup. All examples in the book were built and tested using Google Colab with a free T4 GPU (16GB VRAM).

Setup Options

You can choose from three main approaches:

Cloud-based (Recommended): Google Colab with free GPU access
Local with Conda: Full control with version-managed environment
Local with pip: Quick setup if you already have Python 3.10

Cloud Setup (Google Colab)

Google Colab provides free access to GPUs and comes with most dependencies pre-installed, making it the most stable and hassle-free option.

Open a Chapter Notebook

Click on any “Open in Colab” badge from the book’s repository table of contents.

Enable GPU Runtime

In Google Colab, navigate to:Runtime → Change runtime type → Hardware accelerator → GPU → GPU type → T4

Install Chapter Dependencies

Each notebook includes an installation cell at the top. Uncomment and run it to install required packages.For example, Chapter 1 requires:

!pip install transformers==4.41.2 accelerate==0.31.0

Run the Notebook

Execute cells sequentially to follow along with the book examples.

Google Colab’s free tier includes:

NVIDIA T4 GPU with 16GB VRAM
12GB RAM
Session timeout after ~12 hours of inactivity

Local Setup with Conda

Conda provides the most reliable local setup with full version control and dependency management. This method does not require separate C++ compiler installation.

Prerequisites

Storage: At least 10GB free disk space
RAM: 8GB minimum (16GB recommended)
GPU: NVIDIA GPU with CUDA support (optional but highly recommended)

Install Miniconda

Download and install Miniconda with Python 3.10 for your operating system.

# Download installer from:
# https://docs.anaconda.com/free/miniconda/miniconda-other-installer-links/
# Select: Miniconda3 Windows 64-bit with Python 3.10

Create Conda Environment

Open your terminal and create a new environment named thellmbook:

conda create -n thellmbook python=3.10

Activate Environment

Activate the newly created environment:

conda activate thellmbook

Install Dependencies

Clone the repository and install dependencies:

# Clone the repository
git clone https://github.com/HandsOnLLM/Hands-On-Large-Language-Models.git
cd Hands-On-Large-Language-Models

# Install all dependencies at once
conda env create -f environment.yml

Install PyTorch with GPU Support

Visit pytorch.org and select your configuration to get the appropriate installation command.For CUDA 11.8 (most common):

pip3 install --upgrade torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

The --upgrade flag ensures the CPU version is replaced with the GPU version.

Verify GPU Access

Test that PyTorch can access your GPU:

import torch
print(torch.cuda.is_available())  # Should return True
print(torch.cuda.get_device_name(0))  # Shows your GPU model

Start JupyterLab

Launch JupyterLab to run the notebooks:

jupyter lab

Make sure to select the thellmbook kernel (ipykernel) in the top-right corner of each notebook.

Local Setup with pip

For users who already have Python 3.10 installed and want a quick setup.

This method requires Microsoft Visual C++ 14.0 or greater on Windows. See the Troubleshooting section if you encounter C++ errors.

Verify Python Version

Ensure you have Python 3.10 installed:

python --version  # Should show Python 3.10.x

Clone Repository

git clone https://github.com/HandsOnLLM/Hands-On-Large-Language-Models.git
cd Hands-On-Large-Language-Models

Install Dependencies

pip install --upgrade pip
pip install -r requirements.txt

Install PyTorch GPU

Follow the same PyTorch installation steps from the Conda setup.

Core Dependencies

The following packages are required throughout the book:

Category	Packages	Purpose
Deep Learning	`torch==2.3.1` `transformers==4.41.2` `sentence-transformers==3.0.1`	Core LLM framework
Data Processing	`numpy==1.26.4` `pandas==2.2.2` `datasets==2.20.0`	Data manipulation
Visualization	`matplotlib==3.9.0`	Plotting and visualization
ML Tools	`scikit-learn==1.5.0` `evaluate==0.4.2` `scipy>=1.15.0`	Machine learning utilities
NLP	`sentencepiece==0.2.0` `nltk==3.8.1`	Text processing
Environment	`jupyterlab==4.2.2` `ipywidgets==8.1.3`	Interactive notebooks

Chapter-Specific Dependencies

Some chapters require additional packages:

pip install bertopic==0.16.3 datamapplot==0.3.0

GPU Requirements

While you can run some examples on CPU, most chapters require GPU acceleration for practical performance.

Minimum Requirements

VRAM: 4GB (6GB+ recommended)
CUDA: Version 11.8 or later
GPU Examples: NVIDIA RTX 3060, T4 (Colab), A10G, or better

Cloud Alternatives

If you don’t have a local GPU:

Platform	GPU Options	Free Tier	Notes
Google Colab	T4 (16GB)	✅ Yes	Recommended, session limits
Kaggle	P100 (16GB), T4	✅ Yes	30 hours/week free
AWS SageMaker	Various	🟡 Limited	ml.t3.medium free tier
Azure ML	Various	🟡 Limited	Some free credits
Paperspace	Various	❌ No	Affordable hourly rates

Troubleshooting

”Microsoft Visual C++ 14.0 or greater is required”

This error occurs on Windows when installing packages that need compilation.

Download Build Tools

Visit visualstudio.microsoft.com/visual-cpp-build-tools and click “Download Build Tools”.

Run Installer

Launch the installer and click “Continue” to prepare the installation.

Select C++ Development

Choose “Desktop development with C++” from the workload options.

Install

Click “Install” to install the necessary C++ tools.

Retry pip install

After installation completes, retry your pip install command.

Alternative: Use the conda installation method, which doesn’t require C++ build tools.

CUDA Not Available

If torch.cuda.is_available() returns False:

Verify GPU drivers: Update to the latest NVIDIA drivers
```
nvidia-smi  # Should show your GPU
```

Reinstall PyTorch: Make sure you installed the CUDA version

pip install torch torchvision torchaudio --index-url https://download.pytorch.org/whl/cu118

Check CUDA compatibility: Your GPU must support CUDA 11.8 or later

Out of Memory Errors

If you get CUDA out-of-memory errors:

# Reduce batch size
batch_size = 4  # Try 2 or 1 if still failing

# Use gradient checkpointing
model.gradient_checkpointing_enable()

# Use mixed precision training
from torch.cuda.amp import autocast
with autocast():
    # Your model code here

Package Version Conflicts

If you encounter version conflicts:

# requirements.txt has exact versions that work together
pip install -r requirements.txt

JupyterLab Kernel Issues

If JupyterLab doesn’t show the correct kernel:

# Install ipykernel in your environment
conda activate thellmbook
pip install ipykernel
python -m ipykernel install --user --name=thellmbook

# Restart JupyterLab
jupyter lab

Import Errors

If you get import errors:

Verify environment: Make sure you’re in the correct conda environment

conda activate thellmbook
python -c "import sys; print(sys.executable)"

Reinstall package: Try reinstalling the problematic package

pip uninstall package_name
pip install package_name==version

Check dependencies: Some packages need others to be installed first

pip install -r requirements.txt  # Installs in correct order

Next Steps

Once your environment is set up:

Start with Chapter 1: Introduction to Language Models
Review the Prerequisites to ensure you have the necessary background
Join the community discussions on GitHub

Save your environment configuration! After successfully setting up, you can export it:

# Conda users
conda env export > my_environment.yml

# pip users  
pip freeze > my_requirements.txt

Get Started

Foundations

Text Understanding

Text Generation

Retrieval & Multimodal

Fine-Tuning

Overview

Setup Options

Cloud Setup (Google Colab)

Local Setup with Conda

Prerequisites

Local Setup with pip

Core Dependencies

Chapter-Specific Dependencies

GPU Requirements

Minimum Requirements

Cloud Alternatives

Troubleshooting

”Microsoft Visual C++ 14.0 or greater is required”

CUDA Not Available

Out of Memory Errors

Package Version Conflicts

JupyterLab Kernel Issues

Import Errors

Next Steps

Build docs developers (and LLMs) love

Get Started

Foundations

Text Understanding

Text Generation

Retrieval & Multimodal

Fine-Tuning

Documentation Index

​Overview

​Setup Options

​Cloud Setup (Google Colab)

​Local Setup with Conda

​Prerequisites

​Local Setup with pip

​Core Dependencies

​Chapter-Specific Dependencies

​GPU Requirements

​Minimum Requirements

​Cloud Alternatives

​Troubleshooting

​”Microsoft Visual C++ 14.0 or greater is required”

​CUDA Not Available

​Out of Memory Errors

​Package Version Conflicts

​JupyterLab Kernel Issues

​Import Errors

​Next Steps

Build docs developers (and LLMs) love

Overview

Setup Options

Cloud Setup (Google Colab)

Local Setup with Conda

Prerequisites

Local Setup with pip

Core Dependencies

Chapter-Specific Dependencies

GPU Requirements

Minimum Requirements

Cloud Alternatives

Troubleshooting

”Microsoft Visual C++ 14.0 or greater is required”

CUDA Not Available

Out of Memory Errors

Package Version Conflicts

JupyterLab Kernel Issues

Import Errors

Next Steps