Prerequisites

Auto-generate your docs

Overview
Required Knowledge
Python Programming
Machine Learning Fundamentals
Deep Learning & NLP Basics
Recommended Background
Mathematics
Libraries & Tools
The Visual Learning Approach
What Makes This Book Different
Who This Book Is For
Preparatory Learning Path
Skill Level Assessment
✅ Ready to Start Now
📚 Need Some Preparation (1-2 weeks)
🎓 Recommended Preparatory Study (4-8 weeks)
Complementary Resources
While Reading the Book
Community & Support
What You’ll Learn
Next Steps

Overview

“Hands-On Large Language Models” is designed to be accessible to readers with a foundational understanding of Python and machine learning concepts. This page outlines what you should know before diving into the book and provides resources to help you prepare.

Don’t worry if you don’t have all the prerequisites mastered! The book’s visual learning approach with nearly 300 custom illustrations makes complex concepts easier to grasp, even for relative newcomers to the field.

Required Knowledge

Python Programming

You should be comfortable with Python fundamentals, as all code examples are in Python. What you need to know:

Basic syntax (variables, functions, loops, conditionals)
Data structures (lists, dictionaries, tuples)
Working with libraries (importing modules, using pip/conda)
Object-oriented programming basics (classes, methods)
File I/O operations
Using Jupyter notebooks

Self-assessment test:

# Can you understand and modify this code?
import numpy as np
from typing import List, Dict

class TextProcessor:
    def __init__(self, texts: List[str]):
        self.texts = texts
    
    def get_word_counts(self) -> Dict[str, int]:
        word_counts = {}
        for text in self.texts:
            for word in text.lower().split():
                word_counts[word] = word_counts.get(word, 0) + 1
        return word_counts

# Usage
processor = TextProcessor(["Hello world", "Hello Python"])
counts = processor.get_word_counts()
print(counts)  # {'hello': 2, 'world': 1, 'python': 1}

If the above code makes sense to you, you’re ready! Recommended resources if you need a refresher:

Python for Everybody (free course)
Python.org Official Tutorial
Real Python (practical tutorials)

Machine Learning Fundamentals

A basic understanding of ML concepts will help you follow along more easily. Core concepts you should be familiar with:

Supervised Learning

Training and test sets
Features and labels
Classification vs. regression
Model evaluation (accuracy, precision, recall, F1)
Overfitting and underfitting

Neural Networks Basics

What a neural network is conceptually
Layers, neurons, and activation functions
Forward pass and backpropagation (high-level understanding)
Loss functions and optimization
Training, validation, and testing

Key ML Terms

Embeddings: Dense vector representations of data
Fine-tuning: Adapting a pre-trained model to specific tasks
Transfer learning: Using knowledge from one task for another
Batch processing: Processing multiple examples simultaneously
Gradient descent: Optimization algorithm for training

Self-assessment questions:

Can you explain the difference between training and inference?
What is the purpose of a validation set?
Why do we use activation functions in neural networks?
What does “learning rate” mean in the context of training?

If you can answer 3 out of 4, you’re in good shape! Recommended resources:

Andrew Ng’s Machine Learning Course (Coursera)
Fast.ai Practical Deep Learning
3Blue1Brown’s Neural Networks Series (visual explanations)

Deep Learning & NLP Basics

While the book teaches you about LLMs, some foundational deep learning and NLP knowledge is helpful. What helps (but isn’t strictly required):

Basic NLP concepts: Tokenization, text preprocessing, n-grams
Word embeddings: Understanding that words can be represented as vectors (Word2Vec, GloVe concepts)
Sequence models: High-level awareness of RNNs/LSTMs
Attention mechanism: Conceptual understanding (the book explains this in detail)
Pre-training and fine-tuning: General idea of transfer learning in NLP

The book includes extensive visual explanations of transformers, attention mechanisms, and other advanced concepts. If you’re new to these topics, the book’s illustrations will guide you through them step-by-step.

Recommended resources:

Stanford CS224N: NLP with Deep Learning (lectures available free)
The Illustrated Transformer by Jay Alammar (book co-author)
StatQuest’s Neural Networks by Josh Starmer

Recommended Background

Mathematics

You don’t need to be a math expert, but familiarity with these concepts helps: Linear Algebra (most important):

Vectors and matrices
Matrix multiplication
Dot products
Vector spaces and dimensions

Calculus (basic understanding):

Derivatives and gradients (conceptually)
Chain rule (for understanding backpropagation)

Probability & Statistics (helpful):

Probability distributions
Mean, variance, standard deviation
Sampling and randomness

The book focuses on practical implementation rather than mathematical theory. Understanding these concepts at a high level is sufficient for most readers.

Recommended resources:

Libraries & Tools

You’ll work extensively with these libraries throughout the book:

Library	Purpose	Required Experience
NumPy	Numerical computing	Basic array operations
Pandas	Data manipulation	Reading CSV, DataFrame basics
Matplotlib	Visualization	Simple plotting
PyTorch	Deep learning framework	Helpful but taught in book
Transformers	Hugging Face library	Not required - taught in book
Scikit-learn	ML utilities	Basic familiarity helpful

What you should be able to do:

# NumPy: Create and manipulate arrays
import numpy as np
arr = np.array([1, 2, 3, 4])
print(arr.mean(), arr.std())

# Pandas: Load and explore data
import pandas as pd
df = pd.read_csv("data.csv")
print(df.head(), df.describe())

# Matplotlib: Create simple plots
import matplotlib.pyplot as plt
plt.plot([1, 2, 3], [1, 4, 9])
plt.xlabel("X axis")
plt.ylabel("Y axis")
plt.show()

If you’re new to PyTorch or Transformers, don’t worry! The book teaches you these libraries from scratch with practical examples.

The Visual Learning Approach

One of the unique aspects of “Hands-On Large Language Models” is its heavy use of visual explanations.

What Makes This Book Different

~300 custom illustrations: Complex concepts explained through clear, custom-made diagrams
Visual-first approach: Every major concept is accompanied by visual aids
Step-by-step breakdowns: Complex architectures shown layer by layer
Code-to-diagram mapping: Code examples directly connected to visual representations

This visual learning approach, pioneered by co-author Jay Alammar in his popular blog posts like “The Illustrated Transformer,” makes LLM concepts accessible even to those without extensive ML backgrounds.

Who This Book Is For

This book is ideal if you:

Are a software engineer wanting to work with LLMs
Have ML experience but are new to transformers
Want to understand how LLMs work under the hood
Need practical, hands-on examples (not just theory)
Prefer learning through visualization and code
Want to build real applications with LLMs

This book might be challenging if you:

Have never programmed before
Are completely new to machine learning concepts
Don’t have access to a GPU (though Colab is free!)
Prefer pure mathematical theory over practical implementation

Preparatory Learning Path

If you’re starting from scratch, here’s a recommended learning path:

Learn Python Basics (2-4 weeks)

Complete a Python fundamentals course covering:

Variables, functions, and control flow
Lists, dictionaries, and data structures
Object-oriented programming basics
Working with libraries

Resource: Python for Everybody

Get Familiar with NumPy/Pandas (1 week)

Learn basic data manipulation:

Creating and manipulating arrays
Reading and processing CSV files
Basic data analysis

Resource: NumPy Quickstart, Pandas Getting Started

Understand ML Fundamentals (2-3 weeks)

Grasp core machine learning concepts:

Supervised learning basics
Training vs. testing
Classification and regression
Model evaluation

Resource: Andrew Ng’s ML Course or Fast.ai

Learn Basic Neural Networks (1-2 weeks)

Understand how neural networks work:

Layers and neurons
Activation functions
Training process (high-level)

Resource: 3Blue1Brown’s Neural Networks

Explore NLP Basics (1 week)

Get familiar with text processing:

Tokenization concepts
Word embeddings (conceptual)
Common NLP tasks

Resource: The Illustrated Word2Vec

Start the Book!

You’re ready! The book will teach you:

Transformers and attention mechanisms
Working with Hugging Face Transformers
Fine-tuning LLMs
Building LLM applications
Advanced techniques (RAG, embeddings, multimodal models)

Total preparation time: 7-12 weeks for complete beginners Already have ML experience? You can likely start the book immediately!

Skill Level Assessment

Use this checklist to determine if you’re ready:

✅ Ready to Start Now

Comfortable writing Python scripts
Understand basic ML concepts (training, testing, evaluation)
Can work with NumPy arrays and Pandas DataFrames
Know what neural networks are conceptually
Familiar with Jupyter notebooks
Have access to Google Colab or a local GPU

📚 Need Some Preparation (1-2 weeks)

Know Python but rusty on ML concepts
Haven’t used NumPy/Pandas extensively
Unclear on neural network basics
Need to set up development environment

🎓 Recommended Preparatory Study (4-8 weeks)

Python beginner
No ML experience
Never worked with data science libraries
Not familiar with deep learning concepts

Complementary Resources

While Reading the Book

Enhance your learning with these companion resources: From the Authors:

The Illustrated Transformer - Jay Alammar
The Illustrated BERT - Jay Alammar
A Visual Guide to Quantization - Maarten Grootendorst
A Visual Guide to Mamba and State Space Models - Maarten Grootendorst

Video Courses:

How Transformer LLMs Work - DeepLearning.AI (NEW!)
StatQuest Machine Learning - Josh Starmer

Documentation:

Community & Support

Official Resources:

GitHub Repository - Code examples and issues
Follow Jay Alammar and Maarten Grootendorst on LinkedIn

Where to Get Help:

GitHub Issues for code-specific questions
Hugging Face Forums for library questions
Stack Overflow with relevant tags

What You’ll Learn

By working through the book, you’ll gain practical experience with:

Core Concepts

How transformers work
Tokenization and embeddings
Attention mechanisms
Model architectures

Practical Skills

Using Hugging Face Transformers
Text classification and clustering
Prompt engineering
Fine-tuning models

Advanced Techniques

Retrieval-Augmented Generation (RAG)
Creating custom embeddings
Multimodal models
Parameter-efficient fine-tuning (PEFT)

Real Applications

Semantic search systems
Topic modeling
Text generation pipelines
Production-ready LLM apps

Next Steps

Assess Your Current Level

Use the checklists above to determine your preparedness.

Fill Knowledge Gaps

If needed, work through the recommended preparatory resources.

Set Up Your Environment

Follow the Environment Setup guide to prepare your development environment.

Start Chapter 1

Open the Chapter 1 notebook and begin your LLM journey!

Remember: The book’s visual approach makes it more accessible than traditional technical texts. Even if you’re missing some prerequisites, the illustrations and hands-on examples will help you learn as you go!

Environment Setup Chapter 1: Introduction to Language Models

⌘I

Build docs developers (and LLMs) love

Get started for free Talk to us

Get Started

Foundations

Text Understanding

Text Generation

Retrieval & Multimodal

Fine-Tuning

Overview

Required Knowledge

Python Programming

Machine Learning Fundamentals

Deep Learning & NLP Basics

Recommended Background

Mathematics

Libraries & Tools

The Visual Learning Approach

What Makes This Book Different

Who This Book Is For

Preparatory Learning Path

Skill Level Assessment

✅ Ready to Start Now

📚 Need Some Preparation (1-2 weeks)

🎓 Recommended Preparatory Study (4-8 weeks)

Complementary Resources

While Reading the Book

Community & Support

What You’ll Learn

Core Concepts

Practical Skills

Advanced Techniques

Real Applications

Next Steps

Build docs developers (and LLMs) love

Get Started

Foundations

Text Understanding

Text Generation

Retrieval & Multimodal

Fine-Tuning

Documentation Index

​Overview

​Required Knowledge

​Python Programming

​Machine Learning Fundamentals

​Deep Learning & NLP Basics

​Recommended Background

​Mathematics

​Libraries & Tools

​The Visual Learning Approach

​What Makes This Book Different

​Who This Book Is For

​Preparatory Learning Path

​Skill Level Assessment

​✅ Ready to Start Now

​📚 Need Some Preparation (1-2 weeks)

​🎓 Recommended Preparatory Study (4-8 weeks)

​Complementary Resources

​While Reading the Book

​Community & Support

​What You’ll Learn

Core Concepts

Practical Skills

Advanced Techniques

Real Applications

​Next Steps

Build docs developers (and LLMs) love

Overview

Required Knowledge

Python Programming

Machine Learning Fundamentals

Deep Learning & NLP Basics

Recommended Background

Mathematics

Libraries & Tools

The Visual Learning Approach

What Makes This Book Different

Who This Book Is For

Preparatory Learning Path

Skill Level Assessment

✅ Ready to Start Now

📚 Need Some Preparation (1-2 weeks)

🎓 Recommended Preparatory Study (4-8 weeks)

Complementary Resources

While Reading the Book

Community & Support

What You’ll Learn

Next Steps