PDF Viewer

ThinkEx includes a powerful PDF viewer that lets you work with PDF documents directly in your workspace. PDFs are treated as first-class cards with OCR text extraction, making the content searchable and available to the AI.

Adding PDFs

There are multiple ways to add PDF documents to your workspace:

Upload
AI Assistant

Click the + button

Open the add card menu in your workspace

Select PDF

Choose a PDF file from your computer

Wait for upload

The file uploads to storage and OCR processing begins automatically

Ask the AI to process a PDF URL:

Add this PDF to the workspace: https://example.com/document.pdf

The AI will download and add it as a PDF card.

PDF files are uploaded to Supabase storage (or local storage in development mode) before OCR processing begins. This avoids Next.js body size limits for large files.

OCR Text Extraction

ThinkEx automatically performs Optical Character Recognition (OCR) on uploaded PDFs using Azure Document Intelligence.

OCR Process

Upload

PDF file is uploaded to storage

OCR Processing

Azure Document Intelligence analyzes the document

Text Extraction

Text content is extracted and converted to markdown

Indexing

Extracted text is stored for search and AI access

What Gets Extracted

The OCR process captures:

Text content: All readable text from the PDF
Page structure: Headers, footers, and layout information
Tables: Structured table data
Hyperlinks: Links within the document
Images: Embedded images (metadata only)

OCR works best with clear, well-formatted PDFs. Scanned documents with low resolution or poor contrast may have reduced accuracy.

PDF Viewer Features

The built-in PDF viewer provides a rich reading experience:

Page scrolling: Smooth vertical scrolling through pages
Page numbers: Current page indicator and total page count
Zoom controls: Adjust zoom level for comfortable reading
Search: Find text within the PDF

Annotations

Highlight and annotate PDFs directly in the viewer:

Text highlighting: Select and highlight important passages
Notes: Add comments to specific sections
Bookmarks: Mark pages for quick reference

Annotation features are provided by the @embedpdf/react library, which offers a modern PDF viewing experience.

Working with PDF Content

AI Integration

The AI can read and work with PDF content thanks to OCR:

Summarize the main points from the "Research Paper" PDF

Create flashcards from pages 5-10 of the textbook PDF

What does the contract say about payment terms?

Page-Specific Queries

Reference specific pages in your questions:

read Research Paper pdf pages 1-3

The AI will focus on the specified page range.

Search

PDF text content is fully searchable:

search for "quantum computing" in workspace

The AI will search through all PDFs and return relevant matches with page numbers.

PDF Card Properties

PDF cards include these properties:

fileUrl: Storage URL for the PDF file
filename: Original filename
fileSize: File size in bytes
textContent: Cached extracted text (for quick access)
ocrStatus: Processing status (processing, complete, or failed)
ocrPages: Detailed page-by-page extraction results

OCR Status

PDFs can be in one of three OCR states:

Processing

OCR is currently running

Complete

Text extraction finished successfully

Failed

OCR encountered an error

If OCR fails, the PDF card will still display the document, but text search and AI access will be limited.

PDF Uploads and Storage

Upload Process

PDFs are handled with a non-blocking upload flow:

File validation: Check file type and size
Direct upload: Upload to storage (bypasses Next.js API limits)
Card creation: Add PDF card to workspace immediately
Background OCR: Text extraction happens asynchronously
Content update: Extracted text is added when OCR completes

Storage Locations

Production: Supabase Storage
Development: Local filesystem or Supabase

Large PDFs are supported because the upload goes directly to storage rather than through the Next.js API route, which has a 10MB body limit.

Use Cases

Research

Import research papers and articles
Highlight key findings
Extract quotes and citations
Generate summaries with AI

Study

Add textbooks and course materials
Create flashcards from chapters
Quiz yourself on PDF content
Keep notes alongside source PDFs

Work

Review contracts and documents
Extract key information with AI
Annotate important sections
Keep project documentation accessible

Best Practices

File Organization

Use descriptive names for PDF cards
Group related PDFs in folders
Add notes about key takeaways from each PDF
Use color coding for different document types

AI Workflow

Wait for OCR to complete before asking detailed questions
Reference specific pages for targeted queries
Use PDFs as source material for notes and flashcards
Ask the AI to summarize long documents

Performance

Keep PDF file sizes reasonable (under 50MB recommended)
Use high-quality scans for better OCR accuracy
Be patient with large multi-page documents during OCR

PDF text extraction uses Azure Document Intelligence, which provides high-quality OCR with support for complex layouts, tables, and multiple languages.

Get Started

Core Concepts

Features

Self-Hosting

Development

Adding PDFs

OCR Text Extraction

OCR Process

What Gets Extracted

PDF Viewer Features

Navigation

Annotations

Working with PDF Content

AI Integration

Page-Specific Queries

Search

PDF Card Properties

OCR Status

Processing

Complete

Failed

PDF Uploads and Storage

Upload Process

Storage Locations

Use Cases

Best Practices

File Organization

AI Workflow

Performance

Build docs developers (and LLMs) love

Get Started

Core Concepts

Features

Self-Hosting

Development

​Adding PDFs

​OCR Text Extraction

​OCR Process

​What Gets Extracted

​PDF Viewer Features

​Navigation

​Annotations

​Working with PDF Content

​AI Integration

​Page-Specific Queries

​Search

​PDF Card Properties

​OCR Status

Processing

Complete

Failed

​PDF Uploads and Storage

​Upload Process

​Storage Locations

​Use Cases

​Best Practices

​File Organization

​AI Workflow

​Performance

Build docs developers (and LLMs) love

Adding PDFs

OCR Text Extraction

OCR Process

What Gets Extracted

PDF Viewer Features

Navigation

Annotations

Working with PDF Content

AI Integration

Page-Specific Queries

Search

PDF Card Properties

OCR Status

PDF Uploads and Storage

Upload Process

Storage Locations

Use Cases

Best Practices

File Organization

AI Workflow

Performance