visualize()

Overview

The visualize() function creates interactive HTML visualizations of extraction results. It displays the original text with highlighted extractions, allowing you to step through each entity and view its attributes.

import langextract as lx

# Extract information
result = lx.extract(
    text="John Smith works at Google.",
    prompt_description="Extract people and companies",
    examples=examples
)

# Visualize the results
lx.visualize(result)

Function Signature

def visualize(
    data_source: data.AnnotatedDocument | str | pathlib.Path,
    *,
    animation_speed: float = 1.0,
    show_legend: bool = True,
    gif_optimized: bool = True,
) -> HTML | str

Parameters

data_source

AnnotatedDocument | str | pathlib.Path

required

The source of extraction data to visualize. Can be:

An AnnotatedDocument object (returned from lx.extract())
A string path to a JSONL file containing saved extractions
A pathlib.Path object pointing to a JSONL file

When loading from a file, the first document in the JSONL file will be visualized.

animation_speed

float

default:"1.0"

Animation speed in seconds between extractions when playing in auto-play mode.

Lower values (e.g., 0.5) create faster animations
Higher values (e.g., 2.0) slow down the animation
Default is 1.0 second per extraction

This is a keyword-only parameter.

show_legend

bool

default:"True"

If True, displays a color legend mapping extraction classes to their highlight colors at the top of the visualization.

This is a keyword-only parameter.

gif_optimized

bool

default:"True"

If True, applies GIF-optimized styling with:

Larger fonts for better readability
Better contrast and improved dimensions
Enhanced styling for video capture

Useful when recording the visualization as a GIF or video for presentations.

This is a keyword-only parameter.

Returns

result

IPython.display.HTML | str

Returns an IPython.display.HTML object if IPython is available (Jupyter notebook environment), otherwise returns the generated HTML string.The HTML includes:

Syntax-highlighted text with colored spans for each extraction
Interactive controls (Play/Pause, Previous, Next)
Progress slider to jump to any extraction
Attributes panel showing details of the current extraction
Status text showing current position and extraction count

Exceptions

FileNotFoundError

exception

Raised when data_source is a file path that does not exist.

ValueError

exception

Raised when:

The JSONL file contains no documents
The AnnotatedDocument contains no text
The AnnotatedDocument contains no extractions

Visualization Features

Interactive Controls

The visualization includes several interactive controls:

Play/Pause Button: Automatically cycle through extractions
Previous Button: Jump to the previous extraction
Next Button: Jump to the next extraction
Progress Slider: Manually navigate to any extraction
Auto-scroll: Automatically scrolls the current extraction into view

Color Coding

Each extraction class is automatically assigned a unique color from a Material Design-inspired palette:

Light Blue (#D2E3FC)
Light Green (#C8E6C9)
Light Yellow (#FEF0C3)
Light Red (#F9DEDC)
Light Orange (#FFDDBE)
Light Purple (#EADDFF)
Light Teal (#C4E9E4)
Light Pink (#FCE4EC)
Very Light Grey (#E8EAED)
Pale Cyan (#DDE8E8)

Colors are assigned consistently based on extraction class names (sorted alphabetically).

Attributes Display

For each extraction, the attributes panel shows:

class: The extraction class/entity type
attributes: All extracted attributes as key-value pairs
Empty or null attributes are filtered out for cleaner display

Examples

Basic Visualization

import langextract as lx

# Extract entities
result = lx.extract(
    text="Sarah Johnson founded TechCorp in 2020.",
    prompt_description="Extract people, companies, and dates",
    examples=examples
)

# Display visualization in Jupyter
lx.visualize(result)

Load from JSONL File

import langextract as lx

# Save extractions to file
lx.io.save_annotated_documents_jsonl([result], "extractions.jsonl")

# Later, visualize from file
lx.visualize("extractions.jsonl")

Customize Animation Speed

import langextract as lx

# Slower animation (2 seconds per extraction)
lx.visualize(
    result,
    animation_speed=2.0
)

# Faster animation (0.5 seconds per extraction)
lx.visualize(
    result,
    animation_speed=0.5
)

Without Legend

import langextract as lx

# Hide the color legend
lx.visualize(
    result,
    show_legend=False
)

Optimize for Recording

import langextract as lx

# Disable GIF optimization for cleaner display in notebook
lx.visualize(
    result,
    gif_optimized=False
)

Save as HTML File

import langextract as lx

# Get HTML string
html_content = lx.visualize(result)

# Save to file
with open("visualization.html", "w") as f:
    f.write(html_content)

Use with pathlib

import langextract as lx
from pathlib import Path

# Visualize from Path object
data_file = Path("data") / "extractions.jsonl"
lx.visualize(data_file)

Technical Details

Extraction Filtering

Only extractions with valid character intervals are displayed. An extraction is considered valid if:

It has a non-null char_interval
The start_pos is not None
The end_pos is not None
start_pos < end_pos

HTML Structure

The generated HTML includes:

CSS styles: Embedded styles for highlighting, controls, and animations
Text window: Scrollable container with highlighted text
Attributes panel: Shows details of the current extraction
Controls: Interactive buttons and slider
JavaScript: Handles interactivity and state management

Nested Extractions

The visualization properly handles nested and overlapping extractions:

Spans are sorted by position, then by length
Longer spans open first, shorter spans close first
Ensures valid HTML nesting

Performance

The visualization is optimized for documents with many extractions:

Uses efficient DOM updates
Smooth scrolling with scrollIntoView
CSS animations with hardware acceleration
Minimal JavaScript overhead

Environment Detection

The function automatically detects the execution environment:

Jupyter/IPython: Returns IPython.display.HTML object for inline display
Standard Python: Returns raw HTML string
Checks for IPython availability and notebook context

Core API

Data Classes

I/O Operations

Factory & Configuration

Provider API

Advanced

Overview

Function Signature

Parameters

Returns

Exceptions

Visualization Features

Interactive Controls

Color Coding

Attributes Display

Examples

Basic Visualization

Load from JSONL File

Customize Animation Speed

Without Legend

Optimize for Recording

Save as HTML File

Use with pathlib

Technical Details

Extraction Filtering

HTML Structure

Nested Extractions

Performance

Environment Detection

See Also

Build docs developers (and LLMs) love

Core API

Data Classes

I/O Operations

Factory & Configuration

Provider API

Advanced

​Overview

​Function Signature

​Parameters

​Returns

​Exceptions

​Visualization Features

​Interactive Controls

​Color Coding

​Attributes Display

​Examples

​Basic Visualization

​Load from JSONL File

​Customize Animation Speed

​Without Legend

​Optimize for Recording

​Save as HTML File

​Use with pathlib

​Technical Details

​Extraction Filtering

​HTML Structure

​Nested Extractions

​Performance

​Environment Detection

​See Also

Build docs developers (and LLMs) love

Overview

Function Signature

Parameters

Returns

Exceptions

Visualization Features

Interactive Controls

Color Coding

Attributes Display

Examples

Basic Visualization

Load from JSONL File

Customize Animation Speed

Without Legend

Optimize for Recording

Save as HTML File

Use with pathlib

Technical Details

Extraction Filtering

HTML Structure

Nested Extractions

Performance

Environment Detection

See Also