Skip to main content

Overview

The visualize() function creates interactive HTML visualizations of extraction results. It displays the original text with highlighted extractions, allowing you to step through each entity and view its attributes.
import langextract as lx

# Extract information
result = lx.extract(
    text="John Smith works at Google.",
    prompt_description="Extract people and companies",
    examples=examples
)

# Visualize the results
lx.visualize(result)

Function Signature

def visualize(
    data_source: data.AnnotatedDocument | str | pathlib.Path,
    *,
    animation_speed: float = 1.0,
    show_legend: bool = True,
    gif_optimized: bool = True,
) -> HTML | str

Parameters

data_source
AnnotatedDocument | str | pathlib.Path
required
The source of extraction data to visualize. Can be:
  • An AnnotatedDocument object (returned from lx.extract())
  • A string path to a JSONL file containing saved extractions
  • A pathlib.Path object pointing to a JSONL file
When loading from a file, the first document in the JSONL file will be visualized.
animation_speed
float
default:"1.0"
Animation speed in seconds between extractions when playing in auto-play mode.
  • Lower values (e.g., 0.5) create faster animations
  • Higher values (e.g., 2.0) slow down the animation
  • Default is 1.0 second per extraction
This is a keyword-only parameter.
show_legend
bool
default:"True"
If True, displays a color legend mapping extraction classes to their highlight colors at the top of the visualization.
This is a keyword-only parameter.
gif_optimized
bool
default:"True"
If True, applies GIF-optimized styling with:
  • Larger fonts for better readability
  • Better contrast and improved dimensions
  • Enhanced styling for video capture
Useful when recording the visualization as a GIF or video for presentations.
This is a keyword-only parameter.

Returns

result
IPython.display.HTML | str
Returns an IPython.display.HTML object if IPython is available (Jupyter notebook environment), otherwise returns the generated HTML string.The HTML includes:
  • Syntax-highlighted text with colored spans for each extraction
  • Interactive controls (Play/Pause, Previous, Next)
  • Progress slider to jump to any extraction
  • Attributes panel showing details of the current extraction
  • Status text showing current position and extraction count

Exceptions

FileNotFoundError
exception
Raised when data_source is a file path that does not exist.
ValueError
exception
Raised when:
  • The JSONL file contains no documents
  • The AnnotatedDocument contains no text
  • The AnnotatedDocument contains no extractions

Visualization Features

Interactive Controls

The visualization includes several interactive controls:
  • Play/Pause Button: Automatically cycle through extractions
  • Previous Button: Jump to the previous extraction
  • Next Button: Jump to the next extraction
  • Progress Slider: Manually navigate to any extraction
  • Auto-scroll: Automatically scrolls the current extraction into view

Color Coding

Each extraction class is automatically assigned a unique color from a Material Design-inspired palette:
  • Light Blue (#D2E3FC)
  • Light Green (#C8E6C9)
  • Light Yellow (#FEF0C3)
  • Light Red (#F9DEDC)
  • Light Orange (#FFDDBE)
  • Light Purple (#EADDFF)
  • Light Teal (#C4E9E4)
  • Light Pink (#FCE4EC)
  • Very Light Grey (#E8EAED)
  • Pale Cyan (#DDE8E8)
Colors are assigned consistently based on extraction class names (sorted alphabetically).

Attributes Display

For each extraction, the attributes panel shows:
  • class: The extraction class/entity type
  • attributes: All extracted attributes as key-value pairs
  • Empty or null attributes are filtered out for cleaner display

Examples

Basic Visualization

import langextract as lx

# Extract entities
result = lx.extract(
    text="Sarah Johnson founded TechCorp in 2020.",
    prompt_description="Extract people, companies, and dates",
    examples=examples
)

# Display visualization in Jupyter
lx.visualize(result)

Load from JSONL File

import langextract as lx

# Save extractions to file
lx.io.save_annotated_documents_jsonl([result], "extractions.jsonl")

# Later, visualize from file
lx.visualize("extractions.jsonl")

Customize Animation Speed

import langextract as lx

# Slower animation (2 seconds per extraction)
lx.visualize(
    result,
    animation_speed=2.0
)

# Faster animation (0.5 seconds per extraction)
lx.visualize(
    result,
    animation_speed=0.5
)

Without Legend

import langextract as lx

# Hide the color legend
lx.visualize(
    result,
    show_legend=False
)

Optimize for Recording

import langextract as lx

# Disable GIF optimization for cleaner display in notebook
lx.visualize(
    result,
    gif_optimized=False
)

Save as HTML File

import langextract as lx

# Get HTML string
html_content = lx.visualize(result)

# Save to file
with open("visualization.html", "w") as f:
    f.write(html_content)

Use with pathlib

import langextract as lx
from pathlib import Path

# Visualize from Path object
data_file = Path("data") / "extractions.jsonl"
lx.visualize(data_file)

Technical Details

Extraction Filtering

Only extractions with valid character intervals are displayed. An extraction is considered valid if:
  • It has a non-null char_interval
  • The start_pos is not None
  • The end_pos is not None
  • start_pos < end_pos

HTML Structure

The generated HTML includes:
  • CSS styles: Embedded styles for highlighting, controls, and animations
  • Text window: Scrollable container with highlighted text
  • Attributes panel: Shows details of the current extraction
  • Controls: Interactive buttons and slider
  • JavaScript: Handles interactivity and state management

Nested Extractions

The visualization properly handles nested and overlapping extractions:
  • Spans are sorted by position, then by length
  • Longer spans open first, shorter spans close first
  • Ensures valid HTML nesting

Performance

The visualization is optimized for documents with many extractions:
  • Uses efficient DOM updates
  • Smooth scrolling with scrollIntoView
  • CSS animations with hardware acceleration
  • Minimal JavaScript overhead

Environment Detection

The function automatically detects the execution environment:
  • Jupyter/IPython: Returns IPython.display.HTML object for inline display
  • Standard Python: Returns raw HTML string
  • Checks for IPython availability and notebook context

See Also

Build docs developers (and LLMs) love