LangExtract provides built-in visualization to generate interactive HTML files that display extracted entities highlighted in their original context. This feature makes it easy to review and verify thousands of extractions.
import langextract as lx# Save extraction resultslx.io.save_annotated_documents([result], output_name="extraction_results.jsonl", output_dir=".")# Generate visualization from the filehtml_content = lx.visualize("extraction_results.jsonl")with open("visualization.html", "w") as f: if hasattr(html_content, 'data'): f.write(html_content.data) # For Jupyter/Colab else: f.write(html_content)
The visualization automatically handles both Jupyter/Colab environments (returns IPython.display.HTML) and standard Python scripts (returns HTML string).
import langextract as lximport textwrap# Define extraction taskprompt = textwrap.dedent("""\ Extract characters, emotions, and relationships in order of appearance. Use exact text for extractions. Do not paraphrase or overlap entities. Provide meaningful attributes for each entity to add context.""")examples = [ lx.data.ExampleData( text="ROMEO. But soft! What light through yonder window breaks?", extractions=[ lx.data.Extraction( extraction_class="character", extraction_text="ROMEO", attributes={"emotional_state": "wonder"} ), lx.data.Extraction( extraction_class="emotion", extraction_text="But soft!", attributes={"feeling": "gentle awe"} ), ] )]# Perform extractionresult = lx.extract( text_or_documents="Lady Juliet gazed longingly at the stars, her heart aching for Romeo", prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash")# Save resultslx.io.save_annotated_documents( [result], output_name="extraction_results.jsonl", output_dir=".")# Generate visualizationhtml_content = lx.visualize( "extraction_results.jsonl", animation_speed=1.0, show_legend=True, gif_optimized=True)# Save to HTML filewith open("visualization.html", "w") as f: if hasattr(html_content, 'data'): f.write(html_content.data) else: f.write(html_content)
The visualization seamlessly handles large result sets:
Efficiently renders hundreds or thousands of entities
Smooth scrolling and navigation
Optimized for performance with large documents
Progress slider for quick navigation
# Visualize large extraction from full novelresult = lx.extract( text_or_documents="https://www.gutenberg.org/files/1513/1513-0.txt", prompt_description=prompt, examples=examples, model_id="gemini-2.5-flash", extraction_passes=3, max_workers=20, max_char_buffer=1000)lx.io.save_annotated_documents([result], output_name="romeo_juliet.jsonl", output_dir=".")html_content = lx.visualize("romeo_juliet.jsonl")# Save the visualizationwith open("romeo_juliet_viz.html", "w") as f: if hasattr(html_content, 'data'): f.write(html_content.data) else: f.write(html_content)