Skip to main content

Overview

The prompting module provides tools for building structured prompts with few-shot examples, handling format-specific output, and managing cross-chunk context for improved coreference resolution.

Module

from langextract import prompting

Classes

PromptTemplateStructured

A structured prompt template for few-shot examples.
@dataclasses.dataclass
class PromptTemplateStructured:
    description: str
    examples: list[data.ExampleData] = dataclasses.field(default_factory=list)
description
str
required
Instructions or guidelines for the LLM.
examples
list[ExampleData]
default:"[]"
ExampleData objects demonstrating expected input→output behavior for few-shot learning.

QAPromptGenerator

Generates question-answer style prompts from a template.
@dataclasses.dataclass
class QAPromptGenerator:
    template: PromptTemplateStructured
    format_handler: format_handler.FormatHandler
    examples_heading: str = "Examples"
    question_prefix: str = "Q: "
    answer_prefix: str = "A: "
template
PromptTemplateStructured
required
The prompt template with description and examples.
format_handler
FormatHandler
required
FormatHandler for managing format-specific output (JSON/YAML).
examples_heading
str
default:"Examples"
Heading text for the examples section.
question_prefix
str
default:"Q: "
Prefix for question text.
answer_prefix
str
default:"A: "
Prefix for answer text.
Methods:

render()

Generates a text representation of the prompt.
def render(self, question: str, additional_context: str | None = None) -> str
question
str
required
The question/text to be presented to the model.
additional_context
str | None
default:"None"
Additional context to include in the prompt. Empty strings are ignored.
return
str
Text prompt with a question to be presented to the language model.

format_example_as_text()

Formats a single example for the prompt.
def format_example_as_text(self, example: data.ExampleData) -> str
example
ExampleData
required
The example data to format.
return
str
A string representation of the example, including the question and answer.

PromptBuilder

Base class for building prompts for text chunks.
class PromptBuilder:
    def __init__(self, generator: QAPromptGenerator)
generator
QAPromptGenerator
required
The underlying prompt generator to use.
Methods:

build_prompt()

Builds a prompt for the given chunk.
def build_prompt(
    self,
    chunk_text: str,
    document_id: str,
    additional_context: str | None = None
) -> str
chunk_text
str
required
The text of the current chunk to process.
document_id
str
required
Identifier for the source document.
additional_context
str | None
default:"None"
Optional additional context from the document.
return
str
The rendered prompt string ready for the language model.

ContextAwarePromptBuilder

Prompt builder with cross-chunk context tracking for coreference resolution.
class ContextAwarePromptBuilder(PromptBuilder):
    def __init__(
        self,
        generator: QAPromptGenerator,
        context_window_chars: int | None = None
    )
generator
QAPromptGenerator
required
The underlying prompt generator to use.
context_window_chars
int | None
default:"None"
Number of characters from the previous chunk’s tail to include as context. Defaults to None (disabled).
Properties:
  • context_window_chars: Number of trailing characters from previous chunk to include
Behavior: The builder tracks the previous chunk per document_id and injects trailing text from the previous chunk as context. This helps resolve pronouns and coreferences across chunk boundaries. Example context format:
[Previous text]: ...ending of previous chunk

Current chunk text here...

Functions

read_prompt_template_structured_from_file()

Reads a structured prompt template from a file.
def read_prompt_template_structured_from_file(
    prompt_path: str,
    format_type: data.FormatType = data.FormatType.YAML
) -> PromptTemplateStructured
prompt_path
str
required
Path to a file containing PromptTemplateStructured data.
format_type
FormatType
default:"FormatType.YAML"
The format of the file (YAML or JSON).
return
PromptTemplateStructured
A PromptTemplateStructured object loaded from the file.
Raises: ParseError if the file cannot be parsed successfully.

Usage Examples

Basic Prompt Template

from langextract.prompting import PromptTemplateStructured, QAPromptGenerator
from langextract.core.data import ExampleData, FormatType
from langextract.core.format_handler import FormatHandler

# Create a prompt template
template = PromptTemplateStructured(
    description="Extract person names and their roles from the text.",
    examples=[
        ExampleData(
            text="Dr. Sarah Johnson is the lead researcher.",
            extractions=[
                {"person": "Dr. Sarah Johnson", "person_index": 1},
                {"role": "lead researcher", "role_index": 2}
            ]
        ),
        ExampleData(
            text="Professor Mike Chen teaches computer science.",
            extractions=[
                {"person": "Professor Mike Chen", "person_index": 1},
                {"role": "teacher", "role_index": 2}
            ]
        )
    ]
)

# Create a generator
format_handler = FormatHandler(format_type=FormatType.YAML)
generator = QAPromptGenerator(template=template, format_handler=format_handler)

# Render a prompt
prompt = generator.render("John Smith is a software engineer at Google.")
print(prompt)

Loading Template from File

from langextract.prompting import read_prompt_template_structured_from_file
from langextract.core.data import FormatType

# template.yaml contains:
# description: Extract medical entities
# examples:
#   - text: Patient has diabetes
#     extractions:
#       - condition: diabetes
#         condition_index: 1

template = read_prompt_template_structured_from_file(
    "template.yaml",
    format_type=FormatType.YAML
)

print(template.description)
print(f"Loaded {len(template.examples)} examples")

Context-Aware Prompting

from langextract.prompting import (
    PromptTemplateStructured,
    QAPromptGenerator,
    ContextAwarePromptBuilder
)
from langextract.core.format_handler import FormatHandler
from langextract.core.data import FormatType

template = PromptTemplateStructured(
    description="Extract entities from medical records."
)

format_handler = FormatHandler(format_type=FormatType.YAML)
generator = QAPromptGenerator(template=template, format_handler=format_handler)

# Create builder with context window
builder = ContextAwarePromptBuilder(
    generator=generator,
    context_window_chars=100  # Include last 100 chars from previous chunk
)

# Process chunks sequentially
chunks = [
    "Dr. Sarah Johnson examined the patient on Monday.",
    "She prescribed antibiotics and scheduled a follow-up."
]

doc_id = "doc1"
for chunk in chunks:
    prompt = builder.build_prompt(
        chunk_text=chunk,
        document_id=doc_id
    )
    print(f"Prompt for chunk:\n{prompt}\n")
    # Second prompt will include context from first chunk

Custom Prompt Formatting

from langextract.prompting import QAPromptGenerator, PromptTemplateStructured
from langextract.core.format_handler import FormatHandler
from langextract.core.data import FormatType

template = PromptTemplateStructured(
    description="Extract key information.",
    examples=[]
)

format_handler = FormatHandler(format_type=FormatType.JSON)
generator = QAPromptGenerator(
    template=template,
    format_handler=format_handler,
    examples_heading="Here are some examples:",
    question_prefix="Input: ",
    answer_prefix="Output: "
)

prompt = generator.render(
    question="Extract names from this text.",
    additional_context="Focus on proper nouns."
)
print(prompt)

Format Example Output

from langextract.prompting import QAPromptGenerator, PromptTemplateStructured
from langextract.core.format_handler import FormatHandler
from langextract.core.data import ExampleData, FormatType

template = PromptTemplateStructured(
    description="Extract entities",
    examples=[]
)

format_handler = FormatHandler(format_type=FormatType.JSON, use_fences=True)
generator = QAPromptGenerator(template=template, format_handler=format_handler)

example = ExampleData(
    text="John works at Google",
    extractions=[
        {"person": "John", "person_index": 1},
        {"org": "Google", "org_index": 2}
    ]
)

formatted = generator.format_example_as_text(example)
print(formatted)
# Output:
# Q: John works at Google
# A: ```json
# [{"person": "John", "person_index": 1}, {"org": "Google", "org_index": 2}]
# ```

Multi-Document Context Tracking

from langextract.prompting import (
    ContextAwarePromptBuilder,
    QAPromptGenerator,
    PromptTemplateStructured
)
from langextract.core.format_handler import FormatHandler
from langextract.core.data import FormatType

template = PromptTemplateStructured(description="Extract entities")
format_handler = FormatHandler(format_type=FormatType.YAML)
generator = QAPromptGenerator(template=template, format_handler=format_handler)

builder = ContextAwarePromptBuilder(
    generator=generator,
    context_window_chars=50
)

# Process multiple documents - context is tracked separately per doc_id
doc1_chunks = ["First chunk of doc1", "Second chunk of doc1"]
doc2_chunks = ["First chunk of doc2", "Second chunk of doc2"]

for chunk in doc1_chunks:
    prompt = builder.build_prompt(chunk, document_id="doc1")
    # Process...

for chunk in doc2_chunks:
    prompt = builder.build_prompt(chunk, document_id="doc2")
    # Process...
    # Context from doc1 won't bleed into doc2

Notes

  • PromptTemplateStructured supports both YAML and JSON format files
  • QAPromptGenerator formats examples according to the specified format_handler
  • ContextAwarePromptBuilder tracks context per document_id to prevent cross-document contamination
  • Context window is measured in characters from the end of the previous chunk
  • The context prefix [Previous text]: ... helps the model understand the injected context
  • Use additional_context to provide domain-specific instructions or metadata
  • Examples are automatically formatted with proper JSON/YAML syntax and code fences if enabled

Build docs developers (and LLMs) love