Overview
The prompting module provides tools for building structured prompts with few-shot examples, handling format-specific output, and managing cross-chunk context for improved coreference resolution.
Module
from langextract import prompting
Classes
PromptTemplateStructured
A structured prompt template for few-shot examples.
@dataclasses.dataclass
class PromptTemplateStructured:
description: str
examples: list[data.ExampleData] = dataclasses.field(default_factory=list)
Instructions or guidelines for the LLM.
examples
list[ExampleData]
default:"[]"
ExampleData objects demonstrating expected input→output behavior for few-shot learning.
QAPromptGenerator
Generates question-answer style prompts from a template.
@dataclasses.dataclass
class QAPromptGenerator:
template: PromptTemplateStructured
format_handler: format_handler.FormatHandler
examples_heading: str = "Examples"
question_prefix: str = "Q: "
answer_prefix: str = "A: "
template
PromptTemplateStructured
required
The prompt template with description and examples.
FormatHandler for managing format-specific output (JSON/YAML).
Heading text for the examples section.
Prefix for question text.
Methods:
render()
Generates a text representation of the prompt.
def render(self, question: str, additional_context: str | None = None) -> str
The question/text to be presented to the model.
Additional context to include in the prompt. Empty strings are ignored.
Text prompt with a question to be presented to the language model.
format_example_as_text()
Formats a single example for the prompt.
def format_example_as_text(self, example: data.ExampleData) -> str
The example data to format.
A string representation of the example, including the question and answer.
PromptBuilder
Base class for building prompts for text chunks.
class PromptBuilder:
def __init__(self, generator: QAPromptGenerator)
generator
QAPromptGenerator
required
The underlying prompt generator to use.
Methods:
build_prompt()
Builds a prompt for the given chunk.
def build_prompt(
self,
chunk_text: str,
document_id: str,
additional_context: str | None = None
) -> str
The text of the current chunk to process.
Identifier for the source document.
Optional additional context from the document.
The rendered prompt string ready for the language model.
ContextAwarePromptBuilder
Prompt builder with cross-chunk context tracking for coreference resolution.
class ContextAwarePromptBuilder(PromptBuilder):
def __init__(
self,
generator: QAPromptGenerator,
context_window_chars: int | None = None
)
generator
QAPromptGenerator
required
The underlying prompt generator to use.
Number of characters from the previous chunk’s tail to include as context. Defaults to None (disabled).
Properties:
context_window_chars: Number of trailing characters from previous chunk to include
Behavior:
The builder tracks the previous chunk per document_id and injects trailing text from the previous chunk as context. This helps resolve pronouns and coreferences across chunk boundaries.
Example context format:
[Previous text]: ...ending of previous chunk
Current chunk text here...
Functions
read_prompt_template_structured_from_file()
Reads a structured prompt template from a file.
def read_prompt_template_structured_from_file(
prompt_path: str,
format_type: data.FormatType = data.FormatType.YAML
) -> PromptTemplateStructured
Path to a file containing PromptTemplateStructured data.
format_type
FormatType
default:"FormatType.YAML"
The format of the file (YAML or JSON).
A PromptTemplateStructured object loaded from the file.
Raises: ParseError if the file cannot be parsed successfully.
Usage Examples
Basic Prompt Template
from langextract.prompting import PromptTemplateStructured, QAPromptGenerator
from langextract.core.data import ExampleData, FormatType
from langextract.core.format_handler import FormatHandler
# Create a prompt template
template = PromptTemplateStructured(
description="Extract person names and their roles from the text.",
examples=[
ExampleData(
text="Dr. Sarah Johnson is the lead researcher.",
extractions=[
{"person": "Dr. Sarah Johnson", "person_index": 1},
{"role": "lead researcher", "role_index": 2}
]
),
ExampleData(
text="Professor Mike Chen teaches computer science.",
extractions=[
{"person": "Professor Mike Chen", "person_index": 1},
{"role": "teacher", "role_index": 2}
]
)
]
)
# Create a generator
format_handler = FormatHandler(format_type=FormatType.YAML)
generator = QAPromptGenerator(template=template, format_handler=format_handler)
# Render a prompt
prompt = generator.render("John Smith is a software engineer at Google.")
print(prompt)
Loading Template from File
from langextract.prompting import read_prompt_template_structured_from_file
from langextract.core.data import FormatType
# template.yaml contains:
# description: Extract medical entities
# examples:
# - text: Patient has diabetes
# extractions:
# - condition: diabetes
# condition_index: 1
template = read_prompt_template_structured_from_file(
"template.yaml",
format_type=FormatType.YAML
)
print(template.description)
print(f"Loaded {len(template.examples)} examples")
Context-Aware Prompting
from langextract.prompting import (
PromptTemplateStructured,
QAPromptGenerator,
ContextAwarePromptBuilder
)
from langextract.core.format_handler import FormatHandler
from langextract.core.data import FormatType
template = PromptTemplateStructured(
description="Extract entities from medical records."
)
format_handler = FormatHandler(format_type=FormatType.YAML)
generator = QAPromptGenerator(template=template, format_handler=format_handler)
# Create builder with context window
builder = ContextAwarePromptBuilder(
generator=generator,
context_window_chars=100 # Include last 100 chars from previous chunk
)
# Process chunks sequentially
chunks = [
"Dr. Sarah Johnson examined the patient on Monday.",
"She prescribed antibiotics and scheduled a follow-up."
]
doc_id = "doc1"
for chunk in chunks:
prompt = builder.build_prompt(
chunk_text=chunk,
document_id=doc_id
)
print(f"Prompt for chunk:\n{prompt}\n")
# Second prompt will include context from first chunk
from langextract.prompting import QAPromptGenerator, PromptTemplateStructured
from langextract.core.format_handler import FormatHandler
from langextract.core.data import FormatType
template = PromptTemplateStructured(
description="Extract key information.",
examples=[]
)
format_handler = FormatHandler(format_type=FormatType.JSON)
generator = QAPromptGenerator(
template=template,
format_handler=format_handler,
examples_heading="Here are some examples:",
question_prefix="Input: ",
answer_prefix="Output: "
)
prompt = generator.render(
question="Extract names from this text.",
additional_context="Focus on proper nouns."
)
print(prompt)
from langextract.prompting import QAPromptGenerator, PromptTemplateStructured
from langextract.core.format_handler import FormatHandler
from langextract.core.data import ExampleData, FormatType
template = PromptTemplateStructured(
description="Extract entities",
examples=[]
)
format_handler = FormatHandler(format_type=FormatType.JSON, use_fences=True)
generator = QAPromptGenerator(template=template, format_handler=format_handler)
example = ExampleData(
text="John works at Google",
extractions=[
{"person": "John", "person_index": 1},
{"org": "Google", "org_index": 2}
]
)
formatted = generator.format_example_as_text(example)
print(formatted)
# Output:
# Q: John works at Google
# A: ```json
# [{"person": "John", "person_index": 1}, {"org": "Google", "org_index": 2}]
# ```
Multi-Document Context Tracking
from langextract.prompting import (
ContextAwarePromptBuilder,
QAPromptGenerator,
PromptTemplateStructured
)
from langextract.core.format_handler import FormatHandler
from langextract.core.data import FormatType
template = PromptTemplateStructured(description="Extract entities")
format_handler = FormatHandler(format_type=FormatType.YAML)
generator = QAPromptGenerator(template=template, format_handler=format_handler)
builder = ContextAwarePromptBuilder(
generator=generator,
context_window_chars=50
)
# Process multiple documents - context is tracked separately per doc_id
doc1_chunks = ["First chunk of doc1", "Second chunk of doc1"]
doc2_chunks = ["First chunk of doc2", "Second chunk of doc2"]
for chunk in doc1_chunks:
prompt = builder.build_prompt(chunk, document_id="doc1")
# Process...
for chunk in doc2_chunks:
prompt = builder.build_prompt(chunk, document_id="doc2")
# Process...
# Context from doc1 won't bleed into doc2
Notes
PromptTemplateStructured supports both YAML and JSON format files
QAPromptGenerator formats examples according to the specified format_handler
ContextAwarePromptBuilder tracks context per document_id to prevent cross-document contamination
- Context window is measured in characters from the end of the previous chunk
- The context prefix
[Previous text]: ... helps the model understand the injected context
- Use
additional_context to provide domain-specific instructions or metadata
- Examples are automatically formatted with proper JSON/YAML syntax and code fences if enabled