Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/microsoft/typeagent-py/llms.txt

Use this file to discover all available pages before exploring further.

TypeAgent provides flexible configuration options through ConversationSettings to control indexing, knowledge extraction, and storage behavior.

ConversationSettings

The ConversationSettings class is the primary configuration interface:
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.aitools.model_adapters import create_embedding_model

# Default configuration
settings = ConversationSettings()

# Custom embedding model
embedding_model = create_embedding_model("openai:text-embedding-3-small")
settings = ConversationSettings(model=embedding_model)

# With storage provider
from typeagent.storage.sqlite import SqliteStorageProvider

storage_provider = SqliteStorageProvider(
    db_path="conversation.db",
    message_type=TranscriptMessage
)
settings = ConversationSettings(
    model=embedding_model,
    storage_provider=storage_provider
)

Embedding Model Configuration

Configure the embedding model used for semantic search:
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.knowpro.convsettings import ConversationSettings

# OpenAI models (default: text-embedding-ada-002)
model = create_embedding_model("openai:text-embedding-ada-002")
model = create_embedding_model("openai:text-embedding-3-small")
model = create_embedding_model("openai:text-embedding-3-large")

# Use in settings
settings = ConversationSettings(model=model)
Once you create a database with a specific embedding model, you must use the same model for all subsequent operations. Mixing embedding models will cause errors.

Embedding Model Properties

Access embedding model information:
print(f"Model name: {settings.embedding_model.model_name}")
print(f"Embedding dimensions: {settings.embedding_model.dimensions}")

Knowledge Extraction Settings

Control automatic knowledge extraction from messages:
from typeagent.knowpro.convsettings import ConversationSettings

settings = ConversationSettings()

# Enable/disable automatic knowledge extraction
settings.semantic_ref_index_settings.auto_extract_knowledge = True  # Default

# Configure batch size for concurrent extraction
settings.semantic_ref_index_settings.batch_size = 4  # Process 4 messages concurrently

Extraction Modes

Knowledge is extracted automatically during ingestion:
settings = ConversationSettings()
settings.semantic_ref_index_settings.auto_extract_knowledge = True

# Extraction happens automatically
result = await conversation.add_messages_with_indexing(messages)
print(f"Extracted {result.semrefs_added} semantic references")

Message Text Index Settings

Configure the message text index for semantic search:
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings

# Access message text index settings
msg_settings = settings.message_text_index_settings

# View underlying embedding settings
embedding_settings = msg_settings.embedding_index_settings
print(f"Min score threshold: {embedding_settings.min_score}")
print(f"Max matches: {embedding_settings.max_matches}")

Default Settings

TypeAgent uses these defaults for message text indexing:
from typeagent.knowpro.convsettings import ConversationSettings

settings = ConversationSettings()

# Message text index
# - min_score: 0.7 (70% similarity threshold)
# - max_matches: unlimited

# Related term index  
# - min_score: 0.85 (85% similarity threshold)
# - max_matches: 50
Configure fuzzy matching and synonym expansion:
settings = ConversationSettings()

# Access related terms settings
related_settings = settings.related_term_index_settings

# View embedding settings
embedding_settings = related_settings.embedding_index_settings
print(f"Min score: {embedding_settings.min_score}")  # 0.85
print(f"Max matches: {embedding_settings.max_matches}")  # 50
The related terms index enables:
  • Fuzzy matching of entity names
  • Synonym expansion for verbs and actions
  • Alias resolution (e.g., “Dr. Smith” → “John Smith”)

Thread Detection Settings

Configure conversation thread detection:
settings = ConversationSettings()

# Thread settings use the same embedding model
thread_settings = settings.thread_settings
print(f"Thread detection min score: {thread_settings.min_score}")  # 0.85

Storage Provider Configuration

1
Step 1: Choose a Storage Provider
2
Select between in-memory and SQLite storage:
3
from typeagent.knowpro.convsettings import ConversationSettings

settings = ConversationSettings()

# Option 1: Let TypeAgent choose (SQLite if dbname provided, else Memory)
provider = await settings.get_storage_provider()

# Option 2: Explicitly set storage provider
from typeagent.storage.sqlite import SqliteStorageProvider

provider = SqliteStorageProvider(
    db_path="conversation.db",
    message_type=TranscriptMessage
)
settings.storage_provider = provider
4
Step 2: Configure Provider Settings
5
Pass settings when creating storage:
6
from typeagent.storage.utils import create_storage_provider

provider = await create_storage_provider(
    message_text_settings=settings.message_text_index_settings,
    related_terms_settings=settings.related_term_index_settings,
    dbname="conversation.db",
    message_type=TranscriptMessage
)
7
Step 3: Access Provider Properties
8
Inspect storage provider configuration:
9
# Get storage provider
provider = settings.storage_provider

# Access collections
messages = await provider.get_message_collection()
semantic_refs = await provider.get_semantic_ref_collection()

# Access indexes
semantic_ref_index = await provider.get_semantic_ref_index()
message_text_index = await provider.get_message_text_index()
related_terms_index = await provider.get_related_terms_index()

Conversation Metadata

Configure metadata for tracking and organization:
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent import create_conversation

# Create with metadata
conversation = await create_conversation(
    dbname="demo.db",
    message_type=TranscriptMessage,
    name="Q1 2024 Team Meetings",
    tags=["engineering", "weekly-sync"],
    extras={
        "team": "Platform Engineering",
        "quarter": "Q1-2024"
    }
)

Reading Metadata

Access conversation metadata:
# Get metadata from storage provider
metadata = await provider.get_conversation_metadata()

print(f"Name: {metadata.name_tag}")
print(f"Tags: {metadata.tags}")
print(f"Created: {metadata.created_at}")
print(f"Updated: {metadata.updated_at}")
print(f"Embedding model: {metadata.embedding_model}")
print(f"Schema version: {metadata.schema_version}")

# Access custom fields
if metadata.extra:
    for key, value in metadata.extra.items():
        print(f"{key}: {value}")

Updating Metadata

Modify conversation metadata:
# Update metadata fields
await provider.set_conversation_metadata(
    name_tag="Updated Name",
    tag=["new-tag", "another-tag"],
    custom_field="custom value"
)

# Update timestamps
from datetime import datetime, timezone

await provider.update_conversation_timestamps(
    updated_at=datetime.now(timezone.utc)
)

Complete Configuration Example

Here’s a complete example with all configuration options:
import asyncio
from datetime import datetime, timezone
from dotenv import load_dotenv

from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.storage.utils import create_storage_provider
from typeagent.transcripts.transcript import TranscriptMessage
from typeagent import create_conversation

load_dotenv()

async def main():
    # 1. Configure embedding model
    embedding_model = create_embedding_model("openai:text-embedding-3-small")
    
    # 2. Create conversation settings
    settings = ConversationSettings(model=embedding_model)
    
    # 3. Configure knowledge extraction
    settings.semantic_ref_index_settings.auto_extract_knowledge = True
    settings.semantic_ref_index_settings.batch_size = 4
    
    # 4. Create metadata
    metadata = ConversationMetadata(
        name_tag="Engineering Standup",
        tags=["daily", "engineering"],
        extra={
            "team": "Backend",
            "quarter": "Q1-2024"
        }
    )
    
    # 5. Create storage provider
    provider = await create_storage_provider(
        message_text_settings=settings.message_text_index_settings,
        related_terms_settings=settings.related_term_index_settings,
        dbname="standup.db",
        message_type=TranscriptMessage,
        metadata=metadata
    )
    
    # 6. Set provider in settings
    settings.storage_provider = provider
    
    # 7. Create conversation
    conversation = await create_conversation(
        dbname=None,  # Already have provider
        message_type=TranscriptMessage,
        settings=settings,
        name="Engineering Standup",
        tags=["daily", "engineering"]
    )
    
    print(f"Conversation configured successfully")
    print(f"Embedding model: {settings.embedding_model.model_name}")
    print(f"Auto extract: {settings.semantic_ref_index_settings.auto_extract_knowledge}")
    print(f"Batch size: {settings.semantic_ref_index_settings.batch_size}")

if __name__ == "__main__":
    asyncio.run(main())

Environment Variables

Use environment variables to configure API keys and model access.
Create a .env file:
# OpenAI API
OPENAI_API_KEY=sk-...
OPENAI_ORG_ID=org-...  # Optional

# Azure OpenAI (alternative)
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://....openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=...
Load in your application:
from dotenv import load_dotenv

load_dotenv()  # Load from .env file

# Now create settings - API keys are automatically picked up
settings = ConversationSettings()

Configuration Best Practices

1
Use consistent embedding models
2
Always use the same embedding model for a database:
3
# Store model name with metadata
await provider.set_conversation_metadata(
    embedding_name="text-embedding-3-small"
)
4
Set appropriate batch sizes
5
Balance throughput and resource usage:
6
# For development: smaller batches, faster feedback
settings.semantic_ref_index_settings.batch_size = 2

# For production: larger batches, better throughput
settings.semantic_ref_index_settings.batch_size = 10
7
Enable knowledge extraction
8
Always enable for interactive applications:
9
# Enable for query capabilities
settings.semantic_ref_index_settings.auto_extract_knowledge = True
10
Use meaningful metadata
11
Add context for organization:
12
conversation = await create_conversation(
    dbname="project-sync.db",
    message_type=TranscriptMessage,
    name="Project Sync - 2024-01-15",
    tags=["project-alpha", "sync-meeting"],
    extras={
        "project": "Alpha",
        "meeting_type": "sync",
        "date": "2024-01-15"
    }
)

Next Steps

Build docs developers (and LLMs) love