Documentation Index
Fetch the complete documentation index at: https://mintlify.com/microsoft/typeagent-py/llms.txt
Use this file to discover all available pages before exploring further.
TypeAgent provides flexible configuration options through ConversationSettings to control indexing, knowledge extraction, and storage behavior.
ConversationSettings
The ConversationSettings class is the primary configuration interface:
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.aitools.model_adapters import create_embedding_model
# Default configuration
settings = ConversationSettings()
# Custom embedding model
embedding_model = create_embedding_model("openai:text-embedding-3-small")
settings = ConversationSettings(model=embedding_model)
# With storage provider
from typeagent.storage.sqlite import SqliteStorageProvider
storage_provider = SqliteStorageProvider(
db_path="conversation.db",
message_type=TranscriptMessage
)
settings = ConversationSettings(
model=embedding_model,
storage_provider=storage_provider
)
Embedding Model Configuration
Configure the embedding model used for semantic search:
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.knowpro.convsettings import ConversationSettings
# OpenAI models (default: text-embedding-ada-002)
model = create_embedding_model("openai:text-embedding-ada-002")
model = create_embedding_model("openai:text-embedding-3-small")
model = create_embedding_model("openai:text-embedding-3-large")
# Use in settings
settings = ConversationSettings(model=model)
Once you create a database with a specific embedding model, you must use the same model for all subsequent operations. Mixing embedding models will cause errors.
Embedding Model Properties
Access embedding model information:
print(f"Model name: {settings.embedding_model.model_name}")
print(f"Embedding dimensions: {settings.embedding_model.dimensions}")
Control automatic knowledge extraction from messages:
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Enable/disable automatic knowledge extraction
settings.semantic_ref_index_settings.auto_extract_knowledge = True # Default
# Configure batch size for concurrent extraction
settings.semantic_ref_index_settings.batch_size = 4 # Process 4 messages concurrently
Automatic Mode
Manual Mode
Knowledge is extracted automatically during ingestion:settings = ConversationSettings()
settings.semantic_ref_index_settings.auto_extract_knowledge = True
# Extraction happens automatically
result = await conversation.add_messages_with_indexing(messages)
print(f"Extracted {result.semrefs_added} semantic references")
Extract knowledge manually after ingestion:settings = ConversationSettings()
settings.semantic_ref_index_settings.auto_extract_knowledge = False
# Add messages without extraction
await conversation.messages.extend(messages)
# Extract knowledge later
from typeagent.knowpro import secindex
await secindex.extract_and_index_knowledge(
conversation,
settings,
messages
)
Message Text Index Settings
Configure the message text index for semantic search:
from typeagent.aitools.vectorbase import TextEmbeddingIndexSettings
# Access message text index settings
msg_settings = settings.message_text_index_settings
# View underlying embedding settings
embedding_settings = msg_settings.embedding_index_settings
print(f"Min score threshold: {embedding_settings.min_score}")
print(f"Max matches: {embedding_settings.max_matches}")
Default Settings
TypeAgent uses these defaults for message text indexing:
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Message text index
# - min_score: 0.7 (70% similarity threshold)
# - max_matches: unlimited
# Related term index
# - min_score: 0.85 (85% similarity threshold)
# - max_matches: 50
Configure fuzzy matching and synonym expansion:
settings = ConversationSettings()
# Access related terms settings
related_settings = settings.related_term_index_settings
# View embedding settings
embedding_settings = related_settings.embedding_index_settings
print(f"Min score: {embedding_settings.min_score}") # 0.85
print(f"Max matches: {embedding_settings.max_matches}") # 50
The related terms index enables:
- Fuzzy matching of entity names
- Synonym expansion for verbs and actions
- Alias resolution (e.g., “Dr. Smith” → “John Smith”)
Thread Detection Settings
Configure conversation thread detection:
settings = ConversationSettings()
# Thread settings use the same embedding model
thread_settings = settings.thread_settings
print(f"Thread detection min score: {thread_settings.min_score}") # 0.85
Storage Provider Configuration
Step 1: Choose a Storage Provider
Select between in-memory and SQLite storage:
from typeagent.knowpro.convsettings import ConversationSettings
settings = ConversationSettings()
# Option 1: Let TypeAgent choose (SQLite if dbname provided, else Memory)
provider = await settings.get_storage_provider()
# Option 2: Explicitly set storage provider
from typeagent.storage.sqlite import SqliteStorageProvider
provider = SqliteStorageProvider(
db_path="conversation.db",
message_type=TranscriptMessage
)
settings.storage_provider = provider
Pass settings when creating storage:
from typeagent.storage.utils import create_storage_provider
provider = await create_storage_provider(
message_text_settings=settings.message_text_index_settings,
related_terms_settings=settings.related_term_index_settings,
dbname="conversation.db",
message_type=TranscriptMessage
)
Step 3: Access Provider Properties
Inspect storage provider configuration:
# Get storage provider
provider = settings.storage_provider
# Access collections
messages = await provider.get_message_collection()
semantic_refs = await provider.get_semantic_ref_collection()
# Access indexes
semantic_ref_index = await provider.get_semantic_ref_index()
message_text_index = await provider.get_message_text_index()
related_terms_index = await provider.get_related_terms_index()
Configure metadata for tracking and organization:
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent import create_conversation
# Create with metadata
conversation = await create_conversation(
dbname="demo.db",
message_type=TranscriptMessage,
name="Q1 2024 Team Meetings",
tags=["engineering", "weekly-sync"],
extras={
"team": "Platform Engineering",
"quarter": "Q1-2024"
}
)
Access conversation metadata:
# Get metadata from storage provider
metadata = await provider.get_conversation_metadata()
print(f"Name: {metadata.name_tag}")
print(f"Tags: {metadata.tags}")
print(f"Created: {metadata.created_at}")
print(f"Updated: {metadata.updated_at}")
print(f"Embedding model: {metadata.embedding_model}")
print(f"Schema version: {metadata.schema_version}")
# Access custom fields
if metadata.extra:
for key, value in metadata.extra.items():
print(f"{key}: {value}")
Modify conversation metadata:
# Update metadata fields
await provider.set_conversation_metadata(
name_tag="Updated Name",
tag=["new-tag", "another-tag"],
custom_field="custom value"
)
# Update timestamps
from datetime import datetime, timezone
await provider.update_conversation_timestamps(
updated_at=datetime.now(timezone.utc)
)
Complete Configuration Example
Here’s a complete example with all configuration options:
import asyncio
from datetime import datetime, timezone
from dotenv import load_dotenv
from typeagent.knowpro.convsettings import ConversationSettings
from typeagent.knowpro.interfaces import ConversationMetadata
from typeagent.aitools.model_adapters import create_embedding_model
from typeagent.storage.utils import create_storage_provider
from typeagent.transcripts.transcript import TranscriptMessage
from typeagent import create_conversation
load_dotenv()
async def main():
# 1. Configure embedding model
embedding_model = create_embedding_model("openai:text-embedding-3-small")
# 2. Create conversation settings
settings = ConversationSettings(model=embedding_model)
# 3. Configure knowledge extraction
settings.semantic_ref_index_settings.auto_extract_knowledge = True
settings.semantic_ref_index_settings.batch_size = 4
# 4. Create metadata
metadata = ConversationMetadata(
name_tag="Engineering Standup",
tags=["daily", "engineering"],
extra={
"team": "Backend",
"quarter": "Q1-2024"
}
)
# 5. Create storage provider
provider = await create_storage_provider(
message_text_settings=settings.message_text_index_settings,
related_terms_settings=settings.related_term_index_settings,
dbname="standup.db",
message_type=TranscriptMessage,
metadata=metadata
)
# 6. Set provider in settings
settings.storage_provider = provider
# 7. Create conversation
conversation = await create_conversation(
dbname=None, # Already have provider
message_type=TranscriptMessage,
settings=settings,
name="Engineering Standup",
tags=["daily", "engineering"]
)
print(f"Conversation configured successfully")
print(f"Embedding model: {settings.embedding_model.model_name}")
print(f"Auto extract: {settings.semantic_ref_index_settings.auto_extract_knowledge}")
print(f"Batch size: {settings.semantic_ref_index_settings.batch_size}")
if __name__ == "__main__":
asyncio.run(main())
Environment Variables
Use environment variables to configure API keys and model access.
Create a .env file:
# OpenAI API
OPENAI_API_KEY=sk-...
OPENAI_ORG_ID=org-... # Optional
# Azure OpenAI (alternative)
AZURE_OPENAI_API_KEY=...
AZURE_OPENAI_ENDPOINT=https://....openai.azure.com/
AZURE_OPENAI_DEPLOYMENT_NAME=...
Load in your application:
from dotenv import load_dotenv
load_dotenv() # Load from .env file
# Now create settings - API keys are automatically picked up
settings = ConversationSettings()
Configuration Best Practices
Use consistent embedding models
Always use the same embedding model for a database:
# Store model name with metadata
await provider.set_conversation_metadata(
embedding_name="text-embedding-3-small"
)
Set appropriate batch sizes
Balance throughput and resource usage:
# For development: smaller batches, faster feedback
settings.semantic_ref_index_settings.batch_size = 2
# For production: larger batches, better throughput
settings.semantic_ref_index_settings.batch_size = 10
Always enable for interactive applications:
# Enable for query capabilities
settings.semantic_ref_index_settings.auto_extract_knowledge = True
Add context for organization:
conversation = await create_conversation(
dbname="project-sync.db",
message_type=TranscriptMessage,
name="Project Sync - 2024-01-15",
tags=["project-alpha", "sync-meeting"],
extras={
"project": "Alpha",
"meeting_type": "sync",
"date": "2024-01-15"
}
)
Next Steps