Documentation Index
Fetch the complete documentation index at: https://mintlify.com/jxnl/kura/llms.txt
Use this file to discover all available pages before exploring further.
OpenAIEmbeddingModel
OpenAI-based embedding model for converting text to vector representations.
Constructor
OpenAIEmbeddingModel(
model_name: str = "text-embedding-3-small",
model_batch_size: int = 50,
n_concurrent_jobs: int = 5,
)
model_name
str
default:"text-embedding-3-small"
OpenAI embedding model to use (e.g., “text-embedding-3-small”, “text-embedding-3-large”)
Number of texts to embed in each batch
Maximum number of concurrent API requests
Methods
embed()
Embed a list of texts into vector representations.
async def embed(texts: list[str]) -> list[list[float]]
List of text strings to embed
List of embedding vectors (one per input text)
Example:
from kura.embedding import OpenAIEmbeddingModel
model = OpenAIEmbeddingModel(
model_name="text-embedding-3-large",
model_batch_size=100,
n_concurrent_jobs=10
)
texts = ["Hello world", "Embedding example"]
embeddings = await model.embed(texts)
print(f"Generated {len(embeddings)} embeddings")
Local embedding model using Sentence Transformers (requires sentence-transformers package).
Constructor
SentenceTransformerEmbeddingModel(
model_name: str = "all-MiniLM-L6-v2",
model_batch_size: int = 128,
device: str = "cpu",
)
model_name
str
default:"all-MiniLM-L6-v2"
Sentence Transformer model name from HuggingFace
Number of texts to embed in each batch
Device to run model on (“cpu”, “cuda”, “mps”)
Methods
embed()
Embed a list of texts into vector representations.
async def embed(texts: list[str]) -> list[list[float]]
List of text strings to embed
List of embedding vectors (one per input text)
Example:
from kura.embedding import SentenceTransformerEmbeddingModel
# Use local model (no API calls)
model = SentenceTransformerEmbeddingModel(
model_name="all-MiniLM-L6-v2",
device="cuda" # Use GPU if available
)
texts = ["Local embedding", "No API needed"]
embeddings = await model.embed(texts)
CohereEmbeddingModel
Cohere-based embedding model (requires cohere package).
Constructor
CohereEmbeddingModel(
model_name: str = "embed-v4.0",
model_batch_size: int = 96,
n_concurrent_jobs: int = 5,
input_type: str = "clustering",
api_key: str | None = None,
)
Cohere embedding model to use
Number of texts to embed in each batch
Maximum number of concurrent API requests
Type of input for Cohere (“clustering”, “search_document”, “search_query”)
Cohere API key (if None, uses environment variable)
Methods
embed()
Embed a list of texts into vector representations.
async def embed(texts: list[str]) -> list[list[float]]
List of text strings to embed
List of embedding vectors (one per input text)
Example:
from kura.embedding import CohereEmbeddingModel
model = CohereEmbeddingModel(
model_name="embed-v4.0",
input_type="clustering"
)
texts = ["Cohere embedding", "Alternative provider"]
embeddings = await model.embed(texts)
embed_summaries()
Embed conversation summaries and return items ready for clustering. This is a utility function that wraps the embedding model to produce the dictionary format expected by clustering methods.
async def embed_summaries(
summaries: list[ConversationSummary],
embedding_model: BaseEmbeddingModel
) -> list[dict[str, Union[ConversationSummary, list[float]]]]
summaries
list[ConversationSummary]
required
List of conversation summaries to embed
embedding_model
BaseEmbeddingModel
required
Embedding model to use
return
list[dict[str, Union[ConversationSummary, list[float]]]]
List of dictionaries with “item” (ConversationSummary) and “embedding” (list[float]) keys
Example:
from kura.embedding import embed_summaries, OpenAIEmbeddingModel
from kura.cluster import KmeansClusteringModel
# Embed summaries
embedding_model = OpenAIEmbeddingModel()
embedded_items = await embed_summaries(summaries, embedding_model)
# Use with clustering
clustering_method = KmeansClusteringModel()
cluster_mapping = clustering_method.cluster(embedded_items)