Documentation Index
Fetch the complete documentation index at: https://mintlify.com/dallay/corvus/llms.txt
Use this file to discover all available pages before exploring further.
Embeddings
Embeddings power semantic search in Corvus memory. Instead of exact keyword matches, embeddings capture meaning and context.
How It Works
- Text → Vector: Convert text to a high-dimensional vector (e.g., 1536 floats)
- Store: Save vector alongside memory content
- Query: Convert query to vector, find nearest neighbors via cosine similarity
- Return: Memories with highest semantic similarity
Configuration
[memory]
embedding_provider = "openai" # or "noop" to disable
vector_weight = 0.7 # 70% vector, 30% keyword
keyword_weight = 0.3
Embedding Providers
OpenAI (Default)
Uses OpenAI’s text-embedding-3-small model (1536 dimensions):
[memory]
embedding_provider = "openai"
Cost: 0.02per1Mtokens( 0.000002 per embedding)
API Key: Reads from api_key in config
Noop (Disable Vectors)
Keyword search only:
[memory]
embedding_provider = "noop"
Use when:
- No internet access
- Cost-sensitive deployments
- Keyword search is sufficient
Custom URL
Point to any OpenAI-compatible embedding API:
[memory]
embedding_provider = "custom"
embedding_url = "https://your-api.com/embeddings"
EmbeddingProvider Trait
From src/memory/embeddings.rs:13-20:
#[async_trait]
pub trait EmbeddingProvider: Send + Sync {
/// Generate embedding vector for text
async fn embed(&self, text: &str) -> anyhow::Result<Vec<f32>>;
/// Embedding dimension (e.g., 1536 for OpenAI)
fn dimension(&self) -> usize;
}
Implementation: OpenAI
From src/memory/embeddings.rs:
pub struct OpenAiEmbedding {
api_key: String,
client: reqwest::Client,
model: String,
}
impl OpenAiEmbedding {
pub fn new(api_key: &str) -> Self {
Self {
api_key: api_key.to_string(),
client: reqwest::Client::new(),
model: "text-embedding-3-small".to_string(),
}
}
}
#[async_trait]
impl EmbeddingProvider for OpenAiEmbedding {
async fn embed(&self, text: &str) -> anyhow::Result<Vec<f32>> {
let resp = self.client
.post("https://api.openai.com/v1/embeddings")
.header("Authorization", format!("Bearer {}", self.api_key))
.json(&serde_json::json!({
"input": text,
"model": self.model,
}))
.send()
.await?
.json::<serde_json::Value>()
.await?;
let embedding = resp["data"][0]["embedding"]
.as_array()
.ok_or_else(|| anyhow::anyhow!("No embedding in response"))?
.iter()
.map(|v| v.as_f64().unwrap() as f32)
.collect();
Ok(embedding)
}
fn dimension(&self) -> usize {
1536
}
}
Cosine Similarity
From src/memory/vector.rs:
pub fn cosine_similarity(a: &[f32], b: &[f32]) -> f32 {
assert_eq!(a.len(), b.len(), "Vectors must have same dimension");
let dot: f32 = a.iter().zip(b.iter()).map(|(x, y)| x * y).sum();
let mag_a: f32 = a.iter().map(|x| x * x).sum::<f32>().sqrt();
let mag_b: f32 = b.iter().map(|x| x * x).sum::<f32>().sqrt();
if mag_a == 0.0 || mag_b == 0.0 {
return 0.0;
}
dot / (mag_a * mag_b)
}
Range: [-1, 1] where:
- 1.0 = identical
- 0.0 = orthogonal (unrelated)
- -1.0 = opposite
Embedding Cache
To avoid redundant API calls, embeddings are cached by content hash:
use sha2::{Sha256, Digest};
fn content_hash(text: &str) -> String {
let mut hasher = Sha256::new();
hasher.update(text.as_bytes());
format!("{:x}", hasher.finalize())
}
pub async fn get_or_embed(&self, text: &str) -> anyhow::Result<Vec<f32>> {
let hash = content_hash(text);
// Check cache
if let Some(cached) = self.cache.get(&hash) {
return Ok(cached.clone());
}
// Generate embedding
let embedding = self.embedder.embed(text).await?;
// Store in cache
self.cache.insert(hash, embedding.clone());
Ok(embedding)
}
Cache hit rate: ~70-80% in typical usage
Text Chunking
Long documents are split into chunks before embedding:
From src/memory/chunker.rs:
pub fn chunk_text(text: &str, max_lines: usize) -> Vec<String> {
let lines: Vec<&str> = text.lines().collect();
let mut chunks = Vec::new();
let mut current_chunk = String::new();
let mut current_heading = String::new();
for line in lines {
// Preserve headings across chunks
if line.starts_with('#') {
current_heading = line.to_string();
}
if current_chunk.lines().count() >= max_lines {
chunks.push(current_chunk.trim().to_string());
current_chunk = format!("{}\n", current_heading);
}
current_chunk.push_str(line);
current_chunk.push('\n');
}
if !current_chunk.is_empty() {
chunks.push(current_chunk.trim().to_string());
}
chunks
}
Default: 50 lines per chunk
Hybrid Search Weights
Tune weights based on your use case:
Semantic-Heavy (Default)
vector_weight = 0.7
keyword_weight = 0.3
Best for: Conceptual queries, natural language
Keyword-Heavy
vector_weight = 0.3
keyword_weight = 0.7
Best for: Exact terms, technical queries, code search
Balanced
vector_weight = 0.5
keyword_weight = 0.5
Best for: Mixed queries
Custom Embedding Provider
Implement the trait for your own provider:
use async_trait::async_trait;
use corvus::memory::embeddings::EmbeddingProvider;
pub struct LocalEmbedding {
model: YourEmbeddingModel,
}
#[async_trait]
impl EmbeddingProvider for LocalEmbedding {
async fn embed(&self, text: &str) -> anyhow::Result<Vec<f32>> {
// Your embedding logic
let embedding = self.model.encode(text)?;
Ok(embedding)
}
fn dimension(&self) -> usize {
384 // Your model's dimension
}
}
Register in src/memory/mod.rs:
let embedder: Arc<dyn EmbeddingProvider> = match config.embedding_provider.as_str() {
"openai" => Arc::new(OpenAiEmbedding::new(&config.api_key)),
"local" => Arc::new(LocalEmbedding::new()),
_ => Arc::new(NoopEmbedding),
};
Latency
| Provider | Latency | Notes |
|---|
| OpenAI API | 100-300ms | Network call |
| Local model | 10-50ms | CPU/GPU bound |
| Noop | <1ms | No embedding |
Cost
OpenAI pricing (as of 2026):
- text-embedding-3-small: $0.02 / 1M tokens
- text-embedding-3-large: $0.13 / 1M tokens
Estimate: 1000 memories * 200 tokens each = $0.004
Caching Impact
With 80% cache hit rate:
- Without cache: 1000 embeds = 100-300 seconds
- With cache: 200 embeds = 20-60 seconds
5× speedup + cost savings
Best Practices
Enable embedding cache (default) to avoid redundant API calls
Use text-embedding-3-small (1536d) for best cost/performance ratio
Don’t embed secrets or PII — embeddings are sent to external APIs