OpenAPI specifications — Files containing openapi or swagger in the nameIf your repository includes OpenAPI/Swagger specs, Beacon uses them to generate precise endpoint documentation.
fn is_openapi(path: &Path) -> bool { let name = path.file_name() .unwrap_or_default() .to_string_lossy() .to_lowercase(); name.contains("openapi") || name.contains("swagger")}
This structured context is what gets sent to the AI in Phase 2.
Beacon also detects existing AGENTS.md files during scanning. This allows the AI to refine and update existing documentation rather than starting from scratch.
Beacon constructs a detailed prompt in src/inferrer.rs:272:
fn build_prompt(ctx: &RepoContext) -> String { let mut parts: Vec<String> = vec![ "You are an expert at analyzing software repositories...", "Analyze the following repository context and return a JSON object...", "GUIDANCE: Look beyond just utility scripts. Identify server-side capabilities, REST API endpoints...", "CRITICAL: Return ONLY valid JSON. No markdown, no explanation...", ]; // Include README (truncated to 3000 chars) if let Some(readme) = &ctx.readme { parts.push(format!("\n## README\n{}", truncate(readme, 3000))); } // Include package manifest (1000 chars) // Include OpenAPI spec (3000 chars) // Include up to 10 source files (1500 chars each) parts.join("\n")}
The prompt provides:
Explicit instructions to focus on agent-usable capabilities (APIs, services, not just scripts)
The exact JSON schema the AI must follow
Truncated context from the scanned repository
Beacon truncates long files to keep token usage reasonable while still providing enough context for accurate inference. README gets 3000 chars, source files get 1500 chars each.
if let Some(auth) = &m.authentication { out.push_str("## Authentication\n\n"); out.push_str(&format!("**Type:** `{}`\n\n", auth.r#type)); if let Some(desc) = &auth.description { out.push_str(&format!("{}\n\n", desc)); }}
See src/generator.rs:24
Capabilities with Schemas
for cap in &m.capabilities { out.push_str(&format!("### `{}`\n\n", cap.name)); out.push_str(&format!("{}\n\n", cap.description)); if let Some(input) = &cap.input_schema { out.push_str("**Input:**\n\n```json\n"); out.push_str(&serde_json::to_string_pretty(input).unwrap_or_default()); out.push_str("\n```\n\n"); } if let Some(output) = &cap.output_schema { out.push_str("**Output:**\n\n```json\n"); out.push_str(&serde_json::to_string_pretty(output).unwrap_or_default()); out.push_str("\n```\n\n"); } if !cap.examples.is_empty() { out.push_str("**Examples:**\n\n"); for ex in &cap.examples { out.push_str(&format!("- {}\n", ex)); } }}
See src/generator.rs:37
Endpoints Table
for ep in &m.endpoints { out.push_str(&format!( "### `{} {}`\n\n{}\n\n", ep.method.to_uppercase(), ep.path, ep.description )); if !ep.parameters.is_empty() { out.push_str("| Parameter | Type | Required | Description |\n"); out.push_str("|-----------|------|----------|-------------|\n"); for p in &ep.parameters { out.push_str(&format!( "| `{}` | `{}` | {} | {} |\n", p.name, p.r#type, if p.required { "✅" } else { "❌" }, p.description )); } }}
The three-phase design cleanly separates data collection (scanning) from AI inference.This means you can swap providers without changing the scanning or generation logic.
Deterministic Output
By using structured JSON schemas and low temperature (0.2), Beacon produces consistent results.Running the same repo through Beacon twice yields nearly identical AGENTS.md files.
Context Optimization
Truncation limits keep token usage reasonable while preserving the most important information (README, key source files).This makes Beacon fast and cost-effective even for large codebases.
Standards Compliant
The generated markdown follows the AAIF specification exactly, ensuring compatibility with any agent that supports the standard.