Tier 1 for clean text generation, Tier 2 for full coding tasks
Magpie uses two tiers of LLM interaction, each optimized for its purpose. This architecture avoids forcing a single tool (Goose) to handle tasks it’s not designed for.
One-shot Claude CLI calls for branch names, classifications, commit messages
Tier 2: Coding Tasks
Full Goose agent loop with streaming and tool access for file edits
Using Goose for simple text generation causes streaming fragmentation — tokens arrive split across multiple events, breaking single-line outputs like branch slugs. Tier 1 bypasses this by calling claude -p directly.
async fn claude_call(prompt: &str, step_name: &str, trace_dir: Option<&PathBuf>) -> Result<String> { let mut tb = TraceBuilder::new(step_name, prompt); let output = tokio::process::Command::new("claude") .args(["-p", prompt]) .env_remove("CLAUDECODE") .output() .await .context("failed to run `claude` CLI — is it installed and on PATH?")?; if !output.status.success() { let stderr = String::from_utf8_lossy(&output.stderr); tb.record_event(EventKind::Error, &stderr); let trace = tb.finish(""); if let Some(dir) = trace_dir { let _ = trace::write_trace(&trace, dir); } anyhow::bail!("claude CLI failed: {stderr}"); } let response = String::from_utf8_lossy(&output.stdout).trim().to_string(); let trace = tb.finish(&response); if let Some(dir) = trace_dir { let _ = trace::write_trace(&trace, dir); } Ok(response)}
async fn generate_branch_slug(task: &str, trace_dir: Option<&PathBuf>) -> String { let prompt = format!( "Generate a short branch name slug (3-6 words, hyphen-separated) for this task.\n\n\ Task: {task}\n\n\ Reply with ONLY the slug, nothing else. Example: add-oauth2-pkce-flow" ); match claude_call(&prompt, "branch-slug", trace_dir).await { Ok(response) => { let slug = response.trim().to_lowercase(); ensure_multi_word_slug(&slug, task) } Err(e) => { warn!("branch slug generation failed: {e} → falling back to slugify"); crate::git::slugify(task) } }}
If Claude returns a single word like "authentication", the pipeline enriches it by extracting a verb from the task: "add-authentication". See ensure_multi_word_slug() in pipeline.rs:1275-1312.
Agent steps dispatch to MagpieAgent in local sandboxes:
blueprint/steps/agent.rs:49-119
pub async fn execute( &self, ctx: &StepContext, step_name: &str, sandbox: &dyn Sandbox,) -> Result<StepContext> { let mut prompt = self.prompt.clone(); // Prepend metadata context (e.g. chat history) if configured if let Some(ref key) = self.context_metadata_key { if let Some(value) = ctx.metadata.get(key) { prompt = format!("Context from conversation:\n```\n{value}\n```\n\n{prompt}"); } } if self.include_last_output { if let Some(ref output) = ctx.last_output { prompt = format!("Previous step output:\n```\n{output}\n```\n\n{prompt}"); } } let response = match sandbox.name() { // Remote sandbox: agent can't use Goose locally, so exec `claude -p` inside the sandbox. "daytona" => { let max_turns = self.max_turns.unwrap_or(10); let output = sandbox .exec( "claude", &["-p", &prompt, "--max-turns", &max_turns.to_string()], ) .await?; if output.exit_code != 0 { anyhow::bail!( "claude -p failed inside sandbox (exit {}): {}", output.exit_code, output.stdout.trim() ); } output.stdout.trim().to_string() } // Local sandbox: use the full Goose agent loop (current behavior). _ => { let mut config = MagpieConfig::default(); if let Some(turns) = self.max_turns { config.max_turns = turns; } // Read trace_dir from context metadata (injected by pipeline) if let Some(dir) = ctx.metadata.get("trace_dir") { config.trace_dir = Some(PathBuf::from(dir)); } let agent = MagpieAgent::new(config)?; let working_dir = PathBuf::from(sandbox.working_dir()); agent .run(&prompt, Some(&working_dir), Some(step_name)) .await? } }; let mut new_ctx = ctx.clone(); new_ctx.last_output = Some(response); new_ctx.last_exit_code = Some(0); Ok(new_ctx)}
Remote sandbox behavior: When running in a Daytona sandbox, agent steps exec claude -p inside the sandbox instead of using Goose locally. This ensures the agent has access to the sandbox’s filesystem and tools.
Goose streams tokens, which fragments single-line outputs. Branch slugs like add-oauth2-login arrive as separate events: "add", "-", "oauth2", etc. This requires complex reassembly logic and still risks corruption.
Why not use Claude CLI for everything?
The Claude CLI doesn’t have file or shell tool access. It can’t read source code, run tests, or edit files. It’s a pure text-in-text-out interface.
Could we use the Claude API directly instead of CLI?
Yes, but the CLI handles auth, rate limiting, and error retries for us. It’s a simpler interface for one-shot calls.
How does tracing work across both tiers?
Both tiers use TraceBuilder to emit JSONL traces to .magpie/traces/. Each call records prompt, response, duration, and any errors. Tier 2 also records tool use events.