Debugging

Debugging AI applications requires specialized tools to understand model behavior, trace execution flows, and diagnose issues. Genkit provides comprehensive debugging capabilities through traces, the Developer UI, and CLI tools.

Trace-Based Debugging

Genkit automatically collects detailed execution traces that show every step of your flow execution.

Enabling Traces

Traces are automatically collected when:

Using genkit start:
```
genkit start -- npm run dev
```
Running with GENKIT_ENV=dev:
```
GENKIT_ENV=dev npm run dev
```

Executing flows via CLI:

genkit flow:run myFlow '{"input":"data"}'

Viewing Traces in Developer UI

The Developer UI provides the most powerful trace inspection:

Start the Developer UI:
```
genkit start -- npm run dev
```
Navigate to Traces section at http://localhost:4000
Select a trace to inspect:
- Recent executions appear at the top
- Failed executions are highlighted
- Click any trace to view details

Trace Information

Each trace includes:

Span Tree: Hierarchical view of all operations
Timing Data: Duration of each step
Input/Output: Data passed between steps
Model Interactions: Prompts sent and responses received
Error Details: Stack traces and error messages
Metadata: Flow name, version, labels, and attributes

Example trace for a greeting flow:

└─ Flow: simpleGreeting (234ms)
   ├─ Input: {"customerName": "Sam"}
   ├─ Prompt: greetingPrompt (210ms)
   │  ├─ Model: gemini-flash-latest
   │  ├─ Request: "You're a barista...Sam enters..."
   │  └─ Response: "Welcome back, Sam! How about..."
   └─ Output: "Welcome back, Sam! How about a cappuccino?"

Debugging Common Issues

Flow Execution Failures

Symptom: Flow throws an error or returns unexpected results Debug Steps:

Check the trace in the Developer UI:
- Identify which step failed
- Review error message and stack trace
- Examine input data to that step

Run the flow with test data:

genkit flow:run myFlow '{"test":"data"}' --stream

Verify input schema:

// Add validation logging
const flow = ai.defineFlow(
  {
    name: 'myFlow',
    inputSchema: z.object({
      question: z.string(),
    }),
  },
  async (input) => {
    console.log('Received input:', input);
    // ... rest of flow
  }
);

Check each step:
- Review the trace to see where execution stopped
- Check if null or undefined values are passed
- Verify model responses are as expected

Model Response Issues

Symptom: Model returns unexpected or low-quality responses Debug Steps:

Inspect the prompt in traces:
- View the exact prompt sent to the model
- Check if template variables were substituted correctly
- Verify context and examples are included
Test prompt directly in Developer UI:
- Navigate to Prompts section
- Select your prompt
- Try different inputs
- Compare outputs from different models

Add prompt logging:

const prompt = ai.definePrompt(
  {
    name: 'myPrompt',
    model: googleAI.model('gemini-flash-latest'),
  },
  async (input) => {
    console.log('Prompt input:', input);
    const result = await ai.generate({
      prompt: `Process this: ${input}`,
    });
    console.log('Model response:', result.text);
    return result;
  }
);

Check model configuration:

// Verify temperature, topK, topP settings
const result = await ai.generate({
  model: googleAI.model('gemini-flash-latest'),
  config: {
    temperature: 0.7,
    maxOutputTokens: 1000,
  },
  prompt: 'Your prompt here',
});

Streaming Issues

Symptom: Streaming output doesn’t work or is incomplete Debug Steps:

Test streaming with CLI:

genkit flow:run myFlow '{"input":"data"}' --stream

Verify streaming implementation:

const streamingFlow = ai.defineFlow(
  { name: 'streamingFlow' },
  async (input, { sendChunk }) => {
    const result = await ai.generateStream({
      model: googleAI.model('gemini-flash-latest'),
      prompt: 'Tell me a story',
    });
    
    for await (const chunk of result.stream()) {
      console.log('Chunk:', chunk.text); // Debug log
      sendChunk(chunk.text);
    }
    
    return (await result.response()).text;
  }
);

Check for blocking operations:
- Ensure you’re not awaiting the full response before streaming
- Verify no synchronous operations block the event loop

Performance Issues

Symptom: Flows are slow or timeout Debug Steps:

Analyze timing in traces:
- Open the trace in Developer UI
- Identify the slowest spans
- Check if model calls are taking too long

Measure specific operations:

const flow = ai.defineFlow(
  { name: 'myFlow' },
  async (input) => {
    console.time('model-call');
    const result = await ai.generate({
      model: googleAI.model('gemini-flash-latest'),
      prompt: input.question,
    });
    console.timeEnd('model-call');
    
    return result.text;
  }
);

Check for unnecessary operations:
- Review trace to find redundant model calls
- Look for sequential operations that could be parallel
- Verify retrieval queries are optimized

Optimize model configuration:

// Reduce max tokens if output is too long
config: {
  maxOutputTokens: 500,  // Instead of 2000
}

Context and RAG Issues

Symptom: Model doesn’t use provided context or retrieval fails Debug Steps:

Verify context is passed:

genkit flow:run myFlow '{"question":"test"}' --context '["context1","context2"]'

Inspect retrieval in traces:

const ragFlow = ai.defineFlow(
  { name: 'ragFlow' },
  async (input) => {
    const docs = await retriever.retrieve({
      query: input.question,
    });
    
    console.log('Retrieved docs:', docs); // Debug log
    
    const result = await ai.generate({
      model: googleAI.model('gemini-flash-latest'),
      prompt: `Context: ${docs.map(d => d.text).join('\n')}\n\nQuestion: ${input.question}`,
    });
    
    return result.text;
  }
);

Check document retrieval:
- View retriever span in trace
- Verify documents were found
- Check similarity scores
- Ensure embeddings are generated correctly

Schema Validation Errors

Symptom: Input/output validation fails Debug Steps:

Add detailed error handling:

const flow = ai.defineFlow(
  {
    name: 'myFlow',
    inputSchema: z.object({
      name: z.string(),
      age: z.number(),
    }),
  },
  async (input) => {
    try {
      // Process input
    } catch (error) {
      console.error('Validation error:', error);
      throw error;
    }
  }
);

Test schema directly:

const testInput = { name: 'Alice', age: '30' }; // age is string
const result = inputSchema.safeParse(testInput);
console.log('Validation:', result);

Review trace for schema errors:
- Check error message in trace
- Verify actual vs expected types
- Ensure all required fields are provided

Developer UI Debugging Features

Real-Time Trace Inspection

When running with genkit start, traces appear immediately:

Run a flow from the Flows section
Click “View Trace” to inspect execution
Expand each span to see details
Review timing to identify bottlenecks

Comparing Executions

Compare multiple executions to identify patterns:

Run the same flow with different inputs
View traces side-by-side
Compare model responses
Identify inconsistencies

Error Highlighting

Failed executions are clearly marked:

Red indicators for errors
Stack traces in span details
Error messages at the top level
Failed step highlighted in span tree

CLI Debugging Commands

Running Flows with Verbose Output

# Stream output to see progress
genkit flow:run myFlow '{"input":"data"}' --stream

# Save output for inspection
genkit flow:run myFlow '{"input":"data"}' --output debug-output.json

Extracting Debug Data

Extract traces for offline analysis:

# Extract recent executions
genkit eval:extractData myFlow --maxRows 10 --output debug-traces.json

# Extract labeled runs
genkit eval:extractData myFlow --label "debug-session" --output debug-data.json

Batch Testing for Debugging

Test multiple scenarios:

genkit flow:batchRun myFlow test-cases.json --label "debug-batch" --output results.json

Then review all traces in the Developer UI filtered by label.

Logging Best Practices

Strategic Console Logs

Add logs at key points:

const debugFlow = ai.defineFlow(
  { name: 'debugFlow' },
  async (input) => {
    console.log('[DEBUG] Flow started with:', input);
    
    const docs = await retriever.retrieve(input.query);
    console.log('[DEBUG] Retrieved docs:', docs.length);
    
    const result = await ai.generate({
      model: googleAI.model('gemini-flash-latest'),
      prompt: `Answer: ${input.query}`,
    });
    console.log('[DEBUG] Model response:', result.text);
    
    return result.text;
  }
);

Structured Logging

Use JSON for structured logs:

function debugLog(stage: string, data: any) {
  console.log(JSON.stringify({
    timestamp: new Date().toISOString(),
    stage,
    data,
  }));
}

// Usage
debugLog('input-received', input);
debugLog('model-response', result);

Debugging in Production

While you shouldn’t run with GENKIT_ENV=dev in production, you can:

Use evaluation datasets to reproduce issues:

# Extract from production (if telemetry is enabled)
genkit eval:extractData myFlow --label "production-errors"

Test locally with production data:

genkit flow:run myFlow '{"actual":"production-data"}'

Enable production monitoring (see Observability docs)

Tips for Effective Debugging

Always check traces first - They contain the most complete information
Use labels to organize debug sessions
Test incrementally - Debug one component at a time
Compare working vs broken - Run working examples alongside failing ones
Save traces - Extract and save traces for complex issues
Use streaming - Helps identify where generation stops
Review prompts - Ensure templates render correctly
Check schemas - Validate input/output types match expectations

Common Debugging Patterns

Isolate the Issue

// Create minimal reproduction flow
const minimalFlow = ai.defineFlow(
  { name: 'minimal' },
  async (input) => {
    // Simplest possible version
    return await ai.generate({
      model: googleAI.model('gemini-flash-latest'),
      prompt: 'Say hello',
    });
  }
);

Binary Search Debugging

Comment out half the flow to find the problematic section:

const flow = ai.defineFlow({ name: 'test' }, async (input) => {
  const step1 = await doStep1();
  console.log('Step 1 done');
  
  // const step2 = await doStep2();
  // console.log('Step 2 done');
  
  // const step3 = await doStep3();
  // console.log('Step 3 done');
  
  return step1;
});

Add Intermediate Outputs

const flow = ai.defineFlow(
  { 
    name: 'debug',
    outputSchema: z.object({
      final: z.string(),
      debug: z.any(),
    }),
  },
  async (input) => {
    const intermediate = await someOperation();
    
    return {
      final: intermediate.result,
      debug: {
        rawData: intermediate,
        processedAt: new Date(),
      },
    };
  }
);

Next Steps

Explore the Developer UI features in depth
Learn about Testing strategies
Review CLI commands for advanced debugging
Check out Observability for production monitoring

Overview

Getting Started

Core Concepts

Guides

Model Providers

Deployment

Developer Tools

Trace-Based Debugging

Enabling Traces

Viewing Traces in Developer UI

Trace Information

Debugging Common Issues

Flow Execution Failures

Model Response Issues

Streaming Issues

Performance Issues

Context and RAG Issues

Schema Validation Errors

Developer UI Debugging Features

Real-Time Trace Inspection

Comparing Executions

Error Highlighting

CLI Debugging Commands

Running Flows with Verbose Output

Extracting Debug Data

Batch Testing for Debugging

Logging Best Practices

Strategic Console Logs

Structured Logging

Debugging in Production

Tips for Effective Debugging

Common Debugging Patterns

Isolate the Issue

Binary Search Debugging

Add Intermediate Outputs

Next Steps

Build docs developers (and LLMs) love

Overview

Getting Started

Core Concepts

Guides

Model Providers

Deployment

Developer Tools

Documentation Index

​Trace-Based Debugging

​Enabling Traces

​Viewing Traces in Developer UI

​Trace Information

​Debugging Common Issues

​Flow Execution Failures

​Model Response Issues

​Streaming Issues

​Performance Issues

​Context and RAG Issues

​Schema Validation Errors

​Developer UI Debugging Features

​Real-Time Trace Inspection

​Comparing Executions

​Error Highlighting

​CLI Debugging Commands

​Running Flows with Verbose Output

​Extracting Debug Data

​Batch Testing for Debugging

​Logging Best Practices

​Strategic Console Logs

​Structured Logging

​Debugging in Production

​Tips for Effective Debugging

​Common Debugging Patterns

​Isolate the Issue

​Binary Search Debugging

​Add Intermediate Outputs

​Next Steps

Build docs developers (and LLMs) love

Trace-Based Debugging

Enabling Traces

Viewing Traces in Developer UI

Trace Information

Debugging Common Issues

Flow Execution Failures

Model Response Issues

Streaming Issues

Performance Issues

Context and RAG Issues

Schema Validation Errors

Developer UI Debugging Features

Real-Time Trace Inspection

Comparing Executions

Error Highlighting

CLI Debugging Commands

Running Flows with Verbose Output

Extracting Debug Data

Batch Testing for Debugging

Logging Best Practices

Strategic Console Logs

Structured Logging

Debugging in Production

Tips for Effective Debugging

Common Debugging Patterns

Isolate the Issue

Binary Search Debugging

Add Intermediate Outputs

Next Steps