Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Helicone/helicone/llms.txt

Use this file to discover all available pages before exploring further.

Learn how to track costs across users, features, and environments to understand your AI application’s unit economics and identify optimization opportunities.

What You’ll Learn

How to:
  • Track costs per user and feature
  • Set up cost alerts before budget overruns
  • Enable caching to reduce redundant API costs
  • Analyze cost trends over time

Prerequisites

  • Helicone API key (get one here)
  • An LLM application with API calls
  • 5 minutes to implement tracking

Step 1: Add Cost Tracking Headers

Start by tagging your requests with metadata for cost segmentation.
import { OpenAI } from "openai";

const client = new OpenAI({
  apiKey: process.env.OPENAI_API_KEY,
  baseURL: "https://oai.helicone.ai/v1",
  defaultHeaders: {
    "Helicone-Auth": `Bearer ${process.env.HELICONE_API_KEY}`,
  },
});

// Track cost by user and feature
const response = await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [{ role: "user", content: "Hello!" }],
  },
  {
    headers: {
      "Helicone-User-Id": "user-123",
      "Helicone-Property-Feature": "chat",
      "Helicone-Property-Environment": "production",
      "Helicone-Property-UserTier": "premium",
    },
  }
);
Key Headers:
  • Helicone-User-Id: Track costs per user for unit economics
  • Helicone-Property-Feature: Identify which features drive costs
  • Helicone-Property-Environment: Separate dev/staging/production costs
  • Helicone-Property-UserTier: Compare free vs. paid user costs

Step 2: Organize Multi-Step Workflows

For complex workflows (like AI agents), use sessions to track the total cost of completing a task.
import { randomUUID } from "crypto";

const sessionId = randomUUID();

// Initial question
await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Summarize this document..." }],
  },
  {
    headers: {
      "Helicone-Session-Id": sessionId,
      "Helicone-Session-Name": "Document Analysis",
      "Helicone-Session-Path": "/analyze",
      "Helicone-User-Id": "user-123",
      "Helicone-Property-Feature": "document-analysis",
    },
  }
);

// Follow-up analysis
await client.chat.completions.create(
  {
    model: "gpt-4o",
    messages: [{ role: "user", content: "Extract key points..." }],
  },
  {
    headers: {
      "Helicone-Session-Id": sessionId, // Same session ID
      "Helicone-Session-Name": "Document Analysis",
      "Helicone-Session-Path": "/analyze/extract",
      "Helicone-User-Id": "user-123",
      "Helicone-Property-Feature": "document-analysis",
    },
  }
);
Sessions show the total cost of completing a task. This reveals insights like “document analysis costs $0.45 on average” rather than seeing individual API calls.

Step 3: View Cost Analytics

1

Dashboard Overview

Navigate to your Helicone dashboard to see:
  • Total costs (today, this week, this month)
  • Cost trends over time
  • Top cost-driving models and features
  • Cost per user breakdown
2

Filter by Properties

Use the filters to segment costs:
Filter by Property: Feature = "document-analysis"
Result: $127 spent on document analysis this week

Filter by Property: Environment = "development"
Result: $43 spent on development testing

Filter by Property: UserTier = "premium"
Result: Premium users generate $1,200 in value vs. $380 in costs
3

Session Cost Analysis

View Sessions to see:
  • Average cost per workflow type
  • Cost distribution across steps
  • Expensive outliers to investigate

Step 4: Set Up Cost Alerts

Preventing budget overruns before they happen.
1

Navigate to Alerts

Go to Settings → Alerts in your dashboard.
2

Create Cost Alert

  1. Click “Create Alert”
  2. Select Cost as the metric
  3. Set your threshold (e.g., $100/day)
  4. Choose time window (e.g., 1 day)
  5. Add filters (optional):
    • Environment = “production” (exclude dev costs)
    • Feature = “document-analysis” (monitor specific features)
3

Configure Notifications

Add notification channels:
  • Email: finance@company.com
  • Slack: #alerts channel
Recommended alert structure:
  • Daily alert at 80% of budget (warning)
  • Daily alert at 100% of budget (critical)
  • Separate alerts for production vs. development

Step 5: Enable Caching for Cost Reduction

Cache repetitive requests to eliminate redundant API costs.
// Enable caching for FAQ responses
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [
      { role: "system", content: "Answer FAQ questions" },
      { role: "user", content: "What are your business hours?" }
    ],
  },
  {
    headers: {
      "Helicone-Cache-Enabled": "true",
      "Cache-Control": "max-age=86400", // 24 hours
      "Helicone-Property-Feature": "faq",
    },
  }
);

// Second identical request = $0 cost (cached)
await client.chat.completions.create(
  {
    model: "gpt-4o-mini",
    messages: [
      { role: "system", content: "Answer FAQ questions" },
      { role: "user", content: "What are your business hours?" }
    ],
  },
  {
    headers: {
      "Helicone-Cache-Enabled": "true",
      "Cache-Control": "max-age=86400",
      "Helicone-Property-Feature": "faq",
    },
  }
);
Best caching opportunities:
  • FAQ and support responses
  • Static content generation
  • Development/testing environments
  • Repeated queries with identical inputs

Expected Results

After implementing cost tracking:

Week 1

Total Costs: $487
├── Production: $423 (87%)
│   ├── chat: $245 (58%)
│   ├── document-analysis: $127 (30%)
│   └── search: $51 (12%)
└── Development: $64 (13%)

Top 5 Users by Cost:
1. user-789: $42.50 (premium tier)
2. user-456: $38.20 (premium tier)
3. user-123: $31.80 (free tier)
4. user-234: $28.90 (free tier)
5. user-567: $24.10 (premium tier)

Cache Performance:
- Hit rate: 23%
- Savings: $112

Insights

  • Premium users cost 35/monthaverage,generate35/month average, generate 120 value (3.4x ROI)
  • Free users cost $28/month, unsustainable without limits
  • Document analysis is most expensive feature at $0.45/session
  • Caching FAQ responses saved $112 (23% hit rate)

Step 6: Analyze and Optimize

1

Identify Cost Drivers

Look for:
  • High-cost users to potentially upgrade or limit
  • Features with poor cost-to-value ratios
  • Unexpected development environment costs
  • Cache opportunities (repeated similar requests)
2

Take Action

Based on insights:
// Add rate limiting for free tier users
if (userTier === "free" && monthlyCost > 25) {
  throw new Error("Monthly limit reached. Upgrade to premium.");
}
3

Monitor Impact

Track changes over time:
  • Did rate limiting reduce free tier costs?
  • Is model switching maintaining quality?
  • What’s the new cache hit rate?

Advanced: Query Costs Programmatically

Use the Helicone API to build custom cost dashboards:
const response = await fetch(
  "https://api.helicone.ai/v1/request/query-clickhouse",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${HELICONE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      filter: {
        request_response_rmt: {
          request_created_at: {
            gte: "2024-01-01T00:00:00Z"
          },
          properties: {
            UserTier: { equals: "premium" }
          }
        }
      },
      limit: 1000,
    }),
  }
);

const data = await response.json();
const totalCost = data.data.reduce(
  (sum, req) => sum + (req.cost_usd || 0), 
  0
);

console.log(`Premium user costs: $${totalCost.toFixed(2)}`);

Best Practices

Start with high-level tracking: Add User ID, Feature, and Environment headers to all requests
Use sessions for complex workflows: Group related requests to see true unit costs
Set graduated alerts: 50%, 80%, 95% of budget to catch issues early
Don’t over-optimize prematurely: Track for 1-2 weeks to understand patterns before making changes

Troubleshooting

Helicone calculates costs based on model detection:
  • Using AI Gateway: 100% accurate costs
  • Direct integration: Best-effort based on 300+ model pricing
If your model isn’t supported, contact help@helicone.ai to add it.
Properties take a few minutes to appear in filters after first use. Ensure:
  • Header format: Helicone-Property-[Name]
  • Values are strings (not numbers or booleans)
  • Requests are successfully logging (check dashboard)
Check:
  • Alert threshold and time window
  • Minimum request count (low traffic may not trigger)
  • Filters (too restrictive may exclude all requests)
  • Notification channels are configured correctly

Next Steps

Cost Tracking Guide

In-depth cost optimization strategies

User Metrics

Track per-user usage and costs

Sessions

Group requests to understand workflow costs

Alerts

Configure cost and error alerts

Build docs developers (and LLMs) love