Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/Helicone/helicone/llms.txt

Use this file to discover all available pages before exploring further.

Export your LLM request and response data from Helicone for analysis, backup, compliance, or migration to other systems.

Why Export Data

Common use cases:
  • Fine-tuning preparation: Export production data as training examples
  • Custom analytics: Analyze in your own BI tools (Tableau, PowerBI)
  • Compliance: Meet data retention and audit requirements
  • Backup: Keep local copies of critical data
  • Migration: Move data between systems or regions

Export Methods

Helicone provides three ways to export data:

NPM Tool

Command-line tool with resume support

REST API

Programmatic access for automation

Dashboard

Manual export via UI
The easiest and most reliable way to export large datasets.

Quick Start

# No installation required - use npx
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --limit 10000 \
  --include-body

Features

Auto-Recovery

Resumes from last checkpoint if interrupted

Retry Logic

Exponential backoff for transient failures

Progress Tracking

Real-time progress with ETA

Multiple Formats

JSON, JSONL, or CSV output

Common Usage Examples

Export all requests from a date range:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --format jsonl \
  --output ./data/helicone-export.jsonl \
  --include-body
Output:
┏━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┓
┃ Helicone Data Export Tool            ┃
┗━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━━┛

Fetching total count...
Total records: 45,231
Exporting to: ./data/helicone-export.jsonl

Progress: [====================] 100% | 45,231/45,231 | ETA: 0s

✅ Export complete!
├── Records exported: 45,231
├── Output file: ./data/helicone-export.jsonl
├── File size: 1.2 GB
└── Duration: 3m 42s

Configuration Options

OptionDescriptionDefaultExample
--start-dateStart date (ISO 8601)30 days ago2024-01-01
--end-dateEnd date (ISO 8601)Now2024-12-31
--limitMax records to exportUnlimited10000
--formatOutput formatjsonljson, jsonl, csv
--outputOutput file pathhelicone-export.*./data/export.jsonl
--include-bodyInclude request/response bodiesfalse(flag)
--propertyFilter by propertyNoneEnvironment=prod
--regionAPI regionusus, eu
--batch-sizeRecords per API call1000500
--resumeResume from checkpointfalse(flag)
--clean-stateClear checkpoint and restartfalse(flag)
--log-levelLogging verbositynormalquiet, verbose

Method 2: REST API

For programmatic export and automation.

Basic Query

import fs from 'fs';

const HELICONE_API_KEY = process.env.HELICONE_API_KEY;

async function exportData(
  startDate: string,
  endDate: string,
  limit: number = 1000
) {
  const response = await fetch(
    "https://api.helicone.ai/v1/request/query-clickhouse",
    {
      method: "POST",
      headers: {
        "Authorization": `Bearer ${HELICONE_API_KEY}`,
        "Content-Type": "application/json",
      },
      body: JSON.stringify({
        filter: {
          request_response_rmt: {
            request_created_at: {
              gte: startDate,
              lte: endDate,
            },
          },
        },
        limit,
      }),
    }
  );
  
  const data = await response.json();
  return data.data;
}

// Export and save
const requests = await exportData(
  "2024-01-01T00:00:00Z",
  "2024-12-31T23:59:59Z",
  10000
);

fs.writeFileSync(
  "export.jsonl",
  requests.map(r => JSON.stringify(r)).join("\n")
);

console.log(`Exported ${requests.length} requests`);

Advanced Filtering

{
  "filter": {
    "request_response_rmt": {
      "properties": {
        "Environment": { "equals": "production" },
        "Feature": { "equals": "chat" }
      },
      "request_created_at": {
        "gte": "2024-01-01T00:00:00Z"
      }
    }
  },
  "limit": 1000
}

Pagination for Large Exports

async function exportAllData(
  startDate: string,
  endDate: string
) {
  const allRequests = [];
  let offset = 0;
  const batchSize = 1000;
  
  while (true) {
    console.log(`Fetching batch at offset ${offset}...`);
    
    const response = await fetch(
      "https://api.helicone.ai/v1/request/query-clickhouse",
      {
        method: "POST",
        headers: {
          "Authorization": `Bearer ${HELICONE_API_KEY}`,
          "Content-Type": "application/json",
        },
        body: JSON.stringify({
          filter: {
            request_response_rmt: {
              request_created_at: {
                gte: startDate,
                lte: endDate,
              },
            },
          },
          limit: batchSize,
          offset,
        }),
      }
    );
    
    const data = await response.json();
    const batch = data.data;
    
    if (batch.length === 0) {
      break; // No more data
    }
    
    allRequests.push(...batch);
    offset += batch.length;
    
    console.log(`Total fetched: ${allRequests.length}`);
    
    // Respect rate limits
    await new Promise(resolve => setTimeout(resolve, 100));
  }
  
  return allRequests;
}

// Usage
const allData = await exportAllData(
  "2024-01-01T00:00:00Z",
  "2024-12-31T23:59:59Z"
);

console.log(`Exported ${allData.length} total requests`);

Method 3: Dashboard Export

Manual export for small datasets.
1

Navigate to Requests

2

Apply Filters

Filter data to export:
  • Date range
  • Properties (Environment, Feature, etc.)
  • User ID
  • Model
  • Status
3

Export

Click “Export” button and choose format:
  • JSON
  • CSV
Dashboard export is limited to 10,000 records. For larger datasets, use the NPM tool or API.

Data Format

One JSON object per line:
{"request_id":"req_abc123","created_at":"2024-01-15T10:30:00Z","model":"gpt-4o","prompt_tokens":50,"completion_tokens":100,"cost_usd":0.015}
{"request_id":"req_def456","created_at":"2024-01-15T10:31:00Z","model":"gpt-4o-mini","prompt_tokens":30,"completion_tokens":80,"cost_usd":0.003}
Benefits:
  • Streamable (process line by line)
  • Efficient for large files
  • Easy to split/merge

JSON Format

Array of objects:
[
  {
    "request_id": "req_abc123",
    "created_at": "2024-01-15T10:30:00Z",
    "model": "gpt-4o",
    "prompt_tokens": 50,
    "completion_tokens": 100,
    "cost_usd": 0.015
  },
  {
    "request_id": "req_def456",
    "created_at": "2024-01-15T10:31:00Z",
    "model": "gpt-4o-mini",
    "prompt_tokens": 30,
    "completion_tokens": 80,
    "cost_usd": 0.003
  }
]

CSV Format

Comma-separated values:
request_id,created_at,model,prompt_tokens,completion_tokens,cost_usd
req_abc123,2024-01-15T10:30:00Z,gpt-4o,50,100,0.015
req_def456,2024-01-15T10:31:00Z,gpt-4o-mini,30,80,0.003
Best for:
  • Excel/Google Sheets
  • BI tools (Tableau, PowerBI)
  • Simple analysis

Included Fields

FieldDescriptionType
request_idUnique request identifierstring
created_atTimestamp (ISO 8601)string
user_idUser identifierstring
modelModel namestring
prompt_tokensInput tokensnumber
completion_tokensOutput tokensnumber
total_tokensTotal tokensnumber
cost_usdCost in USDnumber
latencyResponse time (ms)number
statusHTTP status codenumber
propertiesCustom propertiesobject
request_bodyRequest payload (if --include-body)object
response_bodyResponse payload (if --include-body)object

Use Case Examples

Fine-Tuning Dataset

Export successful requests for training:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --property Task=sentiment-analysis \
  --property Environment=production \
  --start-date 2024-01-01 \
  --format jsonl \
  --include-body \
  --output training-data.jsonl

# Post-process to OpenAI format
node convert-to-openai-format.js training-data.jsonl

Cost Analysis

Export for custom analytics:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date 2024-01-01 \
  --end-date 2024-12-31 \
  --format csv \
  --output costs-2024.csv

# Import into Excel/Tableau for analysis

Compliance Backup

Monthly backup for audit trail:
#!/bin/bash
# backup-monthly.sh

MONTH=$(date -d "last month" +%Y-%m)
START_DATE="${MONTH}-01T00:00:00Z"
END_DATE=$(date -d "${START_DATE} +1 month" +%Y-%m-%dT00:00:00Z)

HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --start-date "$START_DATE" \
  --end-date "$END_DATE" \
  --format jsonl \
  --include-body \
  --output "backups/helicone-${MONTH}.jsonl.gz"

echo "Backup complete for $MONTH"

User Data Export (GDPR)

Export all data for a specific user:
const response = await fetch(
  "https://api.helicone.ai/v1/request/query-clickhouse",
  {
    method: "POST",
    headers: {
      "Authorization": `Bearer ${HELICONE_API_KEY}`,
      "Content-Type": "application/json",
    },
    body: JSON.stringify({
      filter: {
        request_response_rmt: {
          user_id: { equals: "user-123" },
        },
      },
      limit: 100000,
    }),
  }
);

const userData = await response.json();

// Save for GDPR request
fs.writeFileSync(
  "user-123-data-export.json",
  JSON.stringify(userData.data, null, 2)
);

Best Practices

Use JSONL for large exports: More efficient than JSON arrays
Export incrementally: Daily or weekly exports are easier to manage than one large export
Compress backups: JSONL compresses well with gzip (80-90% reduction)
Filter early: Apply filters at export time to reduce data size
Request bodies can be large: Only use --include-body when needed

Troubleshooting

Tips to speed up:
  • Use --batch-size 500 for faster but smaller batches
  • Apply filters to reduce data volume
  • Export during off-peak hours
  • Check your network connection
Use --resume to continue:
HELICONE_API_KEY="sk-xxx" npx @helicone/export --resume
Or clean state and restart:
HELICONE_API_KEY="sk-xxx" npx @helicone/export --clean-state ...
Reduce batch size:
HELICONE_API_KEY="sk-xxx" npx @helicone/export \
  --batch-size 250 \
  ...
Or add delays in custom scripts:
await new Promise(resolve => setTimeout(resolve, 500));
Ensure property name matches exactly:
# Correct
--property Environment=production

# Wrong (case sensitive)
--property environment=production
Check property exists in your data:
  1. Go to Helicone dashboard
  2. View a request
  3. Check exact property names

Automated Exports

Schedule regular exports:

Cron Job (Linux/Mac)

# Add to crontab (crontab -e)
# Run daily at 2 AM
0 2 * * * cd /path/to/project && HELICONE_API_KEY=sk-xxx npx @helicone/export --start-date $(date -d "yesterday" +\%Y-\%m-\%d) --output backups/daily-$(date +\%Y-\%m-\%d).jsonl

GitHub Actions

name: Daily Helicone Backup

on:
  schedule:
    - cron: '0 2 * * *'  # Daily at 2 AM UTC

jobs:
  export:
    runs-on: ubuntu-latest
    steps:
      - name: Export Helicone data
        env:
          HELICONE_API_KEY: ${{ secrets.HELICONE_API_KEY }}
        run: |
          npx @helicone/export \
            --start-date $(date -d "yesterday" +%Y-%m-%d) \
            --format jsonl \
            --output backup-$(date +%Y-%m-%d).jsonl
      
      - name: Upload to S3
        uses: aws-actions/aws-cli@v2
        with:
          args: s3 cp backup-$(date +%Y-%m-%d).jsonl s3://my-backups/helicone/

Next Steps

Query API Docs

Full API documentation for queries

Fine-Tuning Prep

Use exported data for fine-tuning

Custom Properties

Add metadata for better filtering

Sessions

Export complete workflows

Build docs developers (and LLMs) love