Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/jxnl/kura/llms.txt

Use this file to discover all available pages before exploring further.

Overview

Kura provides a command-line interface (CLI) built with Typer for starting the web application and managing checkpoints.

Installation

The CLI is installed automatically with Kura:
pip install kura
# or
uv pip install kura
Verify installation:
kura --help

Commands

start-app

Start the FastAPI web server with the Kura analysis UI.
kura start-app [OPTIONS]

Options

--dir
str
default:"./checkpoints"
Directory to use for checkpoints, relative to the current directory.
--checkpoint-format
str
default:"jsonl"
Checkpoint format to use:
  • jsonl: Legacy JSONL format (default)
  • hf-dataset: HuggingFace datasets format (recommended for large datasets)

Examples

Basic usage:
kura start-app
This starts the server at http://localhost:8000 using the default ./checkpoints directory with JSONL format. Custom checkpoint directory:
kura start-app --dir ./my-analysis/checkpoints
Use HuggingFace datasets format:
kura start-app --checkpoint-format hf-dataset
Full example with all options:
kura start-app \
  --dir ./production-checkpoints \
  --checkpoint-format hf-dataset

Output

🚀 Starting Kura with hf-dataset checkpoints at ./checkpoints
Access website at http://localhost:8000

INFO:     Started server process [12345]
INFO:     Waiting for application startup.
INFO:     Application startup complete.
INFO:     Uvicorn running on http://0.0.0.0:8000 (Press CTRL+C to quit)

Environment Variables

The command sets the following environment variables:
  • KURA_CHECKPOINT_DIR: Checkpoint directory path
  • KURA_CHECKPOINT_FORMAT: Checkpoint format (jsonl or hf-dataset)

Server Configuration

The server runs with these settings:
  • Host: 0.0.0.0 (accessible from all network interfaces)
  • Port: 8000
  • Auto-reload: Disabled (enable in development mode if needed)

analyze-checkpoints

This command is documented in the project README but not yet implemented in the CLI.
Analyze existing JSONL checkpoints and estimate migration benefits.

Planned Usage

kura analyze-checkpoints <checkpoint-dir>

Expected Output

Analyzing checkpoints in ./checkpoints...

Checkpoint Statistics:
  Total files: 5
  Total size: 2.4 GB
  Total records: 125,430

Estimated Migration Benefits:
  HF Datasets size: ~1.2 GB (50% reduction)
  Parquet size: ~1.1 GB (54% reduction)
  
Recommendation: Migrate to HF Datasets format for large datasets

migrate-checkpoints

This command is documented in the project README but not yet implemented in the CLI.
Migrate JSONL checkpoints to HuggingFace datasets format.

Planned Usage

kura migrate-checkpoints <source-dir> <target-dir> [OPTIONS]

Planned Options

--hub-repo
str
HuggingFace Hub repository to upload migrated checkpoints (e.g., “username/repo-name”)
--hub-token
str
HuggingFace Hub authentication token for private repositories
--compression
str
default:"gzip"
Compression algorithm: gzip, lz4, zstd, or none

Planned Examples

Basic migration:
kura migrate-checkpoints ./old-checkpoints ./new-hf-checkpoints
Migrate with Hub upload:
kura migrate-checkpoints ./old-checkpoints ./new-hf-checkpoints \
  --hub-repo my-username/kura-analysis \
  --hub-token $HF_TOKEN
Use different compression:
kura migrate-checkpoints ./old-checkpoints ./new-hf-checkpoints \
  --compression zstd

Expected Output

Migrating checkpoints from ./old-checkpoints to ./new-hf-checkpoints...

Processing summaries.jsonl... ✓ (45,230 records)
Processing clusters.jsonl... ✓ (1,243 records)
Processing meta_clusters.jsonl... ✓ (156 records)
Processing projected_clusters.jsonl... ✓ (156 records)

Migration complete!
  Source size: 2.4 GB
  Target size: 1.2 GB (50% reduction)
  
Uploading to HuggingFace Hub: my-username/kura-analysis... ✓

Development Usage

Running from Source

When developing Kura, you can run the CLI directly:
# Install in development mode
uv pip install -e ".[dev]"

# Run CLI
kura start-app

Custom Server Configuration

For development with auto-reload:
# In kura/cli/cli.py
import uvicorn

uvicorn.run(
    "kura.cli.server:api",
    host="0.0.0.0",
    port=8000,
    reload=True,  # Enable auto-reload
    reload_dirs=["./kura"]  # Watch these directories
)

Adding New Commands

Extend the CLI by adding new commands:
import typer
from kura.cli.cli import app

@app.command()
def analyze_checkpoints(
    checkpoint_dir: str = typer.Argument(..., help="Directory containing checkpoints")
):
    """Analyze checkpoint statistics and estimate migration benefits."""
    from pathlib import Path
    import json
    
    path = Path(checkpoint_dir)
    if not path.exists():
        typer.echo(f"Error: Directory {checkpoint_dir} not found", err=True)
        raise typer.Exit(1)
    
    # Analysis logic here
    typer.echo(f"Analyzing checkpoints in {checkpoint_dir}...")

Integration with Web UI

Accessing the Web Interface

After running kura start-app, access the web interface:
# Open in browser
open http://localhost:8000

# Or on Linux
xdg-open http://localhost:8000

Web UI Features

  • Upload conversations: Upload and analyze conversation data
  • Cluster visualization: Interactive cluster maps and hierarchies
  • Conversation browser: Browse and search conversations
  • Export results: Download analysis results

API Endpoints

The server exposes FastAPI endpoints at:
  • http://localhost:8000/docs - Interactive API documentation (Swagger UI)
  • http://localhost:8000/redoc - Alternative API documentation (ReDoc)
See the web UI documentation in your browser for endpoint details.

Configuration Files

Environment Configuration

Create a .env file for persistent configuration:
# .env
KURA_CHECKPOINT_DIR=./production-checkpoints
KURA_CHECKPOINT_FORMAT=hf-dataset
HF_TOKEN=hf_your_token_here
Load in your application:
import os
from dotenv import load_dotenv

load_dotenv()

checkpoint_dir = os.getenv("KURA_CHECKPOINT_DIR", "./checkpoints")
checkpoint_format = os.getenv("KURA_CHECKPOINT_FORMAT", "jsonl")

Configuration Priority

  1. Command-line arguments (highest priority)
  2. Environment variables
  3. Default values (lowest priority)

Best Practices

Use HuggingFace datasets format for large-scale analyses (>100K conversations) to reduce storage and improve performance.
Set a custom checkpoint directory for each project to keep analyses organized.
Use environment variables for production deployments to avoid hardcoding configuration.
Enable auto-reload during development to speed up the feedback loop.

Troubleshooting

Port Already in Use

# Error: Address already in use
ERROR:    [Errno 48] Address already in use
Solution: Kill the process using port 8000 or use a different port:
# Find and kill process
lsof -ti:8000 | xargs kill -9

# Or modify server.py to use a different port
uvicorn.run(api, host="0.0.0.0", port=8080)

Checkpoint Directory Not Found

kura start-app --dir ./missing-dir
Solution: Create the directory or use an existing one:
mkdir -p ./missing-dir
kura start-app --dir ./missing-dir

Import Errors

ModuleNotFoundError: No module named 'kura'
Solution: Install Kura in development mode:
uv pip install -e ".[dev]"

HuggingFace Datasets Not Available

kura start-app --checkpoint-format hf-dataset
# Error: HuggingFace datasets is required
Solution: Install the optional dependency:
uv pip install datasets
# or install all optional dependencies
uv pip install -e ".[all]"

Examples

Production Deployment

#!/bin/bash
# deploy.sh

export KURA_CHECKPOINT_DIR=/data/kura/checkpoints
export KURA_CHECKPOINT_FORMAT=hf-dataset

# Start server with production settings
kura start-app \
  --dir $KURA_CHECKPOINT_DIR \
  --checkpoint-format $KURA_CHECKPOINT_FORMAT

Development Workflow

# Terminal 1: Start server with auto-reload
kura start-app --dir ./dev-checkpoints

# Terminal 2: Run analysis
python scripts/analyze.py

# Terminal 3: Watch logs
tail -f logs/kura.log

CI/CD Integration

# .github/workflows/test.yml
name: Test Kura CLI

on: [push, pull_request]

jobs:
  test:
    runs-on: ubuntu-latest
    steps:
      - uses: actions/checkout@v3
      - uses: actions/setup-python@v4
        with:
          python-version: '3.11'
      
      - name: Install dependencies
        run: |
          pip install uv
          uv pip install -e ".[dev]"
      
      - name: Test CLI
        run: |
          kura --help
          # Add more CLI tests

Docker Deployment

# Dockerfile
FROM python:3.11-slim

WORKDIR /app

COPY . .
RUN pip install uv && uv pip install -e ".[all]"

EXPOSE 8000

CMD ["kura", "start-app", "--dir", "/data/checkpoints", "--checkpoint-format", "hf-dataset"]
# Build and run
docker build -t kura .
docker run -p 8000:8000 -v $(pwd)/data:/data kura

Build docs developers (and LLMs) love