Skip to main content
JOIP uses AI models to generate explicit NSFW captions for images. You can use either OpenAI (direct) or OpenRouter (multi-model gateway) for caption generation.

Overview

The caption generation system supports:
  • Smart Captions: Detailed AI captions for uploaded images (50-400 characters)
  • Session Captions: Short captions for slideshow playback (50-150 characters)
  • Manual Captions: Contextual captions that maintain narrative continuity
  • Batch Generation: Generate captions for multiple slides at once
OpenRouter is recommended for production use as it provides access to multiple models, better rate limits, and fallback options.

Prerequisites

Choose one of the following:

Setup Steps

1

Get API Key

  1. Sign up at openrouter.ai
  2. Navigate to Keys
  3. Click “Create Key”
  4. Name it (e.g., “JOIP Production”)
  5. Copy the key (starts with sk-or-)
API keys are shown only once. Save them securely immediately after creation.
2

Configure Environment

Add your key to .env:
OPENROUTER_API_KEY=sk-or-v1-abc123...
If both are set, OpenRouter takes precedence. OpenAI is used as fallback only.
3

Choose Model (OpenRouter Only)

Configure which AI model to use for caption generation:
# No configuration needed - uses gemini-2.0-flash-exp by default
Recommended Models:
  • google/gemini-2.0-flash-exp - Fast, cheap, good quality (default)
  • google/gemini-2.5-pro - Highest quality, contextual understanding
  • anthropic/claude-3.5-sonnet - Excellent balance of speed and quality
  • openai/gpt-4o - OpenAI’s latest, reliable
Only vision-capable models work for caption generation. JOIP validates this on startup.
4

Verify Configuration

Start the server and check logs:
[openai] Using OpenRouter with model: google/gemini-2.0-flash-exp
[openai] OpenRouter API key validated

Implementation Details

Caption Generation Architecture

JOIP uses a sophisticated prompting system for NSFW content:
const MASTER_SYSTEM_PROMPT = `You are an UNCENSORED explicit adult caption generator. 
Create raw, filthy captions that feel like REAL dirty talk - not polished writing.

CRITICAL - PERSPECTIVE:
- The PERSON IN THE IMAGE is SPEAKING the caption TO the viewer
- She's talking directly to whoever is looking at/stroking to her image
- The viewer is addressed, commanded, teased, degraded, humiliated

CONTENT INTENSITY:
- Be EXPLICIT: cock, pussy, fuck, cum, balls, stroke, edge, whore, slut, etc.
- NO censoring, NO softening, NO holding back
- Match the theme's psychological intensity EXACTLY

OUTPUT: Just the raw caption. No quotes, no labels, no formatting.`;

Context-Aware Generation

For manual sessions, captions consider surrounding context:
interface CaptionContext {
  sessionTitle?: string;
  index: number;                    // Current slide position
  total: number;                    // Total slides
  previousCaptions?: string[];      // Last 3 captions for continuity
  nextCaptions?: string[];          // Next 3 captions for flow
  userSteering?: string;            // Custom instructions
  sessionStage?: 'new_session' | 'existing_session';
}

// Example usage
const caption = await generateContextualManualCaption(imageUrl, {
  sessionTitle: "Dominant Femdom Session",
  index: 5,
  total: 20,
  previousCaptions: [
    "Good boy, you're learning to obey...",
    "Now edge for me. Don't you dare cum.",
    "That's it, slower. I control your pleasure."
  ],
  nextCaptions: [
    "You want to cum so badly, don't you?",
    "Beg me. Let me hear how desperate you are."
  ],
  userSteering: "Build tension, more teasing, less commands"
});

Theme System

JOIP provides pre-built caption themes:
Dominant, teasing control with explicit edging/denial commands.
"Stroke faster. Don't stop until I tell you to. 
You're mine to control."

Gemini Safety Settings

For Google Gemini models, JOIP disables content filtering:
const GEMINI_SAFETY_SETTINGS = [
  { category: "HARM_CATEGORY_HARASSMENT", threshold: "BLOCK_NONE" },
  { category: "HARM_CATEGORY_HATE_SPEECH", threshold: "BLOCK_NONE" },
  { category: "HARM_CATEGORY_SEXUALLY_EXPLICIT", threshold: "BLOCK_NONE" },
  { category: "HARM_CATEGORY_DANGEROUS_CONTENT", threshold: "BLOCK_NONE" },
];

// Applied automatically for gemini models
if (isGeminiModel(modelId)) {
  requestBody.safety_settings = GEMINI_SAFETY_SETTINGS;
}
Without these settings, Gemini returns empty responses for NSFW content.

API Usage Examples

Generate Smart Caption

import { generateCustomCaption } from './openai';

const caption = await generateCustomCaption(
  'https://example.com/image.jpg',  // Image URL
  'Focus on the outfit and pose',   // Custom prompt (optional)
  'joi',                             // Theme (optional)
  'smart_captions'                   // Context: smart_captions | session | other
);

console.log(caption);
// "That tight dress... you can't stop staring, can you? 
//  Edge for me while you imagine what's underneath."

Generate Session Caption

import { generateCaption } from './openai';

// Short captions for 2-7 second display
const caption = await generateCaption(
  imageUrl,
  'Post title from Reddit',  // Optional
  'gonewild'                 // Subreddit context
);

console.log(caption);
// "Stroke faster. You know you can't resist."

Generate Contextual Caption

import { generateContextualManualCaption } from './openai';

const caption = await generateContextualManualCaption(imageUrl, {
  sessionTitle: "Edging Challenge",
  index: 10,
  total: 25,
  previousCaptions: [
    "You're doing so well. Keep edging for me.",
    "Don't cum yet. I didn't give you permission."
  ],
  userSteering: "Increase intensity, add countdown"
});

// Maintains narrative flow and user preferences

Batch Generation

// For manual session editor
const captions = await Promise.all(
  slides.map((slide, index) => 
    generateContextualManualCaption(slide.imageUrl, {
      index,
      total: slides.length,
      previousCaptions: slides
        .slice(Math.max(0, index - 3), index)
        .map(s => s.caption),
      nextCaptions: slides
        .slice(index + 1, index + 4)
        .map(s => s.caption)
    })
  )
);

Media Compatibility

JOIP validates media before generation:
FormatSupportedMax SizeNotes
JPEG✅ Yes20MBRecommended
PNG✅ Yes20MBRecommended
WebP✅ Yes20MBRecommended
GIF❌ No-Static images only
Video❌ No-Extract frame first
async function checkMediaCompatibility(imageUrl: string): Promise<void> {
  // Check for animated GIFs
  if (imageUrl.toLowerCase().includes('.gif')) {
    throw new Error(
      'Animated GIFs are not supported. Use static images (JPEG, PNG, WebP).'
    );
  }
  
  // Check file size (OpenRouter has 21MB limit)
  const response = await fetch(imageUrl, { method: 'HEAD' });
  const contentLength = response.headers.get('content-length');
  
  if (contentLength) {
    const fileSizeMB = parseInt(contentLength) / (1024 * 1024);
    if (fileSizeMB > 20) {
      throw new Error(
        `Image too large (${fileSizeMB.toFixed(1)}MB). Max 20MB.`
      );
    }
  }
}

Error Handling

Automatic Retries

JOIP automatically retries failed generations:
const maxAttempts = 3;
for (let attempt = 1; attempt <= maxAttempts; attempt++) {
  try {
    return await generateCaptionInternal(...);
  } catch (error) {
    // Retry on content filtering or empty responses
    const isRetriable = 
      error.message === 'CONTENT_POLICY_REJECTION' ||
      error.message === 'EMPTY_RESPONSE_CONTENT_FILTERED';
    
    if (isRetriable && attempt < maxAttempts) {
      await new Promise(r => setTimeout(r, 500));
      continue;
    }
    throw error;
  }
}

// Fallback after all attempts
return "Ready to play? Let's see how long you can last...";

Common Error Codes

Cause: Invalid or expired API keySolution:
  • Verify key in .env matches your dashboard
  • Check for spaces or quotes around the key
  • Regenerate key if compromised
  • Restart server after updating
Cause: Too many requestsSolution:
  • OpenRouter: Check rate limits at openrouter.ai/docs
  • OpenAI: Upgrade tier for higher limits
  • Implement user-level throttling
  • Add delays between batch generations
Cause: Invalid request parametersSolution:
  • Check image URL is accessible
  • Verify image size < 20MB
  • Ensure model supports vision
  • Review error message for specifics
Cause: Model refuses to generate NSFW contentSolution:
  • Use Google Gemini models (best NSFW support)
  • Verify safety settings are applied
  • Try alternative model if issue persists
  • Check if prompt is too explicit (ironically)

Cost Optimization

Model Pricing (OpenRouter)

ModelInput ($/1M tokens)Output ($/1M tokens)Quality
gemini-2.0-flash-expFreeFreeGood
gemini-2.5-flash-lite$0.01$0.04Good
gemini-2.5-pro$1.00$4.00Excellent
claude-3.5-sonnet$3.00$15.00Excellent
gpt-4o$2.50$10.00Very Good
Prices are approximate. Check OpenRouter Pricing for current rates.

Tips to Reduce Costs

  1. Use Flash Models: gemini-2.0-flash-exp is free and high quality
  2. Cache Results: JOIP caches captions in IndexedDB (24hr TTL)
  3. Batch Wisely: Generate all captions at once vs. one-by-one
  4. Optimize Prompts: Shorter prompts = lower input costs
  5. Set Max Tokens: Limit output length to reduce costs

Advanced Configuration

Custom Model Parameters

const response = await fetch('https://openrouter.ai/api/v1/chat/completions', {
  method: 'POST',
  body: JSON.stringify({
    model: modelId,
    messages: [...],
    
    // Creativity settings
    temperature: 1.15,      // Higher = more creative (0-2)
    top_p: 0.9,            // Nucleus sampling (0-1)
    
    // Repetition penalties
    frequency_penalty: 0.5, // Reduce word repetition
    presence_penalty: 0.4,  // Encourage new topics
    
    // Output control
    max_tokens: 500,        // Limit response length
  })
});

Enable Reasoning (Extended Thinking)

For complex contextual captions:
OPENROUTER_REASONING_ENABLED=true

# Optional: Set effort level (low/medium/high)
OPENROUTER_REASONING_EFFORT=medium
Reasoning mode increases costs significantly. Use only for contextual manual captions.

Troubleshooting

Debug Logging

// In server/logger.ts, set level to 'debug'
export const logger = createLogger({
  level: 'debug'
});

// Logs will show:
// [openai] Generating caption with OpenRouter model: google/gemini-2.5-pro
// [openai] Caption properly grounded with subtle visual reference

Test Caption Generation

curl -X POST http://localhost:5000/api/captions/generate \
  -H "Content-Type: application/json" \
  -H "Cookie: your-session-cookie" \
  -d '{
    "imageUrl": "https://example.com/test.jpg",
    "customPrompt": "Test caption generation",
    "theme": "joi"
  }'

Security Best Practices

API Key Security

  • Store keys in .env only
  • Never commit to version control
  • Rotate keys periodically
  • Use different keys per environment

Input Validation

  • Validate image URLs before API calls
  • Block localhost/private IPs (SSRF protection)
  • Enforce size limits (20MB max)
  • Check content-type headers

Rate Limiting

  • Implement user-level quotas
  • Track API usage per user
  • Add cooldowns for abuse prevention
  • Monitor costs in real-time

Error Handling

  • Don’t expose API errors to users
  • Log detailed errors server-side
  • Provide generic user-facing messages
  • Implement graceful fallbacks

Build docs developers (and LLMs) love