AI Image Generation

Overview

PolyChat-AI includes powerful AI image generation capabilities using multimodal language models. The system features automatic retry logic, fallback models, and prompt optimization to ensure reliable image generation.

Supported Models

Gemini 2.5 Flash

Free tier availableFast generation with good quality

GPT-4o

Premium modelHigh-quality multimodal capabilities

Claude 3.5/3.7

Premium modelExcellent at image generation

Model Selection

From the source code (src/services/openRouter.ts):

// Known image generation models
const imageGenerationModels = [
  'google/gemini-2.5-flash-image-preview:free',  // Free tier
  'google/gemini-2.5-flash-image-preview',       // Premium
  'google/gemini-2.5-flash-exp-03-25',           // Experimental
  'openai/gpt-4o',                               // OpenAI multimodal
  'openai/gpt-4o-mini',                          // Smaller GPT-4o
  'anthropic/claude-3.5-sonnet',                 // Claude 3.5
  'anthropic/claude-3.7-sonnet'                  // Latest Claude
];

Image Sizes

Supported resolution options:

256×256: Quick previews and thumbnails
512×512: Standard social media images
1024×1024: High-resolution outputs

Actual size support depends on the specific model selected. Most modern models support all three sizes.

Style Presets

Enhance your images with professional style options:

Artistic Styles
Traditional Art

Natural: Photography-like, realistic images
Vivid: Vibrant colors and high contrast
Digital Art: Clean, modern digital illustration
Anime: Japanese animation style

Mood Settings

Set the atmosphere of your generated images:

Mood	Effect
Bright	Cheerful, well-lit, positive atmosphere
Dark	Moody, mysterious, dramatic shadows
Serene	Peaceful, calm, tranquil feeling
Dramatic	Intense, high-contrast, impactful
Playful	Fun, energetic, whimsical
Mysterious	Enigmatic, intriguing, subtle

Lighting Options

Control the lighting in your images:

Natural: Outdoor daylight simulation
Studio: Professional controlled lighting
Dramatic: Strong shadows and highlights
Soft: Diffused, gentle illumination
Neon: Vibrant artificial lighting
Golden Hour: Warm sunset/sunrise glow

Smart Features

Automatic Prompt Optimization

The system automatically enhances your prompts for better results:

// From: src/services/openRouter.ts

export const optimizeImagePrompt = (prompt: string): string => {
  let optimizedPrompt = prompt;
  
  // Add detail if prompt is short
  if (prompt.length < 50) {
    optimizedPrompt += 
      ', highly detailed, professional quality, vibrant colors, ' +
      'sharp focus, high resolution';
  }
  
  // Add style if not specified
  if (!prompt.toLowerCase().includes('style')) {
    optimizedPrompt += ', digital art style';
  }
  
  // Add quality parameters
  if (!prompt.toLowerCase().includes('resolution')) {
    optimizedPrompt += ', high resolution, photorealistic';
  }
  
  return optimizedPrompt;
};

What Gets Added:

Professional quality descriptors
Resolution and detail specifications
Style guidance if missing
Technical quality parameters

Advanced Prompt Builder

For fine-grained control, use the advanced prompt builder:

// From: src/services/openRouter.ts

createAdvancedImagePrompt(
  "A serene mountain landscape",
  {
    style: 'photorealistic',
    mood: 'serene',
    lighting: 'golden_hour',
    composition: 'rule_of_thirds',
    quality: 'ultra_hd'
  }
)

// Generated prompt:
// "A serene mountain landscape, photorealistic, highly detailed,
// serene and peaceful atmosphere, golden hour lighting,
// rule of thirds composition, ultra high definition,
// extremely detailed, professional quality, sharp focus,
// well-composed"

Available Options:

type Style = 
  | 'natural'          // Natural photography
  | 'vivid'            // Vivid and colorful
  | 'digital_art'      // Digital art style
  | 'photorealistic'   // Maximum realism
  | 'anime'            // Anime style
  | 'oil_painting'     // Oil painting
  | 'watercolor';      // Watercolor

Automatic Retry & Fallback

One of the most powerful features: reliable image generation with automatic retry.

How It Works

// From: src/services/openRouter.ts

export const generateImageReliable = async (
  prompt: string,
  apiKey: string,
  primaryModel?: string,
  options: {
    maxRetries?: number;        // Default: 3
    fallbackModels?: string[];  // Auto fallback chain
    size?: string;
    style?: string;
    quality?: string;
  } = {}
): Promise<string | MessageContent[]>

Retry Logic:

Primary Model Attempt

Try the selected model with your prompt (up to 3 retries)

Exponential Backoff

Wait between retries: 1s, 2s, 4s (prevents rate limiting)

Fallback Models

If primary fails, try fallback models automatically:

Gemini 2.5 Flash (free)
Gemini 2.5 Flash (premium)
Gemini experimental
GPT-4o
Claude 3.5 Sonnet

Validation

Verify that response contains valid image URL

Success or Detailed Error

Return image or comprehensive error message with suggestions

Retry Configuration

// Example: Custom retry configuration

const result = await generateImageReliable(
  "A futuristic cityscape at sunset",
  apiKey,
  'google/gemini-2.5-flash-image-preview',
  {
    maxRetries: 5,  // More attempts
    fallbackModels: [
      'openai/gpt-4o',
      'anthropic/claude-3.7-sonnet'
    ],
    size: '1024x1024',
    quality: 'hd'
  }
);

Using Image Generation

Basic Usage

Select Image Model

Choose an image-capable model in settings (look for 🎨 icon)

Write Image Prompt

Describe the image you want:

Generate an image of a serene mountain lake at sunset,
with pine trees reflected in the water, photorealistic style

Send Message

The system automatically detects image requests and optimizes your prompt

Automatic Generation

Prompt is optimized
Sent to selected model
Automatic retry if needed
Image appears in chat

Advanced Usage

// Direct API call for programmatic usage

import { generateImageReliable } from './services/openRouter';

const generateCustomImage = async () => {
  const result = await generateImageReliable(
    "A cyberpunk street market",
    process.env.OPENROUTER_API_KEY,
    'google/gemini-2.5-flash-image-preview:free',
    {
      maxRetries: 3,
      size: '1024x1024',
      style: 'digital_art',
      quality: 'hd'
    }
  );
  
  // Result is either:
  // - Array of MessageContent with image URLs
  // - String with error message
  
  if (Array.isArray(result)) {
    const imageContent = result.find(c => c.type === 'image_url');
    if (imageContent) {
      console.log('Image URL:', imageContent.image_url.url);
    }
  }
};

Error Handling

Comprehensive Error Messages

When all retries fail, you get detailed guidance:

❌ Error de génération d'image: Impossible de générer l'image après 
avoir essayé 5 modèles avec 3 tentatives chacun.

Suggestions:
• Vérifiez votre connexion internet
• Vérifiez que votre clé API OpenRouter est valide
• Essayez avec un prompt plus simple
• Réessayez dans quelques instants

Le système a automatiquement essayé plusieurs modèles et méthodes 
pour garantir la génération de votre image.

Fallback Response

If generation completely fails, a descriptive fallback is provided:

// From: src/services/openRouter.ts

export const createFallbackImage = (prompt: string): MessageContent[] => {
  return [{
    type: 'text',
    text: `🎨 Image demandée: "${prompt}"
    
    Bien que la génération automatique ait rencontré des difficultés,
    voici une description détaillée de l'image qui aurait été créée:
    
    **Description visuelle:**
    • Composition professionnelle avec sujet central bien éclairé
    • Palette de couleurs vibrantes et harmonieuses
    • Détails techniques de haute qualité
    • Style artistique adapté au sujet
    
    **Paramètres techniques:**
    • Résolution: 1024×1024 pixels
    • Format: PNG avec transparence
    • Qualité: Haute définition
    • Style: Numérique moderne`
  }];
};

Best Practices

Writing Effective Prompts

Be Specific:

✅ “A golden retriever puppy playing in autumn leaves, photorealistic”
❌ “A dog”

Include Key Elements:

Subject (what/who)
Setting (where)
Style (how it should look)
Mood (atmosphere)
Lighting (time of day, type)

Use Descriptive Language:

Colors: “vibrant blue”, “muted earth tones”
Textures: “smooth glass”, “rough stone”
Atmosphere: “misty morning”, “dramatic sunset”

Choosing Models

For Speed: Gemini 2.5 Flash (free tier)

Fast generation
Good quality
Free to use

For Quality: GPT-4o or Claude 3.7

Superior detail
Better prompt understanding
Premium pricing

For Experimentation: Try multiple models

Use multi-model chat
Compare outputs
Find your preferred model

Optimizing Results

Iterate on Prompts:

Start with basic description
Add style and mood
Specify lighting and composition
Refine based on results

Use Advanced Options:

Specify style preset
Set mood and lighting
Choose optimal size
Request high quality

Leverage Automatic Retry:

System handles failures automatically
No manual retry needed
Fallback models ensure success

Managing Costs

Start Free: Use Gemini 2.5 Flash free tierMonitor Usage: Check dashboard with Ctrl+UOptimize Generation:

Use 512×512 for drafts
1024×1024 only for finals
Batch similar requests

Smart Model Selection:

Free models for experimentation
Premium for important projects

Technical Details

API Integration

// Image generation uses OpenRouter API

const response = await fetch(API_URL, {
  method: 'POST',
  headers: {
    'Authorization': `Bearer ${apiKey}`,
    'Content-Type': 'application/json',
    'HTTP-Referer': window.location.origin,
    'X-Title': 'PolyChat AI'
  },
  body: JSON.stringify({
    model: imageModel,
    messages: [
      {
        role: 'system',
        content: 'You are an expert AI image generator...'
      },
      {
        role: 'user', 
        content: optimizedPrompt
      }
    ]
  })
});

Response Format

// Images are returned as MessageContent array

type MessageContent = 
  | { type: 'text'; text: string }
  | { type: 'image_url'; image_url: { url: string } };

// Example response:
[
  { type: 'text', text: 'Here is your generated image:' },
  { 
    type: 'image_url', 
    image_url: { 
      url: 'https://example.com/generated-image.png' 
    }
  }
]

Example Prompts

A majestic mountain range at sunrise, with misty valleys below,
golden hour lighting, photorealistic style, rule of thirds composition,
snow-capped peaks, vibrant orange and pink sky, 8k quality

Next: RAG Context

Learn about RAG-powered context enhancement with local embeddings

Get Started

Core Features

Guides

Advanced

Overview

Supported Models

Gemini 2.5 Flash

GPT-4o

Claude 3.5/3.7

Model Selection

Image Sizes

Style Presets

Mood Settings

Lighting Options

Smart Features

Automatic Prompt Optimization

Advanced Prompt Builder

Automatic Retry & Fallback

How It Works

Retry Configuration

Using Image Generation

Basic Usage

Advanced Usage

Error Handling

Comprehensive Error Messages

Fallback Response

Best Practices

Technical Details

API Integration

Response Format

Example Prompts

Next: RAG Context

Build docs developers (and LLMs) love

Get Started

Core Features

Guides

Advanced

​Overview

​Supported Models

Gemini 2.5 Flash

GPT-4o

Claude 3.5/3.7

​Model Selection

​Image Sizes

​Style Presets

​Mood Settings

​Lighting Options

​Smart Features

​Automatic Prompt Optimization

​Advanced Prompt Builder

​Automatic Retry & Fallback

​How It Works

​Retry Configuration

​Using Image Generation

​Basic Usage

​Advanced Usage

​Error Handling

​Comprehensive Error Messages

​Fallback Response

​Best Practices

​Technical Details

​API Integration

​Response Format

​Example Prompts

Next: RAG Context

Build docs developers (and LLMs) love

Overview

Supported Models

Model Selection

Image Sizes

Style Presets

Mood Settings

Lighting Options

Smart Features

Automatic Prompt Optimization

Advanced Prompt Builder

Automatic Retry & Fallback

How It Works

Retry Configuration

Using Image Generation

Basic Usage

Advanced Usage

Error Handling

Comprehensive Error Messages

Fallback Response

Best Practices

Technical Details

API Integration

Response Format

Example Prompts