AI-Generated Captions

Hayon integrates Google Gemini (gemini-2.5-flash) to generate social media captions from your images and optional intent text. You can generate a generic caption for all platforms or request a caption tuned to a specific platform’s style and constraints.

How it works

Open the AI caption panel

In the post editor, click Generate with AI. The panel appears alongside the caption field.

Provide context (optional)

Enter a short prompt describing your intent — for example, “promoting a weekend sale” or “behind the scenes at our studio”. If you leave this blank, Gemini analyses the image alone.

Attach images

Upload one or more images. The media array is passed to Gemini as image parts, enabling the model to match the caption to the visual content.

Choose a generation mode

Generic (POST /api/generate/captions) — produces a caption suitable for any platform.
Platform-specific (POST /api/generate/captions/:platform) — tailors the output to the character limits, tone, and conventions of the chosen platform.

Apply or edit the caption

Review the generated text. Click Use this caption to populate the post editor, or edit the result inline before applying.

Generic caption generation

Call POST /api/generate/captions with your prompt and media.

POST /api/generate/captions

curl -X POST https://api.yourhayon.com/api/generate/captions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Showcasing our new product launch",
    "media": [
      {
        "mimeType": "image/jpeg",
        "data": "<base64-encoded-image>"
      }
    ]
  }'

The model uses the following built-in system prompt:

Gemini system prompt (generative.controller.ts)

You are a professional social media content creator.

Generate:
- One engaging Instagram caption (2–3 short lines max).
- Use the user intent if provided.
- Match the vibe of the image.
- Sound natural and human.

Then generate:
- 5 relevant niche hashtags based on the image.
- 3 general Instagram hashtags.

Rules:
- No emojis.
- No markdown.
- No explanations.
- Do NOT include labels like "Caption:" or "Hashtags:".
- Return plain text only.

Output format:
<caption text>

#tag1 #tag2 #tag3 #tag4 #tag5 #tag6 #tag7 #tag8

Successful response:

200 response

{
  "data": {
    "candidates": [
      {
        "content": {
          "parts": [
            {
              "text": "Behind every great product is a story worth telling. Here's ours.\n\n#productlaunch #newrelease #innovation #startup #buildinpublic #marketing #launch #brand"
            }
          ]
        }
      }
    ]
  }
}

Platform-specific caption generation

Call POST /api/generate/captions/:platform to generate a caption optimised for a particular platform. The buildPlatformPrompt function constructs a prompt that incorporates the platform’s character limits, style guidelines, and best practices.

POST /api/generate/captions/bluesky

curl -X POST https://api.yourhayon.com/api/generate/captions/bluesky \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Nature walk in the mountains",
    "media": [
      {
        "mimeType": "image/jpeg",
        "data": "<base64-encoded-image>"
      }
    ]
  }'

Supported platform values: bluesky · threads · instagram · facebook · mastodon · tumblr

Use platform-specific generation when your platforms have very different audiences or constraints. For example, Bluesky has a 300-character limit while Tumblr supports long-form content.

Rate limits and quotas

Two limits apply to AI caption generation:

Per-user plan limit

Tracked in user.usage.captionGenerations vs user.limits.maxCaptionGenerations. Exceeding this returns a 429 with a plan upgrade prompt.

API rate limit

A sliding window rate limiter (ai_caption) allows 50 requests per hour per user across both caption endpoints, enforced before the plan limit check.

If the Gemini API itself returns a 429 (quota exhausted), Hayon surfaces the message: “AI service limit reached. Please try again after some time or upgrade your plan.” This is separate from the per-user plan limit.

Sending media to Gemini

The buildImageParts helper converts the media array from the request body into Gemini-compatible Part objects. Each entry should contain:

media array item shape

{
  mimeType: string; // e.g. "image/jpeg"
  data: string;     // Base64-encoded image bytes
}

These are passed alongside the text prompt as a multi-part message to GenAi.models.generateContent() using model gemini-2.5-flash.

Caption generation call (generative.controller.ts)

const GenAi = new GoogleGenAI({ apiKey: ENV.GEMINI.API_KEY });

const result = await GenAi.models.generateContent({
  model: "gemini-2.5-flash",
  contents: [
    {
      role: "user",
      parts: imageParts, // image parts + text prompt appended
    },
  ],
});

Using generated captions in a post

After generation, the caption text is populated into the content.text field of the post editor. You can:

Accept it as-is and publish or schedule the post.
Edit the text before saving.
Run generation again to get a different result.
Copy the caption to a specific platform’s platformSpecificContent field if you want it to apply only to one platform while keeping a different caption for others.

Instagram-specific tips

Instagram captions support up to 2,200 characters. The generic generator targets 2–3 lines of body text followed by a block of 8 hashtags. For best results, attach a high-resolution image so Gemini can match the visual context.

Bluesky-specific tips

Bluesky posts are capped at 300 characters including hashtags. Use POST /api/generate/captions/bluesky to ensure the output fits the constraint. The platform-specific prompt instructs the model to stay within the limit.

Mastodon-specific tips

Mastodon instances have configurable character limits (typically 500). The platform-specific generator targets this range. Content-warning fields and content sensitivity are not currently set automatically.

Tumblr-specific tips

Tumblr supports rich long-form text. The platform-specific prompt allows the model to produce more expansive copy suited to blog-style posts.

API reference

Method	Endpoint	Description
`POST`	`/api/generate/captions`	Generate a generic caption from a prompt and images.
`POST`	`/api/generate/captions/:platform`	Generate a caption tailored to the specified platform.

Both endpoints require a valid Authorization: Bearer <token> header and a JSON body with optional prompt (string) and media (array of base64 image objects).

Get Started

Core Features

Platform Integrations

Subscriptions & Billing

Administration

Self-Hosting

AI-Generated Captions

How it works

Generic caption generation

Platform-specific caption generation

Rate limits and quotas

Per-user plan limit

API rate limit

Sending media to Gemini

Using generated captions in a post

API reference

Build docs developers (and LLMs) love

Get Started

Core Features

Platform Integrations

Subscriptions & Billing

Administration

Self-Hosting

​How it works

​Generic caption generation

​Platform-specific caption generation

​Rate limits and quotas

Per-user plan limit

API rate limit

​Sending media to Gemini

​Using generated captions in a post

​API reference

Build docs developers (and LLMs) love

How it works

Generic caption generation

Platform-specific caption generation

Rate limits and quotas

Sending media to Gemini

Using generated captions in a post

API reference