Skip to main content
Hayon integrates Google Gemini (gemini-2.5-flash) to generate social media captions from your images and optional intent text. You can generate a generic caption for all platforms or request a caption tuned to a specific platform’s style and constraints.

How it works

1

Open the AI caption panel

In the post editor, click Generate with AI. The panel appears alongside the caption field.
2

Provide context (optional)

Enter a short prompt describing your intent — for example, “promoting a weekend sale” or “behind the scenes at our studio”. If you leave this blank, Gemini analyses the image alone.
3

Attach images

Upload one or more images. The media array is passed to Gemini as image parts, enabling the model to match the caption to the visual content.
4

Choose a generation mode

  • Generic (POST /api/generate/captions) — produces a caption suitable for any platform.
  • Platform-specific (POST /api/generate/captions/:platform) — tailors the output to the character limits, tone, and conventions of the chosen platform.
5

Apply or edit the caption

Review the generated text. Click Use this caption to populate the post editor, or edit the result inline before applying.

Generic caption generation

Call POST /api/generate/captions with your prompt and media.
POST /api/generate/captions
curl -X POST https://api.yourhayon.com/api/generate/captions \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Showcasing our new product launch",
    "media": [
      {
        "mimeType": "image/jpeg",
        "data": "<base64-encoded-image>"
      }
    ]
  }'
The model uses the following built-in system prompt:
Gemini system prompt (generative.controller.ts)
You are a professional social media content creator.

Generate:
- One engaging Instagram caption (2–3 short lines max).
- Use the user intent if provided.
- Match the vibe of the image.
- Sound natural and human.

Then generate:
- 5 relevant niche hashtags based on the image.
- 3 general Instagram hashtags.

Rules:
- No emojis.
- No markdown.
- No explanations.
- Do NOT include labels like "Caption:" or "Hashtags:".
- Return plain text only.

Output format:
<caption text>

#tag1 #tag2 #tag3 #tag4 #tag5 #tag6 #tag7 #tag8
Successful response:
200 response
{
  "data": {
    "candidates": [
      {
        "content": {
          "parts": [
            {
              "text": "Behind every great product is a story worth telling. Here's ours.\n\n#productlaunch #newrelease #innovation #startup #buildinpublic #marketing #launch #brand"
            }
          ]
        }
      }
    ]
  }
}

Platform-specific caption generation

Call POST /api/generate/captions/:platform to generate a caption optimised for a particular platform. The buildPlatformPrompt function constructs a prompt that incorporates the platform’s character limits, style guidelines, and best practices.
POST /api/generate/captions/bluesky
curl -X POST https://api.yourhayon.com/api/generate/captions/bluesky \
  -H "Authorization: Bearer <token>" \
  -H "Content-Type: application/json" \
  -d '{
    "prompt": "Nature walk in the mountains",
    "media": [
      {
        "mimeType": "image/jpeg",
        "data": "<base64-encoded-image>"
      }
    ]
  }'
Supported platform values: bluesky · threads · instagram · facebook · mastodon · tumblr
Use platform-specific generation when your platforms have very different audiences or constraints. For example, Bluesky has a 300-character limit while Tumblr supports long-form content.

Rate limits and quotas

Two limits apply to AI caption generation:

Per-user plan limit

Tracked in user.usage.captionGenerations vs user.limits.maxCaptionGenerations. Exceeding this returns a 429 with a plan upgrade prompt.

API rate limit

A sliding window rate limiter (ai_caption) allows 50 requests per hour per user across both caption endpoints, enforced before the plan limit check.
If the Gemini API itself returns a 429 (quota exhausted), Hayon surfaces the message: “AI service limit reached. Please try again after some time or upgrade your plan.” This is separate from the per-user plan limit.

Sending media to Gemini

The buildImageParts helper converts the media array from the request body into Gemini-compatible Part objects. Each entry should contain:
media array item shape
{
  mimeType: string; // e.g. "image/jpeg"
  data: string;     // Base64-encoded image bytes
}
These are passed alongside the text prompt as a multi-part message to GenAi.models.generateContent() using model gemini-2.5-flash.
Caption generation call (generative.controller.ts)
const GenAi = new GoogleGenAI({ apiKey: ENV.GEMINI.API_KEY });

const result = await GenAi.models.generateContent({
  model: "gemini-2.5-flash",
  contents: [
    {
      role: "user",
      parts: imageParts, // image parts + text prompt appended
    },
  ],
});

Using generated captions in a post

After generation, the caption text is populated into the content.text field of the post editor. You can:
  • Accept it as-is and publish or schedule the post.
  • Edit the text before saving.
  • Run generation again to get a different result.
  • Copy the caption to a specific platform’s platformSpecificContent field if you want it to apply only to one platform while keeping a different caption for others.
Instagram captions support up to 2,200 characters. The generic generator targets 2–3 lines of body text followed by a block of 8 hashtags. For best results, attach a high-resolution image so Gemini can match the visual context.
Bluesky posts are capped at 300 characters including hashtags. Use POST /api/generate/captions/bluesky to ensure the output fits the constraint. The platform-specific prompt instructs the model to stay within the limit.
Mastodon instances have configurable character limits (typically 500). The platform-specific generator targets this range. Content-warning fields and content sensitivity are not currently set automatically.
Tumblr supports rich long-form text. The platform-specific prompt allows the model to produce more expansive copy suited to blog-style posts.

API reference

MethodEndpointDescription
POST/api/generate/captionsGenerate a generic caption from a prompt and images.
POST/api/generate/captions/:platformGenerate a caption tailored to the specified platform.
Both endpoints require a valid Authorization: Bearer <token> header and a JSON body with optional prompt (string) and media (array of base64 image objects).

Build docs developers (and LLMs) love