Configuring OpenRouter AI keys and model fallback

Skillara AI connects to AI language models via OpenRouter using an OpenAI-compatible API client. The backend supports up to five API keys and an arbitrary number of models, automatically cycling through every key–model combination in order until one succeeds. This means a single rate-limited key or an overloaded model will not block CV generation — the backend silently falls back to the next available option.

Getting an OpenRouter key

Sign up for a free account at openrouter.ai, then navigate to Keys in your dashboard and click Create Key. Copy the key — it starts with sk-or-v1-. Free-tier models available on OpenRouter include Llama 3.1 8B Instruct, DeepSeek R1, and Gemini 2.0 Flash Experimental, all of which are configured as defaults in Skillara AI.

Setting keys in your environment

Add up to five API keys to backend/.env. At least one key is required for CV generation to work. The DEV1_NAME through DEV5_NAME variables are optional and cosmetic — they appear in the server console so you can tell at a glance which key a request used:

OPENROUTER_API_KEY_1="sk-or-v1-..."
OPENROUTER_API_KEY_2="sk-or-v1-..."
OPENROUTER_API_KEY_3="sk-or-v1-..."
OPENROUTER_API_KEY_4="sk-or-v1-..."
OPENROUTER_API_KEY_5="sk-or-v1-..."

# Optional: display names shown in server logs
DEV1_NAME="Alice"
DEV2_NAME="Bob"
DEV3_NAME="Carol"
DEV4_NAME="Dave"
DEV5_NAME="Eve"

Configuring models

Set AI_MODELS to a comma-separated list of OpenRouter model IDs. If the variable is not set, Skillara AI uses this default:

AI_MODELS="meta-llama/llama-3.1-8b-instruct:free,deepseek/deepseek-r1-0528:free,google/gemini-2.0-flash-exp:free"

You can mix free and paid models in the same list. The backend tries them left-to-right and stops at the first success.

How the fallback works

The callWithFallback function in backend/src/services/claude.service.ts implements the multi-key, multi-model retry loop. Here is the exact sequence of events for every CV generation or field suggestion request:

Build the combination list

The backend loads all non-empty API keys from the environment and all model IDs from AI_MODELS. Keys are tried in the order they appear in the API_KEYS array in claude.service.ts: KEY_2 → KEY_3 → KEY_4 → KEY_5 → KEY_1. For each key, every model in AI_MODELS is tried left-to-right — forming an ordered list of (key, model) pairs to attempt.

Create an OpenAI-compatible client

For each combination, a new OpenAI client instance is created with baseURL set to https://openrouter.ai/api/v1 and apiKey set to the current key. This means no custom HTTP client is needed — the standard OpenAI Node.js SDK is used with OpenRouter as the endpoint.

Send the request

The backend calls chat.completions.create with the current model and a max_tokens cap of 4000. The messages array contains the full CV generation or edit prompt.

Check for error responses

OpenRouter sometimes returns an HTTP 200 with an error object in the body instead of raising an HTTP error. The backend inspects the raw response for an error field. If the error code is 429, 402, or 503 (rate limit, insufficient credits, or service unavailable), the combination is skipped and the next one is tried. Any other error message also triggers a skip.

Validate the response content

If no error is present, the backend checks that the returned content is non-empty and at least 100 characters long. A response shorter than that is treated as a failure (e.g. a model that returned a refusal or an empty string) and the next combination is tried.

Return the result or throw

The first combination that passes both checks has its content returned immediately — no further combinations are tried. If every key–model pair fails, the backend throws a 500 error that includes the full list of failure messages so you can diagnose which keys and models were attempted.

Recommended free models

These three models are the current defaults and work well for CV generation:

Model ID	Provider	Notes
`meta-llama/llama-3.1-8b-instruct:free`	Meta	Fast responses, good instruction following
`deepseek/deepseek-r1-0528:free`	DeepSeek	Strong reasoning; produces well-structured HTML
`google/gemini-2.0-flash-exp:free`	Google	Low latency; good at following detailed style guides

You can browse all available models at openrouter.ai/models and add any model ID to your AI_MODELS list.

Adding multiple API keys from different OpenRouter accounts significantly increases throughput. Because the fallback loop tries every key before giving up, spreading load across several free-tier accounts makes it much less likely that all keys are rate-limited at the same time.

Free-tier models on OpenRouter have daily token limits that reset at midnight UTC. For production deployments with consistent traffic, add paid models to your AI_MODELS list or subscribe to a paid OpenRouter plan to avoid hitting limits during peak usage.

Deployment

Configuring OpenRouter AI keys and model fallback

Getting an OpenRouter key

Setting keys in your environment

Configuring models

How the fallback works

Recommended free models

Build docs developers (and LLMs) love

Deployment

Documentation Index

​Getting an OpenRouter key

​Setting keys in your environment

​Configuring models

​How the fallback works

​Recommended free models

Build docs developers (and LLMs) love

Getting an OpenRouter key

Setting keys in your environment

Configuring models

How the fallback works

Recommended free models