Environment Variables

This page provides a comprehensive reference for all environment variables used by the Resource Service. Variables are organized by category with descriptions, defaults, and usage examples.

Required Variables

These variables must be set for the service to start:

GEMINI_API_KEY

string

required

Google Gemini API key for AI-powered wrapper generation.Usage:

GEMINI_API_KEY=AIzaSyD...

How to obtain:

Visit Google AI Studio
Sign in with your Google account
Create a new API key
Copy the key to your .env file

Security:

Never commit this key to version control
Rotate keys periodically
Use different keys for development and production
Monitor usage in Google Cloud Console

The service will fail to start with a validation error if GEMINI_API_KEY is not provided.

CORS and Security

ORIGINS

string

default:"localhost"

Comma-separated list of allowed origins for Cross-Origin Resource Sharing (CORS).Default: localhostUsage:

# Single origin
ORIGINS=http://localhost:3000

# Multiple origins
ORIGINS=http://localhost:3000,http://localhost:5173,https://app.example.com

# Development setup
ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost

Implementation: The service splits this value by commas and configures FastAPI CORS middleware:

origins = settings.ORIGINS.split(",")
app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)

Production recommendations:

Specify exact origins (no wildcards)
Use HTTPS origins only
Limit to necessary domains
Validate origins match your frontend deployments

Database Configuration

MONGO_URI

string

default:"mongodb://localhost:27017"

MongoDB connection URI including authentication, host, port, database name, and options.Default: mongodb://localhost:27017Usage:

# Local development (no auth)
MONGO_URI=mongodb://localhost:27017/resources

# Docker Compose (container name)
MONGO_URI=mongodb://resource-mongo:27017/resources

# With authentication
MONGO_URI=mongodb://username:password@localhost:27017/resources

# MongoDB Atlas (cloud)
MONGO_URI=mongodb+srv://user:pass@cluster.mongodb.net/resources?retryWrites=true&w=majority

# Replica set
MONGO_URI=mongodb://mongo1:27017,mongo2:27017,mongo3:27017/resources?replicaSet=rs0

Format:

mongodb://[username:password@]host[:port][,host2[:port2],...]/database[?options]

Common options:

retryWrites=true - Retry write operations on failure
w=majority - Wait for majority of replica set to acknowledge writes
maxPoolSize=50 - Maximum connection pool size
authSource=admin - Authentication database

Docker Compose:

environment:
  - MONGO_URI=mongodb://resource-mongo:27017/resources

Message Queue Configuration

Primary RabbitMQ Connection

RABBITMQ_URL

string

default:"amqp://guest:guest@rabbitmq/"

RabbitMQ connection URL for internal service messaging.Default: amqp://guest:guest@rabbitmq/Usage:

# Local development
RABBITMQ_URL=amqp://guest:guest@localhost/

# Docker Compose
RABBITMQ_URL=amqp://guest:guest@rabbitmq/

# With authentication
RABBITMQ_URL=amqp://username:password@rabbitmq.example.com:5672/

# With virtual host
RABBITMQ_URL=amqp://user:pass@rabbitmq:5672/prod

# With SSL/TLS
RABBITMQ_URL=amqps://user:pass@rabbitmq.example.com:5671/

Format:

amqp[s]://[username:password@]host[:port]/[vhost]

Default credentials:

Username: guest
Password: guest
Virtual host: / (default)
Port: 5672 (AMQP), 5671 (AMQPS)

Change default credentials in production. The guest user can only connect from localhost.

Queue Names

RESOURCE_DATA_QUEUE

string

default:"resource_data"

Queue name for publishing resource creation and update events.Default: resource_dataUsage:

RESOURCE_DATA_QUEUE=resource_data

Message format:

{
  "resource_id": "uuid",
  "resource_name": "My Resource",
  "wrapper_id": "uuid",
  "event_type": "created" | "updated"
}

Consumers: Other microservices subscribe to this queue to react to resource changes.

RESOURCE_DELETED_QUEUE

string

default:"resource_deleted"

Queue name for publishing resource deletion events.Default: resource_deletedUsage:

RESOURCE_DELETED_QUEUE=resource_deleted

Message format:

{
  "resource_id": "uuid",
  "resource_name": "My Resource",
  "deleted_at": "2026-03-03T12:00:00Z"
}

COLLECTED_DATA_QUEUE

string

default:"collected_data"

Queue name for consuming data collected by wrapper processes.Default: collected_dataUsage:

COLLECTED_DATA_QUEUE=collected_data

Message format:

{
  "wrapper_id": "uuid",
  "resource_id": "uuid",
  "data": [...],
  "collected_at": "2026-03-03T12:00:00Z",
  "chunk_index": 0,
  "total_chunks": 1
}

Behavior: The service consumes messages from this queue and processes collected data.

Data Service Integration

DATA_RABBITMQ_URL

string

default:"amqp://user:password@data-mq:5672/"

RabbitMQ connection URL for the external data service where collected data is published.Default: amqp://user:password@data-mq:5672/Usage:

# Same broker as main service
DATA_RABBITMQ_URL=amqp://guest:guest@rabbitmq/

# Separate data service broker
DATA_RABBITMQ_URL=amqp://data-user:data-pass@data-mq:5672/

# External data service
DATA_RABBITMQ_URL=amqp://user:pass@data-service.example.com:5672/data

Purpose: Wrappers publish collected data to this separate broker, allowing the data service to be deployed independently.

This can point to the same RabbitMQ instance as RABBITMQ_URL or a completely separate broker for distributed deployments.

DATA_QUEUE_NAME

string

default:"data_queue"

Queue name on the data service broker where wrappers publish collected data.Default: data_queueUsage:

DATA_QUEUE_NAME=data_queue

Implementation: Generated wrapper code includes this queue name for publishing:

await channel.default_exchange.publish(
    Message(body=json.dumps(data).encode()),
    routing_key=settings.DATA_QUEUE_NAME
)

WRAPPER_CREATION_QUEUE_NAME

string

default:"wrapper_creation_queue"

Queue name for receiving wrapper creation requests.Default: wrapper_creation_queueUsage:

WRAPPER_CREATION_QUEUE_NAME=wrapper_creation_queue

Message format:

{
  "resource_id": "uuid",
  "wrapper_config": {...}
}

Data Collection Settings

CHUNK_SIZE_THRESHOLD

integer

default:"1000"

Maximum number of records to include in a single message chunk. Data exceeding this threshold is split into multiple chunks.Default: 1000Usage:

# Default
CHUNK_SIZE_THRESHOLD=1000

# Smaller chunks for limited bandwidth
CHUNK_SIZE_THRESHOLD=500

# Larger chunks for high-performance networks
CHUNK_SIZE_THRESHOLD=5000

Behavior: When a wrapper collects data:

If records ≤ threshold: Single message
If records > threshold: Split into multiple messages

Example:

# 2500 records with threshold=1000
# Results in 3 messages:
# - Chunk 0: records 0-999 (1000 records)
# - Chunk 1: records 1000-1999 (1000 records)
# - Chunk 2: records 2000-2499 (500 records)

Considerations:

Smaller values: More messages, less memory per message, higher overhead
Larger values: Fewer messages, more memory per message, lower overhead
RabbitMQ limits: Default max message size is 128MB
Network: Smaller chunks better for unreliable networks

Recommended values:

Development: 1000
Production (high bandwidth): 5000
Production (limited bandwidth): 500
Large datasets: 10000

AI Model Configuration

GEMINI_MODEL_NAME

string

default:"gemini-1.5-flash"

Google Gemini model to use for wrapper code generation.Default: gemini-1.5-flashUsage:

# Fast and cost-effective (recommended)
GEMINI_MODEL_NAME=gemini-1.5-flash

# More capable, higher quality
GEMINI_MODEL_NAME=gemini-1.5-pro

# Legacy model
GEMINI_MODEL_NAME=gemini-1.0-pro

Model comparison:

Model	Speed	Quality	Cost	Use Case
`gemini-1.5-flash`	Fast	Good	Low	Development, high-volume
`gemini-1.5-pro`	Moderate	Excellent	Medium	Production, complex wrappers
`gemini-1.0-pro`	Moderate	Good	Low	Legacy support

Recommendations:

Development: gemini-1.5-flash for fast iteration
Production: gemini-1.5-pro for best quality
High volume: gemini-1.5-flash to minimize costs

Model availability and pricing may vary by region. Check Google AI pricing for current rates.

Debug and Development

WRAPPER_GENERATION_DEBUG_MODE

boolean

default:"false"

Enable verbose logging for wrapper generation and execution.Default: falseUsage:

# Enable debug mode
WRAPPER_GENERATION_DEBUG_MODE=true

# Disable debug mode (production)
WRAPPER_GENERATION_DEBUG_MODE=false

Accepted values:

true, True, TRUE, 1, yes, Yes, YES
false, False, FALSE, 0, no, No, NO

When enabled, logs include:

Complete generated wrapper code
AI model prompts and responses
Detailed execution traces
Wrapper process stdout/stderr
Data collection progress
Error stack traces with full context

Example output:

[DEBUG] Generated wrapper code:
[DEBUG] ---
[DEBUG] import aio_pika
[DEBUG] import pandas as pd
[DEBUG] ...
[DEBUG] ---
[DEBUG] Executing wrapper for resource: abc-123
[DEBUG] Wrapper process started: PID 12345
[DEBUG] Collected 1000 records
[DEBUG] Publishing chunk 1/3 to data_queue

Debug mode generates significant log volume (10-100x normal) and may expose sensitive data including API keys in logs. Only enable in secure development environments.

Production: Always set to false in production to:

Reduce log storage costs
Improve performance
Prevent sensitive data exposure
Reduce noise in monitoring systems

Environment-Specific Examples

Local Development

# Required
GEMINI_API_KEY=AIzaSyD...

# CORS - allow local frontends
ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost

# Database - local MongoDB
MONGO_URI=mongodb://localhost:27017/resources

# Message Queues - local RabbitMQ
RABBITMQ_URL=amqp://guest:guest@localhost/
DATA_RABBITMQ_URL=amqp://guest:guest@localhost/

# AI Model - fast iteration
GEMINI_MODEL_NAME=gemini-1.5-flash

# Debug - enabled for development
WRAPPER_GENERATION_DEBUG_MODE=true

# Data Collection
CHUNK_SIZE_THRESHOLD=1000

Docker Compose Development

# Required
GEMINI_API_KEY=AIzaSyD...

# CORS - allow local frontends
ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost

# Database - Docker container name
MONGO_URI=mongodb://resource-mongo:27017/resources

# Message Queues - Docker container name
RABBITMQ_URL=amqp://guest:guest@rabbitmq/
DATA_RABBITMQ_URL=amqp://guest:guest@rabbitmq/

# AI Model
GEMINI_MODEL_NAME=gemini-1.5-flash

# Debug - enabled for development
WRAPPER_GENERATION_DEBUG_MODE=true

# Data Collection
CHUNK_SIZE_THRESHOLD=1000

Production

# Required - use separate production key
GEMINI_API_KEY=AIzaSyE...

# CORS - specific production origins only
ORIGINS=https://app.example.com,https://dashboard.example.com

# Database - MongoDB Atlas with auth
MONGO_URI=mongodb+srv://prod-user:SECURE_PASSWORD@cluster.mongodb.net/resources?retryWrites=true&w=majority

# Message Queues - production brokers with auth
RABBITMQ_URL=amqp://prod-user:SECURE_PASSWORD@mq.example.com:5672/prod
DATA_RABBITMQ_URL=amqp://data-user:SECURE_PASSWORD@data-mq.example.com:5672/data

# Queue Names - production queues
RESOURCE_DATA_QUEUE=prod_resource_data
RESOURCE_DELETED_QUEUE=prod_resource_deleted
COLLECTED_DATA_QUEUE=prod_collected_data
DATA_QUEUE_NAME=prod_data_queue
WRAPPER_CREATION_QUEUE_NAME=prod_wrapper_creation_queue

# AI Model - best quality for production
GEMINI_MODEL_NAME=gemini-1.5-pro

# Debug - DISABLED for production
WRAPPER_GENERATION_DEBUG_MODE=false

# Data Collection - optimized for production network
CHUNK_SIZE_THRESHOLD=5000

Minimal Configuration

# Only required variable - all others use defaults
GEMINI_API_KEY=AIzaSyD...

Docker Compose Usage

Environment variables can be set in docker-compose.yml:

services:
  resource-service:
    environment:
      # Direct values
      - MONGO_URI=mongodb://resource-mongo:27017/resources
      - GEMINI_MODEL_NAME=gemini-1.5-flash
      
      # From host environment
      - GEMINI_API_KEY=${GEMINI_API_KEY}
      
      # With default values
      - ORIGINS=${ORIGINS:-localhost}
      - GEMINI_MODEL_NAME=${GEMINI_MODEL_NAME:-gemini-1.5-flash}

Or in a separate .env file:

# .env file is automatically loaded by Docker Compose
GEMINI_API_KEY=AIzaSyD...
ORIGINS=http://localhost:3000

Kubernetes Usage

Store sensitive values in Kubernetes Secrets:

apiVersion: v1
kind: Secret
metadata:
  name: resource-service-secrets
type: Opaque
stringData:
  GEMINI_API_KEY: AIzaSyD...
  MONGO_URI: mongodb+srv://user:pass@cluster.mongodb.net/resources
  RABBITMQ_URL: amqp://user:pass@rabbitmq:5672/

Validation and Troubleshooting

Missing Required Variables

ValidationError: 1 validation error for Settings
GEMINI_API_KEY
  field required (type=value_error.missing)

Solution: Set the GEMINI_API_KEY environment variable.

Invalid Type

ValidationError: 1 validation error for Settings
CHUNK_SIZE_THRESHOLD
  value is not a valid integer (type=type_error.integer)

Solution: Ensure CHUNK_SIZE_THRESHOLD is a number without quotes.

Invalid Boolean

ValidationError: 1 validation error for Settings
WRAPPER_GENERATION_DEBUG_MODE
  value could not be parsed to a boolean (type=type_error.bool)

Solution: Use true/false, 1/0, or yes/no.

Connection Failures

ConnectionError: Could not connect to MongoDB at mongodb://localhost:27017

Solution: Verify MONGO_URI is correct and MongoDB is running.

Security Checklist

Protect API keys

Never commit .env files to version control
Add .env to .gitignore
Use different keys for dev/staging/prod
Rotate keys periodically

Secure credentials

Use strong passwords (minimum 16 characters)
Generate passwords with openssl rand -base64 32
Store in secrets management (Vault, AWS Secrets Manager, etc.)
Never use default credentials in production

Restrict CORS

Specify exact origins (no wildcards)
Use HTTPS origins only in production
Validate origins match deployed frontends

Disable debug mode

Set WRAPPER_GENERATION_DEBUG_MODE=false in production
Review logs to ensure no sensitive data is logged
Use structured logging for production

Use encrypted connections

Use mongodb+srv:// or mongodb:// with TLS for MongoDB
Use amqps:// for RabbitMQ over SSL/TLS
Ensure certificates are valid and not self-signed

Get Started

Core Concepts

Guides

Deployment

Required Variables

CORS and Security

Database Configuration

Message Queue Configuration

Primary RabbitMQ Connection

Queue Names

Data Service Integration

Data Collection Settings

AI Model Configuration

Debug and Development

Environment-Specific Examples

Local Development

Docker Compose Development

Production

Minimal Configuration

Docker Compose Usage

Kubernetes Usage

Validation and Troubleshooting

Missing Required Variables

Invalid Type

Invalid Boolean

Connection Failures

Security Checklist

Build docs developers (and LLMs) love

Get Started

Core Concepts

Guides

Deployment

​Required Variables

​CORS and Security

​Database Configuration

​Message Queue Configuration

​Primary RabbitMQ Connection

​Queue Names

​Data Service Integration

​Data Collection Settings

​AI Model Configuration

​Debug and Development

​Environment-Specific Examples

​Local Development

​Docker Compose Development

​Production

​Minimal Configuration

​Docker Compose Usage

​Kubernetes Usage

​Validation and Troubleshooting

​Missing Required Variables

​Invalid Type

​Invalid Boolean

​Connection Failures

​Security Checklist

Build docs developers (and LLMs) love

Required Variables

CORS and Security

Database Configuration

Message Queue Configuration

Primary RabbitMQ Connection

Queue Names

Data Service Integration

Data Collection Settings

AI Model Configuration

Debug and Development

Environment-Specific Examples

Local Development

Docker Compose Development

Production

Minimal Configuration

Docker Compose Usage

Kubernetes Usage

Validation and Troubleshooting

Missing Required Variables

Invalid Type

Invalid Boolean

Connection Failures

Security Checklist