Skip to main content
This page provides a comprehensive reference for all environment variables used by the Resource Service. Variables are organized by category with descriptions, defaults, and usage examples.

Required Variables

These variables must be set for the service to start:
GEMINI_API_KEY
string
required
Google Gemini API key for AI-powered wrapper generation.Usage:
GEMINI_API_KEY=AIzaSyD...
How to obtain:
  1. Visit Google AI Studio
  2. Sign in with your Google account
  3. Create a new API key
  4. Copy the key to your .env file
Security:
  • Never commit this key to version control
  • Rotate keys periodically
  • Use different keys for development and production
  • Monitor usage in Google Cloud Console
The service will fail to start with a validation error if GEMINI_API_KEY is not provided.

CORS and Security

ORIGINS
string
default:"localhost"
Comma-separated list of allowed origins for Cross-Origin Resource Sharing (CORS).Default: localhostUsage:
# Single origin
ORIGINS=http://localhost:3000

# Multiple origins
ORIGINS=http://localhost:3000,http://localhost:5173,https://app.example.com

# Development setup
ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost
Implementation: The service splits this value by commas and configures FastAPI CORS middleware:
origins = settings.ORIGINS.split(",")
app.add_middleware(
    CORSMiddleware,
    allow_origins=origins,
    allow_credentials=True,
    allow_methods=["*"],
    allow_headers=["*"],
)
Production recommendations:
  • Specify exact origins (no wildcards)
  • Use HTTPS origins only
  • Limit to necessary domains
  • Validate origins match your frontend deployments

Database Configuration

MONGO_URI
string
default:"mongodb://localhost:27017"
MongoDB connection URI including authentication, host, port, database name, and options.Default: mongodb://localhost:27017Usage:
# Local development (no auth)
MONGO_URI=mongodb://localhost:27017/resources

# Docker Compose (container name)
MONGO_URI=mongodb://resource-mongo:27017/resources

# With authentication
MONGO_URI=mongodb://username:password@localhost:27017/resources

# MongoDB Atlas (cloud)
MONGO_URI=mongodb+srv://user:pass@cluster.mongodb.net/resources?retryWrites=true&w=majority

# Replica set
MONGO_URI=mongodb://mongo1:27017,mongo2:27017,mongo3:27017/resources?replicaSet=rs0
Format:
mongodb://[username:password@]host[:port][,host2[:port2],...]/database[?options]
Common options:
  • retryWrites=true - Retry write operations on failure
  • w=majority - Wait for majority of replica set to acknowledge writes
  • maxPoolSize=50 - Maximum connection pool size
  • authSource=admin - Authentication database
Docker Compose:
environment:
  - MONGO_URI=mongodb://resource-mongo:27017/resources

Message Queue Configuration

Primary RabbitMQ Connection

RABBITMQ_URL
string
default:"amqp://guest:guest@rabbitmq/"
RabbitMQ connection URL for internal service messaging.Default: amqp://guest:guest@rabbitmq/Usage:
# Local development
RABBITMQ_URL=amqp://guest:guest@localhost/

# Docker Compose
RABBITMQ_URL=amqp://guest:guest@rabbitmq/

# With authentication
RABBITMQ_URL=amqp://username:password@rabbitmq.example.com:5672/

# With virtual host
RABBITMQ_URL=amqp://user:pass@rabbitmq:5672/prod

# With SSL/TLS
RABBITMQ_URL=amqps://user:pass@rabbitmq.example.com:5671/
Format:
amqp[s]://[username:password@]host[:port]/[vhost]
Default credentials:
  • Username: guest
  • Password: guest
  • Virtual host: / (default)
  • Port: 5672 (AMQP), 5671 (AMQPS)
Change default credentials in production. The guest user can only connect from localhost.

Queue Names

RESOURCE_DATA_QUEUE
string
default:"resource_data"
Queue name for publishing resource creation and update events.Default: resource_dataUsage:
RESOURCE_DATA_QUEUE=resource_data
Message format:
{
  "resource_id": "uuid",
  "resource_name": "My Resource",
  "wrapper_id": "uuid",
  "event_type": "created" | "updated"
}
Consumers: Other microservices subscribe to this queue to react to resource changes.
RESOURCE_DELETED_QUEUE
string
default:"resource_deleted"
Queue name for publishing resource deletion events.Default: resource_deletedUsage:
RESOURCE_DELETED_QUEUE=resource_deleted
Message format:
{
  "resource_id": "uuid",
  "resource_name": "My Resource",
  "deleted_at": "2026-03-03T12:00:00Z"
}
COLLECTED_DATA_QUEUE
string
default:"collected_data"
Queue name for consuming data collected by wrapper processes.Default: collected_dataUsage:
COLLECTED_DATA_QUEUE=collected_data
Message format:
{
  "wrapper_id": "uuid",
  "resource_id": "uuid",
  "data": [...],
  "collected_at": "2026-03-03T12:00:00Z",
  "chunk_index": 0,
  "total_chunks": 1
}
Behavior: The service consumes messages from this queue and processes collected data.

Data Service Integration

DATA_RABBITMQ_URL
string
default:"amqp://user:password@data-mq:5672/"
RabbitMQ connection URL for the external data service where collected data is published.Default: amqp://user:password@data-mq:5672/Usage:
# Same broker as main service
DATA_RABBITMQ_URL=amqp://guest:guest@rabbitmq/

# Separate data service broker
DATA_RABBITMQ_URL=amqp://data-user:data-pass@data-mq:5672/

# External data service
DATA_RABBITMQ_URL=amqp://user:pass@data-service.example.com:5672/data
Purpose: Wrappers publish collected data to this separate broker, allowing the data service to be deployed independently.
This can point to the same RabbitMQ instance as RABBITMQ_URL or a completely separate broker for distributed deployments.
DATA_QUEUE_NAME
string
default:"data_queue"
Queue name on the data service broker where wrappers publish collected data.Default: data_queueUsage:
DATA_QUEUE_NAME=data_queue
Implementation: Generated wrapper code includes this queue name for publishing:
await channel.default_exchange.publish(
    Message(body=json.dumps(data).encode()),
    routing_key=settings.DATA_QUEUE_NAME
)
WRAPPER_CREATION_QUEUE_NAME
string
default:"wrapper_creation_queue"
Queue name for receiving wrapper creation requests.Default: wrapper_creation_queueUsage:
WRAPPER_CREATION_QUEUE_NAME=wrapper_creation_queue
Message format:
{
  "resource_id": "uuid",
  "wrapper_config": {...}
}

Data Collection Settings

CHUNK_SIZE_THRESHOLD
integer
default:"1000"
Maximum number of records to include in a single message chunk. Data exceeding this threshold is split into multiple chunks.Default: 1000Usage:
# Default
CHUNK_SIZE_THRESHOLD=1000

# Smaller chunks for limited bandwidth
CHUNK_SIZE_THRESHOLD=500

# Larger chunks for high-performance networks
CHUNK_SIZE_THRESHOLD=5000
Behavior: When a wrapper collects data:
  1. If records ≤ threshold: Single message
  2. If records > threshold: Split into multiple messages
Example:
# 2500 records with threshold=1000
# Results in 3 messages:
# - Chunk 0: records 0-999 (1000 records)
# - Chunk 1: records 1000-1999 (1000 records)
# - Chunk 2: records 2000-2499 (500 records)
Considerations:
  • Smaller values: More messages, less memory per message, higher overhead
  • Larger values: Fewer messages, more memory per message, lower overhead
  • RabbitMQ limits: Default max message size is 128MB
  • Network: Smaller chunks better for unreliable networks
Recommended values:
  • Development: 1000
  • Production (high bandwidth): 5000
  • Production (limited bandwidth): 500
  • Large datasets: 10000

AI Model Configuration

GEMINI_MODEL_NAME
string
default:"gemini-1.5-flash"
Google Gemini model to use for wrapper code generation.Default: gemini-1.5-flashUsage:
# Fast and cost-effective (recommended)
GEMINI_MODEL_NAME=gemini-1.5-flash

# More capable, higher quality
GEMINI_MODEL_NAME=gemini-1.5-pro

# Legacy model
GEMINI_MODEL_NAME=gemini-1.0-pro
Model comparison:
ModelSpeedQualityCostUse Case
gemini-1.5-flashFastGoodLowDevelopment, high-volume
gemini-1.5-proModerateExcellentMediumProduction, complex wrappers
gemini-1.0-proModerateGoodLowLegacy support
Recommendations:
  • Development: gemini-1.5-flash for fast iteration
  • Production: gemini-1.5-pro for best quality
  • High volume: gemini-1.5-flash to minimize costs
Model availability and pricing may vary by region. Check Google AI pricing for current rates.

Debug and Development

WRAPPER_GENERATION_DEBUG_MODE
boolean
default:"false"
Enable verbose logging for wrapper generation and execution.Default: falseUsage:
# Enable debug mode
WRAPPER_GENERATION_DEBUG_MODE=true

# Disable debug mode (production)
WRAPPER_GENERATION_DEBUG_MODE=false
Accepted values:
  • true, True, TRUE, 1, yes, Yes, YES
  • false, False, FALSE, 0, no, No, NO
When enabled, logs include:
  • Complete generated wrapper code
  • AI model prompts and responses
  • Detailed execution traces
  • Wrapper process stdout/stderr
  • Data collection progress
  • Error stack traces with full context
Example output:
[DEBUG] Generated wrapper code:
[DEBUG] ---
[DEBUG] import aio_pika
[DEBUG] import pandas as pd
[DEBUG] ...
[DEBUG] ---
[DEBUG] Executing wrapper for resource: abc-123
[DEBUG] Wrapper process started: PID 12345
[DEBUG] Collected 1000 records
[DEBUG] Publishing chunk 1/3 to data_queue
Debug mode generates significant log volume (10-100x normal) and may expose sensitive data including API keys in logs. Only enable in secure development environments.
Production: Always set to false in production to:
  • Reduce log storage costs
  • Improve performance
  • Prevent sensitive data exposure
  • Reduce noise in monitoring systems

Environment-Specific Examples

Local Development

# Required
GEMINI_API_KEY=AIzaSyD...

# CORS - allow local frontends
ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost

# Database - local MongoDB
MONGO_URI=mongodb://localhost:27017/resources

# Message Queues - local RabbitMQ
RABBITMQ_URL=amqp://guest:guest@localhost/
DATA_RABBITMQ_URL=amqp://guest:guest@localhost/

# AI Model - fast iteration
GEMINI_MODEL_NAME=gemini-1.5-flash

# Debug - enabled for development
WRAPPER_GENERATION_DEBUG_MODE=true

# Data Collection
CHUNK_SIZE_THRESHOLD=1000

Docker Compose Development

# Required
GEMINI_API_KEY=AIzaSyD...

# CORS - allow local frontends
ORIGINS=http://localhost:3000,http://localhost:5173,http://localhost

# Database - Docker container name
MONGO_URI=mongodb://resource-mongo:27017/resources

# Message Queues - Docker container name
RABBITMQ_URL=amqp://guest:guest@rabbitmq/
DATA_RABBITMQ_URL=amqp://guest:guest@rabbitmq/

# AI Model
GEMINI_MODEL_NAME=gemini-1.5-flash

# Debug - enabled for development
WRAPPER_GENERATION_DEBUG_MODE=true

# Data Collection
CHUNK_SIZE_THRESHOLD=1000

Production

# Required - use separate production key
GEMINI_API_KEY=AIzaSyE...

# CORS - specific production origins only
ORIGINS=https://app.example.com,https://dashboard.example.com

# Database - MongoDB Atlas with auth
MONGO_URI=mongodb+srv://prod-user:SECURE_PASSWORD@cluster.mongodb.net/resources?retryWrites=true&w=majority

# Message Queues - production brokers with auth
RABBITMQ_URL=amqp://prod-user:SECURE_PASSWORD@mq.example.com:5672/prod
DATA_RABBITMQ_URL=amqp://data-user:SECURE_PASSWORD@data-mq.example.com:5672/data

# Queue Names - production queues
RESOURCE_DATA_QUEUE=prod_resource_data
RESOURCE_DELETED_QUEUE=prod_resource_deleted
COLLECTED_DATA_QUEUE=prod_collected_data
DATA_QUEUE_NAME=prod_data_queue
WRAPPER_CREATION_QUEUE_NAME=prod_wrapper_creation_queue

# AI Model - best quality for production
GEMINI_MODEL_NAME=gemini-1.5-pro

# Debug - DISABLED for production
WRAPPER_GENERATION_DEBUG_MODE=false

# Data Collection - optimized for production network
CHUNK_SIZE_THRESHOLD=5000

Minimal Configuration

# Only required variable - all others use defaults
GEMINI_API_KEY=AIzaSyD...

Docker Compose Usage

Environment variables can be set in docker-compose.yml:
services:
  resource-service:
    environment:
      # Direct values
      - MONGO_URI=mongodb://resource-mongo:27017/resources
      - GEMINI_MODEL_NAME=gemini-1.5-flash
      
      # From host environment
      - GEMINI_API_KEY=${GEMINI_API_KEY}
      
      # With default values
      - ORIGINS=${ORIGINS:-localhost}
      - GEMINI_MODEL_NAME=${GEMINI_MODEL_NAME:-gemini-1.5-flash}
Or in a separate .env file:
# .env file is automatically loaded by Docker Compose
GEMINI_API_KEY=AIzaSyD...
ORIGINS=http://localhost:3000

Kubernetes Usage

Store sensitive values in Kubernetes Secrets:
apiVersion: v1
kind: Secret
metadata:
  name: resource-service-secrets
type: Opaque
stringData:
  GEMINI_API_KEY: AIzaSyD...
  MONGO_URI: mongodb+srv://user:pass@cluster.mongodb.net/resources
  RABBITMQ_URL: amqp://user:pass@rabbitmq:5672/

Validation and Troubleshooting

Missing Required Variables

ValidationError: 1 validation error for Settings
GEMINI_API_KEY
  field required (type=value_error.missing)
Solution: Set the GEMINI_API_KEY environment variable.

Invalid Type

ValidationError: 1 validation error for Settings
CHUNK_SIZE_THRESHOLD
  value is not a valid integer (type=type_error.integer)
Solution: Ensure CHUNK_SIZE_THRESHOLD is a number without quotes.

Invalid Boolean

ValidationError: 1 validation error for Settings
WRAPPER_GENERATION_DEBUG_MODE
  value could not be parsed to a boolean (type=type_error.bool)
Solution: Use true/false, 1/0, or yes/no.

Connection Failures

ConnectionError: Could not connect to MongoDB at mongodb://localhost:27017
Solution: Verify MONGO_URI is correct and MongoDB is running.

Security Checklist

1

Protect API keys

  • Never commit .env files to version control
  • Add .env to .gitignore
  • Use different keys for dev/staging/prod
  • Rotate keys periodically
2

Secure credentials

  • Use strong passwords (minimum 16 characters)
  • Generate passwords with openssl rand -base64 32
  • Store in secrets management (Vault, AWS Secrets Manager, etc.)
  • Never use default credentials in production
3

Restrict CORS

  • Specify exact origins (no wildcards)
  • Use HTTPS origins only in production
  • Validate origins match deployed frontends
4

Disable debug mode

  • Set WRAPPER_GENERATION_DEBUG_MODE=false in production
  • Review logs to ensure no sensitive data is logged
  • Use structured logging for production
5

Use encrypted connections

  • Use mongodb+srv:// or mongodb:// with TLS for MongoDB
  • Use amqps:// for RabbitMQ over SSL/TLS
  • Ensure certificates are valid and not self-signed

Build docs developers (and LLMs) love