Required Variables
These variables must be set for the service to start:Google Gemini API key for AI-powered wrapper generation.Usage:How to obtain:
- Visit Google AI Studio
- Sign in with your Google account
- Create a new API key
- Copy the key to your
.envfile
- Never commit this key to version control
- Rotate keys periodically
- Use different keys for development and production
- Monitor usage in Google Cloud Console
CORS and Security
Comma-separated list of allowed origins for Cross-Origin Resource Sharing (CORS).Default: Implementation:
The service splits this value by commas and configures FastAPI CORS middleware:Production recommendations:
localhostUsage:- Specify exact origins (no wildcards)
- Use HTTPS origins only
- Limit to necessary domains
- Validate origins match your frontend deployments
Database Configuration
MongoDB connection URI including authentication, host, port, database name, and options.Default: Format:Common options:
mongodb://localhost:27017Usage:retryWrites=true- Retry write operations on failurew=majority- Wait for majority of replica set to acknowledge writesmaxPoolSize=50- Maximum connection pool sizeauthSource=admin- Authentication database
Message Queue Configuration
Primary RabbitMQ Connection
RabbitMQ connection URL for internal service messaging.Default: Format:Default credentials:
amqp://guest:guest@rabbitmq/Usage:- Username:
guest - Password:
guest - Virtual host:
/(default) - Port:
5672(AMQP),5671(AMQPS)
Queue Names
Queue name for publishing resource creation and update events.Default: Message format:Consumers:
Other microservices subscribe to this queue to react to resource changes.
resource_dataUsage:Queue name for publishing resource deletion events.Default: Message format:
resource_deletedUsage:Queue name for consuming data collected by wrapper processes.Default: Message format:Behavior:
The service consumes messages from this queue and processes collected data.
collected_dataUsage:Data Service Integration
RabbitMQ connection URL for the external data service where collected data is published.Default: Purpose:
Wrappers publish collected data to this separate broker, allowing the data service to be deployed independently.
amqp://user:password@data-mq:5672/Usage:This can point to the same RabbitMQ instance as
RABBITMQ_URL or a completely separate broker for distributed deployments.Queue name on the data service broker where wrappers publish collected data.Default: Implementation:
Generated wrapper code includes this queue name for publishing:
data_queueUsage:Queue name for receiving wrapper creation requests.Default: Message format:
wrapper_creation_queueUsage:Data Collection Settings
Maximum number of records to include in a single message chunk. Data exceeding this threshold is split into multiple chunks.Default: Behavior:
When a wrapper collects data:Considerations:
1000Usage:- If records ≤ threshold: Single message
- If records > threshold: Split into multiple messages
- Smaller values: More messages, less memory per message, higher overhead
- Larger values: Fewer messages, more memory per message, lower overhead
- RabbitMQ limits: Default max message size is 128MB
- Network: Smaller chunks better for unreliable networks
- Development:
1000 - Production (high bandwidth):
5000 - Production (limited bandwidth):
500 - Large datasets:
10000
AI Model Configuration
Google Gemini model to use for wrapper code generation.Default: Model comparison:
Recommendations:
gemini-1.5-flashUsage:| Model | Speed | Quality | Cost | Use Case |
|---|---|---|---|---|
gemini-1.5-flash | Fast | Good | Low | Development, high-volume |
gemini-1.5-pro | Moderate | Excellent | Medium | Production, complex wrappers |
gemini-1.0-pro | Moderate | Good | Low | Legacy support |
- Development:
gemini-1.5-flashfor fast iteration - Production:
gemini-1.5-profor best quality - High volume:
gemini-1.5-flashto minimize costs
Model availability and pricing may vary by region. Check Google AI pricing for current rates.
Debug and Development
Enable verbose logging for wrapper generation and execution.Default: Accepted values:Production:
Always set to
falseUsage:true,True,TRUE,1,yes,Yes,YESfalse,False,FALSE,0,no,No,NO
- Complete generated wrapper code
- AI model prompts and responses
- Detailed execution traces
- Wrapper process stdout/stderr
- Data collection progress
- Error stack traces with full context
false in production to:- Reduce log storage costs
- Improve performance
- Prevent sensitive data exposure
- Reduce noise in monitoring systems
Environment-Specific Examples
Local Development
Docker Compose Development
Production
Minimal Configuration
Docker Compose Usage
Environment variables can be set indocker-compose.yml:
.env file:
Kubernetes Usage
Store sensitive values in Kubernetes Secrets:Validation and Troubleshooting
Missing Required Variables
GEMINI_API_KEY environment variable.
Invalid Type
CHUNK_SIZE_THRESHOLD is a number without quotes.
Invalid Boolean
true/false, 1/0, or yes/no.
Connection Failures
MONGO_URI is correct and MongoDB is running.
Security Checklist
Protect API keys
- Never commit
.envfiles to version control - Add
.envto.gitignore - Use different keys for dev/staging/prod
- Rotate keys periodically
Secure credentials
- Use strong passwords (minimum 16 characters)
- Generate passwords with
openssl rand -base64 32 - Store in secrets management (Vault, AWS Secrets Manager, etc.)
- Never use default credentials in production
Restrict CORS
- Specify exact origins (no wildcards)
- Use HTTPS origins only in production
- Validate origins match deployed frontends
Disable debug mode
- Set
WRAPPER_GENERATION_DEBUG_MODE=falsein production - Review logs to ensure no sensitive data is logged
- Use structured logging for production