Documentation Index
Fetch the complete documentation index at: https://mintlify.com/GingerlyData247/SOTeam4-P2/llms.txt
Use this file to discover all available pages before exploring further.
Overview
The Trustworthy Model Registry uses environment variables for all configuration. This approach ensures:- Security: Sensitive credentials never appear in code
- Flexibility: Different configurations for development, staging, and production
- 12-Factor Compliance: Follows modern application deployment best practices
Required Variables
These variables must be set for the application to function correctly.AWS_REGION
AWS region where resources are deployed.Example:
us-east-2Used by:src/aws/s3_utils.py- S3 client initializationsrc/services/storage.py- Storage service configuration
- Automatically detected in Lambda runtime
- Must be explicitly set for local development
- Should match the region of your S3 bucket
S3_BUCKET
Name of the S3 bucket for artifact storage and registry persistence.Example: Notes:
sot4-model-registry-artifactsUsed by:src/aws/s3_utils.py- Artifact upload/download operationssrc/services/storage.py- Storage backend selectionsrc/api/routers/models.py- Registry initialization
- Must be globally unique across all AWS accounts
- Bucket should be in the same region as Lambda function
- Required unless
LOCAL_STORAGE=1is set
AUTH_TOKEN
Default authentication token for API access.Example:
admin-secret-token-12345Used by:- Authentication middleware (if enabled)
- Security track features
- Keep this value secret and secure
- Use different tokens for development and production
- Rotate periodically for security
- Required by OpenAPI specification authentication schema
Optional API Tokens
These tokens improve functionality but are not required for basic operation.HUGGINGFACE_HUB_TOKEN
HuggingFace API token for accessing models and datasets.Example:
hf_xxxxxxxxxxxxxxxxxxxxUsed by:src/services/scoring.py- Model metadata fetchingsrc/metrics/reproducibility.py- Model card analysissrc/metrics/dataset_quality.py- Dataset validationsrc/metrics/bus_factor.py- Repository analysis
- Higher API rate limits
- Access to private models (if permissions granted)
- Improved reliability for model ingestion
- Create a HuggingFace account at https://huggingface.co
- Navigate to Settings > Access Tokens
- Create a new token with
readpermissions
- Optional but recommended for production use
- Without token, API is limited to public models and lower rate limits
GITHUB_TOKEN
GitHub personal access token for API access.Example:
ghp_xxxxxxxxxxxxxxxxxxxxUsed by:src/metrics/reproducibility.py- Repository metadata fetchingsrc/metrics/bus_factor.py- Contributor analysissrc/api/routers/models.py- License information retrieval
- Higher GitHub API rate limits (5,000 requests/hour vs 60)
- Access to private repositories (if permissions granted)
- More reliable lineage and reviewedness metrics
- Navigate to GitHub Settings > Developer settings > Personal access tokens
- Generate new token with
reposcope (for public repositories,public_repois sufficient) - Copy the token immediately (it won’t be shown again)
- Optional but recommended for production use
- Without token, GitHub API calls are rate-limited to 60 requests/hour
Logging Configuration
Control logging behavior for debugging and monitoring.LOG_LEVEL
Controls logging verbosity.Possible values:Notes:
0- Silent (no logs emitted)1- INFO level logging2- DEBUG level logging
src/utils/logging.py- Logger configuration
- Set to
0for production (Lambda logs to CloudWatch automatically) - Set to
1or2for local development debugging - Does not affect CloudWatch logging in Lambda deployments
LOG_FILE
Path to log file for local development.Example:
/tmp/tmr.logUsed by:src/utils/logging.py- File handler configuration
- Only used when
LOG_LEVEL > 0 - If not set or invalid, logs default to stderr
- Not used in Lambda deployments (CloudWatch is used instead)
- File is created if it doesn’t exist
- File is overwritten on each application restart
Development Configuration
LOCAL_STORAGE
Enable local filesystem storage instead of S3.Possible values:Notes:
0- Use S3 storage (production mode)1- Use local filesystem (development mode)
src/services/storage.py- Storage backend selection
- Artifacts stored in
/tmp/local-artifacts/ - Registry stored in local filesystem
- Useful for local development without AWS setup
- Presigned URLs return
local://download/...format - Data is not persistent across system reboots (uses
/tmp) - Never use in production deployments
Internal Configuration
These variables are set automatically by the application and should not be modified.PYTHONPATH
Python module search path.Set by:
src/run.py during initializationValue: <repo-root>:<repo-root>/srcPurpose:- Ensures metrics and utilities can be imported correctly
- Allows CLI and API to share the same codebase
- Maintains compatibility across local, CI, and Lambda environments
- Automatically configured at runtime
- No manual configuration needed
HF_HUB_DISABLE_PROGRESS_BARS
Disables HuggingFace Hub progress bars in console output.Set by:
src/run.py during initializationNotes:- Keeps output clean for autograder compatibility
- Prevents terminal noise in Lambda logs
TQDM_DISABLE
Disables tqdm progress bars in console output.Set by:
src/run.py during initializationNotes:- Prevents progress bar rendering in logs
- Improves CloudWatch log readability
Example Configuration Files
Development Environment
.env
Production Environment (Lambda)
Set via AWS Lambda environment variables:Testing Environment
.env.test
Configuration Validation
The application validates required configuration on startup:src/services/storage.py
src/aws/s3_utils.py
Security Best Practices
Use Secrets Manager
For production, store sensitive tokens in AWS Secrets Manager and reference them in Lambda
Rotate Tokens
Regularly rotate
AUTH_TOKEN, GITHUB_TOKEN, and HUGGINGFACE_HUB_TOKENLeast Privilege
Use IAM roles with minimal required permissions for Lambda execution
Environment Isolation
Use separate credentials and buckets for development, staging, and production
Troubleshooting
S3_BUCKET not set error
S3_BUCKET not set error
Cause: Missing or incorrect S3 bucket configurationSolution:Or add to
.env file:HuggingFace rate limit errors
HuggingFace rate limit errors
Cause: Too many API requests without authentication tokenSolution: Set
HUGGINGFACE_HUB_TOKEN to increase rate limits from 60 to 5,000 requests/hourGitHub API rate limit errors
GitHub API rate limit errors
Cause: Exceeded 60 requests/hour limit for unauthenticated requestsSolution: Set
GITHUB_TOKEN to increase limit to 5,000 requests/hour.env file not loaded
.env file not loaded
Cause:
.env file not in the correct location or not readableSolution:- Place
.envin project root directory - Ensure file is readable:
chmod 644 .env - Verify file is loaded: Check
src/main.pycallsload_dotenv()
Next Steps
Local Setup
Set up your local development environment
AWS Deployment
Deploy to production on AWS Lambda