Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/GingerlyData247/SOTeam4-P2/llms.txt

Use this file to discover all available pages before exploring further.

Overview

The Trustworthy Model Registry is built as a cloud-native, serverless application designed to run on AWS while maintaining compatibility with local development environments. The system follows a modular architecture with clear separation of concerns across API routing, business logic, data persistence, and metric computation.

High-level architecture

AWS components

The system is deployed entirely on AWS using free-tier-compatible services:

AWS Lambda

Stateless execution of the FastAPI backend via the Mangum adapter. Handles all API requests without maintaining server state.

API Gateway

Public REST interface that routes HTTP requests to Lambda. Provides CORS handling and request/response transformation.

Amazon S3

Persistent storage for both the registry metadata (registry.json) and artifact binaries (models, datasets, code).

CloudWatch

Centralized logging and monitoring. All requests, errors, and metrics are captured for observability.
All AWS services are configured to stay within free-tier limits, making the system cost-effective for development and demonstration purposes.

Application structure

The codebase is organized into several key modules:

API layer (src/api/)

Handles HTTP routing, request validation, and response formatting.
Initializes the FastAPI application and configures middleware:
src/main.py
from fastapi import FastAPI
from starlette.middleware.cors import CORSMiddleware
from mangum import Mangum

from src.api.routers.models import router as models_router
from src.api.middleware.log_requests import DeepASGILogger

# Create FastAPI app
app = FastAPI(title="SOTeam4P2 API")

# Add logging middleware
app.add_middleware(DeepASGILogger)

# Configure CORS
app.add_middleware(
    CORSMiddleware,
    allow_origins=["http://sot4-model-registry-dev.s3-website.us-east-2.amazonaws.com"],
    allow_credentials=False,
    allow_methods=["*"],
    allow_headers=["*"],
)

# Mount API routers
app.include_router(models_router, prefix="/api")

# AWS Lambda handler
handler = Mangum(app)
Key decisions:
  • Middleware is added before routers to guarantee full request coverage
  • CORS is handled at the application level for consistency
  • Mangum adapter enables seamless Lambda deployment without code changes
Implements all registry endpoints following the OpenAPI specification:Core endpoints:
  • POST /artifact/{type} - Create new artifacts
  • GET /artifacts/{type}/{id} - Retrieve artifact by ID
  • PUT /artifacts/{type}/{id} - Update artifact metadata
  • DELETE /artifacts/{type}/{id} - Delete an artifact
  • POST /artifacts - List/enumerate artifacts with pagination
Model-specific endpoints:
  • GET /artifact/model/{id}/rate - Compute trust metrics
  • GET /artifact/model/{id}/lineage - Get dependency graph
  • POST /artifact/model/{id}/license-check - Validate license compatibility
  • GET /artifact/{type}/{id}/cost - Estimate operational costs
Search endpoints:
  • GET /artifact/byName/{name} - Exact name match
  • POST /artifact/byRegEx - Regex-based search
System endpoints:
  • GET /health - System health check
  • DELETE /reset - Reset registry to default state
  • GET /tracks - List planned feature tracks
Custom ASGI middleware for comprehensive request/response logging:
src/api/middleware/log_requests.py
class DeepASGILogger:
    def __init__(self, app: ASGIApp):
        self.app = app

    async def __call__(self, scope: Scope, receive: Receive, send: Send):
        if scope["type"] != "http":
            await self.app(scope, receive, send)
            return

        rid = str(uuid.uuid4())[:8]
        method = scope.get("method")
        path = scope.get("path")

        # Capture request body
        body_bytes = b""
        async def recv_wrapper():
            nonlocal body_bytes
            msg = await receive()
            if msg["type"] == "http.request":
                body_bytes += msg.get("body", b"")
            return msg

        # Capture response
        resp_body = b""
        status_code = None
        async def send_wrapper(message):
            nonlocal resp_body, status_code
            if message["type"] == "http.response.start":
                status_code = message["status"]
            if message["type"] == "http.response.body":
                resp_body += message.get("body", b"")
            await send(message)

        start = time.time()
        await self.app(scope, recv_wrapper, send_wrapper)
        duration_ms = round((time.time() - start) * 1000, 2)

        # Log complete request/response with timing
        print(f"[{rid}] {method} {path} -> {status_code} ({duration_ms}ms)")
Features:
  • Unique request IDs for tracing
  • Full request/response body capture
  • Latency measurement
  • CloudWatch-compatible output

Service layer (src/services/)

Contains core business logic isolated from HTTP concerns.
Manages artifact persistence and metadata in S3:
src/services/registry.py
class RegistryService:
    def __init__(self, bucket_name: str, key: str = "registry/registry.json"):
        self.s3 = boto3.client("s3")
        self.bucket = bucket_name
        self.key = key
        self._models: List[Dict[str, Any]] = []
        self._id_counter: int = 0
        self._load()

    def _load(self):
        """Load registry.json from S3 safely."""
        try:
            obj = self.s3.get_object(Bucket=self.bucket, Key=self.key)
            content = obj["Body"].read()
            data = json.loads(content)
            self._models = data.get("models", [])
            self._id_counter = data.get("id_counter", 0)
        except Exception as e:
            logger.error(f"Failed to load registry: {e}")
            self._models = []
            self._id_counter = 0

    def create(self, m) -> Dict[str, Any]:
        """Create new artifact entry."""
        self._load()
        self._id_counter += 1
        new_id = str(self._id_counter)
        entry = {
            "id": new_id,
            "name": getattr(m, "name", "Unnamed Model"),
            "version": getattr(m, "version", "1.0.0"),
            "metadata": dict(m.metadata) if hasattr(m, "metadata") else {},
        }
        self._models.append(entry)
        self._save()
        return entry
Key features:
  • Atomic S3 read/write operations
  • Auto-incrementing artifact IDs
  • Graceful failure handling
  • Metadata preservation
Orchestrates all trust metric calculations:
src/services/scoring.py
class ScoringService:
    def __init__(self):
        token = os.getenv("HUGGINGFACE_HUB_TOKEN")
        self.api = HfApi(token=token)

    def _build_resource(self, model_ref: str) -> Dict[str, Any]:
        """Fetch model metadata from HuggingFace."""
        resource = {
            "name": model_ref,
            "url": f"https://huggingface.co/{model_ref}",
        }
        info = self.api.model_info(model_ref)
        resource["license"] = getattr(info, "license", None)
        resource["tags"] = getattr(info, "tags", [])
        resource["downloads"] = getattr(info, "downloads", 0)
        
        # Read model card
        readme_path = self.api.hf_hub_download(model_ref, "README.md")
        with open(readme_path, "r") as f:
            resource["card_text"] = f.read()
        
        return resource

    def rate(self, resource: Any) -> Dict[str, Any]:
        """Compute all metrics for a model."""
        hf_id = normalize_hf_id(resource.get("name"))
        base_resource = {
            "name": hf_id,
            "url": f"https://huggingface.co/{hf_id}",
            "category": "MODEL",
        }
        # Compute metrics via run.py
        metrics = compute_metrics_for_model(base_resource)
        return metrics
Responsibilities:
  • Fetch HuggingFace model metadata
  • Extract GitHub links and dataset references
  • Invoke individual metric modules
  • Aggregate results into rating schema
Abstracts storage operations with S3/local fallback:
src/services/storage.py
class Storage:
    def put_bytes(self, key: str, data: bytes):
        """Store arbitrary bytes (ZIP, binary, text)."""
        if LOCAL_MODE:
            path = os.path.join(LOCAL_DIR, key)
            os.makedirs(os.path.dirname(path), exist_ok=True)
            with open(path, "wb") as f:
                f.write(data)
        else:
            s3_client.put_object(Bucket=BUCKET, Key=key, Body=data)

    def presign(self, key: str, expires: int = 3600) -> str:
        """Generate presigned download URL."""
        if LOCAL_MODE:
            return f"local://download/{key}"
        return s3_client.generate_presigned_url(
            "get_object",
            Params={"Bucket": BUCKET, "Key": key},
            ExpiresIn=expires,
        )
Features:
  • Transparent S3/local storage switching
  • Presigned URL generation for secure downloads
  • Binary-safe operations

Metrics layer (src/metrics/)

Individual metric implementations following a common interface.
All metrics follow the same interface:
def metric(resource: Dict[str, Any]) -> Tuple[float, int]:
    """Compute metric score and latency.
    
    Args:
        resource: Artifact metadata including name, url, local_path, etc.
    
    Returns:
        (score, latency_ms): Score in [0, 1] and computation time in milliseconds
    """
    start = time.perf_counter()
    
    # Metric computation logic
    score = 0.0
    
    latency = int((time.perf_counter() - start) * 1000)
    return round(score, 3), latency
Available metrics:
  • ramp_up_time - Documentation completeness and example quality
  • bus_factor - Team diversity and contributor distribution
  • performance_claims - Benchmark evidence and performance documentation
  • license - License compatibility and suitability
  • dataset_and_code_score - Dataset and code availability
  • dataset_quality - Dataset documentation and metadata quality
  • code_quality - GitHub repository code quality signals
  • reproducibility - Environment files, notebooks, reproduction instructions
  • reviewedness - PR review coverage from GitHub
  • treescore - Aggregate score of parent model lineage
  • size - Hardware compatibility (Raspberry Pi, Jetson, Desktop, Server)
src/metrics/reproducibility.py
def _score_local_reproducibility(local_dir: str) -> float:
    """Inspect local repository for reproducibility signals."""
    score = 0.0
    p = Path(local_dir)
    
    # requirements.txt → +0.4
    if any(f.name.lower().startswith("requirements") for f in p.iterdir()):
        score += 0.4
    
    # environment.yml → +0.2
    if any(f.name.lower().startswith("environment") for f in p.iterdir()):
        score += 0.2
    
    # Jupyter notebooks → +0.2
    if any(f.suffix.lower() == ".ipynb" for f in p.iterdir()):
        score += 0.2
    
    # README mentions "reproduce" → +0.2
    for readme in p.glob("README*"):
        text = readme.read_text(encoding="utf-8", errors="ignore").lower()
        if "reproduce" in text:
            score += 0.2
            break
    
    return min(score, 1.0)

def metric(resource: Dict[str, Any]) -> Tuple[float, int]:
    start = time.perf_counter()
    local_dir = resource.get("local_dir") or resource.get("local_path")
    
    if local_dir and os.path.isdir(local_dir):
        score = _score_local_reproducibility(local_dir)
    else:
        score = _score_remote_reproducibility(resource)
    
    latency = int((time.perf_counter() - start) * 1000)
    return round(score, 3), latency

Request flow

A typical model ingestion request follows this path:
1

Request reception

API Gateway receives the HTTP POST request and invokes the Lambda function with the Mangum-wrapped FastAPI app.
2

Middleware processing

The DeepASGILogger middleware captures request details, assigns a request ID, and starts timing.
3

Routing

FastAPI routes the request to the appropriate handler in src/api/routers/models.py based on path and method.
4

Metric computation

The ScoringService fetches model metadata from HuggingFace and GitHub, then invokes all metric modules to compute trust scores.
5

Registry persistence

The computed metrics and artifact metadata are persisted to S3 via the RegistryService.
6

Artifact storage

A minimal ZIP artifact is created and uploaded to S3, and a presigned download URL is generated.
7

Response

The complete artifact response is serialized and returned through the middleware stack, logging the final status and latency.

Data models

The system uses Pydantic models for type safety and validation:
class ModelCreate(BaseModel):
    name: str
    version: str
    card: str = ""
    tags: List[str] = Field(default_factory=list)
    metadata: Optional[Dict[str, Any]] = None
    source_uri: Optional[str] = None

class ModelRating(BaseModel):
    name: str
    category: str
    net_score: float
    net_score_latency: float
    ramp_up_time: float
    ramp_up_time_latency: float
    bus_factor: float
    bus_factor_latency: float
    performance_claims: float
    performance_claims_latency: float
    license: float
    license_latency: float
    dataset_and_code_score: float
    dataset_and_code_score_latency: float
    dataset_quality: float
    dataset_quality_latency: float
    code_quality: float
    code_quality_latency: float
    reproducibility: float
    reproducibility_latency: float
    reviewedness: float
    reviewedness_latency: float
    tree_score: float
    tree_score_latency: float
    size_score: SizeScore
    size_score_latency: float

class SizeScore(BaseModel):
    raspberry_pi: float
    jetson_nano: float
    desktop_pc: float
    aws_server: float

Deployment architecture

# Start with uvicorn for hot reload
uvicorn src.run:app --reload --host 0.0.0.0 --port 8000
Local mode features:
  • Filesystem-based storage instead of S3 (set LOCAL_STORAGE=1)
  • SQLite fallback for registry (optional)
  • Hot reload for rapid development
  • Full debugging capabilities
The application uses Mangum to adapt FastAPI for Lambda:
src/main.py
from mangum import Mangum

app = FastAPI(title="SOTeam4P2 API")
# ... middleware and routes ...

# Lambda handler
handler = Mangum(app)
Deployment process:
  1. Package application with dependencies
  2. Upload to AWS Lambda
  3. Configure API Gateway routes
  4. Set environment variables (S3_BUCKET, AWS_REGION)
  5. Verify /health endpoint
The Mangum adapter handles all ASGI-to-Lambda event translation transparently.
The Dockerfile supports both development and production:
FROM node:18-bullseye
WORKDIR /app

# Install Python 3.11 via pyenv
RUN curl -fsSL https://pyenv.run | bash
RUN /opt/pyenv/bin/pyenv install 3.11.9
RUN /opt/pyenv/bin/pyenv global 3.11.9

# Install dependencies
COPY requirements.txt .
RUN /opt/pyenv/shims/python -m pip install -r requirements.txt

# Copy application
COPY . .
ENV PYTHONPATH=/app:/app/src

CMD ["/opt/pyenv/shims/python", "run.py"]
Build and run:
docker build -t tmr .
docker run -p 8000:8000 -e S3_BUCKET=my-bucket tmr

Security considerations

The system implements multiple security layers:
  • Input validation: All endpoints use Pydantic schema validation
  • URL allowlisting: External URLs are validated before fetching
  • Presigned URLs: S3 downloads use time-limited, signed URLs
  • CORS policies: Strict origin controls for browser clients
  • Error handling: No sensitive information leaked in error responses
  • Rate limiting: CloudWatch-based monitoring for abuse detection

Performance characteristics

Typical operation latencies:
OperationAverage Latency
Health check50-100ms
Artifact retrieval100-200ms
Model ingestion30-60s
Metric computation15-45s
Lineage graph200-500ms
License check500-1000ms
  • Horizontal: Lambda auto-scales to handle concurrent requests
  • Storage: S3 provides unlimited artifact storage
  • Registry: In-memory caching with S3 backing
  • Metrics: Computed once at ingestion, cached in metadata

Error handling

The system follows a consistent error handling strategy:
# 400 Bad Request - Invalid input
raise HTTPException(status_code=400, detail="Invalid artifact_type")

# 404 Not Found - Resource doesn't exist
raise HTTPException(status_code=404, detail="Artifact does not exist.")

# 424 Failed Dependency - Ingestion gate rejection
raise HTTPException(
    status_code=424,
    detail=f"Ingest rejected: reviewedness={score:.2f} < 0.50"
)

# 500 Internal Server Error - Unexpected failures
raise HTTPException(status_code=500, detail="Internal rating error.")
All errors are logged to CloudWatch with full context for debugging.

Observability

Request logging

All requests logged with ID, method, path, status, latency, and full request/response bodies

Metric latency

Every metric computation includes precise latency measurement in milliseconds

Health endpoint

/health exposes uptime, artifact count, and system status

CloudWatch integration

All logs automatically captured in CloudWatch for analysis and alerting
Enable detailed logging by setting LOG_LEVEL=2 in your environment for DEBUG-level output.

Build docs developers (and LLMs) love