Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/vruizz22/innova-backend-serverless/llms.txt

Use this file to discover all available pages before exploring further.

Innova Backend Serverless deploys six Lambda functions via Serverless Framework 3, all targeting the nodejs20.x runtime in us-east-1. One function hosts the entire NestJS REST API; the remaining five are purpose-built event-driven workers that handle async telemetry persistence, LLM classification, OCR transcription, scheduled alert generation, and OCR-to-attempt reprocessing. Every function is bundled by serverless-esbuild and shares the same IAM role with scoped SQS and S3 permissions.

Function Reference

api — NestJS HTTP Handler

Handler

src/lambda.handler

Trigger

API Gateway HTTP — ANY / and ANY /{proxy+}

Timeout

30 seconds

Memory

512 MB
The api function is the front door for all HTTP traffic. It bridges API Gateway events to the full NestJS application using @vendia/serverless-express, meaning every controller, guard, interceptor, and pipe defined in AppModule runs inside a single Lambda invocation. The custom domain api.superprofes.app is mapped via serverless-domain-manager with TLS 1.2 and a regional endpoint.
# serverless.yml
api:
  handler: src/lambda.handler
  timeout: 30
  memorySize: 512
  events:
    - http:
        method: ANY
        path: /
    - http:
        method: ANY
        path: '{proxy+}'

telemetryWorker — Attempt Telemetry Persister

Handler

src/infrastructure/workers/telemetry-persister.handler

Trigger

SQS FIFO — AttemptStreamQueue (attempt-stream.fifo)

Timeout

60 seconds

Memory

256 MB
Consumes batches of up to 10 messages from the FIFO telemetry queue and persists raw keystroke events to MongoDB Atlas and archives older records to S3. Uses ReportBatchItemFailures so that a single bad message does not cause the entire batch to be retried.
# serverless.yml
telemetryWorker:
  handler: src/infrastructure/workers/telemetry-persister.handler
  timeout: 60
  memorySize: 256
  events:
    - sqs:
        arn: {"Fn::GetAtt": ["AttemptStreamQueue", "Arn"]}
        batchSize: 10
        functionResponseType: ReportBatchItemFailures
Key behaviors:
  • Receives ordered, deduplicated messages because the source queue is FIFO with ContentBasedDeduplication: true.
  • Writes to the attempt_events MongoDB collection (see Data Model).
  • Sets archived_to_s3_at once records are offloaded to the cold-storage S3 prefix.

llmClassifierWorker — LLM Classification Worker

Handler

src/infrastructure/workers/llm-classifier.handler

Trigger

SQS Standard — LlmClassifyQueue

Timeout

120 seconds

Memory

512 MB
Polls up to 20 messages per invocation with a 60-second batching window to maximize prompt-cache hit rates when calling Anthropic Claude. Classifies attempts that the synchronous rule engine marked as UNCLASSIFIED and writes results back to Postgres via Prisma. A LlmClassifyDLQ receives messages that fail after 3 receive attempts.
# serverless.yml
llmClassifierWorker:
  handler: src/infrastructure/workers/llm-classifier.handler
  timeout: 120
  memorySize: 512
  events:
    - sqs:
        arn: {"Fn::GetAtt": ["LlmClassifyQueue", "Arn"]}
        batchSize: 20
        maximumBatchingWindow: 60
        functionResponseType: ReportBatchItemFailures
Key behaviors:
  • maximumBatchingWindow: 60 lets Lambda accumulate up to 60 seconds of messages before invoking the function, keeping batches dense for better prompt-cache performance.
  • Each classification audit is written to llm_classification_jobs in MongoDB (request tokens, cost, confidence per attempt).
  • 512 MB memory is required because Claude SDK client initialization and JSON marshalling of large batch responses are memory-intensive.

ocrWorker — Handwritten Math OCR Worker

Handler

src/infrastructure/workers/ocr-worker.handler

Trigger

S3 ObjectCreated:* on innova-backend-serverless-{stage}-ocr-uploads

Timeout

60 seconds

Memory

512 MB
Triggered whenever a student photo lands in the OCR upload bucket. The handler reads the S3 event record to obtain the bucket name and object key, then calls the Gemini Vision provider (with Anthropic as fallback) to transcribe handwritten math into structured LaTeX steps. Results are written to the ocr_jobs MongoDB collection and then enqueued on AttemptReprocessQueue.
# serverless.yml
ocrWorker:
  handler: src/infrastructure/workers/ocr-worker.handler
  timeout: 60
  memorySize: 512
  events:
    - s3:
        bucket: ${self:service}-${self:provider.stage}-ocr-uploads
        event: s3:ObjectCreated:*
        existing: true
Key behaviors:
  • existing: true tells Serverless Framework not to create the bucket — it is managed separately as OcrUploadsBucket in the resources block.
  • Each OCR job tracks primary_provider, used_fallback, cost_estimated_usd, and overall_confidence.
  • Photos with status: low_confidence_review are flagged for human review rather than auto-discarded.

alertGenerator — Scheduled Teacher Alert Generator

Handler

src/infrastructure/workers/alert-generator.handler

Trigger

EventBridge schedule — rate(1 hour)

Timeout

60 seconds

Memory

256 MB
Runs every hour, scanning all non-archived courses to generate TeacherAlert records from mastery data. It uses @prisma/adapter-pg directly (no NestJS DI) to keep cold-start latency minimal for a cron function. Alerts are deduplicated per teacher/type/topic/day so the teacher dashboard doesn’t flood.
# serverless.yml
alertGenerator:
  handler: src/infrastructure/workers/alert-generator.handler
  timeout: 60
  memorySize: 256
  events:
    - schedule: rate(1 hour)
Alert types generated:
Alert TypeTriggerSeverity
AT_RISK_STUDENTpKnown < 0.4 in ≥ 2 active topicsHIGH
STUDENT_DROPNo attempts in the last 3 daysMED
UNIT_OFF_TRACKCourse-wide average pKnown < 0.3 in a unitHIGH

attemptReprocessWorker — OCR-to-Attempt Reprocess Worker

Handler

src/infrastructure/workers/attempt-reprocess.handler

Trigger

SQS Standard — AttemptReprocessQueue

Timeout

60 seconds

Memory

256 MB
Converts the structured OCR/grading output produced by innova-ai-engine back into a full Attempt record in Postgres. It boots a cached NestJS application context (via NestFactory.createApplicationContext) so the full AttemptReprocessWorker service with its DI dependencies is available across warm invocations. Per-message failures are reported individually via ReportBatchItemFailures, with a DLQ after 5 failed receive attempts.
# serverless.yml
attemptReprocessWorker:
  handler: src/infrastructure/workers/attempt-reprocess.handler
  timeout: 60
  memorySize: 256
  events:
    - sqs:
        arn: {"Fn::GetAtt": ["AttemptReprocessQueue", "Arn"]}
        batchSize: 10
        functionResponseType: ReportBatchItemFailures
Key behaviors:
  • Parses AttemptReprocessMessage from each SQS record body.
  • If guide_submission_id is present, creates a new Attempt with inputMode='PHOTO_GUIDE' (ADR-120/121).
  • Legacy OCR-loop messages carry only attempt_id and omit guide fields — both shapes are handled.

Cold Start Considerations

NestJS dependency injection initialization adds approximately 200–400 ms to cold starts on top of the standard Lambda container boot time. The following mitigations are in place:
1

Cached application context

attemptReprocessWorker (the heaviest worker) caches NestFactory.createApplicationContext result in a module-level variable. Subsequent warm invocations skip the NestJS boot entirely.
2

serverless-express bridge

The api function uses @vendia/serverless-express to adapt API Gateway events to the NestJS HTTP adapter. The Express app is initialized once per container lifetime.
3

Prisma serverless adapter

All functions that use Prisma instantiate the client with @prisma/adapter-pg pointing to Supabase’s transaction pooler (:6543) with ?pgbouncer=true&connection_limit=1 to avoid exhausting connection pool slots across concurrent Lambda containers.
4

Lightweight cron handler

alertGenerator intentionally avoids NestJS DI and creates a bare PrismaClient directly, reducing cold-start overhead for a function that runs on a schedule and has no HTTP latency budget.

Error Handling

All SQS-triggered workers use functionResponseType: ReportBatchItemFailures. This means each worker’s handler returns an SQSBatchResponse with a batchItemFailures array — only the failed message IDs are returned to SQS for re-queuing. Successful messages in the same batch are not retried.
// Pattern used in attemptReprocessWorker
const batchItemFailures: SQSBatchResponse['batchItemFailures'] = [];

for (const record of event.Records) {
  try {
    const message = JSON.parse(record.body) as AttemptReprocessMessage;
    await worker.processMessage(message);
  } catch (err) {
    logger.error(`Failed to process record ${record.messageId}: ${err instanceof Error ? err.message : String(err)}`);
    batchItemFailures.push({ itemIdentifier: record.messageId });
  }
}

return { batchItemFailures };
Dead letter queue configuration:
QueuemaxReceiveCountDLQ Retention
LlmClassifyQueue314 days
AttemptReprocessQueue514 days
esbuild decorator bundling: Serverless Framework bundles all functions with serverless-esbuild. NestJS relies on TypeScript decorator metadata (design:paramtypes) for its dependency injection container, but esbuild strips this metadata by default. A custom plugin — esbuild-decorators.cjs — is wired in via custom.esbuild.plugins to preserve it. Without this plugin, NestJS DI would resolve all constructor parameters as undefined at runtime. The keepNames: true flag is also set to prevent esbuild from mangling class names that NestJS uses for token resolution.
custom:
  esbuild:
    bundle: true
    keepNames: true
    plugins: ./esbuild-decorators.cjs

Build docs developers (and LLMs) love