InfoJobs DevBoard integrates a local AI feature that lets users generate a concise summary of any job listing on demand. When a user opens a job detail page and clicks the ✨ Generar resumen con IA button, the React frontend calls the Express backend, which proxies the request to a locally running Ollama server. Ollama streams the response back chunk by chunk so the summary appears in real time — no cloud API, no usage billing, and no internet connection required.Documentation Index
Fetch the complete documentation index at: https://mintlify.com/mauroperez055/infoJobs/llms.txt
Use this file to discover all available pages before exploring further.
How it works
The request travels through several layers before text appears on screen:useAISummary hook manages loading, error, and accumulated summary state. Each chunk received from the stream is appended to the previous value, so React re-renders progressively as the model generates text rather than waiting for the full response.
Key design decisions
- Local model: No API keys, no per-request costs, works fully offline. Earlier iterations explored Vercel AI Gateway (requires a credit card), Google Gemini (free quota exhausted), and Puter.js (requires user authentication) — Ollama was chosen as the practical, cost-free alternative.
- Streaming: The backend sends chunks via chunked transfer encoding as Ollama produces them, and the frontend reads each chunk with the Fetch
ReadableStreamAPI. This gives users immediate visual feedback instead of a long blank wait. - Rate limiting: The
/airouter appliesexpress-rate-limitat 5 requests per minute per IP to prevent abuse of the local compute resource. - Model used:
qwen2.5:3b— approximately 2 GB on disk, requires around 4 GB of RAM. Larger or smaller models can be swapped in by changing a single line inbackend/routes/ai.js.
Prompt template
The backend constructs a user prompt from the job’s stored fields and sends it to Ollama. A separatesystemPrompt variable constrains the model to only produce a job summary in Spanish:
System prompt:
\n, populated from the job record):
Ollama must be running at
localhost:11434 before this feature will work. If
it is not running, the backend will return a 500 error. See the Ollama
Setup guide to get started.Ollama Setup
Install Ollama, pull the qwen2.5:3b model, and start the local AI server.
Streaming
How the backend and frontend handle chunked transfer encoding and the Fetch
ReadableStream API.