System Control API: Start, Stop, and Self-Test

The system control endpoints manage the lifecycle of local TrinaxAI services — Ollama and the RAG API process — and provide an automated self-test for CI/CD and diagnostic workflows. All endpoints on this page are protected.

These endpoints are destructive — they can shut down the AI backend entirely or trigger a full factory reset. Only call them from trusted clients (the TrinaxAI PWA, CLI, or your own automation). Never expose port 3333 to the public internet without an admin token.

All /system/* endpoints require authorization: either a call from localhost / private LAN (when TRINAXAI_ALLOW_LAN_SYSTEM is not disabled) or an X-Admin-Token header matching the TRINAXAI_ADMIN_TOKEN environment variable. If a token is provided but incorrect, the request is rejected with HTTP 403 regardless of the client IP.

Authentication

# Localhost — no token needed by default
curl -X POST http://localhost:3333/system/reload

# Remote LAN or production — use X-Admin-Token header
curl -X POST http://192.168.1.100:3333/system/reload \
  -H "X-Admin-Token: your-admin-token"

POST /system/shutdown

Initiate a graceful shutdown of the AI backend (Ollama + RAG API server). The shutdown is delegated to service_manager.py and runs in a new detached process so the API can return a response before it exits.

After calling this endpoint the RAG API will be unavailable. The PWA static frontend remains accessible for restart. Use POST /system/startup to bring services back online.

Response

boolean

Always true when the shutdown has been initiated.

output

string

Human-readable status message. Example: "AI shutdown initiated. The PWA remains available for restart."

Example

curl -X POST http://localhost:3333/system/shutdown \
  -H "X-Admin-Token: your-admin-token"

Response

{
  "ok": true,
  "output": "AI shutdown initiated. The PWA remains available for restart."
}

POST /system/startup

Start Ollama and the RAG API. Invokes service_manager.py start-ai synchronously (up to 60-second timeout) and returns the result.

Response

boolean

true if service_manager.py exited with code 0. false if startup failed.

output

string

Combined stdout from the service manager script.

error

string

Combined stderr from the service manager script. Empty on success.

Example

curl -X POST http://localhost:3333/system/startup \
  -H "X-Admin-Token: your-admin-token"

Response

{
  "ok": true,
  "output": "Starting Ollama... OK\nStarting RAG API... OK\n",
  "error": ""
}

POST /system/stop-all

Stop all TrinaxAI services immediately — both the AI backend and any supporting processes. Equivalent to a hard stop, not a graceful drain. The request is dispatched to service_manager.py stop-all in a detached process.

Unlike /system/shutdown, this terminates every managed process immediately without waiting for in-flight requests to complete.

Response

boolean

Always true when stop-all has been initiated.

output

string

Status message.

Example

curl -X POST http://localhost:3333/system/stop-all \
  -H "X-Admin-Token: your-admin-token"

Response

{
  "ok": true,
  "output": "Full TrinaxAI shutdown initiated."
}

POST /system/self-test

Run an automated health check suite covering Ollama connectivity, embedding model functionality, and RAG query execution. Useful for diagnostics from the PWA settings panel, post-deploy CI/CD validation, or troubleshooting a broken installation. The test suite performs four checks in order:

ollama — Calls GET /api/tags on the configured Ollama instance and checks that at least one model is loaded.
embedding — Calls get_text_embedding("TrinaxAI system test") on the active embedding model and verifies a non-empty vector is returned.
rag_indexed — Checks whether the in-memory hybrid retriever (_fusion_retriever) is loaded.
rag_query — If both Ollama and the index are available, runs a minimal retrieval query ("test") and checks that at least one node is returned.

Response

boolean

true only if all four checks passed.

results

object

Individual check outcomes.

Show Results object

ollama

boolean

true if Ollama is reachable and has at least one model loaded.

embedding

boolean

true if the embedding model returned a valid vector.

rag_indexed

boolean

true if the in-memory index is loaded and the hybrid retriever is ready.

rag_query

boolean

true if a test retrieval query returned at least one result node.

profile

string

The active TRINAXAI_PROFILE value (e.g. "16gb", "max", "8gb").

Example

curl -X POST http://localhost:3333/system/self-test \
  -H "X-Admin-Token: your-admin-token"

All checks passing

{
  "ok": true,
  "results": {
    "ollama": true,
    "embedding": true,
    "rag_indexed": true,
    "rag_query": true
  },
  "profile": "16gb"
}

Partial failure (index not yet built)

{
  "ok": false,
  "results": {
    "ollama": true,
    "embedding": true,
    "rag_indexed": false,
    "rag_query": false
  },
  "profile": "16gb"
}

If rag_indexed is false, run python index.py on the host (or use POST /system/index-upload) to build the initial index, then call POST /system/reload to load it into the running API.

Overview

Endpoints

System Control API: Start, Stop, and Self-Test

Authentication

POST /system/shutdown

Response

Example

POST /system/startup

Response

Example

POST /system/stop-all

Response

Example

POST /system/self-test

Response

Example

Build docs developers (and LLMs) love

Overview

Endpoints

Documentation Index

​Authentication

​POST /system/shutdown

​Response

​Example

​POST /system/startup

​Response

​Example

​POST /system/stop-all

​Response

​Example

​POST /system/self-test

​Response

​Example

Build docs developers (and LLMs) love

Authentication

POST /system/shutdown

Response

Example

POST /system/startup

Response

Example

POST /system/stop-all

Response

Example

POST /system/self-test

Response

Example