Skip to main content

Documentation Index

Fetch the complete documentation index at: https://mintlify.com/tudoumono/Sherpa/llms.txt

Use this file to discover all available pages before exploring further.

All of Sherpa’s runtime behaviour is controlled through environment variables. There are no config files other than the .env family — every tunable is a single flat variable. This page lists every variable grouped by concern, together with its default value and purpose.

Where configuration lives

EnvironmentFileLoaded by
Local development.env (repository root)scripts/run-api.sh via set -a; . "$ENV_FILE"; set +a
Production/etc/sherpa/sherpa.envscripts/run-api.sh when SHERPA_ENV_FILE=/etc/sherpa/sherpa.env is set
scripts/run-api.sh reads the file pointed to by SHERPA_ENV_FILE. If that variable is unset, it falls back to .env in the repository root.
Use .env.production.example as a starting point for your production config file:
sudo cp /opt/sherpa/current/.env.production.example /etc/sherpa/sherpa.env
sudo editor /etc/sherpa/sherpa.env
Replace every CHANGE_ME_… placeholder before starting the service. Never commit the real production file — it contains secrets.

Store connections

These variables configure the three Docker-hosted data stores. Sherpa resolves the PostgreSQL connection through a priority chain: SHERPA_PG_DSNDATABASE_URLPGHOST / PGPORT / PGDATABASE / PGUSER / PGPASSWORD environment variables.
VariableDefaultDescription
DATABASE_URLpostgresql://sherpa:sherpa_dev@localhost:5432/sherpaPostgreSQL DSN. Used if SHERPA_PG_DSN is not set.
SHERPA_PG_DSNPreferred PostgreSQL DSN. Takes priority over DATABASE_URL when set.
POSTGRES_PASSWORDsherpa_devPassword used by the Docker Compose PostgreSQL container when first initialised. Must match the password in DATABASE_URL / SHERPA_PG_DSN.
NEO4J_URIbolt://localhost:7687Neo4j bolt connection URI.
NEO4J_USERneo4jNeo4j username.
NEO4J_PASSWORDsherpa_devNeo4j password.
ES_URLhttp://localhost:9200Elasticsearch base URL. The bundled Elasticsearch image has security disabled; keep this on localhost and do not expose port 9200 externally.

AI providers

AI provider settings are defaults; individual users can override the model and provider from the settings UI. Only plain text is ever sent to external providers — files are never uploaded.
VariableDefaultDescription
OPENAI_API_KEYOpenAI API key. Required for OpenAI and Codex agents. Use a project-scoped key with a budget limit rather than an organisation root key.
OLLAMA_URLhttp://localhost:11434Base URL for a local Ollama server. Used when a user selects a local LLM model.
GEMINI_API_KEYGoogle Gemini API key. Required for Gemini agents.
SHERPA_AGENTheuristicDefault agent used when no per-user agent preference is stored. Accepted values: heuristic, openai, gemini, ollama, codex. User-level settings (set in the UI) take priority over this variable.

Authentication

SHERPA_ADMIN_PASSWORD is used only to bootstrap the initial admin account when the database has no password hash for that user. Set it before first startup, then change the admin password through the user management UI. The variable can be left unset (or removed) after the first login; subsequent startups will use the hashed password stored in PostgreSQL.
VariableDefaultDescription
SHERPA_AUTH_ENABLEDSet to 1 to require login for all API endpoints and the web UI. Unset runs in compatibility mode (a synthetic admin user is assumed). Always set to 1 in production.
SHERPA_AUTH_DISABLEDSet to 1 to force compatibility mode (no login), overriding SHERPA_AUTH_ENABLED. Useful for local experiments. Do not set in production.
SHERPA_ADMIN_PASSWORDInitial password for the admin account. Used only during bootstrap when no password hash exists in the database.
SHERPA_SESSION_DAYS7Number of days a session cookie remains valid. The cookie (sherpa_session) stores a hash of the token in the database, not the token itself.
SHERPA_AUDIT_IP_SALTA random string used to hash IP addresses before they are written to the audit log. Set to a long random value to make IP addresses in logs non-reversible.

Workspace and file uploads

Personal workspace files are stored per-user under SHERPA_USERS_DIR/{uid}/workspace/. They are searched only by grep; they are never indexed in Elasticsearch or Neo4j and never appear as RAG citations.
VariableDefaultDescription
SHERPA_WORKSPACE_MAX_BYTES10485760 (10 MB)Maximum size of a single uploaded file in bytes. Uploads exceeding this limit are rejected with HTTP 413.
SHERPA_WORKSPACE_TTL_DAYS90Number of days before a workspace file expires and is eligible for cleanup. Set to 0 for no expiry.
SHERPA_USERS_DIR./data/usersRoot directory for personal workspace files. In production this should be an absolute path (e.g. /srv/sherpa/users).

Paths and runtime

Setting SHERPA_USE_FIXTURES=1 in a production environment where SHERPA_ENV=production is also set will cause Sherpa to refuse to start. This is intentional fail-closed behaviour: fixture data (synthetic test corpora) must never be served to real users. The make dist tarball does not include the fixtures/ directory at all.
VariableDefaultDescription
SHERPA_KB_DIR./data/kbShared KB root for backward compatibility. Registered worlds are stored here when using the single-world local setup. In production, world paths are stored in the PostgreSQL registry.
SHERPA_DERIVED_DIR./data/derivedRoot directory for Markdown derivatives of Office files. Must be an absolute path in production. Derived content is fully regeneratable by re-running ingest.
SHERPA_VERSIONv1Default world ID. Used as the fallback when a world is not specified in API calls.
SHERPA_HOST127.0.0.1Host address Uvicorn binds to. Keep as 127.0.0.1 and put a reverse proxy in front for external access.
SHERPA_PORT8000TCP port Uvicorn listens on.
SHERPA_UVICORN_WORKERS1Number of Uvicorn worker processes. Increase for higher concurrency.
SHERPA_ENVSet to production to enable the fixtures fail-closed guard. make serve and the systemd unit set this automatically via run-api.sh serve.
SHERPA_ENV_FILEPath to the env file that scripts/run-api.sh and scripts/check-production.sh should load. The systemd unit sets this to /etc/sherpa/sherpa.env.
SHERPA_POLL_SECONDS0 (disabled)When greater than 0, Sherpa periodically re-ingests all registered worlds at this interval (seconds). The default is 0 — all ingest is manual (triggered from the admin UI or API).
SHERPA_DISABLE_EMBEDSet to 1 to disable vector embedding generation. Elasticsearch will use BM25 only. Useful on hosts without GPU access or when embedding latency is unacceptable.
SHERPA_USE_FIXTURESDev and test only. Set to 1 to use the fixture corpus instead of a real registered world. Never set this in a production environment.
SHERPA_BROWSE_ROOTSColon-separated list of allowed root paths for world registration (e.g. /mnt:/srv/sherpa/kb). When set, the admin UI’s folder picker and the /worlds registration endpoint reject paths that fall outside these roots. Unset means no restriction (admin is trusted). Set in production to limit which directories can be registered as worlds.
SHERPA_CODEX_SANDBOX1Controls the Codex agent sandbox mode. The default (1) uses the full multi-layer sandbox (systemd + Landlock + seccomp + bwrap). Set to 0 to fall back to the legacy -s workspace-write flag, which relies solely on OS-user isolation. Emergency escape hatch only — do not set to 0 in production.

Service ports (reference)

The following ports are used by Sherpa’s components. All are bound to 127.0.0.1 in the default Docker Compose configuration.
ServiceDefault portNotes
FastAPI8000Configurable via SHERPA_PORT
PostgreSQL5432Docker Compose; configurable in docker-compose.yml
Neo4j (Bolt)7687Used by the application
Neo4j (Browser UI)7474Admin UI only; not used by the app
Elasticsearch9200No authentication; never expose externally

Build docs developers (and LLMs) love