Environment Variables

Configuration is done via environment variables in the .env file created by the installer.

Starting from v1.8.0

Several variables that previously drove runtime behaviour are now first-boot seeds only. After the first successful startup the system_settings (and auth_settings) Postgres table is authoritative and changes apply without a restart. The env-var column below notes which variables are seeds vs. always-live.

Authentication

Variable	Default	Description
`JWT_SECRET_KEY`	(must be set)	Secret key for signing JWTs. Change this before deployment. Generate with: `openssl rand -hex 32`
`JWT_ALGORITHM`	`HS256`	JWT signing algorithm
`ACCESS_TOKEN_EXPIRE_SECONDS`	`900`	First-boot seed. Access token lifetime in seconds (default: 15 min). Managed at runtime via `PATCH /api/admin/auth/settings` after first boot.
`REFRESH_TOKEN_EXPIRE_SECONDS`	`604800`	First-boot seed. Refresh token lifetime in seconds (default: 7 days). Managed at runtime after first boot.
`DEFAULT_ADMIN_USER`	`admin`	Username for the default admin account created on first boot
`DEFAULT_ADMIN_PASSWORD`	`changeme`	Password for the default admin account. Change this before deployment.
`MFA_ENCRYPTION_KEY`	(must be set — v1.8.0+)	Fernet key used to encrypt TOTP secrets and OIDC `client_secret` values at rest. Supply via the `mfa_encryption_key` Docker secret. Required — the stack will not start without it.

OIDC / SSO (v1.8.0+)

OIDC providers are primarily managed through the admin UI (stored in the oidc_providers Postgres table). The env vars below serve as a fallback when the table is empty and are useful for bootstrapping or scripted deployments.

Variable	Description
`OIDC_PROVIDERS`	Comma-separated list of provider names, e.g. `entra,google`
`OIDC_<NAME>_CLIENT_ID`	Client ID for the named provider
`OIDC_<NAME>_DISCOVERY_URL`	OIDC discovery document URL (`.well-known/openid-configuration`)
`OIDC_<NAME>_REDIRECT_URI`	Redirect URI registered with the IdP
`OIDC_<NAME>_SCOPES`	Space-separated scopes (default: `openid email profile`)
`OIDC_<NAME>_ADMIN_GROUPS`	Comma-separated IdP group names whose members are auto-promoted to `admin`

Client secrets must be supplied as Docker secrets (oidc_<name>_client_secret), not as plain env vars.

See the SSO / OIDC guide for full configuration instructions.

TLS / HTTPS (v1.8.0+)

Variable	Default	Description
`TLS_ENABLED`	`0`	Set to `1` to enable HTTPS termination in the bundled nginx. See the TLS / HTTPS guide.
`SERVER_NAME`	(must be set when TLS enabled)	Fully-qualified domain name for the server (used in nginx `server_name` and HSTS headers)
`TLS_CERT_PATH`	`/etc/nginx/certs/fullchain.pem`	Path to the TLS certificate (inside the nginx container)
`TLS_KEY_PATH`	`/etc/nginx/certs/privkey.pem`	Path to the private key (inside the nginx container)
`HSTS_MAX_AGE`	`31536000`	HSTS `max-age` in seconds (1 year). Set to `0` to disable HSTS.
`OCSP_STAPLING`	`on`	Set to `off` for air-gapped deployments that cannot reach OCSP responders.

Docker Compose env-file gotcha

Docker Compose only auto-loads a file literally named .env. If your TLS variables are in .env.local or another file, you must pass --env-file .env.local explicitly — otherwise TLS_ENABLED will fall back to 0.

Database & Cache

Variable	Default	Description
`DATABASE_URL`	(set by installer)	PostgreSQL connection string
`REDIS_URL`	(set by installer)	Redis connection string
`QDRANT_URL`	`http://rag-db:6333`	Qdrant vector database connection URL.
`QDRANT_API_KEY`	`CHANGEME`	API key for Qdrant authentication. Change this before deployment.

Privacy

Variable	Default	Description
`RAG_LOG_ANONYMISE`	`true`	First-boot seed. Anonymise user identifiers in log output. Managed at runtime via `system_settings`.
`RAG_LOG_REDACT_QUERIES`	`true`	First-boot seed. Redact query text from log output. Managed at runtime via `system_settings`.
`RAG_CONVERSATION_MAX_AGE_DAYS`	`90`	First-boot seed. Automatically purge conversations older than this many days. Managed at runtime via `system_settings`.
`RAG_CONVERSATION_MAX_TURNS`	`100`	First-boot seed. Maximum number of turns retained per conversation. Managed at runtime via `system_settings`.

Runtime

Variable	Default	Description
`LOG_LEVEL`	`INFO`	First-boot seed. Logging verbosity (`DEBUG`, `INFO`, `WARNING`, `ERROR`). Managed at runtime via `system_settings`.
`RAG_AUDIT_RETENTION_DAYS`	`365`	First-boot seed. Audit log retention period in days. Managed at runtime via `system_settings`.
`CONNECTOR_ALLOWED_PATHS`	`/app/docs`	Colon-separated list of container-side paths that connectors are permitted to access. Always include `/app/docs`.
`INDEX_EXTRACTION_WORKERS`	(set by installer)	Number of parallel workers for document text extraction
`BACKEND_WORKERS`	`1`	Number of uvicorn worker processes. Increase for higher throughput on multi-core systems. Added in v1.2.1.
`DOCKER_PORT`	`3000`	Host port mapped to the UI container

Inference & Embedding

Variable	Default	Description
`N_CTX`	`8192`	Context window size in tokens requested from the inference runtime. Lower values reduce memory usage. Jetson default: `4096`. A startup warning is logged if the configured value exceeds the model's actual capacity — see `EFFECTIVE_N_CTX` below.
`EFFECTIVE_N_CTX`	(read-only)	Derived at startup from the inference runtime — the actual context window size in effect. This is the single source of truth used by the token budget manager. When `N_CTX` exceeds model capacity, `EFFECTIVE_N_CTX` reflects the model's real limit and a warning is emitted in the startup logs.
`N_THREADS`	`8`	Number of CPU threads for inference. Jetson default: `6`.
`N_GPU_LAYERS`	`0`	Number of transformer layers offloaded to GPU. Set to `-1` to offload all layers. Only applies when a CUDA-capable GPU is detected.
`EMBED_DEVICE`	`auto`	Device for embedding model inference: `auto` (use CUDA if available, else CPU), `cpu`, or `cuda`. Added in v1.3.0.
`INFERENCE_URL`	`http://inference:8001`	URL of the llama-cpp-python inference server.
`INFERENCE_TIMEOUT_SECONDS`	`120`	First-boot seed. Timeout for inference requests in seconds. Managed at runtime via `system_settings`.

Query Pipeline

Variable	Default	Description
`RAG_CLASSIFIER_LLM_ENABLED`	`false`	Enable LLM-assisted hybrid query classifier. When `true`, ambiguous queries are disambiguated via the local inference server. When `false` (default), classification is fully deterministic.
`RAG_CLASSIFIER_LLM_MAX_TOKENS`	`64`	Maximum tokens for the LLM classifier response. 64 is sufficient for the small JSON output.
`RAG_LOG_TOKEN_USAGE`	`false`	First-boot seed. When `true`, logs per-request token budget usage. Managed at runtime via `system_settings`.
`RAG_QUERY_REWRITE_ENABLED`	`false`	First-boot seed. Enable query rewriting for multi-turn conversations. Managed at runtime via `system_settings`.
`RAG_QUERY_REWRITE_MAX_TURNS`	`3`	First-boot seed. Maximum prior turns used for query rewriting context. Managed at runtime via `system_settings`.
`RAG_MIN_CHUNK_LENGTH`	`50`	First-boot seed. Minimum character length for indexed chunks. Managed at runtime via `system_settings`.

Authentication​

OIDC / SSO (v1.8.0+)​

TLS / HTTPS (v1.8.0+)​

Database & Cache​

Privacy​

Runtime​

Inference & Embedding​

Query Pipeline​