Environment Variables
Configuration is done via environment variables in the .env file created by the installer.
Several variables that previously drove runtime behaviour are now first-boot seeds only. After the first successful startup the system_settings (and auth_settings) Postgres table is authoritative and changes apply without a restart. The env-var column below notes which variables are seeds vs. always-live.
Authentication
| Variable | Default | Description |
|---|---|---|
JWT_SECRET_KEY | (must be set) | Secret key for signing JWTs. Change this before deployment. Generate with: openssl rand -hex 32 |
JWT_ALGORITHM | HS256 | JWT signing algorithm |
ACCESS_TOKEN_EXPIRE_SECONDS | 900 | First-boot seed. Access token lifetime in seconds (default: 15 min). Managed at runtime via PATCH /api/admin/auth/settings after first boot. |
REFRESH_TOKEN_EXPIRE_SECONDS | 604800 | First-boot seed. Refresh token lifetime in seconds (default: 7 days). Managed at runtime after first boot. |
DEFAULT_ADMIN_USER | admin | Username for the default admin account created on first boot |
DEFAULT_ADMIN_PASSWORD | changeme | Password for the default admin account. Change this before deployment. |
MFA_ENCRYPTION_KEY | (must be set — v1.8.0+) | Fernet key used to encrypt TOTP secrets and OIDC client_secret values at rest. Supply via the mfa_encryption_key Docker secret. Required — the stack will not start without it. |
OIDC / SSO (v1.8.0+)
OIDC providers are primarily managed through the admin UI (stored in the oidc_providers Postgres table). The env vars below serve as a fallback when the table is empty and are useful for bootstrapping or scripted deployments.
| Variable | Description |
|---|---|
OIDC_PROVIDERS | Comma-separated list of provider names, e.g. entra,google |
OIDC_<NAME>_CLIENT_ID | Client ID for the named provider |
OIDC_<NAME>_DISCOVERY_URL | OIDC discovery document URL (.well-known/openid-configuration) |
OIDC_<NAME>_REDIRECT_URI | Redirect URI registered with the IdP |
OIDC_<NAME>_SCOPES | Space-separated scopes (default: openid email profile) |
OIDC_<NAME>_ADMIN_GROUPS | Comma-separated IdP group names whose members are auto-promoted to admin |
Client secrets must be supplied as Docker secrets (oidc_<name>_client_secret), not as plain env vars.
See the SSO / OIDC guide for full configuration instructions.
TLS / HTTPS (v1.8.0+)
| Variable | Default | Description |
|---|---|---|
TLS_ENABLED | 0 | Set to 1 to enable HTTPS termination in the bundled nginx. See the TLS / HTTPS guide. |
SERVER_NAME | (must be set when TLS enabled) | Fully-qualified domain name for the server (used in nginx server_name and HSTS headers) |
TLS_CERT_PATH | /etc/nginx/certs/fullchain.pem | Path to the TLS certificate (inside the nginx container) |
TLS_KEY_PATH | /etc/nginx/certs/privkey.pem | Path to the private key (inside the nginx container) |
HSTS_MAX_AGE | 31536000 | HSTS max-age in seconds (1 year). Set to 0 to disable HSTS. |
OCSP_STAPLING | on | Set to off for air-gapped deployments that cannot reach OCSP responders. |
Docker Compose only auto-loads a file literally named .env. If your TLS variables are in .env.local or another file, you must pass --env-file .env.local explicitly — otherwise TLS_ENABLED will fall back to 0.
Database & Cache
| Variable | Default | Description |
|---|---|---|
DATABASE_URL | (set by installer) | PostgreSQL connection string |
REDIS_URL | (set by installer) | Redis connection string |
QDRANT_URL | http://rag-db:6333 | Qdrant vector database connection URL. |
QDRANT_API_KEY | CHANGEME | API key for Qdrant authentication. Change this before deployment. |
Privacy
| Variable | Default | Description |
|---|---|---|
RAG_LOG_ANONYMISE | true | First-boot seed. Anonymise user identifiers in log output. Managed at runtime via system_settings. |
RAG_LOG_REDACT_QUERIES | true | First-boot seed. Redact query text from log output. Managed at runtime via system_settings. |
RAG_CONVERSATION_MAX_AGE_DAYS | 90 | First-boot seed. Automatically purge conversations older than this many days. Managed at runtime via system_settings. |
RAG_CONVERSATION_MAX_TURNS | 100 | First-boot seed. Maximum number of turns retained per conversation. Managed at runtime via system_settings. |
Runtime
| Variable | Default | Description |
|---|---|---|
LOG_LEVEL | INFO | First-boot seed. Logging verbosity (DEBUG, INFO, WARNING, ERROR). Managed at runtime via system_settings. |
RAG_AUDIT_RETENTION_DAYS | 365 | First-boot seed. Audit log retention period in days. Managed at runtime via system_settings. |
CONNECTOR_ALLOWED_PATHS | /app/docs | Colon-separated list of container-side paths that connectors are permitted to access. Always include /app/docs. |
INDEX_EXTRACTION_WORKERS | (set by installer) | Number of parallel workers for document text extraction |
BACKEND_WORKERS | 1 | Number of uvicorn worker processes. Increase for higher throughput on multi-core systems. Added in v1.2.1. |
DOCKER_PORT | 3000 | Host port mapped to the UI container |
Inference & Embedding
| Variable | Default | Description |
|---|---|---|
N_CTX | 8192 | Context window size in tokens requested from the inference runtime. Lower values reduce memory usage. Jetson default: 4096. A startup warning is logged if the configured value exceeds the model's actual capacity — see EFFECTIVE_N_CTX below. |
EFFECTIVE_N_CTX | (read-only) | Derived at startup from the inference runtime — the actual context window size in effect. This is the single source of truth used by the token budget manager. When N_CTX exceeds model capacity, EFFECTIVE_N_CTX reflects the model's real limit and a warning is emitted in the startup logs. |
N_THREADS | 8 | Number of CPU threads for inference. Jetson default: 6. |
N_GPU_LAYERS | 0 | Number of transformer layers offloaded to GPU. Set to -1 to offload all layers. Only applies when a CUDA-capable GPU is detected. |
EMBED_DEVICE | auto | Device for embedding model inference: auto (use CUDA if available, else CPU), cpu, or cuda. Added in v1.3.0. |
INFERENCE_URL | http://inference:8001 | URL of the llama-cpp-python inference server. |
INFERENCE_TIMEOUT_SECONDS | 120 | First-boot seed. Timeout for inference requests in seconds. Managed at runtime via system_settings. |
Query Pipeline
| Variable | Default | Description |
|---|---|---|
RAG_CLASSIFIER_LLM_ENABLED | false | Enable LLM-assisted hybrid query classifier. When true, ambiguous queries are disambiguated via the local inference server. When false (default), classification is fully deterministic. |
RAG_CLASSIFIER_LLM_MAX_TOKENS | 64 | Maximum tokens for the LLM classifier response. 64 is sufficient for the small JSON output. |
RAG_LOG_TOKEN_USAGE | false | First-boot seed. When true, logs per-request token budget usage. Managed at runtime via system_settings. |
RAG_QUERY_REWRITE_ENABLED | false | First-boot seed. Enable query rewriting for multi-turn conversations. Managed at runtime via system_settings. |
RAG_QUERY_REWRITE_MAX_TURNS | 3 | First-boot seed. Maximum prior turns used for query rewriting context. Managed at runtime via system_settings. |
RAG_MIN_CHUNK_LENGTH | 50 | First-boot seed. Minimum character length for indexed chunks. Managed at runtime via system_settings. |