Skip to main content

Environment Variables

Configuration is done via environment variables in the .env file created by the installer.

Starting from v1.8.0

Several variables that previously drove runtime behaviour are now first-boot seeds only. After the first successful startup the system_settings (and auth_settings) Postgres table is authoritative and changes apply without a restart. The env-var column below notes which variables are seeds vs. always-live.


Authentication

VariableDefaultDescription
JWT_SECRET_KEY(must be set)Secret key for signing JWTs. Change this before deployment. Generate with: openssl rand -hex 32
JWT_ALGORITHMHS256JWT signing algorithm
ACCESS_TOKEN_EXPIRE_SECONDS900First-boot seed. Access token lifetime in seconds (default: 15 min). Managed at runtime via PATCH /api/admin/auth/settings after first boot.
REFRESH_TOKEN_EXPIRE_SECONDS604800First-boot seed. Refresh token lifetime in seconds (default: 7 days). Managed at runtime after first boot.
DEFAULT_ADMIN_USERadminUsername for the default admin account created on first boot
DEFAULT_ADMIN_PASSWORDchangemePassword for the default admin account. Change this before deployment.
MFA_ENCRYPTION_KEY(must be set — v1.8.0+)Fernet key used to encrypt TOTP secrets and OIDC client_secret values at rest. Supply via the mfa_encryption_key Docker secret. Required — the stack will not start without it.

OIDC / SSO (v1.8.0+)

OIDC providers are primarily managed through the admin UI (stored in the oidc_providers Postgres table). The env vars below serve as a fallback when the table is empty and are useful for bootstrapping or scripted deployments.

VariableDescription
OIDC_PROVIDERSComma-separated list of provider names, e.g. entra,google
OIDC_<NAME>_CLIENT_IDClient ID for the named provider
OIDC_<NAME>_DISCOVERY_URLOIDC discovery document URL (.well-known/openid-configuration)
OIDC_<NAME>_REDIRECT_URIRedirect URI registered with the IdP
OIDC_<NAME>_SCOPESSpace-separated scopes (default: openid email profile)
OIDC_<NAME>_ADMIN_GROUPSComma-separated IdP group names whose members are auto-promoted to admin

Client secrets must be supplied as Docker secrets (oidc_<name>_client_secret), not as plain env vars.

See the SSO / OIDC guide for full configuration instructions.


TLS / HTTPS (v1.8.0+)

VariableDefaultDescription
TLS_ENABLED0Set to 1 to enable HTTPS termination in the bundled nginx. See the TLS / HTTPS guide.
SERVER_NAME(must be set when TLS enabled)Fully-qualified domain name for the server (used in nginx server_name and HSTS headers)
TLS_CERT_PATH/etc/nginx/certs/fullchain.pemPath to the TLS certificate (inside the nginx container)
TLS_KEY_PATH/etc/nginx/certs/privkey.pemPath to the private key (inside the nginx container)
HSTS_MAX_AGE31536000HSTS max-age in seconds (1 year). Set to 0 to disable HSTS.
OCSP_STAPLINGonSet to off for air-gapped deployments that cannot reach OCSP responders.
Docker Compose env-file gotcha

Docker Compose only auto-loads a file literally named .env. If your TLS variables are in .env.local or another file, you must pass --env-file .env.local explicitly — otherwise TLS_ENABLED will fall back to 0.


Database & Cache

VariableDefaultDescription
DATABASE_URL(set by installer)PostgreSQL connection string
REDIS_URL(set by installer)Redis connection string
QDRANT_URLhttp://rag-db:6333Qdrant vector database connection URL.
QDRANT_API_KEYCHANGEMEAPI key for Qdrant authentication. Change this before deployment.

Privacy

VariableDefaultDescription
RAG_LOG_ANONYMISEtrueFirst-boot seed. Anonymise user identifiers in log output. Managed at runtime via system_settings.
RAG_LOG_REDACT_QUERIEStrueFirst-boot seed. Redact query text from log output. Managed at runtime via system_settings.
RAG_CONVERSATION_MAX_AGE_DAYS90First-boot seed. Automatically purge conversations older than this many days. Managed at runtime via system_settings.
RAG_CONVERSATION_MAX_TURNS100First-boot seed. Maximum number of turns retained per conversation. Managed at runtime via system_settings.

Runtime

VariableDefaultDescription
LOG_LEVELINFOFirst-boot seed. Logging verbosity (DEBUG, INFO, WARNING, ERROR). Managed at runtime via system_settings.
RAG_AUDIT_RETENTION_DAYS365First-boot seed. Audit log retention period in days. Managed at runtime via system_settings.
CONNECTOR_ALLOWED_PATHS/app/docsColon-separated list of container-side paths that connectors are permitted to access. Always include /app/docs.
INDEX_EXTRACTION_WORKERS(set by installer)Number of parallel workers for document text extraction
BACKEND_WORKERS1Number of uvicorn worker processes. Increase for higher throughput on multi-core systems. Added in v1.2.1.
DOCKER_PORT3000Host port mapped to the UI container

Inference & Embedding

VariableDefaultDescription
N_CTX8192Context window size in tokens requested from the inference runtime. Lower values reduce memory usage. Jetson default: 4096. A startup warning is logged if the configured value exceeds the model's actual capacity — see EFFECTIVE_N_CTX below.
EFFECTIVE_N_CTX(read-only)Derived at startup from the inference runtime — the actual context window size in effect. This is the single source of truth used by the token budget manager. When N_CTX exceeds model capacity, EFFECTIVE_N_CTX reflects the model's real limit and a warning is emitted in the startup logs.
N_THREADS8Number of CPU threads for inference. Jetson default: 6.
N_GPU_LAYERS0Number of transformer layers offloaded to GPU. Set to -1 to offload all layers. Only applies when a CUDA-capable GPU is detected.
EMBED_DEVICEautoDevice for embedding model inference: auto (use CUDA if available, else CPU), cpu, or cuda. Added in v1.3.0.
INFERENCE_URLhttp://inference:8001URL of the llama-cpp-python inference server.
INFERENCE_TIMEOUT_SECONDS120First-boot seed. Timeout for inference requests in seconds. Managed at runtime via system_settings.

Query Pipeline

VariableDefaultDescription
RAG_CLASSIFIER_LLM_ENABLEDfalseEnable LLM-assisted hybrid query classifier. When true, ambiguous queries are disambiguated via the local inference server. When false (default), classification is fully deterministic.
RAG_CLASSIFIER_LLM_MAX_TOKENS64Maximum tokens for the LLM classifier response. 64 is sufficient for the small JSON output.
RAG_LOG_TOKEN_USAGEfalseFirst-boot seed. When true, logs per-request token budget usage. Managed at runtime via system_settings.
RAG_QUERY_REWRITE_ENABLEDfalseFirst-boot seed. Enable query rewriting for multi-turn conversations. Managed at runtime via system_settings.
RAG_QUERY_REWRITE_MAX_TURNS3First-boot seed. Maximum prior turns used for query rewriting context. Managed at runtime via system_settings.
RAG_MIN_CHUNK_LENGTH50First-boot seed. Minimum character length for indexed chunks. Managed at runtime via system_settings.