Skip to main content

What is RAG-DocBot?

RAG-DocBot is a self-hosted, on-premise AI platform designed for companies that want to run an internal, privacy-preserving AI assistant on their own infrastructure — with no data ever leaving their network.

It is a backend service and tooling — a separate UI project provides the frontend. Together they form a complete document-grounded AI assistant that runs entirely on your own servers.

RAG Architecture


Vision

RAG-DocBot is built around the principle that your data should never leave your infrastructure:

  • Full data ownership — no SaaS dependency, no third-party data processing
  • On-prem deployment — including air-gapped environments with no outbound internet
  • Configurable for CPU and GPU — runs on commodity hardware as well as GPU-accelerated servers
  • Transparent pipeline — every step from document ingestion to answer generation is visible and controllable
  • Scales from small teams to enterprise — license-gated tiers to match your workload

Key Capabilities

CapabilityDescription
On-premise by designAll data stays on your infrastructure. No telemetry, no external API calls.
Source connectorsIngest documents via file upload, GitHub, Slack, Google Drive, or local directories.
Metadata extraction rulesetsDefine per-source regex rules (connector or integration) to extract structured fields (patient IDs, article numbers, dates, document types) from your documents. Enables filtered, sorted, and grouped retrieval.
Analytics dashboardChunk distribution, metadata coverage, rule effectiveness, and priority audit metrics — per connector or integration. Requires Pro plan.
Streaming chat (SSE)POST /api/chat supports Server-Sent Events streaming via Accept: text/event-stream for real-time answer delivery.
Scheduled syncsCron-based scheduler for automatic connector and integration syncs. Requires Pro plan or higher.
Audit loggingAppend-only Postgres audit log covering chat, sync, and config lifecycle events, with admin query APIs and retention policies. Enterprise only.
Operational backupsRunbook and automation for backup and restore of Postgres, Qdrant, branding assets, and local models.
Intelligent search & retrievalFive retrieval modes — semantic, hybrid metadata filtering, metadata-only, comparison grouping, and BM25 keyword fusion — with automatic intent detection that picks the right strategy for each query.
Async job systemDocument ingestion and indexing run as background jobs — no blocking the API.
Persistent storagePostgreSQL for application data, Redis for live job state, Qdrant for the vector index.
JWT auth & RBACRole-based access control with viewer, editor, and admin roles.
TOTP MFATime-based one-time password second factor for any user account. Enrollment returns a QR code and 10 single-use recovery codes.
Federated login (OIDC / SSO)Multi-provider OpenID Connect with PKCE. Pre-built support for Microsoft Entra ID, Google Workspace, Keycloak, and any OIDC-compliant IdP. Managed from the admin UI at runtime.
Groups & Resource ACLPer-connector and per-integration access control lists. Retrieval is filtered at query time — no re-embedding required when ACL changes. Enterprise only.
TLS / HTTPS terminationOpt-in HTTPS in the bundled nginx (TLS_ENABLED=1). Supports self-signed, internal CA, and Let's Encrypt certificates.
Runtime system settingsKey operational settings (log level, conversation limits, RAG tunables, JWT lifetimes) are managed at runtime via the admin API — no restart required.
Sign-out everywhereAdmins can revoke all active sessions globally or per-user. Available on all license tiers.
License-gated plan limitsFREE, PRO, and ENTERPRISE tiers with configurable document and storage limits.
CPU & GPU supportWorks on standard CPU servers and CUDA-capable GPU servers — both inference and embedding.

Current Status

RAG-DocBot is at v1.8.0 and under active development. See the Changelog for the full release history.