Skip to main content

Hybrid LLM Classifier

By default, query classification is fully rule-based (regex patterns). In v1.5.0, an optional LLM sidecar can be enabled for ambiguous queries that match multiple intents.


How It Works

When RAG_CLASSIFIER_LLM_ENABLED=true, ambiguous queries (for example, matching both an article number and an entity ID) are sent to the local inference server for disambiguation.

Unambiguous queries stay on the fast deterministic regex path. The LLM is only invoked when needed.


Extraction Signal Pipeline

All regex patterns run simultaneously against the query and produce ranked candidate signals. The hybrid classifier compares and merges complementary signals before deciding whether LLM disambiguation is required.

This improves classification quality when metadata patterns overlap across domains.


When to Enable

Enable hybrid mode when your dataset has overlapping metadata patterns and users frequently trigger incorrect intent selection.

Leave it disabled (default) for simple or single-domain datasets where deterministic regex classification is already reliable.


Configuration

VariableDefaultDescription
RAG_CLASSIFIER_LLM_ENABLEDfalseEnables LLM-assisted disambiguation for ambiguous classifier results.
RAG_CLASSIFIER_LLM_MAX_TOKENS64Maximum token budget for the classifier LLM response payload.

Full-Text Index Auto-Creation

In v1.5.0, Slack, GitHub, and Google Drive indexers automatically ensure required Qdrant full-text indexes exist after sync.

This means hybrid_bm25 retrieval mode works immediately after synchronization without manual index creation steps.