Skillia
← Retour aux articles

Hybrid Search Explained

What hybrid search is, how the alpha parameter works, and when to adjust it

Hybrid Search Explained

Question

What is hybrid search and why use it instead of just vector search?

Explanation

Hybrid search = running BM25 (keyword) and vector (semantic) search at the same time, then merging the results.

How it works in Weaviate

  1. You send a query (text + vector)
  2. Weaviate runs two searches in parallel: BM25 scores chunks by keyword matching, vector scores chunks by how close their meaning is to the query (cosine similarity)
  3. Results are merged using a parameter called alpha

The alpha parameter

Controls the balance between the two:

  • alpha = 0.0 - 100% BM25 (pure keyword)
  • alpha = 0.5 - 50/50 (balanced - what we use)
  • alpha = 1.0 - 100% vector (pure semantic)

Why 0.5 works well

  • Exact term queries ("BM25 algorithm") - BM25 finds it precisely, vector might return vaguely related results
  • Vague queries ("how to find relevant text") - vector understands the meaning, BM25 misses because no exact match
  • Acronyms ("RAG") - BM25 finds the acronym, vector also finds "retrieval-augmented generation"

With alpha=0.5, you get the best of both worlds.

When to change alpha

  • Your users search by exact names/codes - lower alpha (more BM25)
  • Your users ask natural language questions - higher alpha (more vector)
  • You're not sure - 0.5 is a safe default

Example

results = store.similarity_search(
    query,
    k=20,          # return top 20 candidates
    alpha=0.5,     # balanced hybrid
)

LangChain + Weaviate handles the merge automatically.