all projects
in progress·Python stack

Torah Study AI

Production RAG pipeline on 3.5M sacred texts. Hybrid search, Cohere reranking, strict anti-hallucination guardrails. Built with FastAPI, Weaviate, and Gemini.

Production RAG pipeline on 3.5M Jewish sacred texts from the Sefaria library. Every answer is grounded in verifiable sources. No hallucinations.

Key numbers

  • 94,635 English texts indexed with Gemini Embedding 001
  • 93.9% recall on Sefaria's own embedding benchmark (18 models tested)
  • 0.3 relevance threshold with smart fallback when sources are weak
  • 420 texts/sec local embedding throughput (Qwen3 on M1 Max)
  • 16 documented build steps with BDD scenarios and TDD

What makes this different

Strict RAG on sensitive data. On sacred texts, a fabricated source is not an error - it is an offense. The pipeline never falls back to LLM general knowledge. If the retrieval score is below 0.3, it says "I didn't find this text" and suggests where to look.

Data-driven model selection. I benchmarked 18 embedding models using Sefaria's own Rabbinic Embedding Leaderboard before choosing. Gemini Embedding 001 wins at 93.9% recall. OpenAI scores 69.9%. The best open-source model (Qwen3-Embedding-8B) scores 89.4%.

Domain-aware chunking. The Sefaria dataset comes pre-chunked by verse, mishnah, and sugya - natural scholarly units. No generic RecursiveCharacterTextSplitter that would cut a Talmud passage in the middle of an argument.

Cost transparency. I estimated embedding costs at $2.60. The reality was 25x higher ($0.15/1M tokens, not $0.006). I documented the mistake, pivoted to English-only indexing, and planned migration to a free local model. The full cost analysis is in the build log.

Stack

8 tools
PythonFastAPIWeaviateGemini 2.5 FlashCohere RerankNext.jsshadcn/uiDocker

Read more

why

Why

The market gap, target users, and strategy

tech-choices

Tech Choices

Architecture, stack decisions, embedding benchmarks

instructions

Instructions

16 build steps with code, tests, and lessons learned

roadmap

Roadmap

Timeline, V2 features, and what is next