Skillia
Back to Projects

Torah Study AI

in progress

Production RAG pipeline on 3.5M sacred texts. Hybrid search, Cohere reranking, strict anti-hallucination guardrails. Built with FastAPI, Weaviate, and Gemini.

PythonFastAPIWeaviateGemini 2.5 FlashCohere RerankNext.jsshadcn/uiDocker

Block 1 - Chat with Gemini

Build a working Torah chatbot that talks to Gemini directly.


Step 1: First LLM call [DONE]

Python script that sends a Torah question to Gemini and displays the answer. Google GenAI SDK, system prompts, input validation. 6 tests green.

Read the full walkthrough ->


Step 2: Next.js frontend [DONE]

Chat interface with Next.js + shadcn/ui. Cal AI-inspired design (Bricolage Grotesque + Inter fonts, monochrome + orange accent). SSE streaming display word by word.


Step 3: Prompt engineering [DONE]

Chavruta system prompt with structured output format (TL;DR, Sources, Explanation). Guardrails: never invent sources, halakhic disclaimer rendered as amber warning box. SSE chunks encoded as JSON to preserve Markdown newlines. 5 prompt-specific tests green.

Read the full walkthrough ->


Block 2 - Make it a real app

Add a backend, auth, persistence, and ship it live.


Step 4: FastAPI backend [DONE]

REST API with POST /chat (sync) and POST /chat/stream (SSE). Health check endpoint. Error handling. CORS configured for Next.js. 4 API tests green.

Read the full walkthrough ->


Step 5: Auth + Save conversations [DONE]

JWT authentication (bcrypt + token). SQLite database with users, sessions, messages tables. CRUD endpoints for sessions. 7 auth tests green. MFA deferred to production deploy.

Read the full walkthrough ->


Step 6: History in frontend [DONE]

Login/register screen. Sidebar with past sessions, new chat button, sign out. Messages saved automatically. Click to resume a conversation. Token persisted in localStorage.

Read the full walkthrough ->


Step 7: Docker [DONE]

Dockerfiles for FastAPI (Python 3.14-slim) and Next.js (Node 22-alpine). docker-compose.yml with frontend + api. Weaviate already on Elestio, not in compose.

Read the full walkthrough ->


Step 8: Deploy to Elestio [SKIPPED]

Skipped for now. Will deploy after Block 4 is complete.


Block 3 - Add Sefaria knowledge

Build the RAG pipeline. Answers based on real texts.


Step 9: Load Sefaria datasets [DONE]

Downloaded 886K English texts + 3.55M Hebrew texts from HuggingFace. Pre-chunked by Sefaria (verse, mishna, sugya). 17 categories: Commentary (191K), Tanakh (153K), Talmud (117K), Liturgy (100K), Halakhah (74K), and more.

Read the full walkthrough ->


Step 10: NER enrichment [SKIPPED]

Skipped for MVP. Sefaria metadata already includes ref field for each text. NER would detect inline citations (when one text references another) - useful for V2 Graph RAG but not blocking.


Step 11: Embeddings + Weaviate [DONE - partial]

94,635 English texts indexed with Gemini Embedding 001 (3072d) in Weaviate. Hit spending cap at ~20 shekels. Plan to migrate to Qwen3-Embedding-0.6B local (free, 89.4% recall on Sefaria benchmark).

Read the full walkthrough (includes the cost disaster) ->


Step 12: RAG pipeline [DONE]

Refactored to LangChain LCEL: ChatPromptTemplate | ChatGoogleGenerativeAI | StrOutputParser. RAGPipeline class with .invoke() and .stream(). Hybrid search + Cohere rerank kept as raw SDK (LangChain limitation). Relevance threshold 0.3 with smart fallback.


Step 13: Hybrid search + Parashah [TODO]

Add BM25 keyword search alongside vector search (alpha=0.5 in Weaviate). Metadata filtering by book/category. Parashah of the week via Sefaria API /calendars.


Block 4 - Make it reliable

Track quality, measure performance, ship the final product.


Step 14: LangFuse observability [TODO]

Track every RAG request: search results, reranking scores, generation. See exactly why an answer is good or bad. LangFuse deployed on Elestio.


Step 15: Ragas evaluation [TODO]

Measure RAG quality on 50 test questions. Metrics: faithfulness, relevancy, context precision. LLM-as-Judge with Torah-specific rubric. On sacred texts, a hallucination is not just an error - it is an offense.


Step 16: Integration + deploy [TODO]

Deploy the full stack on Elestio (Frontend + Backend + Weaviate + LangFuse). Every answer has verifiable Sefaria sources with clickable links.