Skillia
Back to Projects

Torah Study AI

in progress

Production RAG pipeline on 3.5M sacred texts. Hybrid search, Cohere reranking, strict anti-hallucination guardrails. Built with FastAPI, Weaviate, and Gemini.

PythonFastAPIWeaviateGemini 2.5 FlashCohere RerankNext.jsshadcn/uiDocker

Timeline

Block 1 - Chat with Gemini (1 week)

  • Step 1: First LLM call (Python + Google GenAI SDK)
  • Step 2: Next.js + shadcn/ui frontend (Cal AI design)
  • Step 3: Prompt engineering (chavruta system prompt)

Result: A chat that answers Torah questions using Gemini's general knowledge.


Block 2 - Make it a real app (2 weeks)

  • Step 4: FastAPI backend (REST API + SSE streaming)
  • Step 5: Auth (JWT + bcrypt) + Save conversations (SQLite)
  • Step 6: Conversation history in frontend (sidebar + sessions)
  • Step 7: Docker (Dockerfiles + docker-compose)
  • Step 8: Deploy to Elestio (deferred to after Block 3)

Result: Working app with auth, persistence, Docker-ready. Deploy after RAG is added.


Block 3 - Add Sefaria knowledge (3 weeks)

  • Step 9: Load Sefaria HuggingFace datasets (already chunked with metadata)
  • Step 10: NER enrichment (skipped for MVP)
  • Step 11: Embed with Gemini Embedding 001 + index in Weaviate (94K texts, English only)
  • Step 12: RAG pipeline with LangChain LCEL (retrieval + rerank + generation)
  • Step 13: Hybrid search (vector + BM25, alpha=0.5)

Result: Answers based on real 3.5M texts with verifiable Sefaria sources. Weekly Parashah feature.


Block 4 - Make it reliable (2 weeks)

  • Step 14: LangFuse observability
  • Step 15: Ragas evaluation (50 test questions)
  • Step 16: Full integration + deploy

Result: Production-ready with quality tracking. No hallucinations.


V2 - RAG improvements

  • CRAG (Corrective RAG) - reformulate query and retry when retrieval fails. Handles transliteration (Bereshit/Bereishit/Genesis)
  • HyDE (Hypothetical Document Embeddings) - generate a fake answer first, embed it, then search. Improves cross-language retrieval (Hebrew question -> English sources)
  • Hierarchical chunking - add book/chapter level vectors alongside verse-level chunks for better retrieval on broad questions
  • Few-shot examples in system prompt - add 2-3 ideal response examples to align LLM output format
  • LLM-as-Judge evaluation - automated rubric-based scoring instead of manual testing only

V2 - Product features

  • Level system (beginner / intermediate / advanced)
  • User progress memory
  • AI chavruta mode (Socratic method - the AI asks YOU questions)
  • Graph RAG using Sefaria's 3.74M links