Skillia
Back to Projects

Torah Study AI

in progress

Production RAG pipeline on 3.5M sacred texts. Hybrid search, Cohere reranking, strict anti-hallucination guardrails. Built with FastAPI, Weaviate, and Gemini.

PythonFastAPIWeaviateGemini 2.5 FlashCohere RerankNext.jsshadcn/uiDocker
Back to Instructions

Step 5: Save Conversations - How I Build

Adding SQLite persistence, JWT authentication, and conversation history to the Torah chatbot.

6 min read

The Goal

A chatbot without memory is useless. The user asks a question, gets an answer, refreshes the page, and everything is gone. This step adds:

  1. User accounts (register/login)
  2. Persistent conversations (sessions with messages)
  3. JWT authentication on all endpoints

SQLite Schema

Three tables. Simple relational design.

import sqlite3
from datetime import datetime

def init_db(db_path: str = "torah_study.db"):
    """Initialize the database with required tables."""
    conn = sqlite3.connect(db_path)
    cursor = conn.cursor()

    cursor.executescript("""
        CREATE TABLE IF NOT EXISTS users (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            email TEXT UNIQUE NOT NULL,
            password_hash TEXT NOT NULL,
            created_at TEXT DEFAULT (datetime('now'))
        );

        CREATE TABLE IF NOT EXISTS sessions (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            user_id INTEGER NOT NULL,
            title TEXT DEFAULT 'New conversation',
            created_at TEXT DEFAULT (datetime('now')),
            updated_at TEXT DEFAULT (datetime('now')),
            FOREIGN KEY (user_id) REFERENCES users(id)
        );

        CREATE TABLE IF NOT EXISTS messages (
            id INTEGER PRIMARY KEY AUTOINCREMENT,
            session_id INTEGER NOT NULL,
            role TEXT NOT NULL CHECK (role IN ('user', 'assistant')),
            content TEXT NOT NULL,
            created_at TEXT DEFAULT (datetime('now')),
            FOREIGN KEY (session_id) REFERENCES sessions(id)
        );
    """)

    conn.commit()
    conn.close()

Why these three tables:

  • users: email + hashed password. No username, no profile picture. Minimum viable auth.
  • sessions: a conversation thread. Has a title (auto-generated from first message) and timestamps.
  • messages: individual messages within a session. Role is either "user" or "assistant."

JWT Auth with bcrypt

from passlib.context import CryptContext
from jose import jwt
from datetime import datetime, timedelta
import os

pwd_context = CryptContext(schemes=["bcrypt"], deprecated="auto")

SECRET_KEY = os.getenv("JWT_SECRET", "dev-secret-change-in-production")
ALGORITHM = "HS256"
ACCESS_TOKEN_EXPIRE_HOURS = 24


def hash_password(password: str) -> str:
    """Hash a password with bcrypt."""
    return pwd_context.hash(password)


def verify_password(plain: str, hashed: str) -> bool:
    """Verify a password against its hash."""
    return pwd_context.verify(plain, hashed)


def create_token(user_id: int) -> str:
    """Create a JWT token with user_id and expiration."""
    expire = datetime.utcnow() + timedelta(hours=ACCESS_TOKEN_EXPIRE_HOURS)
    payload = {"sub": str(user_id), "exp": expire}
    return jwt.encode(payload, SECRET_KEY, algorithm=ALGORITHM)


def decode_token(token: str) -> int:
    """Decode a JWT token and return user_id."""
    payload = jwt.decode(token, SECRET_KEY, algorithms=[ALGORITHM])
    return int(payload["sub"])

Auth Endpoints

from fastapi import Depends, HTTPException, Header


class RegisterRequest(BaseModel):
    email: str = Field(..., pattern=r"^[\w\.-]+@[\w\.-]+\.\w+$")
    password: str = Field(..., min_length=8)


class LoginRequest(BaseModel):
    email: str
    password: str


class TokenResponse(BaseModel):
    token: str
    user_id: int


@app.post("/auth/register", response_model=TokenResponse)
async def register(request: RegisterRequest):
    """Register a new user."""
    conn = get_db()
    cursor = conn.cursor()

    # Check if email already exists
    cursor.execute("SELECT id FROM users WHERE email = ?", (request.email,))
    if cursor.fetchone():
        raise HTTPException(status_code=409, detail="Email already registered")

    # Create user
    password_hash = hash_password(request.password)
    cursor.execute(
        "INSERT INTO users (email, password_hash) VALUES (?, ?)",
        (request.email, password_hash),
    )
    conn.commit()
    user_id = cursor.lastrowid

    token = create_token(user_id)
    return TokenResponse(token=token, user_id=user_id)


@app.post("/auth/login", response_model=TokenResponse)
async def login(request: LoginRequest):
    """Login and get a JWT token."""
    conn = get_db()
    cursor = conn.cursor()

    cursor.execute(
        "SELECT id, password_hash FROM users WHERE email = ?", (request.email,)
    )
    row = cursor.fetchone()

    if not row or not verify_password(request.password, row[1]):
        raise HTTPException(status_code=401, detail="Invalid credentials")

    token = create_token(row[0])
    return TokenResponse(token=token, user_id=row[0])

Auto-Title from First Message

When a user starts a new conversation, the session title is "New conversation." After the first user message, I update the title to the first 50 characters of that message. Simple, effective, no LLM call needed.

def save_message(session_id: int, role: str, content: str):
    """Save a message and auto-title the session if it is the first user message."""
    conn = get_db()
    cursor = conn.cursor()

    cursor.execute(
        "INSERT INTO messages (session_id, role, content) VALUES (?, ?, ?)",
        (session_id, role, content),
    )

    # Auto-title: if this is the first user message, use it as the title
    if role == "user":
        cursor.execute(
            "SELECT COUNT(*) FROM messages WHERE session_id = ? AND role = 'user'",
            (session_id,),
        )
        count = cursor.fetchone()[0]
        if count == 1:  # This is the first user message
            title = content[:50] + ("..." if len(content) > 50 else "")
            cursor.execute(
                "UPDATE sessions SET title = ?, updated_at = datetime('now') WHERE id = ?",
                (title, session_id),
            )

    conn.commit()

The 7 Auth Tests

from fastapi.testclient import TestClient
from main import app

client = TestClient(app)


def test_register_success():
    """Should create user and return token."""
    response = client.post("/auth/register", json={
        "email": "[email protected]",
        "password": "securepass123",
    })
    assert response.status_code == 200
    assert "token" in response.json()


def test_register_duplicate_email():
    """Should reject duplicate email with 409."""
    client.post("/auth/register", json={
        "email": "[email protected]",
        "password": "securepass123",
    })
    response = client.post("/auth/register", json={
        "email": "[email protected]",
        "password": "otherpass456",
    })
    assert response.status_code == 409


def test_register_weak_password():
    """Should reject password shorter than 8 chars."""
    response = client.post("/auth/register", json={
        "email": "[email protected]",
        "password": "short",
    })
    assert response.status_code == 422


def test_login_success():
    """Should return token for valid credentials."""
    client.post("/auth/register", json={
        "email": "[email protected]",
        "password": "securepass123",
    })
    response = client.post("/auth/login", json={
        "email": "[email protected]",
        "password": "securepass123",
    })
    assert response.status_code == 200
    assert "token" in response.json()


def test_login_wrong_password():
    """Should reject wrong password with 401."""
    client.post("/auth/register", json={
        "email": "[email protected]",
        "password": "securepass123",
    })
    response = client.post("/auth/login", json={
        "email": "[email protected]",
        "password": "wrongpassword",
    })
    assert response.status_code == 401


def test_login_nonexistent_user():
    """Should reject unknown email with 401."""
    response = client.post("/auth/login", json={
        "email": "[email protected]",
        "password": "securepass123",
    })
    assert response.status_code == 401


def test_protected_endpoint_no_token():
    """Should reject request without Authorization header."""
    response = client.get("/sessions")
    assert response.status_code == 401

Why SQLite for POC

I considered PostgreSQL, MongoDB, and even Supabase. I chose SQLite because:

  1. Zero infrastructure. No Docker container, no connection string, no migrations tool. Just a file.
  2. Fast enough. For a single-user POC with fewer than 1000 sessions, SQLite handles 100K+ reads/second.
  3. Easy to inspect. sqlite3 torah_study.db ".tables" shows me everything. No pgAdmin needed.
  4. Trivial to replace. SQLAlchemy can swap the backend later. The schema is standard SQL.

The tradeoff: no concurrent writes. If two users write at the exact same millisecond, one gets a "database is locked" error. For a POC, this is acceptable. For production, I will migrate to PostgreSQL on Elestio.

Lessons Learned

  • Auto-title is a UX win. Users never name their conversations. Using the first message as the title makes the sidebar instantly useful.
  • bcrypt + JWT is the minimum auth stack. No need for OAuth, no need for session cookies. One hashed password, one signed token.
  • Test with a fresh DB per test. Each test file creates a temporary database, runs tests, then deletes it. No state leaks between tests.
  • Pydantic email validation saves you from writing regex. The pattern field in Pydantic handles email format, but for production, use email-validator library.

What is Next

Step 6 adds the conversation history to the frontend: login screen, sidebar with sessions, and message persistence during streaming.