Case StudyYear: 2025-2026

AskMyDoc

A retrieval-augmented document assistant that evolved from a RAG prototype into a reliable multi-user AI product.

Overview

AskMyDoc began as a deliberate RAG learning project: upload a PDF, retrieve relevant passages, and answer questions with grounded context.

Over multiple iterations it evolved from a single-document demo into a multi-user product with authentication, persistent workspaces, contract-first APIs, lifecycle-aware document handling, and evidence-aware answer validation.

How It Evolved

Learn the RAG pipeline

Built the core document workflow end to end: extraction, chunking, embeddings, retrieval, and grounded answer generation.

Productize the experience

Added authentication, user-owned documents, conversation persistence, search, history, and a workspace model that felt closer to a real AI product.

Improve trust and reliability

Focused on contracts, ownership enforcement, lifecycle safety, structured answers, evidence-aware validation, and stronger test coverage.

Why I Built This

I didn't build AskMyDoc to chase novelty. I built it to understand what it actually takes to ship an AI application people can trust.

Each version answered a different engineering question:

V1: Can I build a complete RAG pipeline?↓

V2: Can that pipeline behave like a real product?↓

V3: Can the product stay reliable under real-world failure modes?

That progression turned the project from a PDF chatbot into a hands-on study of contracts, trust boundaries, persistence, and trustworthy AI behavior.

Demo Showcase

AskMyDoc

Architecture

Authenticate User↓

Upload + Validate PDF↓

Extract + Chunk↓

Embed + Index↓

Retrieve by Intent↓

Generate Structured Answer↓

Validate Evidence + Citations↓

Persist Workspace State

Key Features

Authentication and ownership-aware document access

Persistent conversations, document history, and workspace navigation

Contract-first frontend/backend integration using OpenAPI-generated TypeScript types

Structured JSON answers with citations, answer status, and retrieval metadata

Evidence-aware fallback behavior when context is weak or validation fails

Lifecycle-aware upload and deletion flows with partial-failure handling

Integration and UI testing for auth, retrieval, recovery, and async state transitions

Technical Decisions

FastAPI

Next.js

LangChain

ChromaDB

OpenAI

Supabase

Google OAuth

OpenAPI-generated TypeScript

What I Learned

RAG quality is only one part of the problem; product reliability becomes the harder challenge as the system grows.

Trusted, server-derived identity is safer than relying on client-supplied user identifiers for authorization.

Contract-first APIs reduce frontend/backend drift as request and response shapes get more complex.

Uploads, deletions, and cleanup need explicit lifecycle handling instead of simple success-or-failure assumptions.

A trustworthy AI product should surface uncertainty and refuse unsupported answers when evidence is weak.

Testing becomes essential once auth, ownership, persistence, and async recovery behavior enter the system.

Repo