CAPABILITY · Client under NDA

Multi-Model AI Document Search & Citation Tool

AI document search that returns the exact passage with a citation back to the source page — not a summarised guess. Works across PDFs, books, scanned documents, and long technical reports.

Legal TechDocument AICorporate ResearchAcademic SearchCompliance & DiscoveryKnowledge ManagementRAG PlatformsEnterprise Search
See it work

From upload to cited answer.

Researchers upload PDFs, books, and scanned documents. The pipeline OCRs the scanned ones, picks a chunking strategy per document type, and stores the embeddings in pgvector. Queries route through a multi-model layer — GPT-4, LLaMA, or Gemini — and every answer carries a page-level citation back to the source.

docs.example/search
viewing as · Researcher
Researcher · Uploading documents
📚 doc search
Upload
Library
Recent
Settings

Upload documents to your corpus

Researcher · Corpus #14 · 3 documents in queue

OCR enabled
Drag PDFs, books, or scanned documents
up to 500 MB · PDF · EPUB · scanned image PDFs welcome
merger-agreement.pdf
87 pages
Native PDF
parsing PDF…100%
engineering-handbook.pdf
412 pages
Native PDF
parsing PDF…82%
research-paper-scanned.pdf
24 pages (scanned)
OCR
running OCR…64%
3 documents queued · entering analyser
Demo only

This is an animated mockup of the document-search capability — not a live product. Document titles, page numbers, and answer text are illustrative.

01

OCR + PDF parser

Born-digital PDFs are parsed natively; scanned PDFs and image-based pages flow through OCR first. The same downstream pipeline sees clean text either way.

02

Adaptive chunking engine

Paragraph-based for legal contracts, word-count for technical manuals, page-based for academic papers. One strategy across every document type would lose meaning at the boundaries.

03

Vector store (pgvector)

Embeddings land in pgvector for fast cosine-similarity retrieval. Postgres is the same database the rest of the app uses — one less moving part to operate.

04

Multi-model RAG pipeline

GPT-4 for legal precision, LLaMA 3 for technical manuals, Gemini Pro for academic synthesis — routed through a common interface so the pipeline picks the right model per query.

05

Citation engine (page-level)

Every answer carries a citation back to the source page. Legal, compliance, and research workflows can verify each passage instead of trusting a summary.

06

Document structure analyser

A pre-processing analyser inspects each document, identifies its type, and picks the chunking strategy — instead of forcing one strategy across the whole corpus.

What we built

AI document search that returns the exact passage with a citation back to the source page — not a summarised guess. Works across PDFs, books, scanned documents, and long technical reports.

How we built it

Documents are parsed (OCR if scanned), analysed for structure, and chunked using an adaptive strategy chosen per document. Embeddings land in a vector database; queries route through a multi-model RAG pipeline that picks the best LLM for the document type. Every answer carries a page-level citation back to the source.

Users upload PDFs, books, or scanned documents. The system inspects each document and picks the right chunking strategy automatically — paragraph-based for prose, page-based for legal contracts, word-count for technical manuals — instead of forcing one strategy on every document. Embeddings land in a vector store; when a user asks a question, retrieval finds the relevant chunks, and a multi-model layer routes the query to GPT-4, LLaMA, or Gemini depending on the document type. Answers include a citation pointing back to the exact source page, so legal and research workflows can verify every claim.

Architecture

How a request flows through it

Each request enters at the top of the diagram, flows through every box, and lands at the bottom — exactly the way the production system behaves. The scan-line traces where a live request would be right now.

tracing request flow
Document upload (PDF / DOCX / image)
OCR + PDF parser
Adaptive chunking analyser
(paragraph / word / page)
Embeddings PostgreSQL + pgvector
Query routed via LangChain
OpenAI GPT-4
LLaMA
Gemini 2
Answer + citation back to source page
flow direction┌─┐ component
Stack

What it's built with

Capabilities
Multi-Model RAG Pipeline (GPT-4 · LLaMA · Gemini)Adaptive Chunking EngineVector Store (pgvector)OCR + PDF ParserCitation Engine (page-level)Document Structure AnalyserSemantic Search Layer
Engineering notes

The interesting parts

Adaptive chunking per document

A pre-processing analyser picks paragraph-, word-, or page-based chunking per document — instead of forcing one strategy across legal contracts, technical books, and corporate reports.

Multi-model behind a common interface

GPT-4, LLaMA, and Gemini routed through a common interface so the pipeline picks the best model per query without app-level changes.

Citation back to source page

Every answer carries a page-level citation back to the original document — what makes the tool usable in legal, compliance, and academic workflows where a passage without a reference is opinion.

OCR for scanned documents

Scanned PDFs and images flow through OCR before chunking, so the same search experience works on born-digital and scanned content alike.

Decisions

The calls that did most of the work

A handful of engineering choices shape how a system feels. Here are the ones we'd still defend — alongside what each one cost.

01

Adaptive chunking per document

Legal contracts, technical books, and corporate reports each have a different 'natural unit' for a query to land on — one fixed chunk size lands badly on at least one of them.

Tradeoff: A pre-processing analyser adds latency before the first query can run on a new document.

02

Multi-model behind a common interface

Different LLMs perform differently on legal vs technical vs academic text. A common interface lets the system pick per query without app-level changes.

Tradeoff: Three model contracts to test, three rate-limit budgets to manage, three places to chase up regressions.

03

Citation back to source page, not just text

A passage without a page reference is opinion; citing the exact source page is what makes the tool usable in legal and research workflows.

Tradeoff: The chunking and embedding layer has to carry page metadata through every transformation.

Want something like this?

Tell us what you're building.

Free 30-minute call. Real humans, real timelines, no follow-up emails forever.

See more capabilities