AI Document Chat

Chat with your PDFs using Llama 3.2 3B AI. Offline, private, and secure.

Initialize AI Model

Load an AI model optimized for your device to start chatting with your documents. The model runs entirely in your browser.

First load: 1-3 minutes. Model will be cached for future use.

How Offline Browser AI Works

1

First-Time Download (Internet Required)

When you click "Load AI Model", your browser downloads the compressed Llama 3.2 3B model (1-3GB) from a CDN. This takes 1-3 minutes depending on your connection speed. The model is downloaded in chunks for reliability.

Download: model.wasm (1.2GB) + weights.bin (1.8GB)

Storage: Browser IndexedDB (permanent cache)

2

Model Caching (Permanent Storage)

The downloaded model is stored in your browser's IndexedDB. This is persistent storage that survives browser restarts, system reboots, and even browser updates. You only download once.

After this step, internet is NO LONGER REQUIRED

Next loads: 10-30 seconds from cache

3

Loading into RAM & GPU Memory

The cached model loads into your device's RAM and GPU memory using WebLLM. WebGPU provides massive GPU acceleration (10-100x faster than CPU). The model stays in memory while the page is open.

RAM Usage: ~3-4GB for model + inference

GPU: WebGPU acceleration (or WebGL fallback)

Close tab to free memory

4

PDF Upload & Text Extraction (100% Local)

Upload your PDF - text extraction happens in browser using pdf.js. Your document is parsed, text is extracted, and content is intelligently chunked for efficient AI processing. All in browser memory.

Your PDF NEVER leaves your device

No uploads, no network requests, no servers

5

AI Inference (Zero Network Calls)

When you ask questions, the AI performs semantic search to find relevant document sections, then generates contextual answers token-by-token using your GPU. All processing happens in browser JavaScript runtime.

Processing: Browser JavaScript + WebGPU

Speed: 10-50 tokens/second (depends on GPU)

100% offline - disconnect internet if you want

Current Status

No model loaded yet. Click "Load AI Model" to download and cache the AI model.

← Back to All Tools