AI Document Chat
Chat with your PDFs using Llama 3.2 3B AI. Offline, private, and secure.
Initialize AI Model
Load an AI model optimized for your device to start chatting with your documents. The model runs entirely in your browser.
First load: 1-3 minutes. Model will be cached for future use.
How Offline Browser AI Works
First-Time Download (Internet Required)
When you click "Load AI Model", your browser downloads the compressed Llama 3.2 3B model (1-3GB) from a CDN. This takes 1-3 minutes depending on your connection speed. The model is downloaded in chunks for reliability.
Download: model.wasm (1.2GB) + weights.bin (1.8GB)
Storage: Browser IndexedDB (permanent cache)
Model Caching (Permanent Storage)
The downloaded model is stored in your browser's IndexedDB. This is persistent storage that survives browser restarts, system reboots, and even browser updates. You only download once.
After this step, internet is NO LONGER REQUIRED
Next loads: 10-30 seconds from cache
Loading into RAM & GPU Memory
The cached model loads into your device's RAM and GPU memory using WebLLM. WebGPU provides massive GPU acceleration (10-100x faster than CPU). The model stays in memory while the page is open.
RAM Usage: ~3-4GB for model + inference
GPU: WebGPU acceleration (or WebGL fallback)
Close tab to free memory
PDF Upload & Text Extraction (100% Local)
Upload your PDF - text extraction happens in browser using pdf.js. Your document is parsed, text is extracted, and content is intelligently chunked for efficient AI processing. All in browser memory.
Your PDF NEVER leaves your device
No uploads, no network requests, no servers
AI Inference (Zero Network Calls)
When you ask questions, the AI performs semantic search to find relevant document sections, then generates contextual answers token-by-token using your GPU. All processing happens in browser JavaScript runtime.
Processing: Browser JavaScript + WebGPU
Speed: 10-50 tokens/second (depends on GPU)
100% offline - disconnect internet if you want
Current Status
No model loaded yet. Click "Load AI Model" to download and cache the AI model.