Ternary (1.58-bit) language models compiled to WebAssembly. No server, no API key, nothing leaves your browser. Pick a model to download — your choice, on demand.
Choose a model to download
General chat & Q&A, grounded with a local search index (RAG, cites or abstains) and structured tool-calling. The all-rounder.
Writes original short stories from any prompt — fully generated on your device. Tiny, fast, kid-friendly. Trained from scratch on TinyStories.
A 1-bit (binary {-1,+1}) story model that beats TinyStories-1M (BPB 0.575 vs 0.707) — every projection weight is a single bit, distilled from a strong teacher. Runs offline in this tab.
Best on-browser story model: 18.7M ternary params, 16k vocab (zero garbled words), 4096-token context (~3700 words). Distilled from a 300M fp teacher via top-64 KL + Muon + WSD. BPB 0.528 on GPT-4 TinyStories. Fully offline — nothing leaves your device.
The best tiny storyteller: 6.2M params, ternary, 6k-vocab — distilled from a stronger teacher. BPB 0.519 (beats eeny 0.625 & TinyStories-1M 0.707). Cleaner words, on-device.
Strongest chat model — Qwen3-0.6B distilled to 1.58-bit ternary, runs fully offline. ChatML, instruction-following. The flagship on-device assistant.
Downloads once, then cached in your browser (IndexedDB) — works with the network off afterward.