AI that runs entirely on your device

Ternary (1.58-bit) language models compiled to WebAssembly. No server, no API key, nothing leaves your browser. Pick a model to download — your choice, on demand.

0 servers 100% offline after load CPU · SIMD, no GPU 1.58-bit ternary weights

Choose a model to download

🧠

Assistant

Sprapp 0.5B · ternary · ~120 MB

General chat & Q&A, grounded with a local search index (RAG, cites or abstains) and structured tool-calling. The all-rounder.

📖

Storyteller

TinyStories 120M · ternary · 8k vocab · ~47 MB

Writes original short stories from any prompt — fully generated on your device. Tiny, fast, kid-friendly. Trained from scratch on TinyStories.

🔬

1-bit TinyStories

1-bit · 7.7M · 6k vocab · BPB 0.575 · ~7.5 MB

A 1-bit (binary {-1,+1}) story model that beats TinyStories-1M (BPB 0.575 vs 0.707) — every projection weight is a single bit, distilled from a strong teacher. Runs offline in this tab.

🌟

meeny v3

ternary · 18.7M · 16k vocab · BPB 0.528 · ~22 MB · 4096 ctx

Best on-browser story model: 18.7M ternary params, 16k vocab (zero garbled words), 4096-token context (~3700 words). Distilled from a 300M fp teacher via top-64 KL + Muon + WSD. BPB 0.528 on GPT-4 TinyStories. Fully offline — nothing leaves your device.

✨

meeny v2

ternary · 6.2M · 6k vocab · BPB 0.519 · ~7 MB

The best tiny storyteller: 6.2M params, ternary, 6k-vocab — distilled from a stronger teacher. BPB 0.519 (beats eeny 0.625 & TinyStories-1M 0.707). Cleaner words, on-device.

⚡

Qwen3 0.6B

Qwen3-0.6B · 1.58-bit ternary (QAT) · ~550 MB

Strongest chat model — Qwen3-0.6B distilled to 1.58-bit ternary, runs fully offline. ChatML, instruction-following. The flagship on-device assistant.

Downloads once, then cached in your browser (IndexedDB) — works with the network off afterward.

model 📲 personalize

LoRA adapter 1 thread MTP speculative Constrained decode

Loading…

temp · top-k 40 · CPU/SIMD · ⌘/Ctrl+Enter