
ReadFlow
A native macOS reading desk for PDFs, built around reading-time capture.
Why this exists
Existing PDF readers treat highlighting as an afterthought: a coloured stripe, no provenance, lost the moment the layout reflows. But the act of reading is the work — what you notice, where you push back, what reminds you of something else. The brief was a reading desk where capture is the product: structured, resilient, on-device, and good enough that you don’t mind doing the slow thing.
The proof point
Capture-first reading plus an on-device LLM is a clean local-first AI shape. The interesting technical idea isn’t the model — it’s that resilient text anchors (quote + prefix + suffix + offsets) let captures survive any reflow, on any window width, and the AI sits behind that as enrichment, not as the product. The same pattern holds for any enterprise corpus where documents reflow and annotations have to outlive the layout.
How it works
ReadFlow is a Tauri v2 desktop app: a React frontend hosted in the system webview, talking to a Rust backend across a single `invoke(...)` boundary. About sixty namespaced commands cover the surface area — books, journeys, context, preview notes, enrichment, voice. Cross-stack types live in a single `shared/schema.ts`, with Rust structs marked `camelCase` so the wire format matches the TypeScript 1:1.
All state lives on disk: a local SQLite database (WAL, foreign keys on) at the Tauri app-data dir, plus an uploads folder for PDFs and capture PNGs, an audio folder for voice-note WAVs, and a user-configured journey root with To Read / Reading / Completed subfolders. Migrations are idempotent and swallow “column already exists” errors so the app boots cleanly on fresh installs and on years-old databases alike.
AI is pluggable, on-device by default. A bundled `llama_cpp` runs Llama 3.2 3B Instruct (Q4_K_M) downloaded on first launch; users can swap to Ollama, OpenAI, Gemini, or Claude. Enrichment is a deliberate cascade — Google Books then Open Library then the LLM — and every auto-filled field carries a provenance tag (`pdf | epub | google_books | open_library | llm | edited`) so the user can trust the source.
Architecture
What an architect can take from it
- 01Resilient text anchors — quote + prefix + suffix + offsets — are how annotations outlive reflow. The same approach applies to any document corpus where layout isn’t stable.
- 02Provenance per field is how AI-filled forms earn trust. Tag the source on the way in; the UI can show or filter by it later.
- 03A single sync boundary (`invoke`) plus shared typed schemas is a clean contract for desktop apps — far cleaner than ad-hoc IPC.