Deterministic AI starts with deterministic data. Same question, same answer.
Your agent can be brilliant and still be wrong in a new way every run — if the data layer underneath it guesses. Determinism isn't a nice-to-have for AI systems that act on numbers; it's the difference between automation and gambling.
Why "chat with your data" drifts
The common pattern — natural language in, generated SQL out — has three failure points, and they compound:
NL→SQL, run twiceSame question, slightly different phrasing → different generated query → different JOIN, different NULL handling → a different number. Which one goes in the report?
Typed query, run twiceThe agent submits the same declarative spec → one engine, one predicate semantics → the same rows, the same aggregate, stamped with the same data generation.
When a human reads the answer, drift is an annoyance. When an agent reads the answer and then writes, schedules, or spends based on it, drift is a correctness bug with a blast radius.
What determinism means in tabledi
The agent doesn't guess SQL — it drives a typed surface. Queries are declarative specs (select / filter / group / join) over typed columns, evaluated by one engine with a single predicate IR. There is nothing probabilistic between the question and the rows.
Every answer is generation-stamped. The result carries the exact data version it was computed against — ask a hundred times, get the same number a hundred times, and know which data it came from.
Writes are governed, not vibes. Type validation against the schema, optimistic locking (racing writers fail loudly instead of clobbering), per-tenant rate limits, and an audit trail.
Formulas are spreadsheet-exact.VLOOKUP, SUMIF and 60+ functions with defined semantics — recomputed incrementally, not re-derived by a model.
# the shape of a deterministic answer
tabledi query '{ "source": "eval_runs", "group": [...] }'→ 6 rows · gen #204 · identical on every re-run against gen #204
Determinism ≠ no AI
Your agent still does all the thinking: which question to ask, what to build from the answer. tabledi's job is narrower and harder to fake — guarantee the answering. That split is the architecture: your agent + your model key on top, a deterministic, auditable data seam underneath, connected over CLI and MCP. No LLM key needed on our side, ever.
Honest status: the engine's own benchmark numbers (1M-row rollups in fractions of a millisecond, sub-second bulk imports) are what we publish today. A reproducible public benchmark and replay demonstrations are on the roadmap — determinism should be something you can verify, not just read about.
Give your agent a table it can trust.
Deterministic, self-hostable, built for millions of rows.