domain

Engineering Doctrine

The principles by which we design AI in production: the model as a primitive, orchestration, event-driven, per-step model choice, observability, economics, human control.

Not a manifesto — engineering constraints. We follow them because the opposite has already broken in production, for us and for others.

The model is a primitive, not a system

A language model is a compute primitive with a fuzzy contract: similar inputs can produce different outputs. You design the system around that uncertainty — with checks, branching and boundaries — not against it, hoping “the model will figure it out”. Hope is not an engineering property.

Value is in orchestration, not the model call

The model call itself is the easy part. The difficulty is around it: state between steps, retries, branching, failures, handoff to a human. A system that has these explicitly survives a change of requirements; a wrapper around one call does not.

Events, not polling

Workflows are triggered by events. State, idempotency and retries are platform properties, not code rewritten differently in every workflow. This removes a whole class of bugs (duplicates, races, lost state) that otherwise surfaces in production and is not reproducible.

The model is chosen per step

There is no single right model “for everything”. A cheap fast model handles classification and drafts; a strong one is needed where the cost of error is high. The choice is a function of the step: risk, required quality, acceptable latency. A global “strong model everywhere” constant is overpayment with no quality gain where quality wasn’t required.

Without observability the system is undebuggable

If you can’t reconstruct at which step, on which context and why a decision was made, degradation and overspend are noticed from a complaint or an invoice. Observability is not logs just in case — it is the condition for the system being operable at all.

Cost is an engineering metric

Token economics is designed like latency and reliability: context budgets, hybrid routing, caching, cutting silent loops. Done up front it costs days; done after the invoice it costs weeks and usually a temporary feature shutdown.

Humans in the loop where error is expensive

Autonomy is added by necessity, not by default. Where the cost of error is high, a human works by an explicit rule, and escalation is the terminal handler, not an emergency exit. This is built into the architecture, not added after an incident.

Production AI is not a model wrapper

A wrapper without state and contracts does not survive the second iteration of requirements. This is not a question of model scale — it is a question of architecture. We build orchestration from the start, because you cannot retrofit it into a wrapper.