Skip to content
Carbonfay
RU

engineering notes

Engineering Notes

Carbonfay Engineering Notes — dense engineering write-ups: AI-system architecture, context engineering, economics and organization. Not SEO filler, but what we learned in practice.

note 6 min

RED-driven development: an AI agent's maturity lives in its red tests

Why the «feature → test → green» cycle isn't enough for AI agents, and how RED-driven development measures maturity by the quality of red tests survived, not the count of green ones.

note 5 min

GREEN bias: how development environments falsify AI-agent quality

Why testing AI agents yields green reports while the agent fails in production: the built-in GREEN bias of development environments, and how to bypass it with honest evaluation.

note 6 min

The bot that passes every test and doesn't sell

Why an AI bot with 95% green tests barely moves sales: tests check knowledge of facts, not handling doubt, holding the dialog, and bringing the customer back.

note 7 min

Why every demo bot is smarter than the real one

An AI bot's demo is always smarter than production: tests are written by people who know the system and unconsciously help the agent. That's a class of engineering error, not a prompt tweak.

note 7 min

Context contamination in tests

If the tester — human or LLM — knows how the system is built, the test is spoiled: a hint leaks into the check. The fix is full isolation of the test loop from knowledge of the implementation.

note 9 min

Intents don't exist

Why the classic «user → intent → slots → result» scheme only works in slide decks, while a real person holds several intentions at once and shifts them as the dialog unfolds.

note 8 min

Chaos as the primary form of human dialog

Why teams treat a chaotic dialog as the user's mistake, when in fact chaos is the norm of live conversation, and a dialog system must be designed for it rather than «training» the human.

note 8 min

Why the user doesn't know what they want

A dialog AI agent executes the first message as a finished need, but more often the need takes shape inside the conversation. How to design an agent that leads to a formulation instead of guessing.

note 7 min

User lying as a normal operating mode

Users routinely give an AI agent wrong data — budget, dates, goal — and not out of malice. A mirror of LLM hallucination: designing for unreliable input as the foundation of agent reliability.

note 7 min

A model of cognitive noise

An AI agent's quality is defined not by the right answer but by resilience to dialog noise: topic switches, contradictions, emotions, returns to old questions. How to redefine the quality metric.

note 10 min

A red team for conversational AI

Why a developer can't honestly test their own conversational agent, and why you need an AI red team — an independent opponent agent whose job is to prove the agent doesn't work.

note 9 min

Agent versus agent: a new model of QA

Why the tester of a conversational AI is another agent, not a human. The «Customer Simulator → Target Agent → Judge» architecture as multi-agent engineering applied to QA.

note 8 min

Testing an agent against real dialogs

Why synthetic scenarios are useless for chatbot and AI-agent testing, and how a corpus of real dialogs becomes the source of truth and a company asset.

note 7 min

Why LLMs play the customer role badly

LLM-based customer simulators behave too reasonably and help the agent instead of breaking it. Why honest AI-agent testing needs explicit models of difficult behavior.

note 5 min

Adopting AI in a company: where to start and what it costs

AI adoption pays off on specific repeated processes: where to start, how to compute cost and effect, what not to do.

note 4 min

Automating business processes with AI: what actually works

Which business processes AI actually automates — classification, routing, drafts, status reconciliation — and where it doesn't pay off.

note 4 min

Best approaches to AI agents for business: how to measure "best"

How to measure the "best" AI agent for business: reliability, cost of ownership, human control and embeddability — not the model.

note 8 min

Building multi-agent systems: architecture that doesn't fall apart

How to design multi-agent systems that work in production: roles, contracts, coordination, fault tolerance and predictable cost.

note 5 min

Context as the main resource of an AI system

Why an AI system's quality is set by context management, not model size, and how to manage it as an engineering discipline.

note 4 min

Context entropy and the degradation of answer quality

How noise accumulating in context lowers an AI system's answer quality, and which engineering techniques hold it back.

note 4 min

Cost-aware architecture for AI systems

How to design AI systems where cost is an engineering metric alongside latency and reliability, not a surprise at month's end.

note 4 min

Do machines need their own languages to coordinate

Why agents need compact machine representations of meaning instead of natural language, and what it changes in cost and reliability.

note 4 min

Event-driven AI systems instead of simple scenarios

Why a linear scenario breaks on exceptions while an event-driven architecture makes an AI system robust and observable.

note 4 min

Hidden hardcode in AI automation

How wired-in rules and prompt chains turn AI automation into technical debt and why it hits the cost of changes.

note 4 min

How AI compresses operational processes

How AI removes intermediate steps, approvals and waits in operational processes and what it gives in cycle time.

note 6 min

How to build a RAG system that doesn't lie in production

A practical breakdown of building a RAG system: sources, event-based indexing, hybrid search with reranking, and grounding evaluation.

note 4 min

How to compute the payback of AI agents

A model for computing AI-agent payback: what to count as benefit, how to account for token and operation cost, which assumptions are dangerous.

note 5 min

Multi-agent system architecture: roles, contracts, coordination

What a multi-agent system is made of: the agent as an element, input/output contracts, coordination and message exchange between agents.

note 5 min

On-prem RAG: when it's justified and when it's not

When an on-prem RAG system is really needed: data security, the perimeter, cost of ownership — and when the cloud wins.

note 5 min

Orchestrating AI agents in business processes

What AI-agent orchestration is: how to connect agents, tools and people into a managed business process with cost control.

note 5 min

Problems of multi-agent systems and how to avoid them

A breakdown of typical multi-agent failures — looping, context drift, cost growth — and engineering ways to avoid them.

note 5 min

RAG system architecture: sources, indexing, reranking

How a RAG system is built (retrieval augmented generation): sources, indexing, hybrid search, reranking and delivering the minimally sufficient context.

note 5 min

RAG: where it helps and where it creates an illusion of knowledge

When RAG actually raises accuracy and when it merely errs confidently, and how to tell search-over-a-base from understanding the business.

note 4 min

Reducing coordination costs with AI

Why a large company's main hidden cost is coordination, and how an AI layer lowers it without cutting people.

note 5 min

What AI-agent development costs and what drives the price

What makes up the cost of AI-agent development: process, integrations, human control, operation and token cost.

note 4 min

What AI-native engineering is and why it's not "coding with ChatGPT"

What AI-native engineering means: runtime thinking, architecture around models and cost control — and why it's not code generation in a chat.

note 5 min

Why a chatbot is not an AI architecture

How a corporate chatbot differs from an AI system: state, contracts, human control — and why this decides money and risk.

note 4 min

Why AI automation can suddenly become expensive

Where uncontrolled cost growth in AI automation comes from — context length, retries, bad routing — and how to keep the budget.

note 5 min

Why companies need an operational AI environment, not a chatbot

Why a point chatbot doesn't scale, while an operational AI environment lowers coordination costs and gives leaders visibility into processes.

note 4 min

Why natural language is inconvenient for machine coordination

Where natural language creates cost and errors in exchange between agents and how compact representations of meaning solve it.