Engineering teams answer the wrong question all the time. Asked whether a system can be built, they say yes — and they are usually right. Foundation models are competent, retrieval works, fine-tuning is accessible, and the tooling around evals and orchestration has matured. Almost any well-specified AI capability can be assembled in a quarter or two by a strong team. That is precisely why we can probably build this should not end a conversation. It should start one.
The decision a buyer actually needs is not technical possibility. It is whether the capability is worth building, at what total cost, with what residual risk, and against what alternative. Feasibility, treated honestly, is a budget question. It belongs to the same conversation as procurement, change management, and the operating plan — not to a side channel between the CTO and a vendor.
Why AI feasibility is a budget question
Every AI system has a price tag with two columns. The first column is the build: data work, model selection, integration, security review, the first round of evals, and the launch. Most organizations estimate this column reasonably well, because it looks like the software projects they have shipped before.
The second column is the one that breaks budgets. It is the cost of keeping the system honest after launch — labeled data refreshes, eval suites that grow with each new failure mode, monitoring for drift, an on-call rotation that understands the model and not just the infrastructure, escalation pathways when the system is wrong in a way that hurts a customer, and the legal and compliance work that follows the first incident. AI TCO is dominated by this second column. Pretending otherwise produces systems that ship on schedule and degrade on schedule.
A feasibility answer that ignores the second column is not a feasibility answer. It is a build estimate dressed up as a recommendation.
What residual risk actually means in production
Residual risk is what is left after the system is doing its job correctly most of the time. It is not a hypothetical. It is the floor — the set of failures the team has accepted will happen, with a frequency they can quantify, in exchange for the value the system delivers. The question is whether the organization can absorb that floor without compounding it through inattention.
For an internal productivity tool, the floor is usually tolerable: a wrong summary wastes time. For a system that touches customer money, clinical decisions, regulated communications, or hiring outcomes, the floor is a governance problem before it is a model problem. The vendor or internal team that quotes a benchmark without quoting an incident plan has answered half the question.
How to price an AI investment decision honestly
An honest feasibility study answers five questions before it answers can we build this. The order matters. If the first four answers are weak, the fifth is irrelevant.
- 01What decision or workflow does this system change, and what is the measurable cost of getting it wrong once?
- 02What is the marginal value over the current process — not over zero, but over what humans, scripts, or existing software already do?
- 03Who owns the system on day 90, day 180, and day 540, and what is their on-call burden?
- 04How will we know the system has degraded — what signals, what threshold, what action?
- 05What does exit look like — if the model provider raises prices, deprecates a feature, or changes terms, what is the migration cost?
These are not philosophical questions. They are line items. A team that cannot put a number, a name, or a date next to each of them does not yet have a feasibility answer — they have an enthusiasm. Enthusiasm is useful at the prototype stage. It is dangerous at the procurement stage.
AI build vs buy when the math is uncomfortable
Build-vs-buy in AI is rarely a clean choice between two options. It is a choice between three: build it, buy it, or wait. The third option is underused. Many capabilities that were heroic to build twelve months ago are now commoditized in a vendor's product, and many that are heroic today will be commoditized in the next release cycle. Building something that will be a checkbox in someone else's roadmap is a way to convert engineering payroll into eventual technical debt.
Buying is not free either. A vendor relationship has its own TCO — integration, data sharing review, the rate at which their roadmap diverges from yours, and the exit cost when it does. The right way to read a vendor pitch is to ask which parts of the system you would still own if the vendor disappeared tomorrow. If the answer is none, the system is rented, not bought, and that has implications for accountability and price negotiation.
Building has its own quiet costs. Internal AI systems accumulate institutional knowledge in the heads of the two engineers who built them. When those engineers leave, the system becomes a liability without an owner. That risk shows up nowhere in the original feasibility deck.
What a good feasibility answer looks like
A defensible feasibility recommendation is short and specific. It states the problem in business terms, names the alternative being replaced, quantifies the expected value and the residual risk, identifies the owner and the operating budget, and proposes a kill criterion — a measurable condition under which the team will stop, not just iterate. Without a kill criterion, projects do not end. They drift into permanent maintenance, which is the most expensive state an AI system can occupy.
The answer should also be honest about what is not known. Retraining cadence is a guess until the first drift event. Eval coverage is a hypothesis until production traffic tests it. Saying so in the recommendation is not a weakness — it is what separates a feasibility study from a sales document.
- A one-paragraph statement of the decision the system changes and the cost of getting it wrong
- A named owner for years one and two, with budget allocated, not promised
- An eval and monitoring plan that survives the first six months of operation
- A migration and exit plan that does not assume the current vendor or model is permanent
- A kill criterion that is measurable, dated, and agreed before the build begins
Closing the gap between possible and worth doing
The teams that get the most out of applied AI are not the ones with the broadest yes list. They are the ones with a disciplined no — a habit of declining projects that are technically possible but operationally unsupportable, financially marginal, or strategically rented. That discipline pays off twice: once in the projects that are not undertaken, and once in the projects that are, because the surviving work gets the attention it needs to actually land.
Before the next AI initiative gets the green light, ask the team to produce a one-page answer to the five questions above. If the page is hard to write, the project is not ready. If it is easy, the work has already begun.
