What if the genius isnt?

Most teams building agentic AI systems are trying to make LLMs do something they're fundamentally bad at: making consistent, defensible business decisions. I keep seeing this pattern - someone builds an impressive demo where an AI agent approves loans or routes customer requests, then discovers they can't explain why it made any particular choice, can't guarantee it won't make a different choice tomorrow for the same inputs, and definitely can't satisfy a regulator asking for documentation.

The problem isn't that the team did something wrong. The problem is they're using the wrong tool for the job.

LLMs are built for creativity and flexibility. That's what makes them powerful for understanding language and generating responses. But those exact same qualities make them disasters for governed decision-making, where you need the opposite - ruthless consistency, complete transparency, and the ability to prove exactly why you did what you did.

What most people don't realize is there's an established technology that was purpose-built to solve this exact problem. Decision platforms (also called business rules management systems) have been handling high-stakes automated decisions in regulated industries for years. They're not sexy, they're not new, but they work. And for teams building agentic AI, understanding when to use a decision platform instead of an LLM is the difference between a system you can actually deploy versus one that stays in perpetual pilot.

Why LLMs fail at governed decisions

Understanding why LLMs don't work for formal decision logic isn't about criticizing the technology - it's about recognizing what it was designed to do. When you need a system to make business decisions that might get challenged by regulators or customers, three things matter above everything else: consistency, transparency, and the ability to adapt without breaking things. LLMs fail on all three.

Start with consistency. In any regulated environment, similar cases must be treated identically over time. That's not a preference - it's a legal requirement for fairness. LLMs work on probabilities and controlled randomness. That variability is a feature when you're generating creative content. It's a liability when you're deciding whether to approve someone's mortgage.

An LLM might give you the right answer ninety-nine times in a row, then generate a completely different result for identical inputs on attempt one hundred. There's no way to guarantee it will treat every customer the same way. For product teams, that's an operational nightmare. For legal teams, it's indefensible.

Then there's the black box problem. When you deny someone's loan application or job application, you need to explain why. Not with a confabulated justification the LLM generates after the fact - with the actual logic that drove the decision. LLMs can't do this. Even when you prompt them to explain their reasoning, what you get back is another probabilistic output that might sound plausible but has no verifiable connection to what actually happened inside the model.

Presenting one black box's explanation of another black box's decision doesn't hold up when a customer complains or a regulator audits your process. You need a transparent, loggable decision trail that you can walk through step by step. LLMs don't create those.

The agility problem is more subtle but just as important. Business logic needs to change - new regulations get passed, company policies evolve, market conditions shift. With traditional software, you update the specific rule that needs to change. With LLMs, attempting to modify one behavior often produces unexpected side effects somewhere else. You end up needing specialized AI expertise just to make routine business logic updates, instead of letting domain experts who understand the business make changes directly.

And here's something most people miss: many business decisions depend on analytical insights from structured historical data. Payment histories, transaction patterns, risk scores calculated from months of activity. LLMs weren't designed to process structured data this way. They work with text, not relational databases or time-series analysis. So even if you solved all the other problems, you'd still be missing the ability to incorporate the quantitative context that makes decisions accurate.

These aren't minor limitations you can work around. They're fundamental mismatches between what LLMs do and what governed decision-making requires.

How decision platforms actually work

Decision platforms take a completely different approach. Instead of asking a model to figure out the answer, you externalize the business logic into a system designed specifically to manage rules. The architecture has three main components: definition and management, validation and testing, and deployment.

For definition and management, decision platforms use specialized editors that let both technical and business people work with decision logic. Low-code editors let domain experts - the people who actually understand the business policies - directly author and modify rules without writing code. Technical editors give developers deeper integration capabilities when needed. All the decision logic lives in a central repository with version control, branching, and audit trails. The rules aren't buried inside a model somewhere - they're managed like any other enterprise software asset.

Before anything goes live, validation tools automatically check the logic for errors. Missing conditions, overlapping rules, circular dependencies - the system catches these before they cause problems in production. More importantly, you can simulate proposed changes against historical data. Want to know what would have happened last quarter if you'd used the new approval criteria? Run the simulation. You get concrete numbers showing how many applications would have been affected and what the financial impact would have been. This means business stakeholders can assess changes before they touch a single live transaction.

Deployment happens through a decision service - a clean, stateless component that receives data, executes the defined rules, and returns a decision. No side effects, no retained state, just input and output. This service becomes your decision agent within a larger agentic AI system. When a workflow agent needs to know whether to approve a request or route a case, it calls the decision service, gets a definitive answer with a complete log of the reasoning, and proceeds.

The whole system is built around the idea that business rules are valuable assets that need proper governance, not logic buried inside a model that only AI specialists can touch.

Architecture that product teams can actually govern

For product leaders building agentic systems, the key insight is architectural separation. Different components should specialize in what they do best. LLMs handle language understanding and generation. Machine learning models generate probabilistic predictions. Decision platforms handle prescriptive logic. When you mix these responsibilities, you lose the ability to govern any of them effectively.

The pattern that works is stateless, side-effect-free decision agents. Stateless means the decision agent doesn't remember previous interactions. A separate workflow agent manages overall process state - tracking where you are in a multi-step process, what's happened so far, what comes next. The decision agent just receives the current data, applies the rules, and returns the result. This separation makes the decision logic simpler, more testable, and easier to verify.

Side-effect-free means the agent only makes decisions - it doesn't take actions. It doesn't send emails, update databases, or trigger workflows. This matters for reusability. A single "eligibility" decision agent can be used by loan origination, marketing campaigns, and customer service workflows. Each workflow takes different actions based on the same eligibility decision, but the core logic stays consistent and centralized.

The same principle applies to probabilistic inputs. Many business decisions benefit from machine learning predictions - fraud scores, credit risk, churn probability. The correct architectural pattern isn't to embed that logic in your decision agent. Instead, separate analytic agents (built on ML platforms) generate scores, and the decision agent consumes those scores as structured inputs alongside other data.

So your decision logic might say: "IF credit_score > 700 AND fraud_risk < 5% AND debt_to_income < 0.35, THEN approve." The fraud_risk comes from a machine learning model, credit_score comes from a bureau, debt_to_income comes from the application. The decision platform blends all these inputs with your business rules in a completely transparent way. You can see exactly how each factor contributed to each decision.

This architecture gives you control. When fraud patterns shift and your ML model needs retraining, that happens independently. When regulations change and you need to adjust approval thresholds, you update the business rules without touching the ML infrastructure. Each component evolves at its own pace, but the integration points stay clean and auditable.

What this means for legal and compliance

From a legal perspective, decision platforms solve the documentation problem that makes agentic AI so hard to deploy in regulated environments. Every decision generates a complete, structured log showing which rules executed, what data was used, and how the conclusion was reached. This isn't a post-hoc explanation - it's the actual record of what happened, captured in real time.

When a regulator asks why you denied an application, you can show them the exact rule set that was active at that moment, the specific data points that were evaluated, and which conditions were met or failed. When a customer disputes a decision, you have a time-stamped audit trail. This is the evidentiary record you need for both regulatory compliance and civil litigation defense.

The transparency also enables a useful hybrid pattern with LLMs. Upstream, you can use an LLM to ingest unstructured content - parsing product brochures, conversation transcripts, or application documents to extract the structured data your decision agent needs. Downstream, you can feed the structured decision log into an LLM to generate plain-language explanations for customers. The LLM makes the explanation readable, but the actual decision was made by transparent business rules.

This approach combines the strengths of both technologies without compromising the integrity of the decision itself. You get the benefits of natural language processing where it helps, but the actual choice - the thing you might have to defend - happened in a system you can explain.

The agility of decision platforms also matters for regulatory change management. When a new law takes effect or a policy gets updated, legal and business teams can modify the relevant rules in the central repository, simulate the impact, and deploy the change in a controlled rollout. You're not trying to retrain a model or hoping that new prompts will consistently change behavior. You're updating explicit logic that you can test before it affects anyone.

The same mechanism works for controlled business experimentation. Instead of letting an agent evolve its own behavior unpredictably, you can explicitly create multiple versions of a rule set for A/B testing or champion/challenger scenarios. The business learns and adapts, but every variation is documented and produces a complete audit trail. You know exactly which customers saw which version of the logic and why.

What to do about it

The immediate practical step is inventory. Look at your agentic AI systems - both existing and planned - and identify where they're making consequential business decisions. Not where they're understanding or generating language, but where they're making choices that affect customers, money, or regulatory compliance.

For each of those decision points, ask: Do we need to guarantee consistency? Do we need to explain this decision if challenged? Will this logic need to change when regulations or policies change? If the answer to any of those is yes, you're looking at a candidate for a decision platform instead of an LLM.

The longer-term pattern is architectural discipline. As agentic AI systems become more complex, the temptation is to add more responsibility to the LLM - make it smarter, give it more context, let it figure out more on its own. Sometimes that works. But for the specific function of making governed business decisions, that approach leads to systems you can't deploy or can't defend.

The teams that will succeed with agentic AI are the ones that recognize when to separate concerns - using LLMs for what they're good at, using ML models for probabilistic predictions, and using decision platforms for prescriptive logic that needs to be consistent, transparent, and defensible. The technology for governed decision-making already exists. It's been handling this problem in regulated industries for years. The challenge now is recognizing when you need it.

For the full story, and to learn more about AI, Innovation and the law, click on the website in my bio.

You might also like

Red Teaming

A Beginner's Guide to the A2A Protocol: How AI Agents Talk to Each Other