Smaller models aren't a compromise — they're a governance feature

Confluent's Sean Falconer makes a point that deserves more attention from product and legal teams: enterprise AI inherited the consumer model, and that inheritance is creating problems we keep misdiagnosing.

The consumer AI paradigm makes sense in its own context. A single interface that writes poems, debugs code, and plans vacations works when the range of possible inputs is infinite and success is subjective. Nobody's getting sued because ChatGPT gave them a mediocre haiku.

Enterprise AI operates in a fundamentally different environment. An invoice is parsed correctly or it's not. A support ticket routes to the right team or it doesn't. A contract clause gets classified accurately or it creates downstream liability. These aren't conversational problems — they're operational ones where the cost of being wrong is measurable.

The case for constraint over capacity

Small language models win in these settings because they're built around constraint, not breadth. Microsoft's Phi-3 research demonstrates this clearly: on benchmarks like MMLU and MT-Bench, compact models approach or match much larger ones once the task space is well-defined. The pattern holds across healthcare, finance, and legal workflows.

This makes intuitive sense once you think about it from the other direction. Additional parameters don't improve accuracy in bounded environments — they add more ways to be wrong. A model trained to understand everything has to decide which "everything" applies to your specific contract clause. A model trained on your contract types just classifies it.

Real-world evidence across regulated industries

The data from production deployments backs this up. Healthcare companies deploying SLMs trained on clinical data rather than the open web are seeing higher accuracy on domain-specific queries, materially fewer hallucinations, and summaries that actually map to downstream care-management systems.

Finance and legal teams see the same pattern. Contracts, risk reports, and regulatory filings use natural language, but they operate within rigid semantic boundaries. Terms like "net asset value" or jurisdiction-specific legal clauses have precise meanings that general models frequently blur. Firms deploying smaller models trained directly on internal documents get more consistent clause classification, fewer false positives in compliance checks, and response times fast enough to sit directly in transaction pipelines.

The governance case is the real story

For product and legal teams, this creates a decision point that most organizations are getting backwards. The question isn't "should we use the biggest model available?" It's "does this model's capacity match the problem's shape?"

Boundary awareness matters more than parameter count. In closed-world settings, excess generality works against both accuracy and governance. You can't validate what you can't predict. You can't audit what you can't replay. You can't assign accountability when the model's decision space is unbounded.

The upside of focused models: consistency you can test, predictable behavior you can validate, audit trails that actually work, and the control that production systems require. When your AI is making operational decisions — not creative suggestions — you need models that know their limits. That's not a capability constraint. It's a governance feature.

What this means for product counsel

The governance implication is practical: deployment decisions should start with boundary definition, not model selection. Map your input space. Define your output constraints. Identify your failure modes. Then pick the smallest model that covers that territory reliably. Anything bigger creates governance overhead without operational benefit.

Most enterprise AI governance frameworks focus on what the model can do. The more productive question is what the model should do — and whether its architecture is scoped to match. In regulated environments, a model that knows its limits is easier to validate, easier to audit, and easier to defend than one that can do everything but predictably does nothing.

Source: SLMs vs. LLMs: Why smaller AI models win in business — Sean Falconer, The New Stack

The case for constraint over capacity

Real-world evidence across regulated industries

The governance case is the real story

What this means for product counsel

You might also like

The AI employee is here, and it comes with a management platform

Six data shifts that will shape enterprise AI in 2026