AI agent autonomy levels create predictable liability gradients

Mitchell, M., Ghosh, A., Luccioni, A. S., & Pistilli, G. (2025). Fully autonomous AI agents should not be developed. arXiv preprint arXiv:2502.02649v2.

In February 2025, Hugging Face researchers published an analysis arguing against fully autonomous AI agents, highlighting a critical legal inflection point for in-house counsel building AI systems. This analysis, led by Margaret Mitchell, presents a five-level autonomy taxonomy that directly correlates with legal exposure.

The framework ranges from simple processors (★☆☆☆) to fully autonomous agents (★★★★) capable of creating and executing new code beyond predefined constraints. This distinction is crucial because it defines the boundary where product liability shifts from foreseeable misuse to an inability to control system behavior.

The core finding is that risk systematically increases with each autonomy level, with these risks compounding rather than simply adding up. Analysis across thirteen ethical values—including accuracy, safety, privacy, and security—revealed consistent patterns of risk amplification as systems gain more autonomous control. This represents a fundamental shift in how legal responsibility is allocated.

The paper cites the 1980 NORAD incident, where computer systems falsely indicated over 2,000 Soviet missiles, as a concrete example. This demonstrated how even well-engineered autonomous systems can produce catastrophic errors from seemingly trivial causes. For product teams, this means that increasing a system's autonomy amplifies the risk of safeguards failing under unimagined conditions.

Legal implications sharpen when considering "hijacking" scenarios, where malicious third parties can instruct agents to exfiltrate confidential information. The U.S. AI Safety Institute identified this as a critical vulnerability in January 2025, noting cascade effects on user reputation, financial stability, and potential attack targets. This represents a fundamental shift from traditional software vulnerabilities; when systems can autonomously execute novel code, traditional security perimeters become meaningless, and exposure extends to any damage an adversarial actor can conceive.

Misplaced trust is identified as a critical amplifier of all other risks. Systems that perform correctly most of the time create higher stakes when they fail due to users developing inappropriate confidence in their reliability. This directly impacts duty-of-care obligations, as responsibility is inherited for the quality of user reliance on systems for consequential decisions. As autonomy increases, the ability to characterize and limit this responsibility proportionally decreases.

The value analysis shows how autonomy affects different legal risk categories. Safety concerns demonstrate pure risk amplification, with more autonomous systems creating unpredictable failure modes and severe consequences. Privacy risks follow a similar pattern, as systems with broader access and decision-making authority can expose user data in unanticipated ways. Security vulnerabilities compound because autonomous systems can access and manipulate multiple connected systems, expanding attack surfaces beyond what traditional penetration testing can adequately assess.

The research provides specific guidance for product development that directly translates into legal risk management. Their recommendation against fully autonomous agents is based not on technological limitations but on fundamental problems with maintaining accountability and control. Semi-autonomous systems with meaningful human oversight offer a more favorable risk-benefit profile, provided human involvement is genuine, not superficial.

For product teams, this suggests a clear development strategy: build toward the three-star level (multi-step agents) while maintaining robust human control mechanisms. These systems can manage complex workflows and provide significant user value without crossing into the uncontrollable territory of fully autonomous agents. The legal advantage is that human oversight remains meaningful and verifiable, creating clearer liability boundaries.

Practical implementation requires three critical components:

Robust frameworks for maintaining human oversight that go beyond cosmetic approval prompts, ensuring users retain genuine control over consequential decisions.
Reliable override systems that can interrupt agent operations when they drift outside intended parameters.
Verification methods that can validate agent behavior remains within acceptable bounds.

The paper's analysis of different autonomy levels provides a useful framework for risk assessment in product planning. Simple processors and routers present manageable risks. Tool-calling systems introduce more complexity but still operate within defined boundaries. Multi-step agents represent the practical ceiling for systems where reasonable control and liability management can be maintained.

The business implications are equally significant. The research suggests that companies pushing for full autonomy are essentially betting their survival on controlling systems designed to be uncontrollable. This represents a fundamental misalignment between product strategy and risk management. Semi-autonomous systems can capture most of the market value of AI agents while avoiding the existential risks associated with fully autonomous deployment.

The timing of this research is particularly relevant given recent industry trends toward agentic AI systems. With companies racing to deploy increasingly autonomous systems, competitive pressure and regulatory risk are created. Companies demonstrating thoughtful autonomy limitations may gain advantages as regulators begin to scrutinize this space more closely.

The research also offers useful language for stakeholder communication. Instead of rejecting AI agents entirely, it provides a nuanced framework for explaining why certain types of autonomy create unacceptable risks while others remain beneficial. This enables constructive engagement with business stakeholders who want to capture the benefits of agentic AI without unnecessary liability exposure.

Moving forward, the paper's recommendations align with emerging regulatory approaches emphasizing human oversight and accountability. By building systems that retain meaningful human control, companies can position themselves favorably for future compliance requirements while still delivering substantial user value. The key is ensuring autonomy limitations are genuine design constraints, not superficial safety theater.

The research provides a clear framework for critical autonomy decisions:

Evaluate each proposed agent capability against the five-level taxonomy.
Assess risk amplification across relevant value categories.
Maintain human control mechanisms that are meaningful rather than cosmetic.

This approach allows companies to capture the benefits of agentic AI while avoiding uncontrollable risks that could threaten the entire business.

Fully Autonomous AI Agents Should Not be Developed

This paper argues that fully autonomous AI agents should not be developed. In support of this position, we build from prior scientific literature and current product marketing to delineate different AI agent levels and detail the ethical values at play in each, documenting trade-offs in potential benefits and risks. Our analysis reveals that risks to people increase with the autonomy of a system: The more control a user cedes to an AI agent, the more risks to people arise. Particularly concerning are safety risks, which affect human life and impact further values.

arXiv.orgMargaret Mitchell

TLDR: The paper argues that fully autonomous AI agents should not be developed. These agents represent a fundamental shift, moving beyond human-controlled tools to autonomously create context-specific plans and execute multiple tasks without human intervention. The risks to people increase with a system's autonomy, as users cede more control. Safety risks, affecting human life and other values, are particularly concerning.

Risks are amplified as AI agent autonomy increases:

• Safety & Security: Unpredictable actions and the potential to override human control become more likely, leading to harms like "hijacking" for data exfiltration or compromising users. Attack surfaces expand, allowing automated attacks at scale.

• Accuracy & Truthfulness: Inaccuracies and false information (e.g., deepfakes) from base models compound, leading to unreliable outcomes and the propagation of false information that can manipulate beliefs or scam people.

• Privacy: Increased access to personal data (e.g., contacts, calendars) for personalization means higher risks of breaches and public sharing of intimate information without consent.

• Flexibility: Greater integration with diverse systems increases the risk of malicious code and unintended problematic actions, such as draining a bank account.

• Equity: While potentially beneficial, systemic biases from training data can compound, and job loss becomes a greater risk as agents act as artificial workers.

While some benefits like assistiveness, efficiency, and relevance are noted, they often present countervailing relationships with risks. The paper finds no clear benefit of fully autonomous AI agents, only foreseeable harms from ceding complete human control. Historical incidents, such as nuclear false alarms, demonstrate that autonomous systems can make catastrophic errors from trivial causes, highlighting the essential role of human cross-verification.

The paper concludes with critical directions for future development:

1. Adopt agent levels: Widespread adoption of clear distinctions between autonomy levels is needed to better understand system capabilities and associated risks.

2. Human control mechanisms: Develop robust technical and policy frameworks to maintain meaningful human oversight, including reliable override systems and clear boundaries for agent operation.

3. Safety verification: Create new methods to verify that AI agents remain within intended operating parameters and cannot override human-specified constraints.

Ultimately, human judgment and the ability to say "no" remain essential, particularly for high-stakes decisions.

You might also like

How 2023 research predicted AI audit washing would enable discrimination

How systematic privacy governance becomes competitive advantage for AI deployment