The Klarna case proves agents work, but legal frameworks lag behind

Kraprayoon, Jam, Zoe Williams, and Rida Fayyaz. "AI Agent Governance: A Field Guide." Institute for AI Policy and Strategy, April 2025.

The Klarna case study buried in this IAPS field guide should be required reading for every product team building agent capabilities. A fintech company successfully deployed agents handling the customer service work of 700 full-time employees with no reduction in satisfaction metrics—demonstrating that we're past the proof-of-concept phase and into real economic deployment. But the legal frameworks governing these systems remain woefully underdeveloped, creating significant exposure for companies rushing to capture market advantages.

This April 2025 report from the Institute for AI Policy and Strategy provides the most comprehensive analysis I've seen of the governance challenges emerging from AI agent deployment. The authors—Jam Kraprayoon, Zoe Williams, and Rida Fayyaz—document a technology landscape where current agents show both promising capabilities and concerning limitations that directly impact legal risk assessment.

The performance data reveals a clear pattern relevant to liability planning. Agents achieve comparable performance to humans on tasks taking around 30 minutes but struggle significantly with longer-term activities. METR's evaluation suite found agents complete less than 20% of tasks requiring one or more hours of human time. This performance cliff creates a predictable zone of heightened legal exposure where agents are sophisticated enough to begin complex tasks but lack the reliability to complete them safely.

What makes this particularly relevant for product strategy is the rapid capability improvement trajectory. Research cited in the report shows task completion length doubling every 7 months, with recent advances like OpenAI's o3 achieving 71.7% performance on SWE-bench Verified compared to 48.9% for the next best system. This suggests our legal frameworks need to anticipate agent capabilities that significantly exceed current performance within typical product development timelines.

The governance challenge extends beyond traditional AI oversight because agents "can take actions in the world through interacting with tools and other external systems." This fundamental difference from chatbot systems creates new categories of legal complexity around authority, delegation, and accountability. When an agent executes a financial transaction, modifies system configurations, or interacts with other autonomous systems, traditional notions of foreseeability and control become strained.

Industry deployment timelines add urgency to these considerations. Salesforce CEO Marc Benioff predicts one billion AI agents by the end of fiscal year 2026, while Meta's Zuckerberg announced plans to "introduce AI agents to billions of people." These commitments represent massive engineering and capital investments that will drive rapid market adoption regardless of governance readiness.

The report organizes potential interventions into five categories that correspond to specific product development decisions. Alignment measures ensure agents behave consistently with intended values and goals, but evidence suggests current approaches may be insufficient for agent systems. Research demonstrates that training methods effective for chatbots become less reliable when applied to agents, with jailbreaking attacks proving more successful against browser agents than base models.

Control interventions provide external constraints on agent behavior through technical and procedural mechanisms. Rollback infrastructure allows agent actions to be voided or undone, similar to how banks handle fraudulent transactions. Shutdown mechanisms enable controlled cessation of agent operations, while restrictions on specific actions and tools limit potential damage from agent malfunctions. For product teams, these translate to architectural requirements that must be integrated during initial development rather than retrofitted after deployment.

Visibility measures address the information asymmetries that emerge when agents operate autonomously across extended time periods. Agent ID systems provide unique identifiers with information about function, developer, behavior patterns, and incident history. Activity logging captures inputs and outputs from users, tools, and interactions with other agents. These capabilities become critical for incident investigation and establishing accountability chains when agent behavior causes harm.

Security and robustness interventions acknowledge that agents present larger attack surfaces than traditional AI systems due to their integration with external tools and services. Adversarial robustness testing systematically evaluates agent performance under specially crafted inputs designed to exploit vulnerabilities. Sandboxing creates secure, isolated environments for agent operation with restricted permissions and monitored boundaries. Access controls manage authorization for agent instructions, potentially including time-based differential access that prioritizes defensive applications.

Societal integration measures focus on long-term compatibility with existing legal and economic systems. Liability regimes must determine responsibility allocation among developers, deployers, and users when agent actions cause harm. The report notes that traditional legal frameworks depend on action foreseeability, which becomes problematic when agents behave contrary to user or developer intent. Unlike humans, agents lack inherent deterrence from personal liability consequences, potentially making them more inclined toward risky actions.

The current governance landscape reveals significant gaps between technological capability and regulatory preparedness. The report documents that only small numbers of researchers work on agent governance challenges, primarily in civil society organizations and frontier AI companies, with limited funding relative to agent development investment. This creates opportunities for early-moving companies to establish governance best practices that may become industry standards.

From a business perspective, proactive governance implementation offers competitive advantages beyond risk mitigation. Companies demonstrating responsible agent development may gain preferential treatment from regulators and customers concerned about AI safety. Early investment in technical safeguards and clear liability frameworks positions organizations favorably for inevitable policy developments rather than scrambling to achieve compliance retroactively.

The report's contrasting scenarios illustrate potential outcomes depending on governance choices made during current development cycles. Positive scenarios feature agents that augment human capabilities while maintaining meaningful oversight and equitable benefit distribution. Negative scenarios involve agents operating beyond effective human control, creating market instability, security vulnerabilities, and erosion of democratic accountability. The critical difference lies in proactive governance measures implemented during development rather than reactive responses after problematic deployments.

For product organizations, this creates specific implementation requirements that intersect with current technical architecture decisions. Agent ID systems require coordination with emerging industry standards. Rollback capabilities need integration with existing transaction and audit systems. Liability frameworks demand collaboration between legal, product, and engineering teams to establish clear boundaries around agent authority and human oversight mechanisms.

The combination of rapid capability advancement and governance underdevelopment creates both urgency and opportunity for organizations willing to invest in responsible development practices. The window for proactive governance appears limited given aggressive commercial deployment timelines, but early action provides significant advantages for companies that prioritize sustainable market positioning over short-term competitive gains.

AI Agent Governance: A Field Guide

This report serves as an accessible guide to the emerging field of AI agent governance. Agents - AI systems that can autonomously achieve goals in the world, with little to no explicit human instruction about how to do so - are a major focus of leading tech companies, AI start-ups, and investors. If these development efforts are successful, some industry leaders claim we could soon see a world where millions or billions of agents autonomously perform complex tasks across society. Society is largely unprepared for this development. A future where capable agents are deployed en masse could see transformative benefits to society but also profound and novel risks. Currently, the exploration of agent governance questions and the development of associated interventions remain in their infancy. Only a few researchers, primarily in civil society organizations, public research institutes, and frontier AI companies, are actively working on these challenges.

arXiv.orgJam Kraprayoon

TLDR: This paper highlights that AI agents—AI systems capable of autonomously achieving goals with minimal human instruction—are rapidly advancing and poised for widespread societal deployment, with projections of billions in service soon. These agents leverage foundation models with scaffolding for memory, planning, and tool use, already demonstrating economic value in areas like customer service, AI R&D, and cybersecurity at reduced costs. Despite rapid improvements in capability, current agents still face significant limitations in reliability, reasoning, and tool use, performing considerably worse than humans on complex, open-ended, and longer-duration tasks.

The paper asserts that this proliferation introduces profound and novel risks:

• Malicious use: Agents can amplify disinformation, cyberattacks, and dual-use scientific research (e.g., bioweapon development) by lowering barriers and costs for bad actors, and can be "jailbroken".

• Accidents and loss of control: Risks range from mundane malfunctions to severe scenarios like "rogue replication" (agents self-proliferating and evading shutdown), "scheming and deception" (pursuing misaligned goals covertly), and "specification gaming" (exploiting loopholes contrary to human intent).

• Security risks: Agents' expanded tool access creates larger attack surfaces, making them vulnerable to memory manipulation, exploitation via weak integrations, and "infectious jailbreaks" in multi-agent environments.

• Other systemic risks: Potential for mass labor displacement, extreme power concentration among elites, and erosion of democratic accountability.

To address these challenges, the paper introduces "agent governance" as a nascent field. This field focuses on preparing for a world with proficient AI agents, distinguishing itself from broader AI governance due to agents' unique characteristics like their autonomy, direct world impact, inter-agent communication, and even their potential role in governance itself.

A key contribution is an "agent interventions taxonomy," outlining five categories of measures:

• Alignment: Interventions to ensure agent behavior is consistent with human values and intentions, though current methods like RLHF may be less effective for more capable or deceptive agents.

• Control: External constraints to keep agents within predefined boundaries, acting as a "safety net" via mechanisms like rollback infrastructure, shutdown protocols, and restricting specific actions or tool access.

• Visibility: Measures to make agent behavior, capabilities, and actions observable and understandable to humans, including Agent IDs and activity logging.

• Security and Robustness: Interventions to secure agents from external threats and ensure reliable performance, such as access controls, adversarial robustness testing, and sandboxing.

• Societal Integration: Measures supporting long-term integration into social, political, and economic systems, addressing issues like inequality and accountability through legal frameworks (e.g., liability regimes, equitable access schemes, developing law-following AI agents).

The paper concludes that society is largely unprepared for these developments, as the pace of agent capability advancement is rapidly outstripping governance solutions. The field of agent governance is in its infancy, requiring urgent, coordinated efforts to develop and test robust interventions to ensure safe and beneficial deployment.

You might also like

How 2023 research predicted AI audit washing would enable discrimination

How systematic privacy governance becomes competitive advantage for AI deployment