What a Lisbon cafe told me about Anthropic's marketplace experiment

Three weeks ago I was in Lisbon, working through a stack of meetings — academics, practitioners, a guest lecture at NOVA Law. I had just arrived into Lisbon and I was fighting off jet lag at my usual cafe. Someone walked in and ordered a espresso “sim princípio!” My Portuguese is rough, but I asked my husband, did he just order a coffee without principles?

He laughed and explained…no some folks ask the barista to throw away the first drops out of the espresso machine. Those drops are the most concentrated, the most bitter, the part the machine has to push hardest to make. The coffee is smoother. Easier to drink. Less anchored.

The Portuguese word princípio means both beginning and principle. Same Latin root. To remove the start is, in the same breath, to remove the foundation. So this has been living rent free in my head. Then Friday, Anthropic published a paper that turned it into something more useful.

The project is called Project Deal. Anthropic ran a one-week experiment inside its San Francisco office. Sixty-nine of its employees handed Claude agents $100 each and let those agents transact on their behalf in a Slack-based marketplace. The agents listed items, fielded counteroffers, haggled over prices, and closed deals — all in natural language, with no human stepping back into the loop after the setup interview. By the end of the week, 186 deals had closed. More than $4,000 had moved.

The findings will get debated for months. The sentence in the paper that mattered most to me will probably get less notice: "The policy and legal frameworks around AI models that transact on our behalf simply don't exist yet."

The reason the Lisbon phrase came back to me is that Project Deal is, at the operational level, a perfect example of sem princípio, something that runs smooth, but without grounding principles.

The market cleared. The deals closed. The participants reported satisfaction. What was missing was the principle that grounds the cup — the moment of authority that maps to a specific transaction, the comparison that lets a participant know if they were treated fairly, the record that lets someone reconstruct what happened weeks later. The cup arrived smoother. The principle was poured down the drain.

OK, Im being dramatic. But for teams looking at agentic engagement (which seems like everyone), maybe not.

The frameworks lawyers grew up interpreting were built for moments. A decision. A signature. A defined purpose. Agents work in loops. They make hundreds of micro-decisions to produce one outcome. They run for hours or days from a single briefing. They aree/will negotiate with other agents in natural language, without a protocol. The legal architecture has not caught up, and Anthropic just said so out loud.

Waiting for the architecture to arrive is not the move. The operating standard is not compliance. It is defensibility — a record you can explain, document, and stand behind. The kind of record that survives an audit, a customer inquiry, a courtroom, or a regulator who shows up two years later asking what an agent did and why.

Defensibility breaks down into three concepts. None of them is optional.

Observability is what you see while the agent is running. Can you watch what it is doing in the moment, what tools it is calling, what data it is touching, what decisions it is making? This is the dashboard view — the equivalent of looking at the instruments while the car is moving. Most enterprise AI deployments ship with some version of this because the system cannot run without it.

Traceability is what you can reconstruct after the fact. Given an outcome, can you walk backward to the inputs, the authority granted, the chain of decisions, the moment the agent decided to do this rather than that? Observability without traceability is a dashboard with no history. You see the action; you cannot tell anyone where it has been.

Auditability is what a third party can verify. Tamper-evident. Portable. Defensible. Given a record produced by your system, can a regulator, a counterparty, or an opposing counsel confirm the agent acted within its authority and that the record has not been edited after the fact? Most enterprise systems fall short here, because they were built for operations and investigation, not adversarial verification.

The three are layered, and they belong to different audiences. Observability is the operator's view. Traceability is the investigator's. Auditability is the regulator's, and the courtroom's. Most companies have the first today. Some have the second. Almost none have the third.

That last layer — the artifact a third party can verify — does not yet exist as a standard. I have been exploring one through my work on Attest, under the name Certificate of Action: a tamper-evident, signed record tied to the inputs the agent saw, the authority it acted under, and the decisions it made along the way. The explanation is at mindthechain.kenpriore.ai and a demo built for the Legal Quants:

Attest

Project Deal is the experiment that shows we are still pre-artifact.

For builders sitting at the front of this, you will not be able to wait for the regulatory architecture to clarify. The agents will ship. The transactions will run. The deals will close. What you can shape — and what your role now requires you to shape — is the record that gets produced as those things happen. The disclosure of which agent represented your company. The trace that lets you reconstruct what the agent did. The audit artifact that holds up when someone outside your company asks.

I keep coming back to the café in Lisbon. The cup that arrived was smoother. But how do we know what principles were in that cup, and what were poured out.

That is also true of the markets coming next. The smoothness is here, and it will compound. We will not be putting the principle back into the cup.

You might also like

AI agents could dissolve the friction that keeps justice expensive

What Andrej Karpathy and your legal team both get right about vibe coding