AI Agent Payment Controls: Governance for Payment Operators
When an AI agent holds the card, the operator's job is controls: scoped mandates, spending limits, kill-switches, and audit. A governance guide.
The winning agent payment spec is unknowable; the durable operator job is controls. Scoped revocable mandates, spending and velocity limits, agent identity, human-in-the-loop, instant kill-switch, and immutable audit — plus a control reference table and pre-enablement checklist.
The operator's job in agentic payments is not to pick the winning agent spec — it is to put bounded, revocable, auditable controls around what an agent can do with money. The durable controls are: a scoped mandate that defines what an agent may buy, for whom, how much, and for how long; spending and velocity limits layered as per-transaction, period, count-per-window, and cumulative ceilings; merchant, category, and payment-method restrictions; agent identity bound to a specific principal and authenticated on every action; human-in-the-loop approval above defined thresholds; an instant kill-switch that revokes a mandate or an agent and propagates fast; and immutable, queryable audit logs of every agent action. Treat liability between user, agent platform, PSP, wallet, and merchant as something you define contractually, not something you assume.
Agentic payments are arriving. AI agents that hold a payment credential and complete a purchase without a human at the keyboard have moved from demo to live transaction, and the schemes, PSPs, and wallets are shipping the infrastructure to make it routine. The reflex for many operators is to ask which standard to bet on — Google's AP2, Visa's Trusted Agent Protocol, Mastercard's Agentic Tokens, Stripe's Agent Toolkit, one of the model vendors' agent protocols. That is the wrong first question.
The standards will churn. Some of what looks foundational in 2026 will be deprecated, merged, or quietly abandoned by 2028. What does not churn is the operator's actual job: bounding what a non-human actor is allowed to do with money. A mandate has to be scoped, limits have to be enforced, an agent has to be identifiable, a human has to be in the loop above a threshold, and you have to be able to stop everything instantly and prove afterward exactly what happened. Those control objectives outlast any spec. Pick controls, not specs.
This article is the governance leg of a three-part set. For the market context — what is live, how 3DS breaks for agents, the MCC coding problem, and the open liability question — see agentic commerce. For the protocol stack — MCP, the Stripe Agent Toolkit, the network-layer credentials — see the agent payment API stack. This piece is deliberately narrower and more durable: it is the controls-and-governance framework you wrap around whatever stack and market you end up operating in.

Why agent payments need their own controls
A non-human actor holding a payment credential changes four things that existing controls were not designed for.
Delegated authority. A human cardholder is present and exercising judgment at the moment of purchase. An agent is acting on authority delegated earlier, for a scope the human may have described loosely ("book my travel") and may not fully remember. The control problem shifts from "is this the cardholder?" to "is this within the authority the cardholder actually delegated, and is that delegation still valid?"
Machine speed and velocity. A human makes a handful of purchases an hour at most. An agent in a loop — or a compromised agent — can attempt hundreds of transactions a minute. A control that depends on a human noticing something is wrong has already lost. Limits have to be enforced mechanically and at machine speed because the failure mode is mechanical and at machine speed.
Attribution and identity. When something goes wrong with a human transaction, you know which human. With agents you have to answer: which agent, running which version, acting for which principal, under which mandate. If you cannot attribute an action to a specific agent and a specific delegation, you cannot dispute it, debug it, or revoke it cleanly.
Blast radius. A compromised card credential harms one account at human speed. A compromised agent — or a buggy one — can act across every principal it serves, at machine speed, until something stops it. The scale of what can go wrong before a human intervenes is categorically larger, which is why an instant kill-switch is a first-class control rather than a nice-to-have.
Existing card-on-file controls — per-card limits, MCC blocks, velocity rules — are necessary but not sufficient here. They assume a human principal and a single stored credential. They do not express delegated authority, they do not bind to an agent identity, and they do not give you a fast revocation path for "this specific agent, acting for this specific user, stop now." The framework below is what you add on top.
Scoped mandates and delegated authority
The foundational control is the mandate: a bounded, revocable, time-limited grant of authority that says what an agent may buy, for whom, up to how much, and for how long. Everything else in this framework is an enforcement mechanism for, or an exception to, the mandate. The mandate is the unit of control.
A well-formed mandate is least-privilege by construction. It does not grant "spend on this card"; it grants "spend up to this amount, at these kinds of merchants, on behalf of this user, until this expiry." This is the same discipline that OAuth 2.0 scopes encode for API access — access is limited to exactly the permissions requested, and requesting broad access without scoping is treated as an anti-pattern. The durable idea predates agents by a decade: delegate the narrowest authority that still lets the job get done, and make the grant expire.
Several emerging specs encode a mandate concept, and they are useful as illustrations of where the industry is converging — but treat each as evolving and provider-specific. As of 2026, Google's AP2 models authority as cryptographically signed Mandates (an Intent Mandate capturing the user's upfront conditions such as price and timing, and a Cart Mandate fixing the exact items and price before purchase), signed by verifiable credentials. That is one design; it will evolve, and your providers may implement something different. What you should hold onto is not AP2 specifically but the property it illustrates: a mandate that is explicit, signed, scoped, and checkable after the fact. When you grant a mandate you are granting authorization in the literal sense — and like any authorization, it should be the minimum that works and it should expire.
Spending and velocity limits
Limits are the numeric guardrails that enforce a mandate's "how much." They are most robust when layered, because each layer catches a failure the others miss:
- Per-transaction cap. The maximum any single agent-initiated transaction may be. Catches the fat-finger and the single large erroneous purchase.
- Period budget. A cumulative cap over a window — per day, per week, per billing period. Catches slow bleed that stays under the per-transaction cap.
- Velocity limit. A maximum count of transactions per window (per minute, per hour). This is the control specifically aimed at machine speed: a looping or compromised agent hits the velocity wall long before it hits the period budget in dollar terms.
- Cumulative ceiling. A hard lifetime-of-mandate ceiling that no rolling window resets. The backstop that bounds total exposure regardless of how the other limits are tuned.
Layer them rather than choosing one. A per-transaction cap alone does nothing against a thousand small transactions; a period budget alone lets a compromised agent burn the whole budget in seconds. The combination of count-per-window and amount-per-window is what actually bounds a machine-speed actor. Enforce them at a layer the agent cannot edit — the credential, token, or orchestration layer — not in the agent's own logic, which a compromised or buggy agent will not honor.
Merchant, category, and payment-method restrictions
Beyond "how much," a mandate should constrain "where" and "how." Three restriction types:
- Merchant allow/block lists. Restrict an agent to a known set of merchants, or block specific ones. Tightest for narrow agents ("renew these three SaaS subscriptions"), looser for open-ended shopping agents where an allow-list is impractical.
- Category restrictions. Limit by merchant category — the agent that books travel should not be able to spend at a casino or a crypto exchange. There is a real nuance here: when an agent routes a purchase through its own platform's merchant account rather than the underlying merchant, the category coding can shift away from the true merchant type, which both breaks category restrictions and creates reconciliation and dispute problems. The MCC-coding mechanics of agent-routed transactions are covered in agentic commerce; the governance point is that a category restriction is only as trustworthy as the category coding underneath it, so verify how your providers code agent-routed transactions before relying on a category rule.
- Payment-method and rail restrictions. Constrain which method or rail an agent may use — a specific card, a scoped token, a particular wallet — and exclude rails you do not want an agent touching. Scoped tokens (issued per agent, with controls embedded at the token level) are a cleaner enforcement point than a raw stored credential because the constraints travel with the credential.
Agent identity and user binding
A limit is meaningless if you cannot tell which agent you are limiting. Identity answers four questions on every action: which agent is this, what version, which principal (user) is it bound to, and how was that binding authenticated. Without confident answers you cannot enforce a per-agent mandate, you cannot revoke one agent without revoking all, and you cannot distinguish a legitimate agent from one that has been spoofed or compromised.
The threat to design against is an agent acting beyond its principal — either a compromised agent reaching past its mandate, or an impostor presenting as a trusted agent. The defenses are authentication of the agent on every action and a verifiable binding between the agent and the principal it claims to act for. Agent-identity standards are emerging and not yet settled: as illustrative 2026 examples, Visa's Trusted Agent Protocol has agents sign their requests cryptographically so a merchant can verify identity and bind the signature to a specific operation, and Mastercard pairs per-agent tokens with a Know Your Agent onboarding step. These are evolving and provider-specific — do not assume any one of them is universal or final. The durable requirement is that every agent action carries a verifiable identity bound to a specific principal, however your providers implement it.
Human-in-the-loop approvals
Some actions should not be fully autonomous. The control is a threshold above which a human must approve before the agent proceeds — by amount, by merchant risk, by deviation from the stated intent, or by any combination. Below the threshold the agent acts; above it, it requests a step-up and waits.
The hard part is not adding the approval gate — it is designing the approval surface so it is a real decision and not a reflex tap. An approval that shows a human "Approve $1,420 purchase?" with no context, dozens of times a day, becomes a rubber stamp, and a rubber stamp is worse than no gate because it manufactures the appearance of oversight. Make the prompt carry enough to decide on — what, where, why, against which stated intent — set the threshold high enough that approvals stay rare and meaningful, and watch the approval rate: if humans approve essentially everything instantly, the threshold is wrong or the surface is uninformative. The AP2-style pattern of checking a cart against a previously declared intent is one way to make the gate substantive rather than cosmetic, by surfacing the mismatch a human should actually catch.
Kill-switch and revocation
When something goes wrong, the most important control is the ability to stop it instantly. That means revoking a single mandate, revoking a specific agent across all its mandates, and — in the worst case — a global "stop all agent payments" switch. Per-agent tokens that can be revoked independently make the targeted version possible without taking down every agent at once.
A kill-switch is only as good as its propagation speed. Revocation that takes minutes to reach the enforcement point is a window in which a machine-speed actor keeps transacting. Treat revocation latency as a measured number, not an assumption: how long from "revoke" to "this credential can no longer authorize," and where in the stack is that enforced. This is the same incident discipline as failing a degraded processor over fast — the PSP and acquirer outage failover runbook makes the same point about kill-fast controls, and the same lesson applies: the control you have not exercised is a hypothesis, so test that revocation actually works and clock how fast it propagates before you need it in anger.
Audit logs and dispute evidence
Every agent action should produce an immutable, queryable log entry: which agent, which version, which mandate, which principal, when, what was attempted, and what the outcome was. This is not optional record-keeping — it is the evidence base for four different jobs.
- Disputes and chargebacks. When an agent-initiated transaction is disputed, you need to show the mandate that authorized it and the action that executed it. The same evidence discipline that powers chargeback representment applies, with the added burden of proving the delegation chain.
- Debugging. When an agent does something unexpected, the log is how you reconstruct what it actually did versus what it was authorized to do.
- Revocation forensics. After a kill-switch event, the log tells you exactly what the agent did before it was stopped and how much exposure to clean up.
- Future regulatory ask. The rules for agent payments are unsettled; a complete, immutable audit trail is the cheapest insurance against whatever evidentiary standard eventually lands.
Make the log immutable (append-only, tamper-evident) and queryable along the dimensions you will actually need under pressure — by agent, by principal, by mandate, by time window. A log you cannot query fast during an incident is documentation, not a control.
Fraud and risk monitoring for agent traffic
Agent traffic has a different shape from human traffic, and your existing monitoring is tuned for the human shape. Agents generate none of the behavioral signals — mouse movement, typing cadence, device fingerprints — that human fraud models lean on, and they transact with machine timing and velocity. That means two things: your human-tuned models will misread legitimate agent traffic, and you need agent-specific signals to catch the abuse.
Monitor agent traffic as its own population. Velocity and timing patterns, deviation from a mandate's normal behavior, sudden changes in merchant or category mix, and clustering that suggests a compromised agent are the kinds of anomalies worth alerting on. The goal is to distinguish three things that look similar from the outside: a legitimate agent doing its job, an abusive bot wearing agent clothing, and a legitimate agent that has been compromised. Fold these into the same operational discipline you already run — the metrics framing in fraud operations KPIs and the modeling approach in account takeover detection both transfer, applied to an agent population rather than a human one.
Liability and operational handoff boundaries
When an agent-initiated payment goes wrong, responsibility could sit with the user who set the mandate, the agent platform that executed it, the PSP or wallet that issued the credential, or the merchant that accepted it. Here is the honest position: liability for agent-initiated payments is unsettled and varies by scheme, contract, and jurisdiction. It is not yet resolved in law, and the scheme rules were not written for a cardholder who authorized an agent rather than a specific purchase. This article does not tell you where liability lands, because no one can tell you that with confidence today, and nothing here is legal advice.
What you can do is treat liability as a boundary to resolve contractually rather than a rule to assume. Map the handoff points — user to agent platform, platform to PSP or wallet, PSP to merchant — and for each, get the responsibility split written into the contract: who bears the loss when an agent exceeds its mandate, when an agent is compromised, when a mandate was revoked but a transaction slipped through, when delegation provenance cannot be proven. The deeper market discussion of the open liability question lives in agentic commerce; the governance instruction here is narrow and durable: do not operate on an assumed liability position, contract for it explicitly, and price the residual risk you cannot contract away.
Control reference table
The citable core of this framework. Each row is a control objective, what it bounds, where it is typically enforced, and what breaks if you skip it. Enforcement points and mechanics vary by provider — verify against your own stack.
| Control | What it limits | Where typically enforced | Failure mode if missing | |---|---|---|---| | Scoped mandate | The total authority an agent has — what, for whom, how much, how long | Credential/token issuance; mandate service | Agent operates with open-ended authority; no basis to bound or revoke | | Spending and velocity limits | Per-transaction, period, count-per-window, and cumulative spend | Token controls; orchestration layer | Looping or compromised agent burns budget at machine speed | | Merchant/category/method restrictions | Where and how an agent may spend | Token controls; network/scheme rules; orchestration | Agent spends at out-of-scope merchants or on disallowed rails | | Agent identity and user binding | Which agent, which version, which principal, authenticated | Agent-identity layer; signed requests; token registry | Cannot attribute, cannot revoke one agent, cannot detect spoofing | | Human-in-the-loop approval | Autonomy above a risk/amount threshold | Agent platform; approval surface | High-risk purchases execute with no human checkpoint | | Kill-switch and revocation | Continued action by a mandate or agent after a stop decision | Token revocation; mandate service; orchestration | No way to stop a misbehaving agent fast; exposure keeps growing | | Immutable audit log | Loss of a defensible record of every agent action | Logging/event pipeline (append-only store) | No dispute evidence, no forensics, no answer to a regulatory ask | | Agent-traffic monitoring | Undetected abuse or compromise in agent traffic | Fraud/risk platform tuned for agents | Compromised or abusive agents blend into normal traffic | | Idempotency on payment calls | Duplicate charges from agent retries or loops | API client; PSP idempotency keys | A retrying or looping agent double-charges |
What to ask before enabling agent payments
A due-diligence checklist for the conversation with any PSP, wallet, or agent platform before you switch agent payments on. Treat a vague or missing answer as a finding, not a formality.
- [ ] Mandate model. How is a mandate expressed, scoped, signed, and expired? Can you inspect and revoke an individual mandate?
- [ ] Limit granularity. Which limit types are supported — per-transaction, period, velocity (count/window), cumulative — and at which layer are they enforced (can the agent bypass them)?
- [ ] Merchant/category/method controls. Can you allow/block merchants, restrict categories, and constrain methods or rails? How are agent-routed transactions coded, and does that coding undermine category rules?
- [ ] Agent identity and authentication. How is an agent identified and authenticated on each action, and how is it bound to a specific principal? How do you detect a spoofed or compromised agent?
- [ ] Human-in-the-loop. Can you set approval thresholds, and what context does the approval surface show the human? Can you tune it to avoid rubber-stamping?
- [ ] Revocation latency. How fast does a kill-switch propagate from "revoke" to "cannot transact," and where is it enforced? Has it been tested?
- [ ] Audit export. Are agent actions logged immutably, and can you export and query them by agent, principal, mandate, and time?
- [ ] Liability terms. What does the contract say about responsibility when an agent exceeds its mandate, is compromised, or transacts after revocation? What is left to you?
- [ ] Monitoring hooks. Can you feed agent-traffic events into your own fraud and risk monitoring, with agent-specific signals?
- [ ] Sandbox and testing. Is there a sandbox to exercise mandates, limits, revocation, and failure modes before production — and can you run a revocation drill end to end?
Scope note
Agent payment specs and standards are evolving as of June 2026. Google AP2, Visa Intelligent Commerce and the Trusted Agent Protocol, Mastercard Agent Pay and Agentic Tokens, the Stripe Agent Toolkit, and model-vendor agent protocols are cited here as illustrative examples of where the industry is converging, not as endorsements, and each will change — verify the current behavior with the provider. This is operational governance guidance, not legal, regulatory, or scheme-rule advice. Liability for agent-initiated payments is unsettled and varies by scheme, contract, and jurisdiction; nothing here states what the rule is. The control framework, layered-limit design, control reference table, and checklist are PaymentBrief operator synthesis — verify mandate models, limit granularity, revocation latency, liability terms, and monitoring hooks with your specific PSPs, wallets, and agent platforms before relying on them.
Related references
- Agentic Commerce: When AI Agents Become Cardholders — the market context: what is live, why 3DS breaks for agents, MCC coding, and the open liability question this governance piece treats as a contractual boundary.
- AI Agents and Payment APIs: MCP, Stripe Agent Toolkit, and Mastercard — the protocol stack these controls wrap around.
- PSP and Acquirer Outage Failover Runbook — the same kill-fast and test-the-control discipline applied to processor incidents.
- Fraud Operations KPIs — the metrics framing to extend to an agent-traffic population.
- Account Takeover Detection with ML — the modeling approach for spotting compromised or impostor agents.
- AI Chargeback Representment Automation — the dispute-evidence discipline your agent audit logs feed.
For term definitions — authorization, MCC, PSP, and payment orchestration — see the Payments Glossary.
Sources & methodology (7)
Google's Agent Payments Protocol (AP2) establishes trust through Mandates — tamper-proof, cryptographically-signed digital contracts that serve as verifiable proof of a user's instructions, using an Intent Mandate (price, timing, and other conditions set upfront) and a Cart Mandate (an unchangeable record of exact items and price), signed by verifiable credentials
Cited as an illustrative example of a scoped-mandate model; AP2 is one evolving spec among several and the article does not depend on it.
Checked:
Visa announced Intelligent Commerce (April 30, 2025) and launched the Trusted Agent Protocol (October 2025), an open framework in which agent requests carry cryptographically signed HTTP messages so merchants can verify agent identity and bind the signature to a specific domain and operation; users set spend parameters (merchant categories, caps, time windows) the agent must operate within
Used to illustrate agent-identity and scoped-credential controls; cited as an evolving example, not as a scheme rule or endorsement.
Checked:
Mastercard Agent Pay (announced April 2025) issues Agentic Tokens scoped per AI service — one card can carry separate tokens for different agents, each scoped at issuance to merchant categories, spending caps, and time windows, independently revocable, with the agent never receiving the card number; agents are onboarded via a Know Your Agent registration process
Illustrative of per-agent scoped tokens and independent revocation; cited as an evolving example, not a rule operators must follow.
Checked:
Stripe's Agent Toolkit and MCP server let an agent be granted a Stripe restricted API key scoped to exactly the operations it needs — a refund-only agent gets a key that can only call the refunds API — placing scope enforcement at the key-management layer rather than the network
Cited as a durable, available least-privilege control mechanic; specific tool coverage evolves.
Checked:
Idempotency keys let a client safely retry a request after a connection error without performing the operation twice; Stripe saves the status code and body of the first request for a given key and returns the same result on subsequent requests with that key, which bounds the blast radius of an agent that retries or loops
Cited as a durable control mechanic against duplicate agent-initiated charges.
Checked:
OAuth 2.0 uses scopes to limit an application's access to exactly the permissions it requests; the principle of least privilege is enforced by always including a scope parameter and having APIs check for the expected scopes, with requesting access without a scope treated as a security anti-pattern
Cited for the durable least-privilege / scoped-delegation principle underneath agent mandates, independent of any payment spec.
Checked:
The control framework, layered-limit design, control reference table, and pre-enablement checklist in this article are PaymentBrief operator synthesis — illustrative governance guidance, not scheme rules, regulatory requirements, or legal advice; mandate models, limit granularity, revocation latency, and liability terms must be confirmed with your specific providers
Checked:
Source types explained in our Methodology.