What predictable agent pricing looks like, and why nobody is shipping it
In April 2026, a Cursor customer posted on X:
"my background agent burned through $100+ literally thinking itself in a loop with 0 output using codex. what are background agents for if we can't trust them to actually execute work for the token it spends"
That tweet is one of dozens we ran into while collating competitor pricing complaints this month. The same shape keeps repeating across Devin, Cursor, Lindy, Zapier, Viktor, and Claude Code. A customer signs up expecting one number, gets a different one, and can't figure out what they actually bought.
This post walks through the pattern and what we think a predictable plan has to look like for an AI agent in 2026.
The pattern, six ways
Devin (Cognition). Devin charges in "Agent Compute Units" at $2.25 each. A user on X put the question plainly:
"for the $20 plan it says you start with $20 of usage, with each ACU (Agent Compute Unit) being $2.25. as a user who's never tried Devin and wants to try, this is so confusing because does that mean i get ~10 prompts? 50 prompts? 2 prompts? who knows"
Cognition's own answer on Hacker News: "a typical frontend task is about 1-2 ACUs… really depends on the complexity." The vendor itself admits the unit is fuzzy.
Cursor. In July 2025, Cursor switched the flat $20 plan to token-based metering, plus silent "On-Demand" overages when you ran out. A getlago breakdown tracked one developer who paid "$131 total in July on Cursor, a 700% increase from the previous month" after the switch. Our Nairi vs Cursor agents page goes into what this looks like for teams.
Lindy. Per-seat metering at $49.99/user/month on the entry plan. From r/AI_Agents:
"lindy is good. it's also $49.99 a month for a single user. against the rest of your stack (claude, an email tool, a scheduler, whatever else) that adds up fast if you're running lean."
A prospeo.io aggregate review counted "expensive" as the single most common Lindy complaint at 42 mentions.
Zapier. Tasks, not Zaps. From a r/automation thread with 149 upvotes:
"The thing most people miss with Zapier is that it charges per task, not per Zap. A 3-step Zap running 500 times = 1,500 tasks billed. So you hit the plan limit 3x faster than expected."
bestautomationtools.ai catalogues "surprise charges in the $400 to $1,200 range after a misconfigured Zap looped overnight." And eesel.ai notes that every MCP tool call inside Zapier counts as two tasks, which is the gotcha for anyone trying to route an AI agent through Zapier MCP at volume.
Viktor. A Viktor co-founder wrote this on X in February (@frydwia, preserved verbatim):
"reduce how often your scheduled tasks run. more than 5 times per day is very expensive (just ask Viktor)… custom API keys wouldn't save much, we have almost no margin. we might introduce cheaper models soon."
A vendor publicly telling its own customers to run the product less is not a metering bug. It is the metering model working as designed. The Nairi vs Viktor page covers more.
Claude Code. Anthropic announced weekly rate limits in July 2025. The r/ClaudeAI thread that followed has 956 upvotes:
"After my usage limits reset, I sent one prompt. Within 12 minutes, it ate 21% of my 5-hour limit. I am on the 5x ($100) plan."
Different vendors, different units, same outcome: a buyer who can't predict the bill.
What's actually broken
Three things, all of them more engineering than business.
The metering unit is set by what's measurable, not by what a buyer budgets for. Tokens are easy for a model vendor to count. ACUs are easy for a cloud-VM operator to bill. Tasks are easy for an automation platform to log. None of these map cleanly to "how much will my team spend this month if we use this product the way we plan to." The unit is convenient for the seller, opaque to the buyer.
There is no per-agent cost dashboard anywhere. When a customer overspends, they almost never find out from the product. They find out from a credit card statement, an admin email, or a sudden rate-limit screen. From a top comment on the Cursor background-agents thread on r/cursor:
"The background agent stuff in Cursor has zero cost attribution by default so you have no idea what burned what until you're already out of quota."
Across all six products in this list, cost is a postmortem.
Metering changes mid-year. Cursor's July 2025 token shift. Anthropic's weekly rate limits. Cognition retiring the self-hosted Devin tier into the Windsurf rebrand. People who signed up for one plan woke up to a different one. From a Hacker News reaction to the Cursor change:
"Cursor raised 900M, are losing market share to claude code… AND they're decreasing the value of their product? Huge red flag… the timing (midnight on a US holiday) is not ideal."
The vendor can rewrite the contract. The customer cannot.
What predictable pricing has to look like
Usage-based pricing is not the problem. We use it ourselves. The problem starts when the usage unit, the budget cap, and the upgrade path are all opaque at once.
Three things have to be true for a usage-based plan to be predictable.
The unit has to be intuitive. A buyer should be able to read the pricing page and form a real estimate of monthly spend. Tokens fail this test. Compute units fail it. Tasks pass it on the simple end (one Zap fires N times a month) but fail it on the agent end, where one conversation can be one task or twenty depending on metering rules. The closest unit that holds up across agent products is one conversation with an agent, billed as one unit. That maps to the thing a buyer is actually buying: a useful exchange with a teammate.
Per-agent cost has to be observable, not postmortem. At minimum, every assistant turn should expose what it cost and what model it used. Today, on Nairi, every assistant message returned by the /docs/api/conversations/message-reference endpoint includes cost_usd, input_tokens, output_tokens, cache_read_tokens, cache_write_tokens, and model. Honest caveat: those fields are populated when the harness reports usage. Claude Code and OpenCode do today. Cursor's harness does not, and we tell customers that up front. A buyer-facing cost dashboard is on the roadmap; the data is queryable today via the API.
Mid-year metering changes have to be off the table. This one is mostly a covenant. The promise has to be: the unit you sign up for is the unit you renew on.
Nairi's current plans (see nairi.ai/pricing) are $20/month Hobby (3 agents, 100 tasks, 10 automation executions), $80/month Team (10 agents, 500 tasks), and custom Scale. A "task" is one conversation with an agent. Teammates are not metered, so adding the 11th or the 50th person in your Slack workspace does not change your bill. Self-hosted agents are unlimited on every tier, which gives any team an opt-out: if usage projects past the plan limits, the nairid runtime is open source and free to run on your own hardware.
It is not a flat per-org price, and we don't pretend otherwise. It is usage-based metering on a unit a buyer can actually estimate, with per-agent cost data exposed via API today, and no per-seat fees layered on top.
What this means in practice
If you are evaluating an AI agent for your team in 2026, three questions worth asking before signing.
- What is the metering unit, in plain English, and can I estimate this month's bill from your pricing page alone? If the answer involves the phrase "depends on complexity," you do not have a predictable plan.
- Can I see, per agent, how much my team spent last month? API or dashboard, either is fine. If the answer is "we are working on it," you are paying for visibility you do not have.
- What is the vendor's policy on changing the metering unit mid-contract? If there is no policy, assume the answer is "we will change it when our margin needs us to."
These are reasonable questions to ask any vendor selling you a Slack or Discord agent, including us. Our Nairi vs Devin page covers the ACU side of this in more detail, and Nairi vs Cursor agents covers the token-budget side.
Honest closing
Flat per-seat pricing is not the only honest model. Pure usage-based pricing, done well, can be just as predictable. We picked task-based metering with no per-seat fees because it is the simplest shape that lets a customer reason about a monthly bill without doing math we should be doing for them. Other shapes work too, as long as those three covenants hold: an estimable unit, observable per-agent cost, no mid-year rewrites.
What does not work is what most of the industry is shipping right now: convenient-for-the-vendor units, postmortem-only attribution, and metering models that can change at midnight on a US holiday.
That is the actual product gap. We are trying to close it.