Ask anyone selling agentic commerce services what the ROI is. You’ll get a framework. A slide deck. A directional metric. What you won’t get is a clean number tying their work to your revenue.

This is the thing nobody building in this space wants to say out loud: you can see the channel growing. You can’t prove your optimization is why your brand is in it.

And the distinction matters, because there’s a version of this argument that’s wrong. The measurement infrastructure isn’t absent. It’s partial.

Amazon’s Rufus Sponsored Prompts, which went billable in late March, gives you a report showing which shopper questions triggered your products, the impressions, the clicks, the attributed sales — down to the ASIN. Shopify supports agentic storefronts across ChatGPT, Copilot, Google AI Mode, and Gemini, with orders flowing into admin tagged by referral source — though Shopify’s own docs note that Google Analytics and custom pixels won’t fire in all agentic checkout cases, with only server-to-server events reliably tracking. Tools like Profound are building GA4 integrations for AI visibility, though they frame it as correlation and ecosystem visibility, not causal attribution. Pieces of the system exist.

So what’s missing?

The causal link. The part where you can say: “We did X to optimize for AI agents, and it caused Y in revenue.”

Shopify has reported that AI-referred traffic is up 7x since January 2025, with AI-attributed orders up 11x. But those are channel-level aggregates. A brand could have done zero optimization work and still seen its AI-referred orders jump — because the channel itself grew that fast. When everything’s rising, every boat looks like it has a great captain.

That’s the gap. Not “can we see AI traffic?” but “can we prove our agency’s work is why we’re getting it?”

Think about what web analytics gave early SEO agencies. You could show a client: you were ranking #47 for this keyword, now you’re #12. Here’s the traffic increase. Here’s the revenue. Clean, attributable, undeniable. That feedback loop let SEO agencies prove their value fast enough to keep clients paying.

Agentic commerce doesn’t have that loop yet. “Share of model” — how often ChatGPT or Perplexity or Rufus recommends your product for a given query — is the closest thing. It’s useful. It’s directional. It’s also disconnected from the purchase. You can track that ChatGPT mentions your running shoe more often this month. You cannot track how many people bought that shoe because ChatGPT mentioned it.

There are case studies floating around. A B2B marketing firm called Broworks says 10% of their organic traffic now comes from generative engines, with 27% of that converting to sales-qualified leads. An AEO agency called The Optimist claims a 4,900% revenue increase from LLM-referred sources for a retail tech client. Alhena, which sells AI shopping assistant software, published a study of 329 brands showing that accurate AI product descriptions drive 2–4x conversion lifts.

I’ll be honest about what these are. The Broworks numbers are self-reported by the agency that did the work. The Optimist figure is aggressive and the methodology isn’t public. Alhena is measuring the impact of Alhena’s own product on Alhena’s own customers. When tool vendors publish case studies showing their tool worked, that’s marketing, not proof. A skeptical CMO will notice.

None of which means the underlying signal is fake. Some vendor datasets show LLM-referred traffic converting well — Alhena reports 2.47%, ahead of paid search and Google Shopping in its sample. But at least one other study across 973 ecommerce sites found LLM referrals converting worse than Google Search. The numbers depend on who’s counting, how, and which brands are in the dataset. The channel is small and growing fast. The conversion story is not settled.

The part that isn’t solid: proving that your $10K/month agency is why your brand shows up in the results.

So when a brand hires someone to “optimize for AI agents,” the CMO will ask for results within 90 days. And the honest answer is: we can show you what AI agents are saying about your products, where you’re showing up, and where your competitors are beating you. We can fix the data problems that make agents get your products wrong. What we can’t do is hand you a dashboard that says “our work generated $X in AI-attributed revenue this month.”

That dashboard doesn’t exist yet. For anyone.

Here’s why I think you should invest anyway.

The Alhena data — vendor-funded or not — points at something real: the accuracy of what AI says about your product is the lever. When an agent describes your product correctly, people buy. When it gets it wrong, they bounce. And every wrong description compounds. Bad data in AI feeds is stickier than bad data in search results, because agents don’t show ten options to evaluate. They show one. If it’s not yours, you never know you lost.

The parallel isn’t 2003 SEO. It’s 2003 product data.

The brands that invested in clean, structured product feeds for Google Shopping and Amazon back then didn’t do it because they could prove the ROI in Q1. They did it because they understood the channel was going to matter, and the cost of having bad data when the channel scaled was higher than the cost of being early.

Structured product data that agents can parse. Accurate attributes. Clean reviews. Schema that surfaces the right information. Protocol registration so agents can find you. None of this is wasted even if agentic commerce grows slower than the hype suggests. It makes your products better everywhere — on Amazon, on your DTC site, in Google Shopping, in email campaigns.

The brands that wait for perfect measurement will wait until their competitors have already built the infrastructure. Then they’ll hire someone to catch up at three times the cost.

Three things you can do now that don’t require faith in a specific timeline:

  1. Audit what AI agents actually say about your top 20 products. Ask ChatGPT, Perplexity, and Rufus to recommend products in your category. If they’re getting your products wrong — or ignoring them entirely — you have a data problem, not a strategy problem. That part is provable today.
  2. Fix your product data for agents, not just humans. Structured attributes, complete specs, clean reviews. This helps everywhere, not just AI channels. And it’s the one investment where the before-and-after is visible — you can re-query the same AI surface and see the description change.
  3. Use the measurement that does exist. Shopify’s AI order tagging, Rufus Prompts reporting, server-side referral data. It’s not GA4. But it’s enough to know if the channel is real for your category — and enough to stop someone from telling you “we can’t measure anything” when that’s not quite true either.

Nobody can prove that optimization drives results in this channel. Not yet. The brands that will win are the ones honest enough to say that — and disciplined enough to build the foundation before the proof arrives.