3x Worse, 1.5x Better: The Only Two Numbers That Matter in AI Shopping Right Now

Two numbers came out in the last few weeks that, put together, explain pretty much everything happening in agentic commerce architecture.

First one: Walmart’s checkout data from ChatGPT showed that native AI checkout converted at 3x worse than Walmart.com. Three times worse. This was CNBC reporting in March, citing actual performance data from one of the largest retailers on earth.

Second one: Criteo — the first ad tech company to partner formally with ChatGPT — found that LLM referral traffic converts at 1.5x better than other referral channels.

So AI checkout is a disaster. But AI referral is excellent. Read those two numbers together and the whole architecture question basically answers itself.

Agents are extraordinary shopping assistants. They’re terrible cashiers.

OpenAI launched Instant Checkout in late 2024 with a lot of fanfare. The idea was simple: find something in ChatGPT and buy it without leaving. The problem is it never got going. Only about 30 Shopify merchants had activated it in six months, which for a product with OpenAI’s distribution is not a great sign. They killed it.

I don’t think they killed it because the technology was bad. I think they killed it because the data was bad. The Walmart numbers are a pretty good explanation. When you hand the transaction layer to the AI platform, conversion tanks — not by a little, but by enough that the math on the whole model stops working.

Think about what an AI conversation actually does well. It gathers context. It narrows options. It builds purchase confidence. By the time someone finishes a good product research conversation with ChatGPT or Claude, they know what they want. That’s exactly when a familiar, trusted checkout flow should close the deal — not a novel in-chat transaction experience they’ve never used before.

You don’t hand a warm lead to a confused stranger at the door.

Walmart clearly learned this fast. Instead of letting OpenAI own the checkout, they built Sparky — their own retailer-branded app running inside ChatGPT. The agent handles discovery. The transaction happens in Walmart’s environment: Walmart’s checkout flow, Walmart’s trust signals, saved payment info the customer already has on file. It runs at 70% of direct Walmart.com conversion. That’s the gap between “this works” and “this gets killed.”

Gap did essentially the same thing inside Gemini. The merchant owns the transaction layer. Google built the discovery surface. Gap captures the sale.

Every retailer figuring this out is doing the same thing: let the agent be the assistant, stay in charge of the close.

The Criteo data is the other half of the picture. A 1.5x conversion lift from LLM referral traffic isn’t surprising once you think about how someone actually arrives at your site from a ChatGPT conversation. They’ve already been through a research process. The agent helped them figure out what they need, compare options, ask follow-ups. By the time they click through to a product page, the decision is mostly made. They’re not browsing. They’re executing.

That’s a very different customer than someone who clicked a display ad or followed a social media link on a whim. The intent quality is higher. Purchase confidence is higher. Which means your existing site and your existing checkout work better on this traffic, because the agent already did the hard qualifying work upstream.

Shopify grew AI-assisted orders 15x without building a consumer-facing agent at all. What they did was make their merchant catalogs readable by agents — structured product data, clean descriptions, good schema. The agent ecosystem did the rest. That’s a company that understood which layer to own.

Constructor saw a 52% add-to-cart lift from brands that cleaned up their product data for agent readability. Same products, better structured data, more than double the add-to-cart rate from AI traffic. The agent can only recommend what it can clearly read.

I talk to brand teams a lot about where AI fits into their channel strategy. The question I keep coming back to is the same one this data raises: who owns the transaction layer in a world where AI owns the discovery layer?

On Amazon, Amazon owns both. That’s always been the deal — you’re renting shelf space and they control the customer relationship. Agentic commerce could actually change that dynamic. But only if brands move deliberately. If you let ChatGPT or Gemini own both discovery and checkout, you’ve recreated the Amazon problem somewhere new. If you use the AI discovery surface to drive traffic to your own checkout, you’ve got a better deal than you’ve had on Amazon in years. Better-qualified traffic, higher purchase intent, merchant-owned relationship.

That’s the strategic lever here. The agent sends. You close.

The architecture is settling into something that’s actually good for brands willing to invest in two things: structured product data so agents can accurately represent what you sell, and a checkout experience good enough to close warm traffic. Both are achievable. Neither requires building an AI agent, and neither requires a platform partnership you don’t control.

The trap is trying to compete on the discovery layer — building your own AI shopping experience, hoping customers find you there. Most brand-built AI shopping experiences are bad, because brands don’t have the breadth of data or training to do what ChatGPT or Perplexity does across millions of products. You’re not going to out-agent the agents.

The Walmart data and the Criteo data point to the same conclusion: the best position for most brands is to be extraordinarily findable by agents and very good at closing when the traffic arrives. That’s different from the old SEO playbook, but not entirely different. You still win by being the right answer. You still close by earning trust at the transaction.

The agent just changed what “being found” looks like.

3x Worse, 1.5x Better: The Only Two Numbers That Matter in AI Shopping Right Now

Todd Piechowski

The Signal

You’re in.