Essay · Design philosophy

Agents Don't Need a Catalogue. They Need Attribution.

Why a commerce MCP shouldn't try to be a product database, and why the best design call we made was cutting scope by half.

April 15, 2026 Ge Jiaqi ~5 min read

When we first sketched an MCP server for commerce, the obvious shape was: build a catalogue. Load hundreds of thousands of product SKUs, expose a full-text search tool, ship it. That was the natural instinct — agent asks for a product, we return products, classic.

It was also wrong. Here's why, and what we built instead.

1. The agent already has the catalogue

Look at what a modern agent — Claude, ChatGPT, Perplexity, Cursor, Gemini — can do without a commerce MCP installed. It can:

Search the web in real time and read the results
Recall popular products on major platforms from its training corpus
Parse JSON structured data from a product page
Handle multilingual queries natively
Reason about trade-offs between options

The one thing it cannot do is make the click trackable. It doesn't have an attribution infrastructure. It doesn't own a relationship with the merchant. It can show the user a Shopee listing, but when the user clicks, the click is a raw https request — no downstream revenue signal, no accounting that the sale originated from an AI recommendation.

If you are building an MCP for commerce and you rebuild the product-discovery surface the agent already has, you are spending 95% of your effort on something the user already gets for free. The remaining 5% — the attribution layer — is where you are actually needed.

2. Smaller surface, sharper moat

This realization changed our scope. Instead of building a catalogue, we built an attribution layer: a small MCP server that exposes six tools, all centered on the question "given this shopping intent, how do you make the click count?"

search_brands — region-matched merchant discovery (agents use this to find where to recommend from)
get_brand — full merchant record for a known slug
list_regions / list_categories — introspection
get_click_url — brand-level attribution URL
wrap_product_url — product-level attribution URL (takes a specific merchant URL the agent already found, returns a trackable click-through)

That's it. No catalogue search. No price history. No reviews aggregator. No competitor comparison. The agent does all of those using capabilities it already has; we make the final click work.

This is a smaller surface area than most commerce infrastructure. It's also a sharper moat, because the thing we do well — neutral, region-aware, multilingual attribution plumbing — is structurally hard for a marketplace or a single merchant to build. They want their own attribution, not a neutral layer. We're the only party incentivized to be neutral.

3. What we don't build

Being explicit about what we don't build is as important as describing what we do. Here's our list of non-goals:

We don't scrape. Ever. Our data comes from partner catalogues that are explicitly redistributable. Every merchant in our directory is there because of a clean, sanctioned relationship — never through HTML parsing of a marketplace listing.

We don't compete on catalogue depth. Marketplaces like Shopee, Lazada, Taobao, JD, Amazon already have millions of SKUs. We don't try to match that. We surface merchants, not SKUs; the agent goes to the merchant for the long tail.

We don't insert ourselves into the user's session state. We don't set cookies. We don't fingerprint. We don't build a user model. Our observability is click-level, anonymized, and bounded.

We don't try to be a UI. The agent's client (Claude Desktop, Cursor, whoever) is the UI. We return machine-readable data plus a nicely-formatted markdown card; the client decides how to render it.

We don't try to be a marketplace. We never hold inventory, never process payments, never own the customer relationship. Every sale happens on the merchant's own storefront under their own terms.

4. What this frees us to do well

Subtraction is what enables the things we do invest in:

Multilingual indexing. With a small, curated merchant set, we can maintain hand-crafted multilingual aliases (see Nine Languages in One Prompt) that a catalogue-at-scale approach couldn't maintain.

Region-matching. Every merchant record carries explicit regions metadata. An agent querying from Thailand gets Thailand-shipping options; from Singapore gets Singapore options. This is only tractable because the surface is small enough to label honestly.

Clean observability. Every click through our /go/ redirect is logged with UA, ASN, country, and the agent session's aff_sub tag. We see patterns in how agents route users; we see latency issues; we see the beginnings of attribution data. This only works because we're the only thing in the middle of the click.

A story merchants understand. When we approach a merchant about direct onboarding, the pitch is easy: "we're a neutral agent-facing layer; you get an attribution-tracked share of agent-originated sales; you don't have to build an MCP server." Simple.

5. The broader lesson

If there's a pattern here, it's: the right scope for infrastructure in the agent era is the one the agent doesn't already have.

For commerce, that's attribution + region + multilingual access — not catalogue, not reviews, not comparison tables. For travel, it's probably live availability + booking attribution, not itinerary building. For research, it's citation tracking + canonical URL resolution, not summarization. In each case, the sharp question is "what does the agent need that it can't do itself, that has real stakes outside its context window?"

Infrastructure built that way is small, defensible, and composes well with everything else the agent can do. Infrastructure built to duplicate what agents already do well ends up overbuilt and under-used.

"Agent-native" doesn't mean "what would an agent want to do, built as an API." It means "what does an agent not yet have, that it needs to do its job for its user." Those are different questions. The second one is where the interesting work is.

If you want to see how small this actually ends up being: our entire MCP server is under 600 lines of hand-written JavaScript, running on a single edge worker, serving queries in under 100ms. The docs are at github.com/Nimo1987/xurprise-mcp-docs. Everything else, we leave to the agent.

— Ge Jiaqi · April 15, 2026 · more posts