Mercora | Russell Moore

Overview

Mercora is a production-ready, AI-enhanced e-commerce platform specializing in outdoor gear, built entirely on Cloudflare’s edge infrastructure. It started as an exploration of whether a full-featured online store — AI shopping assistant, payment processing, admin dashboard, inventory management — could run at the edge without traditional cloud servers. The answer is yes, and the result is a platform that serves pages in under 100ms globally with zero origin servers.

What makes Mercora technically interesting isn’t just the e-commerce platform itself. It’s what sits on top of it: a 17-tool Model Context Protocol (MCP) server that turns the entire store into a set of capabilities any AI agent can use. An AI assistant in Claude Desktop, Cursor, or VS Code can search products, manage a cart, place orders, and coordinate fulfillment — all through standardized tool calls without a browser. This is the agentic commerce pattern: shopping as an API that AI agents consume, not a website humans visit.

The platform also features Volt, a conversational AI shopping assistant with semantic product search, anti-hallucination guardrails, and a personality system. A comprehensive admin dashboard provides product management, order processing, AI-powered analytics, content management, and promotion management with a MACH Alliance-compliant data model underneath.

MCP Server: Agentic Commerce

The MCP integration is the most forward-looking piece of Mercora. Instead of treating e-commerce as a browser-based experience, the MCP server exposes the entire commerce workflow as tools that any MCP-compatible AI client can call.

17 Production Tools

The server implements four categories of tools across 17 endpoints:

Commerce tools handle product discovery: search_products runs semantic vector search with agent context, assess_request evaluates what the store can fulfill from a multi-item request (and flags what needs to be sourced elsewhere), and get_recommendations provides context-aware suggestions based on the agent’s understanding of the user.

Cart management covers the full lifecycle: add_to_cart, bulk_add_to_cart, update_cart, remove_from_cart, clear_cart, and get_cart with real-time total estimation. Cart state persists in D1-backed sessions, so an agent can build a cart across multiple conversation turns.

Order processing handles get_shipping_options, validate_payment, place_order, and get_order_status — the complete purchase flow from shipping calculation through order tracking.

Agent administration enables create_agent, list_agents, get_agent_details, and update_agent_status for managing which AI agents have access and what rate limits apply.

Multi-Agent Architecture

The MCP server is designed for a future where personal AI shopping agents coordinate across multiple retailers. Each agent authenticates with an API key, carries user context (budget, preferred brands, activities, experience level), and maintains persistent sessions. The assess_request tool is specifically built for multi-site coordination — it returns what Mercora can fulfill, what it can’t, and suggests alternatives, so an orchestrating agent can route unfulfilled items to other retailers.

Agent context flows through every tool call via the X-Agent-Context header, validated and size-limited (max 1024 bytes). Rate limiting is per-agent with configurable requests-per-minute and operations-per-hour, tracked in D1 with upsert counters.

HTTP REST Transport

MCP typically uses WebSocket or stdio transport. Cloudflare Workers don’t support persistent WebSocket connections in the same way, so the server uses HTTP REST — each tool call is a POST to /api/mcp with a tool field routing to the appropriate handler. This trades the streaming benefits of WebSocket for compatibility with Workers’ stateless execution model. Session state that would normally live in a WebSocket connection persists in D1 instead.

Volt: AI Shopping Assistant

Volt is the customer-facing AI assistant — a conversational interface for product discovery and outdoor gear advice, powered by semantic search and a carefully constrained generation pipeline.

Semantic Search Pipeline

When a user asks Volt a question, the pipeline works in phases. First, the question is embedded into a 768-dimensional vector using BGE-base-en-v1.5. That vector queries a Cloudflare Vectorize index containing 38 items (30 products + 8 knowledge articles), returning the top 7 matches with metadata. Product IDs and text snippets are extracted from the results to provide context for the language model.

The language model (GPT-OSS-20B, running on Workers AI) generates a response with the vector search context injected into the system prompt. The model is configured with temperature: 0.1 for chat to prioritize consistency over creativity — you don’t want a shopping assistant inventing products.

Anti-Hallucination System

The core problem with AI shopping assistants is hallucination: the model confidently recommends products that don’t exist, invents specifications, or claims features the actual product doesn’t have. Mercora addresses this at multiple levels.

The system prompt includes strict rules: only recommend products from the provided context, never mention products not in the vector search results, and if no products match the query, provide general advice instead of forcing irrelevant recommendations. The model is instructed to be a “selective product curator, not a product catalog” — recommending 1-4 relevant items rather than listing everything.

On the output side, the system parses the AI response for bold-formatted product names, maps them back to actual product IDs from the vector results, and only returns products the AI explicitly recommended. If the AI mentions products in bold but they can’t be mapped to real products, the system returns zero products rather than falling back to the raw vector results. This prevents the scenario where the AI hallucinates a product name and the system shows unrelated products as “recommendations.”

Personality and Context

Volt has a defined personality — a gruff but good-hearted outdoor gear expert with dry humor and genuine enthusiasm for the wilderness. A flair system adds random wisdom/quips to 30% of responses for character depth. Easter eggs handle specific queries (s’mores recipe, unicorn mentions) with canned personality-forward responses.

User context enriches every interaction: Cloudflare geo headers provide location data, purchase history from previous orders prevents recommending already-owned products, and customer tier (identified via Clerk authentication) adjusts recommendation sophistication.

MACH-Compliant Data Model

The database layer follows MACH Alliance principles (Microservices, API-first, Cloud-native, Headless). The data access layer in lib/models/mach/ mirrors commercetools-style schemas: products with typed variants, structured pricing with currency objects, hierarchical categories, promotion rules with stacking logic, inventory tracking, and media management.

This isn’t just academic adherence to a spec. The MACH structure means the data model cleanly separates from the presentation layer, which is why the same product data serves the Next.js storefront, the admin dashboard, the Volt AI assistant, and the MCP server without adapter logic. A product is a product regardless of which interface is consuming it.

The promotion system supports percentage discounts, fixed amount discounts, free shipping, and category-scoped promotions with minimum order thresholds and discount code stacking — all validated server-side with the same logic whether the order comes from the web checkout or an MCP tool call.

Admin Dashboard

The admin interface is a full back-office system: product CRUD with bulk editing, order management with status workflow, category management with hierarchical organization, promotion management, knowledge base management for Volt’s support content, CMS page management, admin user management with role-based access, and settings configuration.

AI-powered analytics generate natural language business intelligence summaries from order and product data. The admin can generate product descriptions and knowledge articles using AI content generation tools, configured with higher temperature (0.8) for marketing creativity versus the low temperature (0.1-0.3) used for chat and analytics.

Authentication is multi-layered: Clerk handles customer authentication, while admin access uses a separate database-driven authentication system with role-based security and production/development mode switching.

Edge Infrastructure

Next.js on Workers

Running Next.js 15 on Cloudflare Workers via OpenNext is non-trivial. Workers have execution time limits, memory constraints, and a 128KB response body limit for chunked responses. The platform handles this through careful response chunking, lazy module imports (each MCP tool handler is dynamically imported only when called), and session state externalized to D1 rather than held in memory.

Storage Architecture

D1 stores all structured data: products with variants, orders, users, admin users, MCP sessions, MCP agent registrations, rate limit counters, promotions, pages, and settings. Drizzle ORM provides type-safe queries with migration management.

R2 stores product images, category images, product descriptions as markdown files (vectorized for AI search), and knowledge base articles. Cloudflare image transforms optimize serving.

Vectorize maintains a 38-item index with 768-dimension BGE embeddings covering all products and knowledge articles. The index is rebuilt via admin API endpoints that read markdown content from R2, generate embeddings, and upsert vectors.

KV is not used in Mercora (unlike RecompAI) — the product catalog is small enough that D1 queries with Drizzle’s query builder are fast enough without a caching layer.

Payments

Stripe handles the complete payment flow: payment intents, real-time tax calculation via Stripe Tax, webhook processing for order fulfillment, and test card support for development. All Stripe communication happens server-side on Workers. The checkout flow supports discount code validation with the same promotion engine used across all sales channels.

Tech Stack

Next.js 15 — App Router with server and client components. Server components for storefront rendering, client components for cart, chat interface, and admin dashboard interactivity.
Cloudflare Workers — Hosts the entire application via OpenNext adapter. API routes, MCP server, AI inference, and static asset serving all run at the edge.
D1 — Relational data store for products, orders, users, MCP sessions/agents, promotions, pages, and admin configuration.
R2 — Object storage for product images, knowledge base markdown, and marketing assets.
Vectorize — 38-item semantic search index with BGE-base-en-v1.5 embeddings (768 dimensions) for product discovery and knowledge retrieval.
Cloudflare AI — GPT-OSS-20B for text generation (chat, analytics, content), BGE for embeddings. Per-use-case temperature and token configuration.
Stripe — Payment processing with Stripe Tax for real-time tax calculation, webhook-driven order fulfillment, and discount code validation.
Clerk — Customer authentication with secure session management.
Drizzle ORM — Type-safe database queries with MACH-compliant schema definitions and automatic migration management.
shadcn/ui — Component library for consistent UI across storefront and admin dashboard.

Outcomes

17-tool MCP server enabling agentic commerce — AI agents can search, cart, and purchase through standardized tool calls without a browser
Multi-agent architecture with API key authentication, per-agent rate limiting, persistent sessions, and cross-site fulfillment assessment
Anti-hallucination pipeline ensuring the AI only recommends real products from verified vector search results
MACH-compliant data model serving the same product data to four interfaces (storefront, admin, Volt, MCP) without adapter logic
Full admin dashboard with AI-powered analytics, content generation, order management, and promotion management
Sub-100ms page loads globally via edge-rendered pages with zero origin servers
Production payment processing with Stripe integration, real-time tax calculation, and multi-channel discount validation
Complete e-commerce platform running entirely on Cloudflare Workers, D1, R2, Vectorize, and Workers AI

See the LinkedIn article I wrote:

Mercora Project: Building a Serverless AI-Powered eCommerce Platform in 7 days. When I set out to build mercora, an eCommerce platform prototype, I wanted to create something that showcased the cutting edge of serverless architecture while solving real business problems. The result is a fully functional platform that runs entirely on Cloudflare linkedin.com