REST API

Complete endpoint reference for the ShopGraph extraction API.

Base URL: https://shopgraph.dev

Authentication

Three authentication methods are supported:

MethodHeaderNotes
API KeyAuthorization: Bearer sg_live_...For Starter, Growth, and Enterprise tiers
Stripe MPPX-Payment-Method: pm_...Machine-to-machine payments for agents
Free tierNone50 full-pipeline calls/month, no signup needed

Rate Limits

TierMonthly LimitRate
Free5010/min
Starter ($99)10,00060/min
Growth ($299)50,000200/min
EnterpriseCustomCustom

Rate limit headers are included in every response: X-RateLimit-Limit, X-RateLimit-Remaining, X-RateLimit-Reset.

Endpoints

POST /api/enrich/basic

Free-tier extraction using Schema.org structured data only. No authentication required.

Note: For most URLs, the Playground or the full /api/enrich endpoint provides better results.

Request

curl
$ curl -X POST https://shopgraph.dev/api/enrich/basic \
  -H "Content-Type: application/json" \
  -d '{"url": "https://www.allbirds.com/products/mens-tree-runners"}'

Request Body

FieldTypeRequiredDescription
urlstringYesProduct page URL
formatstringNoResponse format. json (default) or ucp. When set to ucp, returns data in UCP line_item format.

Response

200 OK
{
  "product": {
    "product_name": "Men's Tree Runners",
    "brand": "Allbirds",
    "price": {
      "amount": 98.00,
      "currency": "USD"
    },
    "availability": "in_stock",
    "primary_image_url": "https://cdn.allbirds.com/image/fetch/q_auto,f_auto/w_1000/https://www.allbirds.com/cdn/shop/files/TR2MNCW_SHOE_ANGLE_GLOBAL_MENS_TREE_RUNNER_CHARCOAL.png",
    "_shopgraph": {
      "extraction_method": "schema_org",
      "field_confidence": {
        "product_name": 0.98,
        "brand": 0.95,
        "price": 0.97,
        "availability": 0.91,
        "primary_image_url": 0.93
      }
    }
  },
  "cached": false,
  "free_tier": { "used": 12, "limit": 50 }
}

POST /api/enrich

Full extraction with LLM, hybrid merge, and Playwright fallback. Requires authentication.

Request

curl
$ curl -X POST https://shopgraph.dev/api/enrich \
  -H "Content-Type: application/json" \
  -H "Authorization: Bearer sg_live_abc123" \
  -d '{
    "url": "https://www.uline.com/Product/Detail/S-19318/Kraft-Paper/30-lb-Kraft-Paper-Sheets-8-1-2-x-11",
    "format": "ucp",
    "strict_confidence_threshold": 0.8,
    "include_score": true
  }'

Request Body

FieldTypeRequiredDescription
urlstringYesProduct page URL
formatstringNoResponse format. json (default) or ucp. When set to ucp, returns data in UCP line_item format.
strict_confidence_thresholdnumberNoFloat between 0 and 1. Fields with confidence scores below this threshold are removed from the response. For autonomous pipelines that need only high-confidence data.
include_scorebooleanNoInclude AgentReady score in response
methodstringNoForce extraction method: schema_org, llm, playwright

Response

200 OK — with include_score: true
{
  "product": {
    "product_name": "30 lb Kraft Paper Sheets - 8 1/2 x 11",
    "brand": "Uline",
    "price": {
      "amount": 92.00,
      "currency": "USD"
    },
    "availability": "in_stock",
    "sku": "S-19318",
    "primary_image_url": "https://www.uline.com/images/product/large/S-19318.jpg",
    "categories": ["Kraft Paper", "Paper Sheets", "Packaging"],
    "_shopgraph": {
      "extraction_method": "schema_org",
      "data_source": "live",
      "field_confidence": {
        "product_name": 0.98,
        "brand": 0.93,
        "price": 0.93,
        "availability": 0.83,
        "categories": 0.93,
        "primary_image_url": 0.93
      },
      "confidence_method": "tier_baseline"
    }
  },
  "agent_readiness_score": 76,
  "scoring_breakdown": {
    "structured_data_completeness": { "score": 82, "weight": 0.30, "weighted_contribution": 24.6 },
    "semantic_richness":            { "score": 68, "weight": 0.20, "weighted_contribution": 13.6 },
    "ucp_compatibility":            { "score": 71, "weight": 0.20, "weighted_contribution": 14.2 },
    "pricing_clarity":              { "score": 80, "weight": 0.15, "weighted_contribution": 12.0 },
    "inventory_signal_quality":     { "score": 75, "weight": 0.15, "weighted_contribution": 11.25 },
    "access_readiness":             { "score": 100, "weight": 0.00, "weighted_contribution": 0 }
  },
  "scoring_version": "2026-04-08-v1",
  "cached": false,
  "credit_mode": "standard"
}

POST /api/enrich/html

Extract from provided HTML. Useful when you already have the page content.

Request Body

FieldTypeRequiredDescription
htmlstringYesRaw HTML content
urlstringNoSource URL for context and caching
formatstringNostandard or ucp
strict_confidence_thresholdnumberNoConfidence threshold

POST /api/score

Score existing product data for agent readiness without extracting from a URL.

Request Body

FieldTypeRequiredDescription
productProductDataYesProduct data object to score

Response

Returns an AgentReady Score object.


GET /api/stats

Get extraction statistics and test corpus results. No authentication required.


GET /api/health-check

Health check endpoint. Returns system status and success rates.


Availability Values

The availability field is normalized via a structured-signal parser (LAU-330). Each value emits a per-pattern confidence baseline that flows through the freshness decay model.

ValueSignal DetectedConfidence Baseline
in_stock"In Stock", "Available now", "Add to Cart" button present, "Ready to ship"0.85
out_of_stock"Out of Stock", "Sold Out", "Currently Unavailable", "Notify me when available"0.85
low_stock"Only N left", "N remaining", "Low stock", "Selling fast" (N ≤ 20). Includes quantity_remaining when N is extractable.0.80
backordered"Backordered", "On backorder", "Ships in N weeks", "Coming soon"0.75
preorderSchema.org PreOrder term, "Pre-order today" (back-compat with Schema.org availability vocabulary)0.80
quote_only"Contact for Pricing", "Quote on Request", "Call for price", "POA"0.70
unknownNo structured signal detected. Calibration baseline lowered from mid-range to 0.30 in LAU-330 so uncertainty is honestly reflected in confidence.0.30

Availability decays faster than most fields. By default it uses the real_time volatility class (30-min half-life). Promo-heavy domains (Etsy, eBay, AliExpress, Temu, Shein) route to hyper_volatile (10-min half-life). See Self-healing freshness for the decay model.

Error Responses

StatusMeaning
400Invalid request (missing URL, bad format)
401Authentication required
402Payment required (free tier exhausted, no payment method)
429Rate limited
500Internal server error
504Extraction timeout (page too slow or complex)