Scoring Methodology

Versioned methodology for AgentReady scoring. Current version: 2026-04-08-v1.

UCP Schema Compatible — AgentReady scoring is validated against ucp-schema v1.1.0 field requirements.

Version

FieldValue
Scoring version2026-04-08-v1
Methodology URLhttps://shopgraph.dev/methodology
UCP schema versionv1.1.0

Composite Score Formula

The overall AgentReady score is a weighted sum of six dimension scores (five currently active):

Formula
agent_readiness_score = (structured_data_completeness * 0.30)
                     + (semantic_richness             * 0.20)
                     + (ucp_compatibility             * 0.20)
                     + (pricing_clarity               * 0.15)
                     + (inventory_signal_quality       * 0.15)

access_readiness is included in the response at weight 0.00. When activated, weights will be redistributed across all six dimensions.

AgentReady dimension scores use a 0-100 scale. Per-field confidence scores (from the extraction pipeline) use a separate 0-1 scale. These are independent scoring systems.

Dimension Details

1. Structured Data Completeness (weight: 0.30)

Measures how many expected fields are present in the extracted data. Scores on a 0-100 scale.

Core fields

  • title — Product name
  • price — Numeric price
  • description — Product description
  • availability — Stock status
  • images — Primary product image
  • brand — Brand or manufacturer
  • categories — Product categories
  • specs — Product dimensions/specifications

Base score = (fields_present / fields_expected) * 100. B2B bonus (up to 10 points) for part numbers, MOQ, lead time, and bulk pricing signals.

2. Semantic Richness (weight: 0.20)

Evaluates category depth, attribute count, variant coverage, and description quality. Scores on a 0-100 scale.

  • Category depth (max 25 pts) — Deeper category trees score higher
  • Attribute count (max 25 pts) — Colors, materials, dimensions
  • Variant coverage (max 25 pts) — Multiple color/material options
  • Description quality (max 25 pts) — Length and detail of description

3. UCP Compatibility (weight: 0.20)

How well the extracted data maps to the Universal Commerce Protocol schema. Scores on a 0-100 scale.

  • Required UCP fields (60%) — item.id, item.title, item.price
  • Optional UCP fields (25%) — image, brand, description, availability, categories, currency, color, material
  • UCP mapping validation (15%) — Actual mapping to UCP schema succeeds

4. Pricing Clarity (weight: 0.15)

Evaluates price data completeness and B2B pricing signals. Scores on a 0-100 scale.

  • Base price present (40 pts)
  • Currency present (20 pts)
  • Sale price info (15 pts)
  • Bulk/tier pricing detected (15 pts)
  • MOQ detected (10 pts)

5. Inventory Signal Quality (weight: 0.15)

Measures specificity of stock and fulfillment data. Scores on a 0-100 scale.

  • Stock status present (40 pts)
  • Quantity information (25 pts)
  • Lead time detected (20 pts)
  • Backorder info (15 pts)

6. Access Readiness (weight: 0.00)

Evaluates whether the target URL requires agent-specific authentication or RFC 9421 signed identity verification. Scores on a 0-100 scale. Currently inactive (weight 0.00). All URLs in the test corpus are openly accessible. This dimension activates when Web Bot Auth adoption reaches detection threshold (>10% of test URLs returning Web Bot Auth headers). When active, weights will be redistributed across all six dimensions.

Calibration Approach

Scores are calibrated against a 208-URL ground-truth dataset:

  1. Ground truth collection — Human-verified product data for 208 URLs across 22 verticals.
  2. Automated testing — Extraction tests run every 30 minutes via cron.
  3. Accuracy measurement — Extracted values compared against ground truth field-by-field.
  4. Baseline adjustment — Extraction method baselines updated when measured accuracy diverges from predicted confidence by more than 5%.
  5. Version bumping — When baselines change, the scoring version is incremented.
This methodology is versioned. The scoring_version field in every AgentReady score response references a specific revision of this document.

Version History

2026-04-08-v1 — Initial methodology release. Five active dimensions, access_readiness stubbed at weight 0.00. Calibrated against 208-URL ground-truth dataset across 22 verticals.