Scoring Methodology

Versioned methodology for AgentReady scoring. Current version: 2026-04-08-v1.

UCP Schema Compatible — AgentReady scoring is validated against ucp-schema v1.1.0 field requirements.

Version

Field	Value
Scoring version	`2026-04-08-v1`
Methodology URL	`https://shopgraph.dev/methodology`
UCP schema version	v1.1.0

Composite Score Formula

The overall AgentReady score is a weighted sum of six dimension scores (five currently active):

Formula

agent_readiness_score = (structured_data_completeness * 0.30)
                     + (semantic_richness             * 0.20)
                     + (ucp_compatibility             * 0.20)
                     + (pricing_clarity               * 0.15)
                     + (inventory_signal_quality       * 0.15)

access_readiness is included in the response at weight 0.00. When activated, weights will be redistributed across all six dimensions.

AgentReady dimension scores use a 0-100 scale. Per-field confidence scores (from the extraction pipeline) use a separate 0-1 scale. These are independent scoring systems.

Dimension Details

1. Structured Data Completeness (weight: 0.30)

Measures how many expected fields are present in the extracted data. Scores on a 0-100 scale.

Core fields

title — Product name
price — Numeric price
description — Product description
availability — Stock status
images — Primary product image
brand — Brand or manufacturer
categories — Product categories
specs — Product dimensions/specifications

Base score = (fields_present / fields_expected) * 100. B2B bonus (up to 10 points) for part numbers, MOQ, lead time, and bulk pricing signals.

2. Semantic Richness (weight: 0.20)

Evaluates category depth, attribute count, variant coverage, and description quality. Scores on a 0-100 scale.

Category depth (max 25 pts) — Deeper category trees score higher
Attribute count (max 25 pts) — Colors, materials, dimensions
Variant coverage (max 25 pts) — Multiple color/material options
Description quality (max 25 pts) — Length and detail of description

3. UCP Compatibility (weight: 0.20)

How well the extracted data maps to the Universal Commerce Protocol schema. Scores on a 0-100 scale.

Required UCP fields (60%) — item.id, item.title, item.price
Optional UCP fields (25%) — image, brand, description, availability, categories, currency, color, material
UCP mapping validation (15%) — Actual mapping to UCP schema succeeds

4. Pricing Clarity (weight: 0.15)

Evaluates price data completeness and B2B pricing signals. Scores on a 0-100 scale.

Base price present (40 pts)
Currency present (20 pts)
Sale price info (15 pts)
Bulk/tier pricing detected (15 pts)
MOQ detected (10 pts)

5. Inventory Signal Quality (weight: 0.15)

Measures specificity of stock and fulfillment data. Scores on a 0-100 scale.

Stock status present (40 pts)
Quantity information (25 pts)
Lead time detected (20 pts)
Backorder info (15 pts)

6. Access Readiness (weight: 0.00)

Evaluates whether the target URL requires agent-specific authentication or RFC 9421 signed identity verification. Scores on a 0-100 scale. Currently inactive (weight 0.00). All URLs in the test corpus are openly accessible. This dimension activates when Web Bot Auth adoption reaches detection threshold (>10% of test URLs returning Web Bot Auth headers). When active, weights will be redistributed across all six dimensions.

Calibration Approach

Scores are calibrated against a 208-URL ground-truth dataset:

Ground truth collection — Human-verified product data for 208 URLs across 22 verticals.
Automated testing — Extraction tests run every 30 minutes via cron.
Accuracy measurement — Extracted values compared against ground truth field-by-field.
Baseline adjustment — Extraction method baselines updated when measured accuracy diverges from predicted confidence by more than 5%.
Version bumping — When baselines change, the scoring version is incremented.

This methodology is versioned. The scoring_version field in every AgentReady score response references a specific revision of this document.

Version History

2026-04-08-v1 — Initial methodology release. Five active dimensions, access_readiness stubbed at weight 0.00. Calibrated against 208-URL ground-truth dataset across 22 verticals.