Scoring Methodology
Versioned methodology for AgentReady scoring. Current version: 2026-04-08-v1.
Version
| Field | Value |
|---|---|
| Scoring version | 2026-04-08-v1 |
| Methodology URL | https://shopgraph.dev/methodology |
| UCP schema version | v1.1.0 |
Composite Score Formula
The overall AgentReady score is a weighted sum of five dimension scores:
overall = (completeness * 0.30)
+ (confidence * 0.25)
+ (structure * 0.20)
+ (actionability* 0.15)
+ (freshness * 0.10)
Dimension Details
1. Completeness (weight: 0.30)
Measures how many expected fields are present in the extracted data.
Core fields (weighted 2x)
title— Product nameprice— Numeric pricecurrency— ISO 4217 codebrand— Brand or manufacturerimage— Primary product imageavailability— Stock statusurl— Canonical product URL
Optional fields (weighted 1x)
description,sku,gtin,mpn,category,rating,review_count,variants
Score = (core_present * 2 + optional_present) / (core_expected * 2 + optional_total)
2. Confidence (weight: 0.25)
Based on per-field confidence scores from the extraction pipeline.
Score = mean(field_confidences). Penalized if any field falls below 0.5 (floor penalty of -0.1).
3. Structure (weight: 0.20)
Validates that extracted data conforms to expected formats:
- Price is a valid positive number
- Currency is a valid ISO 4217 code
- URL is well-formed (starts with http/https)
- Availability matches Schema.org ItemAvailability enum
- Image URLs are valid and accessible
Score = valid_checks / total_checks
4. Actionability (weight: 0.15)
Can an AI agent take action (e.g. purchase, compare, recommend) with this data?
- has_price (0.35) — Price is present and valid
- has_availability (0.30) — Stock status is known
- has_product_url (0.20) — Product URL for linking/visiting
- has_add_to_cart (0.15) — Add-to-cart mechanism detected
Score = weighted sum of boolean checks
5. Freshness (weight: 0.10)
Penalizes stale or cached data:
| Data Age | Score |
|---|---|
| < 1 hour | 1.0 |
| 1-6 hours | 0.9 |
| 6-24 hours | 0.7 |
| 1-7 days | 0.4 |
| > 7 days | 0.1 |
Calibration Approach
Scores are calibrated against a 208-URL ground-truth dataset:
- Ground truth collection — Human-verified product data for 208 URLs across 22 verticals.
- Automated testing — Extraction tests run every 30 minutes via cron.
- Accuracy measurement — Extracted values compared against ground truth field-by-field.
- Baseline adjustment — Extraction method baselines updated when measured accuracy diverges from predicted confidence by more than 5%.
- Version bumping — When baselines change, the scoring version is incremented.
scoring_version field in every AgentReady score response references a specific revision of this document.