How Dealophant calculates price per unit
Dealophant ranks Amazon listings by their true cost per unit using a hybrid of title parsing, an LLM classifier, and Amazon's own pricePerUnit field. This page documents the methodology so it can be cited and verified.
The four-step pipeline
- Live PA-API query. Every search hits the Amazon Product Advertising API in real time. Dealophant does not maintain a stale price database. Listings without a buy-box price are excluded.
- Title parsing. A regex extracts obvious quantities (e.g., "12 Rolls", "32 oz", "Pack of 6", "100 Pods"). For multi-pack volume listings ("16 oz Cans, 12 Pack"), the parser multiplies pack count by per-container size to get total volume.
- LLM classification. An LLM step (Gemini 2.5 Flash, with DeepSeek and Llama fallbacks) classifies each title into one of four unit families: weight, volume, count, or servings. The classifier is biased toward weight over servings when both appear, because manufacturer-defined serving sizes are not standardized.
- Amazon pricePerUnit override. When the title parser landed on a weak result (servings, or count=1 meaning no real unit was extractable) AND Amazon's pricePerUnit field names a trusted weight or volume unit (oz, lb, fl oz, ml, l, gal, g, kg), the product is promoted into that family using Amazon's authoritative per-unit rate.
Unit conversions
Internally Dealophant uses two base units: ounce for weight, fluid ounce for volume. Count and servings have no conversion (1 unit = 1 unit).
- Weight: 1 lb = 16 oz · 1 kg ≈ 35.274 oz · 1 g ≈ 0.035274 oz
- Volume: 1 gal = 128 fl oz · 1 L ≈ 33.814 fl oz · 1 mL ≈ 0.033814 fl oz
Why cross-unit listings are segregated
A 30-serving protein powder tub at $1.67/serving is not directly comparable to a 16-ounce tub at $2.06/oz, because there's no fixed relationship between a "serving" and an ounce that holds across brands. Dealophant therefore picks one canonical unit family per search (the mode of unit families among priced listings, with weight winning ties), ranks only listings in that family, and shows the rest under "Other formats" with their own price/quantity displayed but not ranked.
Validation accuracy (latest run)
As of 2026-05-03, the unit parser passed 12/12 fixtures in the public validation harness (100% accuracy on the curated set). Each fixture is a real Amazon ASIN with a hand-checked expected unit type and count range; the harness runs the live pipeline (regex → LLM classifier → Amazon-pricePerUnit override) end-to-end against production. Source: script/validateUnitParser.ts. Latest results JSON: data/parser-accuracy.json.
LLM classifier transparency
The LLM classification step uses one of three providers, in fall-through order: google/gemini-2.5-flash, deepseek/deepseek-chat, then meta-llama/llama-3.3-70b-instruct:free. All three are accessed via OpenRouter. Inference parameters are temperature=0.1, max_tokens=1000, with a 3-second per-call timeout. The full prompt text — including the rules biasing toward weight over servings, the ignore list for unrelated specs (wattage, mAh, megapixels), and the per-category guidance — is in server/aiUnitParser.ts, function buildPrompt. The prompt is version-controlled; any change to it is visible in git history.
If all three models fail or time out (4-second total budget), Dealophant falls back to the regex-only result with no LLM input. Results are cached per ASIN for 500 entries (LRU) so repeat searches don't re-call the API.
The Amazon-pricePerUnit override step described below runs after the LLM classification, so a misclassification by the LLM can still be corrected by Amazon's own per-unit data when that data is in a trusted weight or volume unit.
PPU override sanity bounds
The override step rejects Amazon-supplied pricePerUnit values that imply a per-base-unit price below $0.001/oz or above $50/oz. Real consumer goods sit comfortably inside that range (cheap bulk rice ≈ $0.05/oz, premium supplements ≈ $5/oz, very high-end vitamins ≈ $20/oz). A value outside the bounds is almost always either a malformed display string or a seller-spoofed PPU (e.g., a 0.5-oz listing claiming "$0.99 / ounce" when the bottle is actually 0.5 oz total). Out-of-bounds overrides are silently rejected and the listing keeps its title-parser classification.
Override decision rule (exact)
The Amazon-pricePerUnit override is two binary conditions, AND-ed together. Pseudocode:
if (titleResult.unitType === "servings" OR
(titleResult.unitType === "count" AND titleResult.unitCount <= 1))
AND
(amazonPpu.unit ∈ {oz, ounce, lb, pound, fl oz, fluid ounce,
ml, milliliter, l, liter, gal, gallon, g, gram, kg})
then promote to (weight | volume) using amazonPpu.amount as $/base-unit
else keep titleResult unchanged
There is no confidence score, no probability threshold, no weighting. The decision is a pure function of (titleResult.unitType, titleResult.unitCount, amazonPpu.unit). The full implementation, including the regex used to parse Amazon's display string and the conversion factors, is in server/aiUnitParser.ts on GitHub.
"Best Value" badge — what it does and doesn't mean
The "Best Value" badge is awarded to whichever listing has the lowest computed price-per-unit within the dominant unit family of the current search. It does not consider listings shown under "Other formats" because those use a different unit and would not be a fair comparison. The badge is therefore family-specific: a "Best Value" tagged 16-oz tub is the cheapest per-ounce in-stock weight-based listing, not "the absolute cheapest version of this product on Amazon" (which might exist in a different format we segregate out). When the dominant family contains fewer than 3 priced listings, the badge is suppressed entirely because the comparison set is too small to be meaningful.
Ranking is deterministic
Within the dominant unit family, listings are sorted strictly by computed price per base unit, ascending. The ranking function is totalPrice ÷ unitCount (in the family's base unit) — no secondary signal, no commission weighting, no merchant preference, no editorial promotion. The "Best Value" badge is awarded to whichever listing has the lowest computed price-per-unit at the moment of the search; if two listings tie, the order is whatever Amazon's PA-API returned. Outbound clicks carry an Amazon Associates partner tag and the operator earns a commission on qualifying purchases — that revenue does not enter the ranking math.
Known limitations and failure modes
- Title parsing misses. When a product title doesn't include a quantity at all, the regex falls through to
count = 1. The Amazon-pricePerUnit override catches the common case (Amazon supplies a per-ounce or per-fluid-ounce rate), but a title with no quantity and no Amazon PPU will appear with unitCount = 1 and an inflated apparent per-unit price. These listings often surface in "Other formats" rather than the main ranking.
- Marketing-inflated quantities. "Mega rolls", "double rolls", "family size", "value pack" — the parser treats these as the underlying unit (rolls, packs) without inflating the count. Two competing toilet-paper listings can therefore have the same per-roll price even when one delivers 2× the sheets per roll. Dealophant does not currently normalize for sheet-per-roll variance.
- Concentrate vs. ready-to-use. "Makes 32 gallons" claims on cleaners or fertilizers refer to diluted output, not container size — the parser explicitly strips those phrases before measuring, but a few wording variants slip through.
- Cross-currency or non-US listings are not handled; Dealophant only queries the US Amazon marketplace. Per-ASIN price history graphs reflect US-marketplace prices only — the 90-day low shown may be higher than a hypothetical global low because foreign Amazon marketplaces are never sampled.
- Stale prices between fetch and click. Amazon prices change every few minutes. The visible "fetched live" timestamp on each search result is the only reliable freshness anchor; always confirm on Amazon before purchasing.
What Dealophant does not do
- Quality, taste, or review-score ranking. Listings are ranked strictly by computed price per unit. The "Best Value" badge means cheapest per unit, not "best overall."
- Non-Amazon retailers. Only Amazon US listings are indexed.
- Affiliate-influenced ranking. Amazon Associates commission applies to outbound clicks but does not affect order — order is deterministic from the price math.
Worked example: Amazon pricePerUnit override
The override step is the most opaque part of the pipeline, so here is a concrete example. Suppose Amazon returns a listing with:
- Title:
Truvani Organic Vegan Protein Powder | Chocolate | 20g Plant Based Pea Protein | 18 Servings | ...
price.money.amount: 39.12
price.pricePerUnit.displayAmount: "$1.86 / ounce"
The title parser sees "18 Servings" with no weight and assigns unitType=servings, unitCount=18. The override step parses Amazon's displayAmount with a strict regex (/$?s*([d.]+)s*/s*(?:(d+(?:.d+)?)s+)?([a-z][a-zs]*?)s*$/i) into {amount: 1.86, unit: "ounce", multiplier: 1}. Because "ounce" is in the trusted weight allowlist (oz, ounce, lb, pound, g, gram, kg) and the title classification is "weak" (servings), the listing is promoted to weight: unitCount = totalPrice ÷ amount = 39.12 ÷ 1.86 = 21.03 oz, pricePerUnit = $1.86/oz. The override is blocked when the title parser was confident (count > 1, e.g., a 2-pack of wall chargers) or when Amazon's unit is in the count/wattage/sheets family rather than weight or volume.
Currency, marketplace, and availability
All prices are in US dollars (USD). Dealophant queries only the US Amazon marketplace at www.amazon.com — listings on amazon.ca, amazon.co.uk, or other regional Amazon marketplaces are not indexed, and the same ASIN can have a different price, currency, or availability outside the US. Users shopping from non-US Amazon storefronts should not rely on Dealophant rankings.
Listings without a buy-box price (out-of-stock, vendor-only, restricted, or where Amazon hasn't selected a primary seller) are excluded from results entirely; they don't appear in either the dominant-family ranking or the "Other formats" section. Price history for these ASINs is paused while they're unavailable, then resumes when the listing returns.
Public API for verification
The full search response is exposed at https://dealophant.com/api/products/search?query=<term> with no authentication. Each response includes a fetchedAt ISO8601 timestamp, the dominantFamily selection, and every per-product field used in the ranking — including the raw Amazon pricePerUnitDisplay string. Anyone can reproduce the per-unit math or audit the override decisions directly against this endpoint.
Source code, including the title-parser regex, the LLM classifier prompt, the Amazon-pricePerUnit override logic, and the unit-conversion tables, is at github.com/thejdubb02/dealophant. The ranking function lives in server/aiUnitParser.ts.
Affiliate disclosure
As an Amazon Associate, Dealophant earns from qualifying purchases. Outbound product links carry the operator's Amazon Associates partner tag. Commission revenue does NOT enter the ranking math — the sort comparator does not have access to commission data, and the ranking function is purely totalPrice ÷ unitCount within the dominant family, ascending. This is enforceable from the source code linked above.