Project frozen · 2026-05-26 snapshot

Hong Kong grocery price intelligence

Two million grocery prices. One honest wall.

My wife noticed no single Hong Kong shop carries everything she wants. I noticed foodpanda often charges more than the shop's own website. So I read the publicly listed price of nearly everything, eleven retailers deep, and let the numbers talk.

Then I parked it, on purpose. This whole dashboard runs inside your browser tab. No server. No database humming in a closet. Just SQL on Parquet, right here. Scroll down and poke it.

2,027,565
price observations
1,147
stores
11
retail sources
156,433
product families
110,508
barcodes
warming up…

The convenience tax

Walk, or let foodpanda walk for you?

Same product. Same shop. Two prices. One for your feet, one for your couch. For every product we could match between a chain's own website and its foodpanda delivery page, we measured the gap. That gap is the channel premiumChannel premium: how much more the same item costs on foodpanda delivery versus buying it direct from the retailer's own website. Measured per matched product, then summarised per chain..

Bars = the average markup and the painful 90th-percentile markup, per chain. The couch is rarely free.

Delivery markup by chain

foodpanda price vs chain-direct price, per matched product family

+20%
city'super adds a flat fifth on delivery. Same trolley, +20%, across ~3,007 matched items.
+0%
Marks & Spencer barely blinks. 97% of 2,660 items sit at exact parity online and delivered.
+13.2%
Wellcome's average. But the top tenth of items? +50%. Read the small print.

The shop scoreboard

Start here. Which chains lean on the couch tax, ranked by average markup. The shop is the story; the products are the evidence.

ShopAvg markupWorst 10%Items dearer on deliveryProducts compared
querying…

city'super and M&S bridge by exact product id (not name), so their rows are dead reliable: city'super tacks a flat fifth onto almost everything; M&S charges you the same whether you walk or wait.

…and the products that prove it

Biggest single-item gaps we found (direct shelf price vs foodpanda median, same retailer, matched products)

ProductChainCategoryDirectfoodpandaMarkup
querying…
Why delivery costs more
Visual explainer slot — a plain-language illustration of this concept lands here.

The hunt

Why you can't win the basket in one shop

It started with my wife. It's not that she can't find things in one place. It's that the cheapest version of each thing keeps landing in a different shop. I wanted to prove or kill that hunch. So I took six real items and checked the walk-in shelf price at two shops. Not delivery, not promotions. Just the price you'd pay standing in the aisle.

Watch the cheaper shop flip on every single line. And these two shops are owned by the same company.

One basket, two shops, the winner flips every row

real walk-in shelf prices · 2026-05-26 snapshot · Wellcome and Market Place by Jasons (both DFI)

ItemWellcomeMarket PlaceCheaper at
555 Fried Sardines 155g$9.0$17.5Wellcome
Cobs Sea Salt Popcorn 80g$45.0$25.0Market Place
Nestlé All-Purpose Cream 250ml$20.0$37.5Wellcome
Walkers Shortbread 150g$55.0$33.0Market Place
Oishi Crackers 90g$12.0$21.0Wellcome
French Cellars Cabernet 250ml$34.0$17.0Market Place
$175
whole basket at Wellcome
$151
whole basket at Market Place
$116
cheapest, if you split across both

Splitting the basket beats the cheaper single shop by 23%. And these are sister shops, same owner. Now picture doing it across real rivals — except they won't publish a barcode to let anyone compare (that's the wall at the bottom). So you can't optimise. You hunt by memory and luck. That isn't chaos in the data. That's the strategy working exactly as designed.

Today's bait

real markdowns pulled from the snapshot — the hooks that get you through the door

Black Thunder Mini Choc 139g$63$16
Campbell's Trolley Cart$399$50
Wah Yuen Butter Egg Rolls 18pc$129$35
Bonaqua Water 24×770ml$192$59
Black Thunder Almond & Hazelnut$63$16
Headache capsules 60pc (Mannings)$199$50

The evidence: every chain has a personality

x = price level vs market · y = how much prices wobble (CVCoefficient of variation: standard deviation ÷ mean. Low = steady, predictable prices. High = constant promotions and reversals, i.e. Hi-Lo.) · bubble = SKUs compared

CV 2.30
Mannings, the wildest swings in town. You genuinely cannot guess if today's price is a gift or a gouge. So you check. Every time.
CV 0.47
Wellcome keeps you guessing too — steady on the surface, spiky underneath.
flat
A handful of shops barely move. Boring is a feature: you can trust the price without a spreadsheet.
The basket you can never fill cheaply in one place
Visual explainer slot — a plain-language illustration of this concept lands here.

Promo intensity

Who is always “on sale”?

A big red SALE sticker on half the shelf is a strategy, not a coincidence. Share of each chain's catalogue carrying a discount on snapshot day.

Share of catalogue on promotion

percent of priced SKUs marked down · snapshot 2026-05-26

The tell

Read the last digit, read the shop

A price ending in .90 is a small trick played on your eye: you read “$9 and change,” not “$10.” (Hermann Simon: the further right a digit sits, the less your brain weighs it.) So how a shop ends its prices is a quiet confession of who it's pretending to be. This is charm pricingCharm pricing: setting a price just under a round number ($9.90 not $10) so it feels cheaper. A round ending ($10.00) signals the opposite — quality, confidence, no gimmick. Hong Kong skips the Western .99 and splits between .90 and fully round..

Two camps fall out of the data, and they're exactly who you'd guess.

Playing cheap
Everything ends in .90 — Donki 87%, AEON 71%. The whole shop whispers “look how affordable.”
Pretending premium
Clean round numbers — city'super 92%, Wellcome 74%. The flex is refusing the trick at all.

How prices end, by chain

share of prices ending in .90 (deal-feel), round .00 (premium), and the rare .99

Why $9.90 lands differently than $10.00
Visual explainer slot — a plain-language illustration of this concept lands here.

Ghosts in the data

When a 2-litre bottle of water costs $9,999

Sift two million prices and you meet the ghosts. 26 listings priced at exactly HK$9,999. Nobody is paying ten thousand dollars for two litres of Kirin water. It's a placeholder — a price typed by a human who didn't want anyone to click buy.

HK$9,999
Kirin Japan Soft Water 2L · AEON
The lazy out-of-stock. Marking an item unavailable on these platforms is fiddly clicks; typing a number no human will ever pay is one. So the water “exists,” technically, at a price designed to be ignored. AEON does this 17 times; JHC Japan Home, 9.
HK$9,999
Georgia The Black coffee 500ml · AEON
Or it's a plain fat-finger — someone meant $19.90 and a zero ran away. Either way, a naive “average price” would swallow these whole and lie to you. It's exactly why the charts above lean on robust statisticsRobust statistics: methods (median, modified z-score, Qn scale) that ignore a handful of absurd values instead of letting them drag the average. A single $9,999 ghost can't move a median. instead of a plain mean.

Volatility by aisle

Where prices swing the most

Toilet paper: boring, everyone charges about the same. Hot pot and fresh seafood: a casino. The price spreadPrice spread: how far apart the cheapest and most expensive store are for the same product, as a percentage. Big spread = shopping around actually pays. within a category tells you where it's worth shopping around.

Median price spread by category

how much the same item varies store-to-store · higher = shop around

The whole haystack

Two million rows. In this tab. Right now.

The charts above ran on tidy little summaries. This runs on everything: 2,027,565 raw price observations, a single 33 MB ParquetParquet: a columnar file format. It stores data by column instead of by row, which compresses far better and lets a reader grab just the columns and chunks it needs. Our 10 GB of raw listings became one 290 MB Parquet, then 33 MB once trimmed. file sitting on object storage. Your browser reaches in and grabs only the bytes it needs, using HTTP range requestsHTTP range request: asking a server for just bytes 1,000–2,000 of a file instead of the whole thing. DuckDB uses the Parquet file's index to fetch only the chunks a query touches, so a query over 2M rows might download a few MB, not 33..

This is DuckDB-WASMDuckDB-WASM: the DuckDB analytical database compiled to WebAssembly (~3.5 MB), running entirely in your browser. Real SQL, no backend, no API. This is the 2026 way to ship a data product with zero servers. doing real analytical SQL with nothing behind it. Edit the query. Hit run. Watch a laptop-grade database chew through two million rows from a static file.

First run downloads the DuckDB engine and the file's index, then streams only what each query needs.
Run a query to pull live rows from the firehose.

So what do you actually do

There is no cheapest shop. There are cheapest jobs.

We went looking for the one shop to rule them all. It isn't there — the basket flips even between sister shops, and the barcodes that would let an app sort it out are hidden. So here's the only move that survives the data: stop hunting for the shop, and match the shop to the job.

The one that just plays fair: Don Don Donki

37,431
products
CV 0.11
prices barely move (not Hi-Lo)
87%
end in .90 (cheap, openly)
$0
delivery gouge

You can't buy your whole list at Donki. But the household stuff, the snacks, the Japanese odds and ends — that's exactly its lane, and you can just order it on foodpanda. It has no separate website, so foodpanda is its shelf: there's no walk-in price to mark up against. Prices barely move, nearly everything ends in .90, and as far as the data can see it keeps its word. Loud about being cheap, and honest about it. (You still have to survive the store.)

Order without thinking

delivery price ≈ the shelf price, so tapping “deliver” costs you little to nothing

ShopWhy it's safe
Marks & Spencer97% of items at exact parity on delivery
PARKnSHOPdelivery runs about level with the shelf
AEONprices barely move — trust it without checking

Walk, or wait for the deal

these reward your feet and your patience, and punish autopilot

ShopThe catch
city'superflat ~+20% on delivery — walk if you can
WellcomeHi-Lo — only buy what's on its rotation
Manningswildest swings in town — never pay full price

So my wife was right all along. No single shop wins, no app will fix it, and the reason it can't is sitting one section down: the shops hide the one number that would let anyone compare. Until that changes, the hunt is the job. At least now you've got the map.

Why it's frozen

The honest part: where I stopped

Comparing the same product across different companies needs a shared barcodeBarcode / EAN / SKU: the 13-digit GS1 number on a product. It's the only bulletproof way to know two listings are literally the same item across different retailers.. Most chains don't hand that out. So I matched by normalised name instead (a family_idfamily_id: a fingerprint built from a cleaned-up product name, so ‘Yakult 5x100ml’ and ‘Yakult LT 500ml [random delivery]’ collapse into one comparable family even without a barcode.). That works, until you notice almost every cross-chain match is the same parent company wearing a different hat.

The match funnel

families that survive each honesty filter

All product families156,433
Appear at 2+ chains (by name)14,325
Genuinely cross-company (3+ rivals)131

Wellcome, Mannings and Market Place by Jasons are all DFI. PARKnSHOP and Watsons are all AS Watson. Strip the same-parent banners out and a 14,325-family mountain becomes a 131-family hill. To climb past it you need the loyalty-app price feeds (yuu, MoneyBack). That's a different, much bigger project. I chose to stop. Knowing where the diminishing returns start is the actual skill.

The local model earned its keep

messy merchant categories, mapped once, on my own laptop

Thousands of junk category strings (“$10 Flash Sale”, “3:15 PM Tea Break”) refused to map. Instead of paying an API per row, qwen2.5:7b on Ollama classified each distinct string exactly once into a durable map (632 of them), then never ran again. A second local model, bge-m3, matched Chinese names to their English twins where barcodes were missing. Two million rows, normalised privately, for the price of electricity.

Same-parent banners vs real rivals
Visual explainer slot — a plain-language illustration of this concept lands here.