Back in 2023 I rebuilt the 2000s “Visitors:” counter with a Cloudflare Worker and KV . That was a deliberate play test — a fun way to kick the tyres on the Workers + KV read/write pattern — and I was happy with it. I knew it was approximate; accuracy was never the point. What I hadn’t done until this week was measure how approximate, and build something that actually settles to the truth.
This is that rewrite: same modal bottom-left, but the number behind it now comes from edge analytics and converges overnight.
Try it: the count is bottom-left of every page on this site.
What the original counted, and what it skipped ¶
The 2023 version is a client-side beacon. 111.js fires a fetch('/stats' + pathname) on DOMContentLoaded, and the Worker does get → +1 → put in Workers KV
. A view only lands if the visitor runs JS, the request isn’t blocked, and the page reaches load.
I always knew that under-counted — a beacon needs the visitor to cooperate, and my readers are exactly the ad-blocking, no-JS-by-default crowd. For a toy that’s fine. But I finally pulled the real edge numbers to see the size of the gap:
| Page | Old beacon recorded | Real edge page loads |
|---|---|---|
/ (homepage) |
~22/day | 151/day |
The beacon was catching roughly 15% of actual loads — off by about 7×. Expected, but bigger than I’d have guessed.
There’s a second limit baked into the design: get → +1 → put isn’t atomic, and KV caps writes near one per second per key
. Two hits on the same page in the same second race, and one increment disappears. Fine for a counter that’s just for fun; a dead end if you ever want it to be correct.
Most of the “traffic” is bots ¶
Grouping the edge data by Cloudflare’s botManagementDecision reframed the whole thing. One day of blog HTML traffic:
| Verdict | Hits/day |
|---|---|
likely_automated |
195 |
automated |
110 |
likely_human |
93 |
verified_bot (Googlebot, GPTBot…) |
90 |
other (no score) |
30 |
About 76% bots. So the honest “real readers” figure is ~120/day, not the ~550 raw HTML loads — and nowhere near what the beacon saw. Ironically the beacon was an accidental bot filter (bots rarely run JS), it just threw out most humans to get there.
The blog also shares its Cloudflare zone with other services I run, so step one for any honest count is a hostname filter: keep the blog’s apex host, ignore everything else on the zone.
The model: two numbers, eventually consistent ¶
I didn’t want to lose the part I like — the count ticking up the moment a page loads. But I wanted it to be true. Those goals pull against each other: client-side counting is instant but lossy; edge counting is complete but only available after the fact.
So I keep both. Each page stores two values in KV:
{ "reconciled": 10219, "live_today": 7 }
live_today— the optimistic tick.POST /stats/<path>on load bumps it. Instant, satisfying, slightly low.reconciled— edge truth through end of yesterday. Only the nightly cron writes it.
Displayed count is reconciled + live_today.
The detail that makes this safe: the edge count is a superset of the beacon count. Every visitor whose JS fired also made an HTML request the edge already logged. So I never add the two — live_today is a preview, and each night the cron replaces it with the real number for that day. No double counting, because the day’s optimistic tick is discarded and rebuilt from edge truth.
That’s eventual consistency. During the day the number is a fast, slightly-low estimate; overnight it converges to the correct value — which is higher, because it recovers every blocked and no-JS reader the beacon missed. It only ever jumps up, so a visitor never watches it count backwards.
How it fits together ¶
Two write paths into the same KV value — one live, one nightly:
The live path serves every request. The scheduled path runs once a night and corrects the record.
A day in the life of the counter ¶
- 00:30Z — cron reconciles yesterday.
reconciledsteps up to edge truth;live_todayresets to 0. - Through the day — each human view bumps
live_today. The display tracks it live, sitting a little below reality (blocked/no-JS readers aren’t inlive_today). - Next 00:30Z — the cron folds that day’s real human count in from the edge, and the display snaps up to the true figure.
The reconcile, step by step ¶
A cron trigger
(30 0 * * * UTC) runs a scheduled()
handler in the same Worker. It queries the Cloudflare GraphQL Analytics API
— the same data the dashboard draws from, counted at the edge, immune to ad blockers and JS.
The query does the heavy lifting:
httpRequestsAdaptiveGroups(
filter: {
datetime_geq: $start, datetime_lt: $end,
clientRequestHTTPHost: "jamieede.com", # blog apex only
edgeResponseStatus: 200,
edgeResponseContentTypeName: "html", # a page, not a .js/.css/.json
requestSource: "eyeball" # not a Worker subrequest
}
limit: 5000
orderBy: [count_DESC]
) {
count
avg { sampleInterval }
dimensions { clientRequestPath botManagementDecision }
}
Then the handler:
- Reads a single meta key,
_meta:lastReconciledDate. - Computes the range
(lastReconciledDate, yesterday]and clamps the start to the 32-day retention window. - Runs the query, and per path sums
count × sampleIntervalover the human verdicts only (human,likely_human,other), skipping/cdn-cgi/. - For each path:
reconciled += humanViews,live_today = 0. - Advances
_meta:lastReconciledDateto yesterday — last, so a failure mid-run leaves it un-advanced.
That sampleInterval multiply matters in theory and not in practice. The adaptive dataset is sampled, but at my volume it reports ~1.00–1.05 — effectively exact. It only earns its keep if I ever get popular.
Why the key mapping matters ¶
The live path keys by the request path (which includes the /stats route prefix). Analytics returns the bare page path. They have to land on the same KV key or the counts split. So the cron prepends /stats before writing:
Both paths converge on pageviews:/stats/posts/foo/, which is also the existing key format — so the switch preserved every count instead of resetting to zero.
Idempotency and self-healing ¶
The meta key makes re-runs harmless. If the cron fires when it’s already caught up, it sees last == yesterday and no-ops with zero writes. If a night is missed, the next run reconciles the whole range since the last success, so it heals itself (within the 32-day window). Triggered twice locally, the second run proves it:
{"msg":"reconcile first-run init","lastReconciledDate":"2026-06-13"}
{"msg":"reconcile up-to-date","last":"2026-06-13"}
Honest callouts ¶
Retention is 32 days. Per-path data only lives in the sampled *AdaptiveGroups dataset, retained 32 days on my Business plan. The exact daily rollup goes back a year but carries no path dimension. The cron runs nightly and only reads “yesterday”, so there’s plenty of slack — unless it’s dead for a month, in which case it logs a gap warning and accepts the small undercount.
Bot scoring is partial. The numeric botScore
and jsDetectionPassed are Enterprise-only — blocked on my plan. The categorical botManagementDecision is populated on Business, and that’s enough to bin the crawlers.
One rollover artifact. A visitor in the ~30 minutes between midnight UTC and the cron may see their live_today reset before that day is reconciled. It returns the next night from edge truth. Fine for a counter in the corner of a page.
No reset on switchover. reconciled was seeded from the old counts, so nothing dropped to zero — the homepage kept its 10,219 and grows accurately from here.
Why bother upgrading a toy? ¶
The 2023 build did exactly what I wanted at the time: a nostalgic number that goes up, and a clean excuse to play with Workers and KV. Nothing wrong with it. The only thing it couldn’t be was accurate — a beacon can’t see the visitors who never run it.
Making it honest meant moving the source of truth off the client and onto the edge, where every request is already counted whether the visitor cooperates or not. Same modal, same corner of the page — the number just means something now.
Read the original: Using Cloudflare Workers for a page views modal · more Hugo website posts