About — Guardian Angles

The idea

Earlier this year someone sent me a thread by Ted Alcorn, a US journalist who'd built a tool called Below the Fold. It lets you rummage through 25 years of New York Times coverage — reporters, topics, sections, headlines — and watch different strands of the paper's journalism rise and fall over time. It's the kind of tool that sharpens what you instinctively feel about a publication into something you can actually look at, and it spread fairly quickly.

I've been playing with more expansive vibe coding recently and wanted specifically to play with the Guardian's API, so decided to scratch that itch by putting a spin on Ted's fantastic work. The version of this I had in my head was a bit different — no bylines, more weight given to tags and sections, probably a bit slower-paced. Guardian Angles is the result.

How it's made

A note that probably matters: I'm not a programmer. I haven't written a Python script from scratch, I find the command line weirdly intimidating and although I can now just about explain what a gzipped JSON shard is, I couldn't have a month ago.

Everything here was built through the Claude Code desktop app with me describing what I wanted and letting Claude do the rest. Conversations that went roughly: "could we show what the Guardian was publishing about this exact week, in every other year?" or "the month pill at the top of the chart is getting clipped on desktop — can we look at that?"

How it was built, more technically

An hourly GitHub Action fetches headlines from the Guardian's Open Platform API into compact gzipped JSON shards, one per month. A second step aggregates those into search-ready indexes at monthly, weekly and daily granularity, plus a 3,000-entry tag catalogue. The frontend is a set of vanilla JS modules — no framework, no database, no server — with canvas-rendered charts, the occasional Pretext typography effect, and the Guardian's colour palette throughout. The whole thing is a few megabytes of static files served from GitHub Pages.

It's open source. Everything that isn't an API key is in the repo.

How it developed

A rough chronological log of what went in when. The newest changes are at the top.

Every view is now a link

Trends, Deep dive and This Week have always remembered themselves in the address bar — share the URL and the other person sees what you see. Topics and Newsroom didn't, and quietly reset on every visit. Now they hold their state too: the Topics table keeps your name filter, section chip, year range and sort; Newsroom keeps the absolute-or-proportional toggle, the year range, any section you've solo'd or hidden, and even an open month drilldown — send someone newsroom.html?solo=film&month=2018-06 and they land on June 2018's film desk, list of biggest tags and all. Defaults stay out of the URL, so nothing changes unless you've actually touched something.
The views start talking to each other

An odd thing about the site until today: five views, and the only route between them was the navigation bar. You could be looking at Keir Starmer's decade on the Trends chart with no way to say "now show me everything about him" short of retyping the search on another tab. Fixed with a family of small, quiet links. Every compared term on Trends now carries a "Deep dive →" under its stats; every row of the Topics table gets one on its slug line; the story of the week's hero card has one; and when you click a chart peak and the dispatches open, a link takes the same topic into Deep dive scoped to that year. Nothing new to learn — just dotted-underline escape hatches where your eye already is, each carrying your year range along with it.
The repo goes on a diet

A few months ago I learned what a gzipped JSON shard is; this week I learned what happens if you commit nine megabytes of them to git every hour for months. Each hourly build had been committing not just any new headlines but the entire set of search indexes rebuilt from them — files you can regenerate at any moment from what's already there — and git, which never forgets anything, had quietly grown the repository to 4.7 gigabytes, close to the point where GitHub starts sending concerned emails. Two changes. The indexes are now built fresh on every run and shipped straight to the site without ever touching the repository's history. And the current month's data — the one file that genuinely changes every hour — travels the same way, only joining the permanent archive once the month closes. We also caught a quiet accomplice: every refetch stamped a new timestamp into files whose actual content hadn't changed, so even nothing-happened runs looked like changes worth recording. The timestamps are gone, identical content now produces identical bytes, and history should grow by about a megabyte a month rather than ten an hour. The site looks exactly the same and updates exactly as often.

Postscript, later the same day: with the bleeding stopped, we went after the existing 4.7 gigabytes too. Nobody but me had ever really cloned the repository, so there was nothing to break. A tool called git-filter-repo stripped the data files out of every old commit — at which point the 473 hourly bot commits, now changing nothing, simply ceased to exist — and the headline archive was re-landed on top as a single commit. The 115 commits of actual development survive untouched, the files at the end are verified byte-for-byte identical to what was there before, and the repository is 128 megabytes. The code history, it turns out, was 700 kilobytes of the 4.7 gigabytes all along.
Chasing the mobile snap

For a few days Trends on Chrome mobile had been crashing the tab on me when I compared words rather than tags — a long pause after hitting Compare, the page reloading with blank fields, then an immediate snap. Tag comparisons were fine. Took a couple of passes to find the actual culprit. Tags always hit a pre-built index so they're a direct array lookup. Words fall back to scanning every monthly shard if the term isn't in the top 5,000 — and our scan was loading all 163 monthly shards simultaneously, around 500MB of parsed JSON on a phone with roughly 400MB of tab memory. The rising-tags panel in the reading pane was also eagerly loading an 80MB daily tag index just to pick out the biggest tag from the last full day, which didn't help. Fixed both: the scan now loads four shards at a time and drops each from memory the moment it's finished counting matches; the trending-tag insert now reads from the weekly index that's already loaded for rising-tags, which costs nothing extra. And since the "Searching…" message used to live only on the reading panel — which on mobile is off-screen below the chart — it's now also shown where you're actually looking, as a small pulsing Guardian-blue dot in the chart's empty-state line.
Deep dive, deeper

The middle of the Deep dive page looked a bit hollow once the four summary stats landed, so I added four more blocks that fill the space with something the stats can't show you on their own. A weekly heatmap — one cell per ISO week, shaded by volume — that surfaces cluster shapes the monthly sparkline only hints at. First / Peak / Most recent dispatch cards showing the actual articles at those points, rather than just the dates. A peak-month drilldown that expands inline when you click the peak stat, listing up to twelve headlines from that month. And a sidebar block surfacing the content words that keep appearing in matched headlines, which turns out to be a nice read on framing — "emergency" and "warming" rising to the top on the climate crisis, "pledge" and "leader" on Keir Starmer. The yellow-bar highlight from Trends carries through too. Later: the sidebar blocks and the section-mix bars all became clickable filters. Click "Labour" in Travels-with or any of the section rows and the headline list narrows to that facet; click again to clear. A blue-ruled chip sits above the list to make the active filter obvious.
Every tag gets its real name

A reader spotted that the site was showing "Brexit Party" for a tag the Guardian has since renamed to "Reform UK". Tag slugs are immutable in the Guardian's CMS but the webTitle on a tag can change — and ours was derived from the slug, so renames were invisible to us. Wired up a small script that pulls every catalogued tag's current webTitle from the Guardian's /tags API and caches it; runs monthly on its own. Side effect: the hundred-odd smooshed-name overrides I'd been hand-maintaining are now mostly redundant, since CAPI returns "Margaret Thatcher" for a slug that's simply "politics/margaretthatcher".
"Subjects" is now "Topics"

Minor but niggling: "Subjects" is the Guardian CMS's internal word, not a reader's. Renamed the tab to "Topics", which reads more naturally for what the view actually does. While I was in the nav, I reshuffled it a bit — the three search surfaces (Trends, Topics, Deep dive) now cluster together on the left, with This Week and Newsroom as the editorial-frame tabs to the right. URL paths all stayed the same so shared links still work.
Deep dive

A fifth tab for when the question is about a single topic rather than a comparison. Pick a tag or a word, pick a date range, and you get the shape of coverage — total, peak month, first and last appearance, section mix, a volume sparkline — followed by every headline in that span streaming in newest-first, with the tags it tends to travel with in a sidebar. A filter box and a CSV export sit above the list. Compare-up-to-four is a great way in but the wrong shape when you want the whole story on one thing.
The design audit

I'd been staring at this thing long enough that I couldn't see it properly any more, so I asked Claude Design to go through the whole site and come back with things that weren't working — not functionality, but proportion, rhythm, contrast, the sins you can't see once you've built something. It returned a prioritised audit of thirteen fixes, which we then worked through with Claude Code one at a time. The masthead dek slimmed to a single sentence. The chart title split into a muted empty-state prompt and a proper serif headline, which are genuinely different jobs. Toggle controls unified on ink instead of drifting between ink and blue. Real paper grain via an SVG noise tile, replacing the dot lattice I'd been shipping that turned out to be effectively invisible on retina displays. Focus rings, ARIA labels on the chart canvases, a global respect for prefers-reduced-motion. Most usefully a proper mobile pass, which caught a horizontal-overflow bug I'd never noticed on desktop and a tap-target problem on almost every segmented control. While we were in there I noticed the About page read more like product marketing than anything I'd actually write, so rewrote it. What you're reading now is the result.
guardian-angles.com

Bought the domain, pointed the DNS at GitHub Pages, rewrote the OG image URLs on every page. The old github.io URL now redirects, so anything already shared still works.
"I feel lucky", expanded

The curated recipe list had come out quite heavily UK-weighted, which felt odd for a button partly meant to surface unexpected comparisons. Expanded to 250 word recipes and 225 tag recipes — more US politicians, more Latin American leaders, more of Asia, plus world cities, tourist wonders, rivers, cars, pop stars, actors, directors, and a couple of personal easter eggs I won't spoil here.
Data ingestion, now resilient to me

Twice on the same day the hourly backfill job fetched a batch of old 2015 shards, rebuilt the indexes, committed them, and then failed to push because I'd landed a design commit during its three-minute build window. Each failure threw away what it had fetched. Added a rebase-and-retry loop in the workflow so the data commit rebases on top of whatever else has landed and tries again. Also fixed This Week, which had been showing the previous week's "story of the week" one step behind on Mondays because the default-week logic assumed the last bucket in the index was always the partial one in progress.
Extending back to 2012

I'd been looking at the Newsroom chart and wondering what happens if we could see further back. Specifically, I remembered the Guardian went through a content-reduction moment around 2015 and I wanted it visible. Extended the backfill start date from 2016 to 2012 — covers the start of modern digital Guardian, the 2012 Olympics, Leveson, the Scottish referendum, Gamergate, and that 2015 editorial step-change. 48 extra months. The hourly CI job started chipping away.
"Last updated" indicator

A small pulsing green dot and italic "updated 17m ago" under the stat counter on every page. Tells you at a glance how fresh the data is. Hover for the exact timestamp. The relative format walks from minutes → hours → yesterday → days.
Mobile polish pass

I was worried the whole thing might not hold up on a phone, so we did a proper sweep. Eight specific fixes: chart axis labels no longer clip at the right edge; four-tag chart titles line-clamp to three lines instead of overflowing; the "Share as image" button goes full-width so the label doesn't truncate; Newsroom year labels switch to short format ('18, '20, '22) when cramped; Subjects article counts no longer lose their "k"; subnav now has a fade-out cue so you can tell there's more to scroll to; the hint copy says "hover or tap" instead of hover-only; the reading panel's CTA shrinks on narrow screens. The chart itself remains readable at 10+ years of monthly data on a phone — dense but not broken.
"I feel lucky", proper randomness

The curated recipe list had been firing too often, so I asked for it to feel more like an easter egg. Now 80% of picks are pure random from the top of the index, 20% are pre-baked combinations (including some deliberate odd-one-out comedy — "bread, butter, cheese, crumpet" / "pandemic, outbreak, epidemic, manflu"). The button itself moved to be the first thing you see in the sidebar, in the same styling family as the Compare button rather than a separate novelty card.
Name fixes, accent fixes, grammar fixes

A whole category of small embarrassments got fixed: words like "orb", "rgen", "nchez" were appearing in the random pool because the old tokenizer couldn't cross an accent boundary — rebuilt the index so "Orbán" indexes cleanly as "orban" (and a search for "orban" still finds "Orbán" in headlines). Possessives now fold together — "America" and "America's" count as one thing. UK and US section labels finally capitalise correctly. The Newsroom y-axis snaps to round values instead of the slightly awkward 5.1k / 7.6k / 10.2k steps it had before.
About page + masthead polish

This page. Cleaner favicon, OG share card redesigned to look like the real masthead rather than a dark blue block. Extra breathing room between the title and the blue/yellow rule on desktop.
Every headline becomes a link

Final API refetch across the decade captured each article's canonical URL. Clicking any headline anywhere in the site now opens the original Guardian article. Also shipped: the "I feel lucky" button redesigned to match the Compare button's editorial style; "Time machine" button replaces the random-week emoji.
The full decade lands

After a rate-limit saga lasting several days — the Guardian free tier was tighter than documented — a one-off quota bump from an engineer at the Guardian let us backfill the whole 124-month range from January 2016 to April 2026 in a single 25-minute burst.
Pre-baked lucky recipes

The "I feel lucky" button became curated. 131 word-based comparisons (from "happy, sad, joy, misery" to "bread, butter, cheese, crumpet") and 69 tag-based ones. Clicking it produces an instantly charted story rather than four random tags.
PNG export

Every Trends chart has a Share-as-image button that composites the current view into a branded PNG with the masthead, title, legend, and chart baked in — ready for Slack or social.
Year-on-year overlay

New toggle on the Trends chart: Timeline or Year-on-year. The latter overlays each year of a term's data on a shared January-to-December axis, making seasonal patterns and year-on-year shifts immediately visible. The "inflation" YoY view shows 2022 towering above every other year.
This Week

A fourth tab with an auto-generated dashboard: biggest Guardian tag this week, all-time peak, sample headline, weekly-sparkline, fastest risers and fallers, and "On this week in …" — the biggest tag in the same week of every year back to 2016. Later additions: arrow buttons to step through history; clickable year cards; a "Time machine" button that jumps to a random week.
Newsroom

A third tab: a stacked area chart of the Guardian's publishing output, broken down by section across the whole decade. Toggle sections on and off, switch to proportional view to see share-of-output, click any month to drill into tags. Reveals that Guardian output dropped from ~10k articles/month in 2016 to ~5-6k by 2024, and that World news holds a remarkably stable ~25–30% of output regardless of the news cycle.
Subjects

NYT-inspired table of every tag we index, ranked by volume, searchable and filterable. Tick up to four to launch them onto the Trends chart. Uses a dual-thumb year-range slider and section chips for filtering.
Word-boundary search

Fixed a subtle bug where the chart and the headline list disagreed. Searching "AI" used to return headlines containing "rain", "again", "Spain" in the explorer even though the chart only counted real "AI" mentions. Both now agree on word-boundary matching.
Section filter, rising panel, trending

Added a section filter so you can see "Starmer in Opinion only". The idle reading panel started showing "rising fastest" tags and words based on a 4-week vs 12-week baseline — a newsroom lens for what's spiking.
Tags mode

Words were never going to be enough. Switched the default search mode from free-text to Guardian's canonical editorial tags. Donald Trump, climate crisis, cost of living — all resolve to tag IDs that catch every article regardless of headline phrasing. 3,000 top tags indexed with autocomplete.
The first shape

Three views: masthead, search box, line chart, reading panel, headline explorer. Word search only. Monthly granularity. 2022–2024 data. Deployed to GitHub Pages. The whole thing was a single HTML file with three JS modules.
The first prompt

Before any of that, there was just a question, typed into Claude Code:

I'm interested in building something like the tool shown in this screen capture of an X thread, but for the Guardian and using the Guardian's API and Pretext. The website is here: tedalcorn.github.io/nyt. How hard would this be, and would I be able to build it in a way that was quick to load for a user? The one thing I want to avoid is the emphasis on individual writers: I prefer tags and sections to avoid individuals being exposed for various reasons. I'm kind of picturing a Google Trends / Guardian Trends kind of idea. I don't want a perfect copy of the NYT one. I want it to feel distinct and interesting. But I love the headlines thing and being able to see individual coverage of people. All ideas welcome. Maybe we start with a solid foundation that's achievable and we can build on?

Everything you've scrolled through above is what that question turned into.

The idea

How it's made

How it was built, more technically

How it developed

Every view is now a link

The views start talking to each other

The repo goes on a diet

Chasing the mobile snap

Deep dive, deeper

Every tag gets its real name

"Subjects" is now "Topics"

Deep dive

The design audit

guardian-angles.com

"I feel lucky", expanded

Data ingestion, now resilient to me

Extending back to 2012

"Last updated" indicator

Mobile polish pass

"I feel lucky", proper randomness

Name fixes, accent fixes, grammar fixes

About page + masthead polish

Every headline becomes a link

The full decade lands

Pre-baked lucky recipes

PNG export

Year-on-year overlay

This Week

Newsroom

Subjects

Word-boundary search

Section filter, rising panel, trending

Tags mode

The first shape

The first prompt

Thank you