Lighthouse is a compass, not a destination
Synthetic benchmarks are useful right up to the moment they start setting the priority. A real example from CyclingHero — a 69 on mobile Lighthouse, well into the yellow zone, blazing fast in production, zero user complaints — and why the score-chasers were wrong.
I’ve lost count of how many times this conversation has happened.
A client opens PageSpeed Insights. Sees a 60. Panics. Files a ticket: “performance is bad.” I reply with “have you used the page?” The answer is almost always no. The page got pasted into a tool, and the score did the thinking.
That’s the whole problem.
Lighthouse is a compass. It points you toward common pitfalls — render-blocking resources, oversized images, layout shift — and that’s genuinely useful. It does not know whether your users are happy. It does not know what trade-offs the team made on purpose. It does not know what the page is for.
The CyclingHero case
CyclingHero is a long-running engagement of mine — a content-rich landing page for a tour operator, still on Next’s pages router. The synthetic mobile Lighthouse score is 69 — well into the yellow zone, the kind of number that makes a stakeholder open a ticket on sight. Desktop sits around 99. Customer complaints about page speed: zero.
The reasons the desktop score is high — and the reasons the page is in fact blazing fast in real-world use — are not accidents:
- All pages are static, with thoughtfully designed ISR. The marketing surface is built once, served from the edge, revalidated on a schedule the editorial team controls. No per-request rendering work for content that doesn’t need it.
- i18n bundles trimmed to only the keys each page actually needs. No 500KB locale dictionary shipped to a page that uses 30 strings. Page props stay minimal.
- Properly-sized, properly-encoded images at build time. No Next/Image runtime layer — because the source content was already polished, and the optimisation pass would just have added a hop and slowed delivery. Sometimes the right “perf optimisation” is removing one.
- A clever mix of SSG and React Query. Static for everything that can be static; React Query takes over after hydration on the surfaces that need live data, with a local user cache and smart hydration to make sure the initial render isn’t redundantly refetching what the server just sent. No double-calls, no flash of stale content.
The reasons the mobile score is low are also not accidents — they’re known trade-offs:
- Lighthouse simulates 4× CPU throttling on a “Moto G Power” that doesn’t really exist in CyclingHero’s traffic profile.
- The pages router serves a JS bundle ahead of route resolution. Known weakness vs the app router; chosen deliberately, because pages router has been stable for years on top of an editorial workflow that depends on that stability.
- The hero is image-rich on purpose. The brand sells itineraries — emotional, place-driven, full-bleed photography. Stripping it to chase TBT would be regression on the part of the site that actually converts.
Real users on real phones load the page, find a tour, fill in the form. Nobody’s blocked. The score is a number on a dashboard nobody uses.
What “good performance” actually means
Two questions, in order:
- What does the user need to do on this page? (Read content, find a tour, submit a contact form, watch a 3D scene.)
- Are they doing it without friction on the devices they actually use?
If the answer to #2 is yes, you’re done. Run RUM (real-user monitoring) for the long tail. Watch for regressions. Move on.
If the answer is “we don’t know”, that’s the problem to fix — not the synthetic score.
When Lighthouse is genuinely useful
I use Lighthouse on every project I touch. Always. It’s a genuine compass, and a compass is useful — but a compass tells you which way the wind is blowing, not where you have to sail.
Three things it does well:
- Catches regressions in CI. Budget files, PR-level diffs, alerts when an LCP image gets uploaded uncompressed.
- Surfaces mechanical wins. Missing
width/heighton images, render-blocking CSS, a rogue third-party script someone added without asking, a rogue font weight nobody actually uses. Real, fixable problems — most with no UX cost. - Provides a vocabulary for talking about perf with non-engineers. “INP regressed from 180ms to 320ms” lands better than “feels laggy”.
The font-weight audit
The font case is worth dwelling on, because it’s where Lighthouse most often pays for itself in a single ticket.
A page imports a font in 300, 400, 500, 600, 700. The layout actually only uses 400 and 600. That’s three extra font files (or a fatter variable axis) shipped for zero rendered glyph. Lighthouse flags it. The right response is rarely “strip them all”; it’s a five-minute conversation with design:
- Do we genuinely need 300 on that one tiny label, or did it land there because someone pasted a Figma component last quarter and nobody else uses that weight?
- Can we get the same look with
font-weight: 400and tighter letter-spacing? Synthetic boldness via CSS is bad practice on type — but synthetic lightness via tracking is often invisible to a real user. - If we do need a wide weight range, are we on a variable font where the cost is one file and an axis, not five separate files?
Most of the time you keep two weights. The page loads 60–120KB lighter, design loses nothing visible, and Lighthouse stops complaining. That’s the loop the tool was built for.
But: is this 80ms worth chasing?
The other half of using the compass well is knowing when not to follow it.
Lighthouse will happily flag an 80ms reflow on font-swap and tell you to optimise. Sometimes that’s correct. Sometimes it isn’t. If the page’s whole visual identity is anchored to a particular typeface — and most marketing sites are — the right answer is “this 80ms stays.” Unless you’re prepared to ship a system stack and make peace with what that does to the brand, the flag is informational, not a TODO.
Fonts are part of the product. They’re the part most users consciously notice without knowing they’re noticing. Fonts are sexy. A portfolio that ships in system-ui to save 80ms is the same kind of unforced error as a bank shipping in Comic Sans.
The compass tells you a typographic choice has a perf cost. The architect decides whether that cost is buying anything visible. Most of the time, it is.
What Lighthouse does not do well: tell you whether your users are happy, whether the trade-offs you made were the right ones, or which 30% of fixes are worth the hours they’d take.
Clarity on priority before chasing points
Before chasing the last 10 points on any score, you have to know what you’re optimising for. A landing page that converts on emotion is a different priority than a dashboard a user spends four hours a day in. Same Lighthouse score; very different good.
The clearest signal that a team has lost the priority: someone is grinding on a 92→97 jump while the page’s actual conversion rate hasn’t moved in a year.
The baking metaphor
Performance optimisation is like baking. You can’t refine a recipe in Excel. You write it down, bake it, taste it, adjust, bake it again. The cake either works or it doesn’t, and the score on the recipe card was always downstream of eating the cake.
The temptation, in both, is to keep refining the formula because it feels productive. The right move is to ship, use the page, watch how it performs in the wild, and then go eat ramen instead of dedicating your evening to chasing the last five points on a dashboard nobody but the procurement team will ever look at.
What I’d tell my younger self
- Lighthouse is one signal, not the signal. RUM is the signal that pays the bills.
- The synthetic benchmark’s job is to flag common problems. Your users are uncommon.
- If clearing a Lighthouse flag would visibly hurt the experience for real users, the flag stays. The score is downstream of the product, not the other way around.
- Deploy. Use the thing. Eat ramen. Optimise what was actually slow.
The score is the recipe card. The customer eats the cake.