Website Performance Audit: Methodology and What Tools Miss

A real performance audit measures the things most tools miss, and ranks the fixes by revenue.

Most performance audits in the wild are a screenshot of a Lighthouse score and a list of generic suggestions. That is not an audit. A real audit segments lab measurements from field measurements, separates symptoms from causes, ranks fixes by revenue impact, and ends with a written plan. This page covers what that work actually contains, what most tools miss, and how my audit work maps to the productized engagement tiers I publish elsewhere on this site.

01Definition

What a real performance audit actually contains

A performance audit is a written diagnosis of why a site is slower or less reliable than it should be, ranked by the severity of the impact and the cost to fix. The output is not a score; the output is a prioritized list of changes with confidence levels, expected metric movement, and an estimate of the revenue impact of leaving each one alone.

The work has five orthogonal layers. Network covers TLS, HTTP version, server response time, CDN configuration, and time-to-first-byte under real conditions. Render covers the critical rendering path, render-blocking resources, font loading strategy, and how quickly the largest visible element actually paints. Interactivity covers JavaScript execution, main-thread blocking, third-party scripts, and the pattern of visual feedback when the user clicks or types. Stability covers cumulative layout shift, unexpected content reflow, and the late-loading surprise patterns that make a page feel cheap. Backlogs covers everything that is technically not one of the above but quietly costs you, including image formats, accessibility blockers that double as performance problems, and the long tail of scripts your team forgot they installed.

The deeper measurement covers all five layers in both lab conditions (controlled environment, repeatable, easy to debug) and field conditions (real users on real devices, sampled across the 75th percentile, the threshold Google itself uses to grade Core Web Vitals) (Google web.dev, 2024). Auditing only one half of that pairing is how plausible-looking Lighthouse reports cover for sites that are actually broken for most visitors.

75%

of real user sessions must hit the good Core Web Vitals threshold for a page to count as passing, measured at the 75th percentile across mobile and desktop devices.

Source: Google web.dev (2024)

02Tool gaps

What free tools quietly miss

PageSpeed Insights, Lighthouse, and the Chrome DevTools performance panel are real tools. I use all three in every audit. They each have known gaps, and in 2026 the gaps matter more than they used to.

Lab vs field divergence. A Lighthouse score run on a fast desktop with simulated throttling is not what your visitors experience. Real users run on a mid-tier Android with a flaky LTE connection at the corner of an intersection. The 75th percentile field measurement, drawn from the Chrome User Experience Report, frequently disagrees with the lab score by twenty to forty points (HTTP Archive Web Almanac, 2024). When the two diverge, the field number is what Google ranks against. The lab number is the decoration.

INP under stress. Lighthouse simulates interaction in lab conditions only at a single moment. INP, the metric that replaced FID in March 2024, captures the worst observed interaction across the entire session (Sullivan and Viscomi, 2024). A Lighthouse INP score that looks fine can hide a single brutal interaction during a form submission or modal open that destroys the actual user experience.

Third-party flakiness. Chat widgets, A/B test scripts, analytics SDKs, ad tags, and consent managers are often loaded asynchronously, which means they do not appear in a single lab snapshot the same way every run. Their cost is measurable only across many real sessions. Most audits ignore this. A real audit catalogs every third-party script, measures its observed cost, and recommends a defer or remove decision per script.

What the score does not score. Lighthouse accessibility, SEO, and best-practice categories are shallow. A site can score 100 on Lighthouse accessibility and fail WCAG 2.2 in real audit conditions, because the tool checks roughly thirty conditions and WCAG 2.2 has eighty-six (W3C, 2023). A site can score 100 on SEO and have no schema, no canonicalization, and no internal-link strategy. The score is a starting point, not a verdict.

53%

of mobile site visits abandoned when a page takes longer than three seconds to load.

Source: Think with Google, 2017

03Revenue impact

What a real audit actually buys you in dollars

The revenue case for performance is not subtle. Akamai's retail performance research found that a one-hundred millisecond delay in load time reduced conversion rates by seven percent in their dataset of e-commerce sites (Akamai, 2017). Deloitte's 2020 study of mobile site speed and retail outcomes, working with Google, found that a tenth-of- a-second improvement in mobile site speed delivered an eight to ten percent lift in conversion across categories, including service businesses (Deloitte and Google, 2020). Vodafone Italy's rebuild of one of their pages on modern stack improved largest contentful paint by thirty-one percent and lifted sales by eight percent in the measurement window (Google, 2022).

For a service business, the math is similar but the mechanism is slightly different. Slow service sites lose high-intent visitors who arrived from a Google search and had a specific question. The visitor reads two seconds of loading, hits the back button, and the next-listed competitor gets the call. The rate at which this happens rises sharply between two and five seconds of perceived load time (Akamai, 2017; Deloitte and Google, 2020). For a shop generating fifty inbound leads a month, a ten percent conversion lift from performance work is five additional customers, every month, forever. At an average ticket of three to seven hundred dollars, that math justifies the audit and the fixes inside the first quarter.

The reverse is also true: an audit that returns a clean bill of health is also valuable, because it forces the conversation to where the actual revenue leak lives. Most sites I scan have at least one performance issue costing measurable revenue. A small share have none, in which case the audit redirects attention to conversion design, copy, or trust signals, which are diagnosed by the same tooling.

8 to 10%

average mobile conversion lift per 0.1 second improvement in site speed across retail and service categories.

Source: Deloitte and Google, Milliseconds Make Millions, 2020

04How this maps to engagements

The three-tier path from audit to shipped fixes

I publish three productized engagement tiers that map cleanly to the audit-and-fix arc. The choice depends on what your scan turns up, your timeline, and how much you want to ship in one sweep.

Tier one, free first read. Pathlight runs a scored scan against any URL in roughly ninety seconds. The output is a written report covering performance, conversion, trust, and revenue-impact estimates with a prioritized fix list. The scan is free, the report is yours, and there is no follow-up obligation. Most audits I run start here, because Pathlight is the diagnostic tool I built for exactly this purpose.

Tier two, productized fix. Fix Sprint is the post-scan engagement when the report identifies a handful of clear, high-impact issues that can be shipped without a full rebuild. Two-week fixed-price engagement. Three top-priority issues from your scan, ranked by revenue impact, deployed to production. Includes a Lighthouse before-and-after and a fresh Pathlight re-scan so the movement is verifiable rather than asserted. The fee is credited toward a larger engagement if you decide to keep going.

Tier three, full rebuild. When the audit surfaces structural problems (legacy stack, deep technical debt, no schema layer, brittle templating, poor mobile rendering across the board), the right answer is a rebuild rather than another round of patches. The full Next.js development engagement is covered on the Next.js development page, with engagement scope, deliverables, and the honest case for when a rebuild is the right call versus when it is overkill.

The honest framing is that not every site needs all three tiers, or even any of them. If your scan returns a clean report and you are happy with the conversion outcomes, the right next step is no engagement at all. I will tell you that on the discovery call, and I have told it to enough buyers in the past that it is part of how I sell the posture.

05Process

How I run an audit, end to end

Five phases, in order. The phases scale with the engagement tier: a Pathlight scan is automated and runs phases 1 to 3 in ninety seconds. A Fix Sprint runs all five with me in the loop. A full rebuild folds these phases into the Discovery and Architecture stages of the larger engagement.

Triage
15 to 30 minutes
I read the URL the way a buyer would. What loads on first paint, what loads after, what feels off in the first ten seconds, where my eye goes, what I cannot find. The triage notes go straight into the audit before any tool runs, because the human-perception read is the layer the tools cannot replicate.
Lab measurement
1 to 2 hours
Lighthouse runs across the canonical pages on mobile and desktop. WebPageTest runs against multiple connection profiles and three geographic regions. Chrome DevTools performance panel for one detailed flame-graph capture per representative page. Output is a structured table of lab metrics with one column per page, one row per metric, color-coded by severity.
Field measurement
30 to 60 minutes
Chrome User Experience Report 75th-percentile field metrics for every URL with sufficient traffic. Real-user monitoring from the site's own analytics if it is wired. The field numbers are what Google ranks against; the lab numbers are what I debug against. Both go into the report, side by side, because divergence between them is itself a finding.
Prioritization
1 to 2 hours
Every finding gets three numbers: severity (1 to 5), implementation cost (S, M, L), and confidence in the revenue impact estimate (low, medium, high). The list sorts by severity divided by cost. Top of the list is what ships first. Bottom of the list goes into the backlog or gets cut entirely if the cost-to-impact ratio is wrong.
Written plan
1 to 2 hours
The report is a single PDF or Notion page covering executive summary, lab table, field table, prioritized findings, recommended action plan, and a small appendix on what was checked and why specific tactics were skipped. The plan is the deliverable. The fixes happen in the engagement tier you choose.

06Scope

Timeline, deliverables, and pricing

Engagement scope

Audit timeline

90 seconds to 2 weeks

Productized tier

Pathlight free · Fix Sprint $2,995

Audit work folds into Fix Sprint when fixes are shippable in two weeks; folds into Starter or larger when a rebuild is the right call

Engagement

Solo principal

Same person at pitch, build, and launch

What you receive

Pathlight scan covering performance, conversion, trust, and revenue impact
Lighthouse and WebPageTest lab measurements for canonical pages
Chrome User Experience Report 75th-percentile field measurements
Prioritized fix list ranked by severity divided by implementation cost
Lab versus field divergence analysis where it matters
Third-party script catalog with defer or remove recommendations
Lighthouse before-and-after if engagement proceeds to Fix Sprint
Fresh Pathlight re-scan post-launch to verify the score movement
Written plan as PDF or Notion page, yours to keep regardless
Discovery-call alignment on which tier matches the findings

Common questions

What buyers usually ask before signing

Next step

The fastest first read is a free Pathlight scan against your live URL.

The scan produces the same lab and field measurements I open every paid engagement with, in roughly ninety seconds. If the report surfaces fixable issues, Fix Sprint ships the top three in two weeks at a fixed price. If it surfaces structural problems, the conversation moves to a full rebuild. Either way, the diagnostic is yours and the next move is yours.

Run a free Pathlight scan Or jump to Fix Sprint

Sources

1.Google web.dev. (2024). Core Web Vitals: thresholds and 75th-percentile measurement. https://web.dev/articles/vitals
2.HTTP Archive. (2024). Web Almanac 2024: Performance chapter. https://almanac.httparchive.org/en/2024/performance
3.Sullivan, B., and Viscomi, R.. (2024). INP becomes a stable Core Web Vital on March 12. https://web.dev/blog/inp-cwv-march-12
4.Akamai. (2017). Akamai Online Retail Performance Report: Milliseconds are critical. https://www.akamai.com/newsroom/press-release/akamai-releases-spring-2017-state-of-online-retail-performance-report
5.Deloitte and Google. (2020). Milliseconds Make Millions. https://web.dev/case-studies/milliseconds-make-millions
6.Google. (2022). Vodafone Italy: a 31% improvement in LCP increased sales by 8%. https://web.dev/case-studies/vodafone
7.W3C. (2023). Web Content Accessibility Guidelines (WCAG) 2.2. https://www.w3.org/TR/WCAG22/

Author

Joshua Jones is the principal architect of DBJ Technologies, a solo digital engineering studio in Royse City, Texas, working with service businesses across the Dallas-Fort Worth metro. Last reviewed May 5, 2026.

A real performance audit measures the things most tools miss, and ranks the fixes by revenue.

What a real performance audit actually contains

What free tools quietly miss

What a real audit actually buys you in dollars

The three-tier path from audit to shipped fixes

I publish three productized engagement tiers that map cleanly to the audit-and-fix arc. The choice depends on what your scan turns up, your timeline, and how much you want to ship in one sweep.

How I run an audit, end to end

Triage

15 to 30 minutes

I read the URL the way a buyer would. What loads on first paint, what loads after, what feels off in the first ten seconds, where my eye goes, what I cannot find. The triage notes go straight into the audit before any tool runs, because the human-perception read is the layer the tools cannot replicate.

Lab measurement

1 to 2 hours

Lighthouse runs across the canonical pages on mobile and desktop. WebPageTest runs against multiple connection profiles and three geographic regions. Chrome DevTools performance panel for one detailed flame-graph capture per representative page. Output is a structured table of lab metrics with one column per page, one row per metric, color-coded by severity.

Field measurement

30 to 60 minutes

Chrome User Experience Report 75th-percentile field metrics for every URL with sufficient traffic. Real-user monitoring from the site's own analytics if it is wired. The field numbers are what Google ranks against; the lab numbers are what I debug against. Both go into the report, side by side, because divergence between them is itself a finding.

Prioritization

1 to 2 hours

Every finding gets three numbers: severity (1 to 5), implementation cost (S, M, L), and confidence in the revenue impact estimate (low, medium, high). The list sorts by severity divided by cost. Top of the list is what ships first. Bottom of the list goes into the backlog or gets cut entirely if the cost-to-impact ratio is wrong.

Written plan

1 to 2 hours

The report is a single PDF or Notion page covering executive summary, lab table, field table, prioritized findings, recommended action plan, and a small appendix on what was checked and why specific tactics were skipped. The plan is the deliverable. The fixes happen in the engagement tier you choose.

What buyers usually ask before signing

The fastest first read is a free Pathlight scan against your live URL.