Analytics event sampling: why your data vanishes above 500K sessions

Newsletter metrics dashboard with charts and graphs

Written by

in

If you’re running a content site that pulls 500,000+ sessions a month, you’ve probably noticed something strange: your GA4 reports don’t always match. Run the same custom report twice, and the numbers shift. Filter by a specific page or traffic source, and suddenly your totals don’t add up.

You’re not imagining it. You’ve hit GA4’s sampling threshold.

Sampling means Google is analysing a subset of your data—not all of it—then extrapolating the results. It’s faster for Google to process, but it quietly erodes the accuracy of every decision you make based on those numbers.

When sampling kicks in (earlier than you think)

Google’s documentation says GA4 standard properties support up to 10 million events per month before sampling applies. That sounds like plenty of headroom.

But sampling doesn’t wait until you hit 10 million events. It starts much earlier—often around 500,000 sessions—when you build custom reports, apply filters, or request data outside the standard date range.

Standard reports (the pre-built ones GA4 ships with) use aggregated data tables and rarely sample. But the moment you:

  • Add a secondary dimension
  • Apply a segment or filter
  • Build an exploration report
  • Request data older than 14 months

…GA4 switches to on-the-fly querying, and sampling kicks in if your property is processing enough events.

You’ll see a green badge in the top-right corner of your report that says “This report is based on X% of sessions.” If that number is below 100%, your data is sampled.

What gets lost in sampled data

Sampling doesn’t affect every metric equally. High-level numbers—total sessions, pageviews, users—tend to hold up reasonably well even at 20–30% sampling rates.

But the more specific your question, the worse sampling performs. If you’re trying to:

  • Identify which blog post drove the most newsletter signups from organic search last quarter
  • Compare conversion rates across traffic sources for a specific product page
  • Measure how a site-speed optimisation affected bounce rates on mobile

…sampling can distort the answer enough to reverse your conclusions.

I’ve seen sampled reports show a 15% conversion rate on a landing page when the unsampled data (pulled via BigQuery export) showed 11%. That’s not a rounding error—that’s a different business decision.

How to reduce or avoid sampling

If you’re consistently hitting sampling in GA4, you have four options.

Option one: Narrow your date range. Instead of analysing the last six months, break it into monthly or bi-weekly segments. Smaller queries are less likely to sample. You’ll need to stitch the data together manually, but at least it’s accurate.

Option two: Use the Data API. GA4’s Reporting API gives you access to unsampled data if your query stays within certain limits. Tools like Google Sheets (via the GA4 add-on) or Looker Studio can pull data this way. It’s slower, and there are still daily quotas, but it bypasses the sampling you see in the GA4 interface.

Option three: Export to BigQuery. GA4 offers a free BigQuery export (up to 1 million events per day). Once your data is in BigQuery, you can query it without sampling. The trade-off: you need to learn SQL, and you’re managing your own data warehouse. But if you’re making six-figure decisions based on this data, it’s worth it.

Option four: Upgrade to GA4 360. The enterprise tier starts at $50,000/year (sometimes negotiable to $150,000 depending on scale) and raises sampling thresholds significantly. Unless you’re running a seven-figure media operation, this isn’t realistic.

When sampled data is good enough

Not every report needs to be unsampled. If you’re checking whether traffic went up or down this week, or whether your top-performing post is still your top-performing post, a 70% sample is fine.

But if you’re deciding whether to double down on a traffic source, kill a product, or restructure your content strategy, don’t trust a sampled report. Pull the data via API or BigQuery, or narrow your query until the sampling badge disappears.

The costliest analytics mistake isn’t picking the wrong tool—it’s trusting incomplete data and not knowing it.

Using GA4 for attribution or conversion tracking? Subscribe to One Two Three Send and get one operator-focused article like this every day.

Other newsletters you might like

My Local Dublin

Dublin Ireland – Explore the city and find things to do, places to see and food to eat.

Subscribe

Love Netherlands

Canal towns, hidden villages, Dutch stories — a slow, loving look at the Netherlands, written by the people who love it most.

Subscribe

Love South Africa

South Africa as a travel destination. The Rainbow nation full of wonderful gems to visit. Going on Safari in the Kruger National Park, visiting the beautiful beaches of Cape Town, indulge in the South African culture and heritage.

Subscribe

Love London

A newsletter for Londoners who want to rediscover their own city. Travellers planning their first or fifth visit. Anglophiles who fell in love with London through literature, film, or a rainy afternoon on the South Bank.

Subscribe

Newsletters via the One Two Three Send network.  ·  Want your newsletter featured here? Click here