Data Differences: GA4 Reports and Explorations
Standard reports and explorations are two invaluable places where you can find some actionable insights into your web and app data, or something that leads to a more fruitful investigation.
Generally, you’d expect them to show the same data but there are quite often times when the data doesn’t match.
Now, before you go on blaming your tracking, your website, or even your marketing team it might be a better idea to understand what can cause these discrepancies.
Here are the major reasons that cause this data mismatch:
Let’s look at these factors individually to understand them better.
1. Unsupported Dimensions and Metrics
Reports and explorations are designed to offer different perspectives on your data, each operating at different granular levels.
For instance, certain dimensions and metrics available in standard reports are unsupported in explorations.
So, when you open a standard report in explorations containing unsupported fields, these fields are omitted from the exploration.
This could affect any visualisations based on unsupported fields in the standard report causing confusion as you’d be wondering why the line graph looks different.
2. Filtering Conditions
Filtering works differently between reports and explorations. In explorations, filters allow you to specify the match type (e.g. contains, exactly matches, or begins with), with case sensitivity playing a role.
In contrast, the table search filter uses a "contains" match type for filtering, and it's not case-sensitive.
Just to clarify, we are talking about the search box between the visualisation and data table, not the one at the top of the Analytics interface or the Add filter button.
3. Comparisons vs. Segments
Comparisons act like segments though with limitations but again the reports allow the use of fields that are not supported in explorations.
Now, when you open the report with comparisons applied in explorations, comparisons are transformed into segments.
Any unsupported metrics or dimensions from the comparison won't be present in the resulting segment in explorations, potentially altering the included or excluded data, so you will see different results in both parts of the GA4 interface.
4. Data Retention
Date ranges in explorations adhere to your property's data retention settings i.e. 2 months and 14 months.
So, If you create a report with a date range exceeding these limits in explorations, data preceding that range will not be included. It’s definitely a good idea to keep it to the max available range of 14 months.
5. Cardinality and Sampling
When your property accumulates a substantial volume of data, Google Analytics 4 employs different methodologies in reports and explorations to balance cost and performance. These varying methods can lead to disparities between the two.
5a. Reports and Cardinality
These rely on daily aggregated tables with system limits. Data beyond these limits is aggregated under an "(other)" category.
The impact of high cardinality on reports varies depending on the report type and even dimensions, potentially affecting data grouping and the results.
5b. Explorations and Sampling
Instead of relying on the daily aggregated tables, these query raw event and user-level data. When the quota limit (10M events for free users, 100M events for GA4 360 users) is exceeded, Google Analytics 4 uses a sample of the data.
The sample size's proportion to the population size can influence query accuracy. Larger samples generally yield more accurate results.
6. Thresholding
For reports and explorations, data may be withheld if user counts within the specified date range are low if Google Signals is enabled. If you see the dreaded triangle symbol at the top of your reports then it means your data has thresholding applied.
This measure is implemented to safeguard user privacy where a lower user count can result in inferring users’ identity. One way around it is to change the reporting identity settings to ‘Device-based’. Click on the admin cog → Reporting Identity → Show all.
Next, select the Device-based option and click on the blue Save button.
7. Processing Times
Google Analytics data is sourced from various systems and may undergo processing at different times. As a result, when querying data for the past 48 hours, slight variations in results might be observed due to processing time differences.
In fact, it’s suggested to not include the last 3 days’ data in your analysis as it can take that much time for GA4 to process all the data.
8. Behavioural Modelling
When behavioural modelling for consent mode is enabled, slight data differences may arise between standard reports and explorations.
Behavioural modelling employs machine learning on two different data sets - aggregated tables for reports and raw event and user-level data for explorations.
Structural differences between these sets can result in minor variations in modelled data, especially when users decline consent for analytics cookies.
Conclusion
So finally we are at the finish line after we’ve discussed the major reasons you can see differences in the standard GA4 reports and explorations.
Learning about these reasons can help us understand what’s happening under the hood so we can mitigate what we can.
But the important bit is to know that you cannot completely eliminate these differences and will be seeing the disparity between the standard reports and explorations in GA4.
As you might have already learned, with GA4 doing analysis within the interface can be a painful experience.
Whether it is Google’s strategy or not to push more and more people to use BigQuery, it actually can be more helpful in providing insights that are not marred by the interface.
Lastly, many businesses are under the impression that using BigQuery will cost them a lot but with the free 10 GB storage and 1 TB query processing allowance per month, it won’t be an issue for most businesses out there.
Not to mention you get to own that data and store it for as long as you want to vs. 14 months max available for explorations in GA4.
So, how do you deal with the frustration of these differences you see and explain to your clients? Let us know it all in the comments below.