In July 2019, Google announced their new “App + Web” property, which would later be renamed to Google Analytics 4. Immediately post announcement, most attention was paid to the new event-based data model and the ability to track apps and websites within the same property.
As of early 2023, with the sunset date for Universal Analytics only months away, many marketers and web developers are noticing another key difference between Universal Analytics and Google Analytics that might have been previously overlooked: why is the data retention period for Google Analytics 4 so short?
In this article, let’s explore what has happened around data retention in marketing in general, how it’s likely to impact the day-to-day work of marketers, and what you can do to future-proof your data stack.
The end of endless data
The Google Analytics “Web + App” announcement came shortly after GDPR legislation went into effect in Europe. It was disclosed that the property offered only two data retention period options: two and 14 months.
These options are significantly shorter than the maximum of 50 months of data retention offered by both standard and premium 360 Universal Analytics.
It remains unclear if this initial data-retention decision was directly related to the GDPR legislation, and Google Analytics 4 has since added support for up to 50 months of data retention for its premium 360 offering. However, it is the start of a trend.
When Amazon Ads launched its reporting API solution, data retention was capped at merely 60 days. In October 2020, Facebook announced that it was introducing data retention limits of 37 months.
While 37 months might seem like a long enough window to compare current results to historical data, remember that 37 months ago marks the start of the COVID-19 pandemic and its subsequent business impact. If you want to know how your results today compare against pre-pandemic levels, or you want to build reliable statistical models based on a lot of historical data, you might no longer be able to do so with Facebook Ads data (or Amazon Ads for that matter).
Rate limits, tokens, and quotas
While access to historical data getting more limited, many marketers have also, unwillingly, had to learn about data concepts such as rate limits, tokens, and quotas. A key contributor to these realizations was when Looker Studio announced that Google Analytics 4 would start enforcing its outlined quota limitation.
In short, this can be described as a sort of data throttle. If you request data too much or too frequently, the request will be rejected and return an error message. Very quickly, many marketers that fetched data via plugins connected directly to the Google Analytics 4 API in Looker Studio started seeing their reports displaying error messages that quota limits had been reached.
Why would Google Analytics start enforcing this when it’s such a poor experience for end-users who want to report on their data? It could be performance related.
If there were no limits on the amount of data and how frequently you refresh it could be detrimental to other users of the API. It could also be to nudge users toward storing data at an intermediary such as BigQuery (which happens to be free for Google Analytics 4 users) instead of querying the API directly.
No matter the exact reason, the question remains for marketers: what do I have to do now?
What now? Time to build a solid data foundation
Now that we’ve know that the era of freewheeling, ad-hoc API reporting is coming to a close, let’s look at what you need to ensure continued success.
Storage
When you can’t rely on directly querying APIs anymore, intermediary storage before visualizing the data becomes a must-have. The good news is that data storage has become cheap and the options are plentiful. However, make sure that you don’t make the selection based purely on the lowest cost per gigabyte. Instead, think of this as a prerequisite when you make your way through the rest of the items on this list.
Increasing connections
The marketing landscape is rapidly expanding. As of 2022, marketing teams are working with an average of 15 data sources, up from 10 in 2021.
That number is projected to increase to 18 by the end of this year.
Whilst having connections to only Google Analytics, Google Ads, and Facebook Ads might have been sufficient a couple years ago, marketers today need data from a wealth of different sources. It's best to make sure your tools can account for this increasing level of smooth connection.
Organizing the data
As the number of data sources increase, so too does the interest in escaping reporting silos. To achieve this, blending data across sources has become a necessity.
However, the necessary data transformation to enable this blending has historically been reserved for data teams working with SQL.
The marketing team is out of the loop, slowing down the pace and iteration cycles of the team. Make sure that the solution you chose is built with end-users (marketers) and their use case (improving marketing performance) in mind.
"Shareability"
Once the data is prepared, you need to make it available to the reporting tool. This might seem straightforward, but make sure that you are easily able to share the data between the tools of your choice seamlessly. And be sure you don’t end up with vendor lock-in if you were to ever make a change to your reporting tool.