In my first marketing job, there was a running joke. When performance was up, it was thanks to our hard work. But if performance dipped, it was always due to “seasonality.” The performance of our marketing campaigns is affected by many external factors outside of our control. When they have a positive impact, we tend to take credit without investigating further. When there’s a decrease in performance, we blame it on “the economy,” “the weather,” or “the pandemic.”
Source
What if we could remove human bias and automate these types of analyses, so we could see more clearly what performance would have been without these macro trends? That would help us with commentary on our weekly, monthly, and quarterly reports – as well as attributing credit where it’s due. We would be rewarded based on performance achieved despite economic headwinds, and we would stop tricking ourselves into thinking we’re geniuses in periods of high market demand.
In this tutorial, we’ll cover what external factors may be affecting your performance and walk through an example of how to use linear regression to estimate the impact of a world event (like the COVID-19 pandemic) on a business. You’ll be well equipped to combine Funnel, Google Sheets, and Google Data Studio to replicate and automate this analysis for your business, on any external factor you deem important to performance.
What external factors affect marketing performance?
There are several major factors that can directly affect marketing performance, and depending on the year, there could be more than one of them affecting you at once! They’re called external factors because they are outside of your control. They can significantly impact your marketing campaigns and can cause you to miss or overshoot performance goals if you haven't taken these factors into account.
- Military conflicts like the war in Ukraine can lead to supply chain issues, making products go out of stock.
- Pandemics like COVID-19 can cause lockdowns, forcing people to stay home while shopping more online or purchasing products they wouldn’t normally buy.
- Industry giants like Apple can change privacy settings (i.e. iOS14), allowing users to opt-out of tracking and impacting the accuracy of ad targeting.
- Important holidays like Black Friday in the USA can cause deal-seeking behavior in consumers and pull forward planned purchases to sale periods.
- Some products are seasonal, like ice cream during summer or chicken wings during American football season, leading to periods of peak demand.
- A competitor might raise millions in venture capital funding and significantly increase their marketing spend, stealing your customers and inflating your advertising costs.
Often, these external factors have more impact on the performance of your marketing campaigns than the daily decisions you make to optimize your campaigns. They tend to get significant executive attention because leaders need to explain to company stakeholders what’s happening. Yet they’re completely absent from most marketing reports. The culprit is a lack of statistical training in marketing teams, as most marketers aren’t comfortable doing a linear regression analysis – which is what we’ll learn to do in this post.
Major world events
Serious world events can range from the largely predictable to the wildly unexpected.
The COVID-19 pandemic began in January 2020 when images appeared of a lockdown in Wuhan, China. By April, the rest of the world started to shut down in different stages and of varying degrees of strictness. In the United States, a national emergency was declared on March 13, 2020, though the measures taken differed by state. Measures were adjusted in real-time, with stricter measures enforced to maintain health care capacity.
The pandemic had a significant impact on consumer behavior. Some industries (such as travel, fashion, restaurants) were heavily impacted, with consumers' lives changing so significantly that they radically changed their habits. Other industries thrived (including e-commerce, technology, construction) as demand surged with more time spent at home.
Trying to calculate the impact of a serious world event like COVID on your industry is difficult. The pandemic was not felt equally everywhere at the same time, as government policies differed across locations. Additionally, consumer behavior changes weren’t consistent at every stage of the pandemic, because the virus had various spikes and dips in infections along with different health outcomes for each of the multiple variants.
Building a macro report with linear regression
A simple before-and-after analysis isn’t enough in these scenarios. You need to do a linear regression analysis based on an index of how high or low the impact of the pandemic was at various points in time. Thankfully, Google released mobility data from Google Maps (the market leader in most countries), which shows an index of how much people were staying at home, traveling to work, or visiting retail spaces over the course of the pandemic.
1) Downloading and formatting mobility data
To first obtain the data, visit Google’s COVID-19 Community Mobility Reports here. You should download the Global CSV (since that’s a smaller file size) unless you need something specific from the Regional CSVs. Note that this is a massive file that eats up almost 1GB of data at the time of writing, and it’s unlikely you’ll be able to open it without Excel or Google Sheets crashing. You could extract the data you need with this handy Python script if you know how to code, but you can also just open the file in a plain text editor. I use VS Code, but Notepad or any other option will work. Next, delete everything apart from the lines you need.
Search for “United States,,,,,,” to quickly find where the US national data begins. Then, delete everything that comes before and after apart from the first line containing column headers.
Now, you’ve gone from about 10,000,000 rows to around 1,000, which is much easier to manage. You can keep the data in Excel or upload it to Google Sheets – either way, we’ll connect it to our other data later.
2) Connect your Funnel account
To do this type of analysis, you need all of your data in one place. Of course, you can do this manually or code a custom solution, but (for speed and ease) this is something Funnel can do for you. You can start the process of getting a Funnel account here, then under ‘Connect', select "Data Sources".
From there, you can click on each marketing platform you want to import data from, and it will walk you through authentication, then pull the data you need. Once you click ‘Allow’ to connect your account (in this case we’re connecting Google Ads) you’ll be prompted to choose which individual ad account you want to import.
For our purposes, we can go with the ‘default’ configuration, since we’ll just be pulling in basic spend and performance data. However, this is flexible and customizable if you have a more advanced use case. You can connect as many accounts as you would like (and your plan allows), and there are hundreds to choose from. For this analysis, I’ve added just one more: Facebook Ads. The data takes some time to import depending on the size of the account. For me, it took 10 minutes.
Now, let’s also import our Google Mobility data. This is the only part that’s manual and would need to be done each week, month, or however often you want to refresh the report. This could also be automated with some custom coding to regularly pull, clean, and transform the data if you would like. Next, append it to a Google Sheet for Funnel to pull from. Connecting a Google Sheet or uploading a CSV/Excel file works the same way as our other connectors, too. Just scroll down to the bottom of the ‘Connect new Data Sources’ modal.
When you click to connect your Google account, make sure you tick the boxes to give Funnel full permission to access your spreadsheets. If you miss this step, the connector won’t work, because it won’t be able to locate and read your Google Sheet.
The next step is to copy and paste the URL into the Sheet you want to pull the data from. Then, click ‘Load Sheets’.
For me this took no time at all, loading almost immediately. You will know it worked when you can see a preview of the sheet in front of you. Click “connect” to finalize the process. If you made a mistake, you can replace the URL and click “Load Sheets” to try again.
Presuming the file looks the same as mine and you retained the column headings, you should find that Funnel automatically detects which column is which. Plus, it will detect the datatype of each column. This is your chance to review and correct anything before proceeding. If it all looks good, hit “Save & Import.”
3) Export data from Funnel to Google Sheets
We now have all of our data in one place. The next step is to send it to our Google Sheet for modeling. Once we’ve built the model, we’ll then pull it back into Funnel to display it elsewhere (the modeling work itself can’t be done in Funnel).
If you click on “Google Sheets” under “Export,” you can then click the button “New Export.” That will lead you to a page where you can connect your Google account and either create a new sheet or choose an existing one.
Once you’ve done this and hit “Save & Run,” the automated job of updating that spreadsheet will begin. You can of course edit what you want to export, but I’ve just gone with the default of exporting all metrics. It’s important to note that, in some cases, this can slow Google Sheets down, though.
If the job doesn’t automatically run immediately, you can click “Run Now” to make the data appear.
4) Getting the data in the right format with a pivot table
The next step in front of us is to properly format the data for modeling. Right now we have cost, clicks, and impressions, but it’s broken down by data source from column B. That means there are multiple rows with the same date – one for each data source. Instead, we need one row per date with cost, impressions, and clicks aggregated across channels.
To do this, we just need to create a pivot table. The table should capture data in all rows of the column, not just what is currently visible. That's because you'll want to automatically collect any new data that Funnel pushes into the Google Sheet to update your model.
Click “Insert,” then “Pivot Table,” and choose to create it in a new sheet. Then, choose data as the rows, cost, clicks, and impressions as the values. Make sure your selection is correct (top right, “A:G” rather than “A1:G100”) to ensure the pivot table will continue updating. It also makes sense to filter out blank dates for all those empty rows you haven’t yet filled.
5) Joining COVID-19 data to performance data with VLOOKUP
Our next step in the process is to “lookup” the COVID-19 data in order to bring it into this sheet before we can create the model. To achieve this, we’ll use VLOOKUP, which lets us reference the date column which is common between the two sheets.
The first parameter is the lookup key: the date. Then, we select the full range from the date column to all of the mobility columns at the end of that table. Next, the second to last parameter is the column in that selection range, which you’ll want to pull into that column. To do that, start at 2 to pull the second column in while proceeding next to the following columns (3, 4, 5, etc.) until you have all the columns pulled in. Finally, put FALSE as the last parameter to make sure it does an exact lookup.
6) Calculating performance metrics: CPM, CPC, CTR
We want to know the impact that COVID-19 had on our marketing performance, so now is the time to calculate our performance metrics. We’re using CPM, CPC, and CTR, because they’re the three biggest indicators of market health and how well you’re bidding. It would also make sense to calculate conversion, CPA, or ROI metrics – but we’ll leave that for later. The formulas are as follows:
- CPM = cost / impressions * 1000
- CTR = clicks / impressions
- CPC = cost / clicks
7) Automating the analysis in Google Sheets with LINEST
In order to understand how COVID-19 affects the cost of running ads, we will try to predict CPM. As a result, that will be the first parameter in the LINEST function. For the second parameter, we need to ensure that the columns we’re using to predict performance are lined up sequentially. In the image below, the yellow columns reflect that we’ve purely referenced the cost and workplaces mobility data. Then, for the third and fourth parameters, we just entered the number 1 to get an intercept and verbose output.
Don’t worry. We’ll explain what that all means.
The output of the model can be a bit difficult to understand. The first important item to point out is that the top row (the coefficients) comes out in reverse order, with the intercept as the last one. The other important number is the r-squared, which is the first number in row 3. This is an indicator of how well the model fits the data, from 0 to 1. Given that our model is at 0.4, that’s an indication we may be missing some important variables that drive CPM. In the future, it might be useful to do work to improve the accuracy of this model by adding more explanatory variables to build a full marketing mix model. Ideally, the R2 should be over 0.6, but we’re okay to proceed with the tutorial for now.
8) Calculating the past contribution of each variable
The final spreadsheet work we need to perform is calculating the past performance of each variable and how they contributed to overall performance. This can be done with our model coefficients by multiplying them by the input values for that row.
This works since the model coefficient is essentially the rate at which the model believes the performance metric goes up for every unit of the input value. If the coefficient for workplaces COVID mobility is 1.34, that means CPM increased by $1.34 for every unit increase of people visiting work.
The likely explanation for this is that people not going to work meant more time on the internet and social media, thereby increasing ad inventory and decreasing the price.
Our goal in this exercise is to calculate CPM if the pandemic didn’t exist and compare that to the actual CPM during the pandemic. By analyzing this difference, we can understand what parts of the reporting period were most severely impacted by the lockdowns during the pandemic.
The formula for a linear regression model is simply to multiply each variable by its coefficient, then add them all together with the intercept. So, if we want to leave a variable out (i.e. predict the effect of no COVID-19), we just need to leave it out of the equation. For our model, that means multiplying the coefficient for media spend by the coefficient, then adding the intercept, and ignoring the COVID-19 variable.
9) Pulling the data back into Google Data Studio via Funnel
Finally, after all of that setup work, we’re ready to send the result back into Funnel. Since we now know what our CPM would have been without COVID, we can isolate the impact of COVID-19 on our performance metrics when building a report.
To import the data back into Funnel, we do what we did before with the Google Sheet import. This time, though, we navigate to the predictions tab before copying the URL to ensure it has the right GID on the end of the URL. You should be able to see the data come through correctly when we import it, which will show in the pane below.
It’s important that you pick the right metrics. For example, the metrics in the Google Sheet around the no-COVID prediction didn’t show up for me by default. I had to select them. Also, be sure to select the date so you can blend this data together for the chart eventually.
If you’re having issues with parsing the dates, it is likely because you’re accidentally copying across the grand total row or you have blank rows. You can remove these errors by deleting these rows or using an IF statement to show a blank (two quotation marks) instead. Then, go to edit the configuration in the warning, and click to import again.
10) Building a macro marketing report in Google Data Studio
The final thing to do is build the actual report in Google Data Studio. You could be a bit complex here, but I’m simply going to overlay the actual CPM with what the CPM would have been without COVID in order to visualize the impact over time.
As we update our report and COVID wanes or resurfaces, we’ll be prepared for both scenarios. If another pandemic or global crisis happens, we’ll know more about how that might affect our performance and can adjust accordingly. We can also explain these effects to executives in real-time.
Sending the data to Google Data Studio with Funnel couldn’t be easier. Simply click into Google Data Studio in the Funnel interface and click “New View.” By default, I’m pulling all my data into there so that I can have maximum flexibility. Then, click to authorize Google Data Studio and authenticate your Google account. That will create a Funnel data source in Data Studio. You can also complete this step manually by copying across your token and ID.
From this screen, you can click “CREATE REPORT,” and the source will appear within the report. Funnel does create a basic report template for you to make life easier, but we’re going to ignore that for this tutorial and start with a fresh page. If we click to create a chart and select a time series, we’re almost done.
Drag both the prediction data and the CPM data into the metrics section of the chart, and put the date as the dimension. The finishing touches are to choose the right date range (I selected the last 90 days) and then add a relevant title.
There we have it!
This is the type of insight that executives crave, and it only took us a little bit of setup time to automate it. Of course, this is just an illustrative example, but we can use this combination of Funnel data pipelines, linear regression, and predictions to really understand the impact of any variable on any other.
It should be relatively easy to repurpose what you’ve learned here to build something more complex or suited to your needs. Linear regression is the workhorse of statistics, and it works surprisingly well in a wide range of scenarios. It’s almost always worth seeing what you can identify with it first before moving onto more complex algorithms and methods.
Controlling for more variables with marketing mix modeling
As we’ve hinted at, this was a simple and not particularly accurate model, because we were ignoring a host of factors that could have also impacted CPM. In addition, it’s usually a no-no to include a variable on both sides of the equation (cost in our case). However, this example was simply meant to show you how linear regression and data pipelines can be combined to analyze anything you like!
In another example, we could model conversions instead of CPM. In that case, we’d know what contribution COVID had towards sales (likely negative). We could add more variables for the various factors that affect sales, like our spend on each ad channel, to understand more about our marketing attribution independent of what ad platforms or analytics are telling us.
When you add more variables to a linear regression in an effort to account for everything that might drive sales, it’s called marketing mixed modeling (MMM), and it’s a time-honored attribution technique from the 1960s.
It’s worth taking note of developments in the space, ever since Facebook released Robyn, their open-source MMM library, in response to Apple’s decimation of the tracking industry post-iOS14. As well as home-grown solutions which might not make sense for many brands who don’t want to take on the burden of building and maintaining a statistical tool, there are also several vendors reinventing the space. One great option is Recast, which came from Harry’s marketing science team, meaning their solution works great for eCommerce and retail.
Whether you’re using a homegrown solution or a 3rd-party vendor, don’t reinvent the wheel on getting all of your data in one place: Funnel can handle that part for you so you can focus on the rest.
About the author
Mike was a co-founder at Ladder, a 50-person growth marketing agency, leading operations, data science, and product. He is currently making simulator-based courses for data-driven marketers at Vexpower.