Kissmetrics Blog

A blog about analytics, marketing and testing

Built to optimize growth. Track, analyze and engage to get more customers.

How to Avoid Corrupting Your Google Analytics Data

Do you want tasty data that helps you grow your business? Of course! Tasty data helps you take action by figuring out how our customers behave. It helps you grow your business.

But we need to keep our data clean.

Unless you’re careful, you can corrupt your Google Analytics data. Your clean tasty data will become dirty not-so-tasty data. Which means you won’t be able to learn anything about your customers.

Here are 5 rules you want to follow to keep your data tasty and delicious:

  1. Always test your filters before applying them to your main profile
  2. Use virtual pageviews only when you have to
  3. Never use campaign URLs for internal marketing
  4. Keep yourself from being hacked
  5. Only use the goals that you need

You’re about to learn how to follow each of them.

Testing Filters

Filters are powerful. They’re so powerful that they can completely nuke your data if you’re not careful.

Let’s step back and review how Google Analytics collects data:

  1. All day long, the Google Analytics servers collect raw data from your site.
  2. Once a day, Google Analytics compiles the data.
  3. Then it runs your data through your Google Analytics settings. This includes your different profiles, goals, and filters. Using your settings, Google Analytics permanently changes the data to match your instructions.
  4. The altered data is what you see in your reports.

Once Google Analytics runs the raw data through your settings, there’s no going back. The raw data is gone forever.

So if you have a filter that tells Google Analytics to take a hike and delete everything, your data goes poof. Adiós data!

Here’s the deal: the difference between a benign filter and a nuke filter is very small.

For example, most sites have a filter that removes company traffic from the reports. This is a great filter to have. After all, your employees behave very differently than your customers. To keep your data as accurate as possible (so you can get the best actionable insights), removing them from your Google Analytics reports makes a lot of sense. Usually, this filter is set up to exclude the range of IP addresses that your company uses.

Instead of telling the filter to exclude traffic, what it you tell it to include traffic from your IP address by accident? Now your Google Analytics reports will kill everything EXCEPT your company traffic. All it takes is one little mis-click and your data is hosed. Since these options are in the same dropdown menu, this mistake is easy to make.

Seriously, there are about 10 pixels between you and certain death. Take a look:

Google Analytics Filter

So how do you avoid the filter nuke? Set up Safety Net Profiles. In addition to the main profile that you use for analysis, you need two more Google Analytics profiles:

  1. Test Profile: Set up your filters here first. If you nuke your data, it’s no big deal. Make sure it works, then apply the same filter to your main profile.
  2. Raw Data Profile: Don’t apply any goals, filters, or anything else. Just let this profile collect data in case of a critical failure with your other profiles.

These safety net profiles are so important that they’re one of the 8 Google Analytics features that every site MUST have enabled.

Overuse of Virtual Pageviews

Using a snippet of JavaScript, we can force Google Analytics to record a pageview whenever we want. This is ideal for tracking PDF downloads. Since the Google Analytics Tracking Code isn’t embedded in a PDF, there’s no way to tell if someone downloaded it.

Instead, we can tell Google Analytics to record a pageview when someone clicks the link to the PDF. We call it a virtual pageview because the page doesn’t actually exist.

But Google Analytics can’t tell the difference between a real pageview and a virtual one. Data from virtual pageviews gets merged into the rest of your metrics. If you track a few PDF downloads, this isn’t a big deal. The extra pageviews aren’t enough to throw anything off.

Let’s say we want to track the amount of time someone watches a product demo video. We’ve set up everything so that every 5 seconds, a virtual pageview gets recorded. We’ve named all the virtual pageviews differently so we can see how far people get into the video before they abandon it.

If the video gets a sizable amount of traffic, you could receive thousands upon thousands of extra pageviews. This will skew a huge portion of your data.

So be careful with virtual pageviews and make sure you don’t use them excessively.

I recommend using them in two instances:

  1. To track downloads or any other action that represents a “pageview” that Google Analytics can’t track.
  2. You can set up events as goals. But you can’t use them with goal funnels. So if you have an event you’d like to use in a funnel, use a virtual pageview instead. Check out our guide on the 4 Critical Goal Types That Are Critical To Your Business for all the details.

Using Internal Campaign URLs

So you have a banner or call to action on your site. But you don’t sell advertising space, you’re trying to market your own products to your visitors. Many people assume that they can use a campaign URL with UTM parameters to track their internal marketing.

This is a very bad idea.

Every time someone clicks a campaign URL, Google Analytics starts them off on a new visit.

Normally, if I come to your site from an organic Google search, everything I do on your site is tracked under the same visit. But as soon as I click on your internal campaign, my visit gets split into two different ones.

Traffic data will then become completely corrupted. Your visits, pages/visit, time on site, bounce rates, exit rates, and just about everything else gets thrown out of whack. You won’t be able to trust your data at all.

Conversion rates for traffic course also get corrupted. Google Analytics attributes conversions to the most recent traffic source. The only exception is direct visits, those conversions get passed to the traffic source before the last visit (if there is one).

But when you use internal campaigns, Google Analytics will give them the credit for the conversion. It will be much more difficult for you to figure out where your profitable traffic is coming from.

Instead, choose one of these options to track your internal campaigns.

Event Tracking

With events, you can track just about any action that you want. But you’ll need to pass some data to Google Analytics with a snippet of JavaScript.

This data gets passed into your event reports where you see the conversion rates for all your events.

So if you have a bunch of different calls to action throughout your site, give them events, then see which ones encourage your visits to convert.

For in-depth instructions on how to set all this up, go to the Google Analytics Event Tracking Guide.

Virtual Pageviews

As we’ve already covered, we can force a pageview into Google Analytics whenever we want. But be careful with these. If you start tracking too many virtual pageviews, the data can get out of control and you’ll have a terrible time figuring out what’s going on with your site.

If your internal marketing points to another page on your site that’s already being tracked by Google Analytics, you should use events instead of virtual pageviews.

But feel free to use them if you’re going to set up a goal funnel. To set up a funnel report that includes your internal marketing, you’ll need to use a virtual pageview. Events can’t be added to funnels.

Site Search

With a little ingenuity, we can hijack the Google Analytics site search and force it to track our internal campaigns.

It involves 3 basic steps:

  1. Create 1 or 2 custom utm parameters (this isn’t nearly as hard as it sounds)
  2. Add the custom utm parameters to your internal links
  3. Tell site search to track these utm parameters instead of the site search parameters (it’ll never know the difference)

All of your data will then pop up in your site search reports. It’s that easy.

Since this will completely monopolize the site search, you’ll want to set this up on a separate profile.

Here’s what the Site Search Overview report looks like:

Google Analytics Site Search

After you hijack it, mentally replace every instance of “Site Search” with “Internal Campaign.” You’ll have all sort of delicious data to play with.

Justin Cutroni has a complete guide on how this works. If you want to go this route, definitely give it a read.

Custom Variables

Custom variables are not for the meek and tidy. This is big dog territory. I would only go this route if you’ve already played around with custom variables and have a firm grasp for how they work. Events, virtual pageviews, and hijacking the site search are all much easier to implement.

If you want to dive down this rabbit hole, start with Google’s custom variables guide.

Getting Hacked

Yes, your Google Analytics account can be hacked. You see, you have complete control over where your site data goes. As long as you know the Property ID of an account, you can send your data to any Google Analytics account out there.

And it’s very easy to find the Property ID of any site. All you have to do is view the page source of the website. Once you have someone’s Property ID, you can send all of your site data to their Google Analytics reports. This will corrupt everything.

It can also happen to your reports. If someone does it to you, you won’t be able to tell what’s going on with your site. Every drop of value from your analytics will go whooshing out the door.

Does this actually happen? It sure does. I know this is shocking but there are evil people on the internet.

But there’s good news.

You can set up a filter to protect yourself from being hacked. It will completely protect you. To get step-by-step instructions for how to set it up, check out our blog post on How to Protect Your Google Analytics From Being Hacked.

Excessive Use of Goals

I’ve left this one for last because it’s not nearly as nefarious as the others.

Throughout your reports, you will see aggregate conversion rates that include ALL of your goals. If you’re diligent, you can segment your reports and make sure you see the conversion rate for critical goals like purchases, account signups, and form submissions

But this adds another step. When it comes to analytics, you want to make it as easy as possible for yourself to pull out valuable insights that help you take action. The further you bury your numbers, the harder it will be to do this. If you keep your goals clean and only focus on tracking items that directly help your bottom line, you’ll instantly see what’s working and what’s not.

There are two goals in particular that the vast majority of sites should avoid activating:

  • Visit Duration
  • Pages/Visit

These goals won’t tell you anything useful, they’ll just inflate your conversion metrics. You’ll either waste your time trying to get to the real numbers or you’ll come to the wrong conclusions about how your site performs. Neither option is good.

You should definitely read this post on why it’s important to avoid setting up visit duration and pages/visit goals.

Bottom Line

Now that you know the pitfalls that will corrupt your data, take care when dealing with them. You must be vigilant!

Specifically, make sure you:

  1. Avoid applying filters without testing them first
  2. Don’t use too many virtual pageviews
  3. Avoid using campaign URLs to track internal campaigns
  4. Protect yourself so you can’t get hacked
  5. Be careful with goals that you don’t need

If you follow these steps, your Google Analytics data will be rock solid. You’ll be able to dive into your reports with reckless abandon, gaining valuable insights at every turn. And that’s what it’s all about.

Keep your data clean and it’ll be much easier to figure out how to grow your business.

How do you keep your Google Analytics data clean? Tell us in the comments!

About the Author: Lars Lofgren is the Kissmetrics Marketing Analyst and has his Google Analytics Individual Qualification (he’s certified). Learn how to grow your business at his marketing blog or follow him on Twitter @larslofgren.

  1. Thank You very much for sharing this detail and very useful article. Well I think the title of this article should be How to correctly use Google Analytics.

  2. “Conversion rates for traffic course also get corrupted. Google Analytics attributes conversions to the most recent traffic source. The only exception is direct visits, those conversions get passed to the traffic source before the last visit (if there is one).”

    Is that really the case? What if the source before also was Direct and the source before that was Organic?

  3. I always get in corrent data on my Google Analytic account. Now I know what was the problem. Thanks for the post.

  4. Chris Countey Dec 31, 2012 at 10:26 am is throwing a 404. Do you have another resource you can recommend?

  5. “You can set up events as goals. But you can’t use them with goal funnels. So if you have an event you’d like to use in a funnel, use a virtual pageview instead”

    I thought that events could now be used in goal funnel?

  6. I’ll correct my comment.
    Would I be correct in saying that although goals can now be created based on events, currently they cannot be used in funnel visualization?

    • Yes Nick, that’s correct. Events can be used as goals but they can’t be used in a funnel visualization. The workaround is to trigger virtual pageviews instead of events. Then you’ll be able to use those URLs as steps in your funnel.

  7. I sure enjoyed a bunch of post where no one bashed someone and when another gave good advice it was appreciated.
    Well done all!

  8. Excellent article. Just what I needed, many thanks!

  9. Great article. So if I did the wrong setting with the filter with include/excluding data, then it stopped tracking all visits to the site – is it possible to get the data back?


Please use your real name and a corresponding social media profile when commenting. Otherwise, your comment may be deleted.

← Previous ArticleNext Article →