Linus Larsson

GTM variable that removes PII from query string

This is my first GTM Template that I've added to the Template Gallery. What this variable does is that it allows you to modify a query string from the URL in order to make sure that you are not sending any PII by mistake to third party services. You can find the template on https://tagmanager.google.com/gallery/#/owners/qvalento-gtm-templates/templates/query-string-modifier or in the Template Gallery from within the GTM interface. Below, I will explain how it works and show you how you should use it.

First of all, the template needs permissions to read from the URL and the Data Layer. From the URL it will read protocol, host, path and query. From the Data Layer it will only read from gtm.elementUrl. The template will also need permission to log to console, which will only happen when debugging or previewing.

Creating a variable from the template

Start off by adding the template to your container, which you do by heading to templates in the left menu and then search for "Query String Modifier" in the Gallery. Once you've added it you can create a new variable by using the template as type.

As you can see in the image above, you can choose from where the URL should be imported. By default it will pick the Page URL, but you could also pick Element URL to catch the URL of a clicked link or form. If neither of these built in choices meets your criteria, then you could add a custom variable as well.

The checkboxes are where the magic happens. If you check the first one, the query string will be returned with an initial question mark. This comes in handy when you want to add the query string to a custom built URL. More on this later.

The second checkbox allows you to whitelist specific parameters and block everything else. This is the solution for removing PII. The reason why you can't blacklist instead is because you loose a lot of control of filtering out PII if you were to tackle it in that way. You might not know when a new query parameter would be added to the website.

When you check the box to activate whitelisting, then you get a new drop down choice. By default, the variable will remove non whitelisted parameters, but you could also choose to redact the values of them instead. In that case, you can choose what text the value should be replaced by. You can leave this field empty to remove the value but still keep the parameter with an equals symbol.

At the bottom of the variable settings box, you can add parameters to whitelist. Simply add one parameter per row. The ones in the image below are important to whitelist, since they are used by Google Analytics. Make sure to also add the parameter that is used for internal search.

Examples

Let's say we have the following URL: https://lynuhs.com/?utm_source=self&utm_medium=referral&email=name@lynuhs.com. The variable would then return one of the following query strings depending on setup:

  • ?utm_source=self&utm_medium=referral&email=[REDACTED]
  • ?utm_source=self&utm_medium=referral&email=
  • utm_source=self&utm_medium=referral&email=[REDACTED]
  • utm_source=self&utm_medium=referral&email=
  • ?utm_source=self&utm_medium=referral
  • utm_source=self&utm_medium=referral

Where to use it

So where should we add this to avoid sending PII to third party services? We have to make sure that those services receive these modified query parameters and not the original ones. I will show you how to do this for Google Analytics, but it's basically the same thing for all third party services.

To be absolutely certain that the URL never includes the original query string, we will have to overwrite the location parameter. My recommendation would be to use one variable where you remove all non whitelisted query parameters to add in the location field and then use another variable with redacted values instead to send as a custom dimension to Google Analytics. With this method, you will avoid seeing unnecessary query parameters in your reports, but you still will be able to see the parameter names if you add them as a secondary dimension (in case you need to troubleshoot).

  • Location: {{URL - Protocol}}://{{Page Hostname}}{{Page Path}}{{VARIABLE 1 - REMOVING PII}}
  • Custom Dimension: {{VARIABLE 2 - REDACTING PII}}

For the first variable, it's important that you have checked the box for adding a question mark to the query string. Otherwise Google Analytics won't understand that the string is supposed to be query parameters, and your reports will look weird.

OBS! If your website is using fragments in the URL, you might have to handle these separately to add them in the location field.

Comments

Comments are currently not available due to roll out of custom built blog theme. I'm working on getting this available again, but in the mean time you can write to me on LinkedIn if you have any questions.

© Copyright - Lynuhs.com - 2018-2021