Linus Larsson

Conversion rate per page in Google Analytics with R

I often get questions about how to evaluate if a page has been useful in the customer journey on the website. You could check this by a lot of different KPIs but many of them aren't very good. For example, I think page value is an irrelevant metric since Google analytics will split the transaction value on all visited pages during the session. So if a user likes to browse on multiple pages and all of them are somewhat relevant for that user then the page value for our specific page we want to follow up will be getting low credit, when in reality it was a great page for the user. 

One metric that I like to look at is the conversion rate for that page when it has been visited during the session. This is calculated as the number of visitors who bought something and visited the page during the session divided by the unique pageviews for that same page.

Previously I've used segments in Google Analytics to compare the unique pageviews between all users and sessions with transactions. This is however a very slow process, especially if you have a lot of pages you want to analyze. 

So how can we do this in a better way and scale the solution? By using R we can easily do this fast and effective!

First off we import the Google Analytics library and specify which view we are going to import data from. You can do this either by selecting by name or by simply typing the view id. We also create cariables for the time period we want to collect data from. 

# Created by Linus Larsson
# 2019-01-07
# http://34.78.24.69
#install.packages("googleAuthR")
#install.packages("googleAnalyticsR")
library(googleAnalyticsR)
library(googleAuthR)
# Create connection to Google Analytics
ga_auth()
# Get all views and select the id you want to use
account_list <- ga_account_list()
ga_id <- account_list[which(account_list$webPropertyName=="PROPERTY NAME" & account_list$viewName == "VIEW NAME"),'viewId']
# If you know the view id you could just type it in instead of the above code
#ga_id <-9387XXXX
# Select which dates the report should include
start <- "2019-01-06"
end <- "2019-01-06"

In the next part of the script we create our function that is going to extract data from Google Analytics based upon the chosen time period, dimensions, metrics, filters and segments. Notice that you can leave both filters and segments blank if you want to. The script will then return data with a filter which matches REGEX ".*" and use the standard segment "All Users".

gaGetData <- function(id, start, end, dimensions, metrics,  dimF="not set", segment = segment_ga4("All Users", segment_id = "gaid::-1")){
  if(!(is.list(dimF))){
    dimF <- dim_filter(dimensions[1], "REGEXP",".*")
    dimF <- filter_clause_ga4(list(dimF), operator = "OR")
  }
  
  df <- google_analytics(viewId = id, 
                         date_range = c(start,end),
                         metrics = metrics,
                         dimensions = dimensions,
                         dim_filters = dimF,
                         segments = segment,
                         anti_sample = TRUE,
                         max = -1)
  
  return (df)
}

The last part of the script is a function to compare unique pageviews for all visitors with those of the customers. The function uses a variable called 'pages'. Leave it empty to export data for all pages or type in a vector of pages that you want the data from. Pretty simple right? You can now start comparing your different pages from the perspective of how many of all visitors who viewed the page also decided to buy something. 

gaCalculatePageCR <- function(id, start, end, pages = "ALL"){
  # COLLECT ALL PAGEVIEWS
  if (pages[1] == "ALL"){
    pages <- ".*"
  } else {
    pages <- paste0("^", paste(pages, collapse = "$|^"), "$")
  }
  
  dimf <- dim_filter("pagePath","REGEXP",pages)
  fc <- filter_clause_ga4(list(dimf), operator = "OR")
  all <- gaGetData(id, start, end, c("date","pagePath"), c("uniquePageviews"), fc)
  con <- gaGetData(id, start, end, c("date","pagePath"), c("uniquePageviews"), fc, segment_ga4("Segment",segment_id="gaid::-10"))
  
  all <- all[-3]
  con <- con[-3]
  colnames(all) <- c("date","pagePath","uniquePageviews_all")
  colnames(con) <- c("date","pagePath","uniquePageviews_customers")
  
  data <- merge(all, con, by = c("date","pagePath"), all.x = TRUE)
  data$conversionRate <- round(data$uniquePageviews_customers / data$uniquePageviews_all, 4)
  
  return (data)
}

To run it simply call the function!

gaCalculatePageCR(ga_id, start, end, c("/page1/", "/page2/"))

If you want the script in full version you can find it on GitHub. I also wrote a query for Google BigQuery if you would prefer to do your analysis there. I apologize in advance if the query is too complicated, I'm not an expert in writing SQL... yet. 

Comments

  1. Jørn Reidel

    2019-03-12 10:43

    Hi, thanks for the blog post. Just a small comment; since you query the v3 API, you have to change “df <- google_analytics(id, ….." to "df <- google_analytics_3(id,…..", if you're not using a very old version of googleAnalyticsR. Or even better, update everything to v4 syntax!

    • Linus

      2019-03-12 11:01

      Hi, thanks for the input Jørn! I had updated the GitHub code but had forgotten about the post. It should now work with the v4 syntax. Please feel free to get back to me if I missed something in the code.

      • Jørn Reidel

        2019-03-12 14:09

        Nice approach to page conversion, I tested it and it’s working as expected. I changed date dimension to isoYearIsoWeek, which I think is more useful. I like Page Value in GA, since it’s easily available in the interface (and api). But be careful to compare page value on different pages. I tend to look at the most important pages on the site, and compare page value week over week or month over month for the page.

  2. Josh

    2020-01-08 20:45

    Thank you so much for putting this together!

    The gaCalculatePageCR keeps throwing this error at me, it seems to not be able to pull the data,(i have run the 1st function on its own and it is grabbing data from GA ). I am a NOOB when it comes to R, but i have some programming background.

    any ideas?
    thank you for your time 🙂

    “2020-01-08 14:41:23> anti_sample set to TRUE. Mitigating sampling via multiple API calls.
    2020-01-08 14:41:23> Finding how much sampling in data request…
    2020-01-08 14:41:24> Downloaded [0] rows from a total of [].
    2020-01-08 14:41:24> No sampling found, returning call
    2020-01-08 14:41:24> Downloaded [0] rows from a total of [].
    2020-01-08 14:41:24> anti_sample set to TRUE. Mitigating sampling via multiple API calls.
    2020-01-08 14:41:24> Finding how much sampling in data request…
    2020-01-08 14:41:25> Downloaded [0] rows from a total of [].
    2020-01-08 14:41:25> No sampling found, returning call
    2020-01-08 14:41:25> Downloaded [0] rows from a total of [].
    Error in `colnames<-`(`*tmp*`, value = c("date", "pagePath", "uniquePageviews_all")) :
    attempt to set 'colnames' on an object with less than two dimensions"

    • Linus Larsson

      2020-03-17 11:17

      Hi, sorry for a late reply. It seems like you’re not downloading any data. Make sure you’re using the correct view id and that the segment for converting customer has the same ID.

Leave a Reply

Your email address will not be published. Required fields are marked *

*

Cookie Settings

© Copyright - Lynuhs.com - 2018-2024