The GA4 update: matching big query data with Google analytics
UPDATE: to get the right number of sessions you need to contact user psudeo id and ga session id
After my series of matching big query data with Universal Analytics being helpful. I thought with the forced update to GA4, and the ability for all to get raw data I would start the series again with GA4 data.
Google provides access to their sample GA4 data from their merchandise store: https://www.googlemerchandisestore.com/. The data set is bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*
This post will look at:
Users
New Users
Sessions
Number of Session per User
Page Views
Pages/Sessions
Avg. Session Duration
Bounce Rate
Users
Though users should be fairly easy to calculate because GA4 is all user-based, there could be a discrepancy between BQ and GA4 if you use Google Signals. See this article: https://support.google.com/analytics/answer/9445345?hl=en#zippy=%2Cin-this-article
As Google states:
Google Analytics 4 exports event data associated with anonymous cookies to BigQuery. Theoretically, this can result in the same user being counted multiple times across different devices. In contrast, Google signals measures users across devices. Therefore, data exported to BigQuery might show more users when compared with reports based on Google signals data.
But let’s keep things simple and assume that you didn’t activate Google Signals. Here we are getting the users for the month of January 2021
New Users
When people say that GA4 is event based, you better believe it. Everything is an event. Event if it is new users. So if you want to could how many new users you count the user ids that trigger that event:
Sessions
Also your sessions are events, but just make sure you count the timestamp and the user id or you would just get the number of users.
Pageviews
Though I find this metric useless without context (like which pages) some people like it as it usually a big number you can shout about. Again your page views are events so you can do the same as sessions.
Bounces and Bounce Rate
The bounce rate has been removed from GA4 in favour of the engagement rate (See below). Some people will still want bounce rate though. Here we are assuming that a bounce is if they only viewed 1 page, as there isn’t interactive/non-interactive events so we don’t want to count events.
Engagement rate
So what we really should be using is engagement rate. Which is easier to get than bounce rate.
Average Session Duration
Again this has changed slightly to engaged session duration. This will be lower than your Universal Analytics session duration as it only counts when the tab is in focus. Below I show how to do both.
Number of Sessions per User
This one is pretty easy after a couple of hard ones before. Just dived session by user.
Pages per Session
Again fairly similar to the previous - divid page views by sessions.
Putting it all together
So now if you want them all in one query, this time I’m using with clauses for the nested queries, here you go:
And there we are. I will keep on updating the queries from the UA series. If there is anything in particular you would like let me know.