Matching Big Query Data with Google Analytics - Acquisition Channel report
Continuing to recreate Google Analytics reports in Google BigQuery, this time looking at the Acquisition Channel report.
See my previous article: Matching Big Query Data with Google Analytics - Audience Overview. This covers the basics with:
Users
New Users
Sessions
Bounce Rate
Pages/session
Avg. Session Duration
So I won’t cover those again. In this article I’ll look at breaking these down by channel grouping. I will also cover the remaining metrics:
Transactions
Transactions work a bit differently to goals, goals are counted once per session where as transactions are multiple times in a session. There is also seperate metrics in the BigQuery table for transactions and lucky for us, transactionID, so no matter what you unnest the tables by you can get the number of transactions.
SELECT
channelGrouping,
count(distinct hits.transaction.transactionId) as transactions,
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
AND totals.visits = 1
group by
channelGrouping
eCommerce Conversion Rate
Converstion rate is just transactions divided by sessions so then we have:
SELECT
channelGrouping,
round(transactions/sessions*100,2) as eCom_conversion_rate,
FROM (
SELECT
COUNT(DISTINCT CONCAT(fullvisitorId, CAST(visitId AS string), date)) AS sessions,
count(distinct hits.transaction.transactionId) as transactions
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
AND totals.visits = 1
group by
channelGrouping)
Revenue
There are as many ways to get revenue as to cook an egg. You could use totals.transactionRevenue
if you aren’t unnesting anything. You could sum up hits.product.productRevenue
and add shipping. For this example I am going to use hits.transaction.transactionRevenue
. All revenue needs to be divided by 1,000,000 to get it to be in dollars, this is to allow for conversion to different currencies and what not.
SELECT
channelGrouping,
sum( hits.transaction.transactionRevenue/1000000) AS total_revenue
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
AND totals.visits = 1
group by
channelGrouping
Goal completions
Your goals should be based on a page or event. Do not use smart goals, they suck. Duration and number of page view goals are also not well thought out. Goals are triggered once per session so to get the number of goals we just need to count all the distinct ids that saw a certain page or triggered a certain event. For this example we are counting people who saw the registration page.
SELECT
channelGrouping,
count(distinct case when hits.page.pagePath = "/registersuccess.html" then CONCAT(fullvisitorId, CAST(visitId AS string), date) end) as registration_goal
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
AND totals.visits = 1
group by
channelGrouping
Goal Conversion Rate
Just like ecommerce conversion rate, goal conversion is total goals divided by sessions.
SELECT
channelGrouping,
round(registration_goal/sessions*100,2) as registration_conversion_rate
FROM (
SELECT
channelGrouping,
count(distinct case when hits.page.pagePath = "/registersuccess.html" then CONCAT(fullvisitorId, CAST(visitId AS string), date) end) as registration_goal
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
AND totals.visits = 1
group by
channelGrouping)
Put it all together
So all the metrics together are below, channel groupings could be replaced with trafficSource.campaign or trafficSource.source or trafficSource.medium
SELECT
channelGrouping,
users,
new_users,
sessions,
ROUND(sessions/users,2) AS sessions_per_users,
ROUND(pageviews/sessions,2) AS pageviews_per_session,
ROUND(total_session_duration/sessions,2) AS avg_session_duration_secs,
ROUND(bounces/sessions*100,2) AS bounceRate,
transactions,
ROUND(transactions/sessions*100,2) AS eCom_conversion_rate,
ROUND(total_revenue,2) AS total_revenue,
registration_goal,
ROUND(registration_goal/sessions*100,2) AS registration_conversion_rate
FROM (
SELECT
channelGrouping,
COUNT(DISTINCT fullvisitorId) AS users,
COUNT(DISTINCT
CASE
WHEN visitNumber = 1 THEN fullvisitorId
END
) AS new_users,
COUNT(DISTINCT CONCAT(fullvisitorId, CAST(visitId AS string), date)) AS sessions,
COUNT(DISTINCT
CASE
WHEN hits.type = "PAGE" THEN CONCAT(fullvisitorID, CAST(visitId AS STRING), date, hits.hitNumber)
END
) AS pageviews,
SUM(totals.timeOnSite/ totals.hits ) AS total_session_duration,
COUNT(DISTINCT
CASE
WHEN totals.bounces = 1 THEN CONCAT(fullvisitorId, CAST(visitId AS string), date)
END
) AS bounces,
COUNT(DISTINCT hits.transaction.transactionId) AS transactions,
SUM( hits.transaction.transactionRevenue/1000000) AS total_revenue,
COUNT(DISTINCT
CASE
WHEN hits.page.pagePath = "/registersuccess.html" THEN CONCAT(fullvisitorId, CAST(visitId AS string), date)
END
) AS registration_goal
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
AND totals.visits = 1
GROUP BY
channelGrouping)
Next week I will move onto behaviour reports. I won’t do the pages report as this article outlines it just great: