Matching Big Query Data with Google Analytics - Product Performance
Still in lockdown in Melbourne so still recreating Google Analytics reports in BigQuery (The fun never stops at my house). If you haven’t read the others I have covered:
This week I am looking at the product performance report. I always found that the shopping behaviour report was more useful than the summary report, but here I will combine them so you can get the revenue as well.
Getting the product names
In previous blog articles, we have been unnesting hits to get things like page paths or events. Now we need to unnest products underneath hits.
A note of warning, using a comma between your unnests is a CROSS JOIN so it will only output items in both areas, e.g. if we unnest hits and products the result is a table that only contains the hits with products in them. If you want all hits and all product you replace the comma with a LEFT JOIN.
So here we go with getting a list of all the products:
SELECT
product.v2ProductName,
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits,
UNNEST (hits.product) product
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
GROUP BY
v2ProductName
Shopping behaviour
Google actually makes it easy for us to get out which stage of the funnel each hit is (as long as you have set it up as enhanced eCommerce). We use hits.eCommerceAction.action_type
which is a series of numbers (though stored as a string) that correlates to a shopping stage.
The action type. Click through of product lists = 1, Product detail views = 2, Add product(s) to cart = 3, Remove product(s) from cart = 4, Check out = 5, Completed purchase = 6, Refund of purchase = 7, Checkout options = 8, Unknown = 0.
Using these stages you can count up each event. I’m just going up to step 6.
Also, note here we aren’t using total.visits = 1
, if you have a refund hit going from another platform they may not trigger a visit.
SELECT
product.v2ProductName,
sum(case when hits.eCommerceAction.action_type = "1" then 1 end) as productListView,
sum(case when hits.eCommerceAction.action_type = "2" then 1 end) as productDetailView,
sum(case when hits.eCommerceAction.action_type = "3" then 1 end) as addToCart,
sum(case when hits.eCommerceAction.action_type = "4" then 1 end) as removeToCart,
sum(case when hits.eCommerceAction.action_type = "5" then 1 end) as checkout,
sum(case when hits.eCommerceAction.action_type = "6" then 1 end) as transaction,
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits,
unnest (hits.product) product
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
group by v2ProductName
Transaction data
Attached to the product array, that has the product name, you can also get details on what was bought, productQuantity
and productRevenue
.
If you sum up product revenue it may not be the same as your total revenue as it won’t include shipping or any discounts across the entire cart.
You will need to only get the quantity and revenue from the transaction event, otherwise you could be getting quantity from add to cart or other events.
Also remember that revenue always needs to be divided by 1,000,000.
SELECT
product.v2ProductName,
sum(case when hits.eCommerceAction.action_type = "6" then product.productQuantity end) as qty,
round(sum(case when hits.eCommerceAction.action_type = "6" then product.productRevenue end)/1000000,2) as revenue,
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits,
UNNEST (hits.product) product
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
GROUP BY
v2ProductName
Putting it all together
Now we can add these all together and add in the percentages and averages. Such as the awkwardly named basket to detail rate which is addToCart/productDetailView
.
SELECT
*,
round(addToCart/productDetailView,2) AS basketToDetailRate,
round(transaction/productDetailView,2) AS buyToDetailRate,
round(qty/transaction,2) AS avgQuantity,
FROM (
SELECT
product.v2ProductName,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "1" THEN 1
END
) AS productListView,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "2" THEN 1
END
) AS productDetailView,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "3" THEN 1
END
) AS addToCart,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "4" THEN 1
END
) AS removeToCart,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "5" THEN 1
END
) AS checkout,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "6" THEN 1
END
) AS transaction,
SUM(CASE
WHEN hits.eCommerceAction.action_type = "6" THEN 1
END
) AS qty,
ROUND(SUM(CASE
WHEN hits.eCommerceAction.action_type = "6" THEN product.productRevenue
END
)/1000000,2) AS revenue,
FROM
`bigquery-public-data.google_analytics_sample.ga_sessions_*`,
UNNEST (hits) hits,
UNNEST (hits.product) product
WHERE
_table_suffix BETWEEN "20170701"
AND "20170731"
GROUP BY
v2ProductName )
I have to say that the GA data gives some funky results but I am going to put that down to their funky data. As usual, let me know if there is a report you would like to see or if you have any questions.