Matching Big Query Data with Google Analytics - Page value in GA4 with BigQuery

Thanks to Marquin Smith for this great post on how you get page value from your GA4 data.

What is page value?

The page value metric in Google Analytics is a metric that tells you how important each page is in driving revenue for a website.

If a page is often viewed in sessions during which conversions or transactions happen, it would have a high page value. The converse is also true. If a page is hardly ever viewed in sessions on the way to a conversion or transaction, it would have a low page value.

Compared to the overall page views metric, the page value metric measures the importance of a page to the business.

If a page never appears in a converting session, that page will have a $0 page value.

To be able to calculate the eCommerce page value for a particular website, there needs to be eCommerce tracking set up (with revenue being recorded) set up in Google Analytics:

Sounds great, how do I get the Ecommerce page value

Ecommerce Page value was a metric that could be obtained from the Google Analytics interface for Universal Analytics, but this is no longer the case for GA4.

So, as the title suggests, this post will show how to get Ecommerce page value by using BigQuery.

Calculating Ecommerce page value in detail

The official Google resource for understanding page value is beneficial.

Ecommerce page value and page value are subtly different, I will explain some details below. An SQL query will be provided later on that will calculate the Ecommerce page value for all pages in a given period (Assuming that you are collecting your GA4 data into BigQuery).

The formula for calculating page value for each page is as follows.

(Ecommerce Revenue + Goal Value) / unique pageviews

The Ecommerce page value definition is below. (Goal values are ignored for Ecommerce page values):

Ecommerce Revenue / unique pageviews

One way to get the page value metric in a GA4 world is to use the query here which will get the page value for each page in March 2024.

To make the query applicable to you, simply make the following changes:

  • Update to the desired date range

The dates on lines 7 and 8 of the SQL script to the desired date range (Dates are represented in the 'YYYYMMDD' format). For example, to get the page values for pages in November 2023, the first date variables would look like the following.

-- set variables
DECLARE my_start_date STRING DEFAULT "20231101";
DECLARE end_date STRING DEFAULT "20231130";
  • Update the query to point at your tables that have your GA data

Lines 25 and 42 will need to be updated to point at your dataset where Google Analytics raw data is being loaded.

-- CURRENT
FROM `<project>.<database>.events_*`

-- EXAMPLE 
FROM `bigquery-public-data.ga4_obfuscated_sample_ecommerce.events_*`

The query will output a table with the following columns.

  • page_location :: The URL or page that the following metrics relate to
  • total_unique_pageviews :: The total number of sessions that page appeared in
  • page_value :: eCommerce revenue / unique_pageviews_in_converting_sessions

Details on how the query works are below.

Query Part 1

The query begins by setting the start and end dates that the page value metric will cover. These values are used multiple times throughout the script.

The page_table common table expression gets the unique session_id and user_pseudo_id combinations that have been seen on each page. (The session_id and user_pseudo_id are used in combination to identify unique sessions).

DECLARE my_start_date STRING DEFAULT "20240301";
DECLARE end_date STRING DEFAULT "20240331";

-- Get unique user IDs and session IDs for each page in the date range
WITH page_table AS (
    SELECT DISTINCT
        user_pseudo_id,
        (
            SELECT 
                value.int_value 
            FROM UNNEST(event_params) 
            WHERE key = 'ga_session_id'
        ) AS session_id,
        REGEXP_REPLACE(
            (SELECT value.string_value FROM UNNEST(event_params) WHERE  key = 'page_location'), 
            r'\?.*', 
            ''
        ) AS page_location,
    FROM `<project>.<database>.events_*`
    WHERE  
        _table_suffix BETWEEN my_start_date AND end_date
    GROUP BY all
),

Query Part 2

This part of the query retrieves the total eCommerce revenue that has been recorded for each session and user combination.

-- get revenue for each session and user combination
purchase_table as(
    SELECT
        user_pseudo_id,
        (
            SELECT 
                value.int_value 
            FROM UNNEST(event_params) 
            WHERE key = 'ga_session_id'
        ) AS session_id,
        SUM(ecommerce.purchase_revenue) as revenue,
    FROM `<project>.<database>.events_*`
    WHERE  
        _table_suffix BETWEEN my_start_date AND end_date
        AND event_name = "purchase"
    GROUP BY all
)

Query Part 3

The final part of the query puts all the pieces together. It calculates the eCommerce page value by calculating the eCommerce revenue for each page and dividing it by the number of unique page views for that page in sessions with a transaction.

SELECT
    page_location,
    COUNT(DISTINCT CONCAT(user_pseudo_id,session_id )) AS unique_page_views,
    -- Ecommerce page value = Ecommerce Revenue / unique pageviews
    SUM(revenue)
    /
    COUNT(DISTINCT 
        CASE 
            -- only count unique pageviews in sessions that generated revenue
            WHEN revenue IS NOT NULL THEN concat(user_pseudo_id,session_id ) 
        END
    ) AS page_value,
FROM page_table
LEFT JOIN purchase_table USING (user_pseudo_id, session_id)
GROUP BY all
ORDER BY 
    unique_page_views DESC

Putting it all together

The full query is here

DECLARE my_start_date STRING DEFAULT "20240301";
DECLARE end_date STRING DEFAULT "20240331";




-- get unique user ids and session ids for each page in date range
WITH page_table AS (
   SELECT DISTINCT
       user_pseudo_id,
       (
           SELECT
               value.int_value
           FROM UNNEST(event_params)
           WHERE key = 'ga_session_id'
       ) AS session_id,
       REGEXP_REPLACE(
           (SELECT value.string_value FROM UNNEST(event_params) WHERE  key = 'page_location'),
           r'\?.*',
           ''
       ) AS page_location,
   FROM `<project>.<database>.events_*`
   WHERE 
       _table_suffix BETWEEN my_start_date AND end_date
   GROUP BY all),


   -- get revenue for each session and user combination
purchase_table as(
   SELECT
       user_pseudo_id,
       (
           SELECT
               value.int_value
           FROM UNNEST(event_params)
           WHERE key = 'ga_session_id'
       ) AS session_id,
       SUM(ecommerce.purchase_revenue) as revenue,
   FROM `<project>.<database>.events_*`
   WHERE 
       _table_suffix BETWEEN my_start_date AND end_date
       AND event_name = "purchase"
   GROUP BY all
)


SELECT
   page_location,
   COUNT(DISTINCT CONCAT(user_pseudo_id,session_id )) AS unique_page_views,
   -- Ecommerce page value = Ecommerce Revenue / unique pageviews
   SUM(revenue)
   /
   COUNT(DISTINCT
       CASE
           -- only count unique pageviews in sessions that generated revenue
           WHEN revenue IS NOT NULL THEN concat(user_pseudo_id,session_id )
       END
   ) AS page_value,
FROM page_table
LEFT JOIN purchase_table USING (user_pseudo_id, session_id)
GROUP BY all
ORDER BY
   unique_page_views DESC

Conclusion

Now we can calculate Ecommerce page value in a post Universal Analytics world.

An added benefit of calculating the Ecommerce page value from BigQuery is that this data can now be plugged into almost every data visualisation tool, from LookerStudio to PowerBI and beyond. Stakeholders can have the super valuable page value metric in their familiar reporting tool, raising the profile of digital analytics (and yourself) within the business.

Previous
Previous

14 cool things you can do in Google Cloud

Next
Next

GA4 at Scale