Advanced queries

The advanced queries in this page apply to the BigQuery event export data for Google Analytics. See BigQuery cookbook for Universal Analytics if you are looking for the same resource for Universal Analytics. Try the basic queries first before trying out the advanced ones.

Products purchased by customers who purchased a certain product

The following query shows what other products were purchased by customers who purchased a specific product. This example does not assume that the products were purchased in the same order.

The optimized example relies on BigQuery scripting features to define a variable that declares which items to filter on. While this does not improve performance, this is a more readable approach for defining variables compared creating a single value table using a WITH clause. The simplified query uses the latter approach using the WITH clause.

The simplified query creats a separate list of "Product A buyers" and does a join with that data. The optimized query, instead, creates a list of all items a user has purchased across orders using the ARRAY_AGG function. Then using the outer WHERE clause, purchase lists across all users are filtered for the target_item and only relevant items are shown.

Simplified

  -- Example: Products purchased by customers who purchased a specific product. 
 -- 
 -- `Params` is used to hold the value of the selected product and is referenced 
 -- throughout the query. 
 WITH 
  
 Params 
  
 AS 
  
 ( 
  
 -- Replace with selected item_name or item_id. 
  
 SELECT 
  
 'Google Navy Speckled Tee' 
  
 AS 
  
 selected_product 
  
 ), 
  
 PurchaseEvents 
  
 AS 
  
 ( 
  
 SELECT 
  
 user_pseudo_id 
 , 
  
 items 
  
 FROM 
  
 -- Replace table name. 
  
 ` 
 bigquery 
 - 
 public 
 - 
 data 
 . 
 ga4_obfuscated_sample_ecommerce 
 . 
 events_ 
 *` 
  
 WHERE 
  
 -- Replace date range. 
  
 _TABLE_SUFFIX 
  
 BETWEEN 
  
 '20201101' 
  
 AND 
  
 '20210131' 
  
 AND 
  
 event_name 
  
 = 
  
 'purchase' 
  
 ), 
  
 ProductABuyers 
  
 AS 
  
 ( 
  
 SELECT 
  
 DISTINCT 
  
 user_pseudo_id 
  
 FROM 
  
 Params 
 , 
  
 PurchaseEvents 
 , 
  
 UNNEST 
 ( 
 items 
 ) 
  
 AS 
  
 items 
  
 WHERE 
  
 -- item.item_id can be used instead of items.item_name. 
  
 items 
 . 
 item_name 
  
 = 
  
 selected_product 
  
 ) 
 SELECT 
  
 items 
 . 
 item_name 
  
 AS 
  
 item_name 
 , 
  
 SUM 
 ( 
 items 
 . 
 quantity 
 ) 
  
 AS 
  
 item_quantity 
 FROM 
  
 Params 
 , 
  
 PurchaseEvents 
 , 
  
 UNNEST 
 ( 
 items 
 ) 
  
 AS 
  
 items 
 WHERE 
  
 user_pseudo_id 
  
 IN 
  
 ( 
 SELECT 
  
 user_pseudo_id 
  
 FROM 
  
 ProductABuyers 
 ) 
  
 -- item.item_id can be used instead of items.item_name 
  
 AND 
  
 items 
 . 
 item_name 
  
 != 
  
 selected_product 
 GROUP 
  
 BY 
  
 1 
 ORDER 
  
 BY 
  
 item_quantity 
  
 DESC 
 ; 
 

Optimized

  -- Optimized Example: Products purchased by customers who purchased a specific product. 
 -- Replace item name 
 DECLARE 
  
 target_item 
  
 STRING 
  
 DEFAULT 
  
 'Google Navy Speckled Tee' 
 ; 
 SELECT 
  
 IL 
 . 
 item_name 
  
 AS 
  
 item_name 
 , 
  
 SUM 
 ( 
 IL 
 . 
 quantity 
 ) 
  
 AS 
  
 quantity 
 FROM 
  
 ( 
  
 SELECT 
  
 user_pseudo_id 
 , 
  
 ARRAY_AGG 
 ( 
 STRUCT 
 ( 
 item_name 
 , 
  
 quantity 
 )) 
  
 AS 
  
 item_list 
  
 FROM 
  
 -- Replace table 
  
 ` 
 bigquery 
 - 
 public 
 - 
 data 
 . 
 ga4_obfuscated_sample_ecommerce 
 . 
 events_ 
 *` 
 , 
  
 UNNEST 
 ( 
 items 
 ) 
  
 WHERE 
  
 -- Replace date range 
  
 _TABLE_SUFFIX 
  
 BETWEEN 
  
 '20201201' 
  
 AND 
  
 '20201210' 
  
 AND 
  
 event_name 
  
 = 
  
 'purchase' 
  
 GROUP 
  
 BY 
  
 1 
  
 ), 
  
 UNNEST 
 ( 
 item_list 
 ) 
  
 AS 
  
 IL 
 WHERE 
  
 target_item 
  
 IN 
  
 ( 
 SELECT 
  
 item_name 
  
 FROM 
  
 UNNEST 
 ( 
 item_list 
 )) 
  
 -- Remove the following line if you want the target_item to appear in the results 
  
 AND 
  
 target_item 
  
 != 
  
 IL 
 . 
 item_name 
 GROUP 
  
 BY 
  
 item_name 
 ORDER 
  
 BY 
  
 quantity 
  
 DESC 
 ; 
 

Average amount of money spent per purchase session by user

The following query shows the average amount of money spent per session by each user. This takes into account only the sessions where the user made a purchase.

  -- Example: Average amount of money spent per purchase session by user. 
 WITH 
  
 events 
  
 AS 
  
 ( 
  
 SELECT 
  
 session 
 . 
 value 
 . 
 int_value 
  
 AS 
  
 session_id 
 , 
  
 COALESCE 
 ( 
 spend 
 . 
 value 
 . 
 int_value 
 , 
  
 spend 
 . 
 value 
 . 
 float_value 
 , 
  
 spend 
 . 
 value 
 . 
 double_value 
 , 
  
 0 
 . 
 0 
 ) 
  
 AS 
  
 spend_value 
 , 
  
 event 
 . 
 * 
  
 -- Replace table name 
  
 FROM 
  
 ` 
 bigquery 
 - 
 public 
 - 
 data 
 . 
 ga4_obfuscated_sample_ecommerce 
 . 
 events_ 
 *` 
  
 AS 
  
 event 
  
 LEFT 
  
 JOIN 
  
 UNNEST 
 ( 
 event 
 . 
 event_params 
 ) 
  
 AS 
  
 session 
  
 ON 
  
 session 
 . 
 key 
  
 = 
  
 'ga_session_id' 
  
 LEFT 
  
 JOIN 
  
 UNNEST 
 ( 
 event 
 . 
 event_params 
 ) 
  
 AS 
  
 spend 
  
 ON 
  
 spend 
 . 
 key 
  
 = 
  
 'value' 
  
 -- Replace date range 
  
 WHERE 
  
 _TABLE_SUFFIX 
  
 BETWEEN 
  
 '20201101' 
  
 AND 
  
 '20210131' 
  
 ) 
 SELECT 
  
 user_pseudo_id 
 , 
  
 COUNT 
 ( 
 DISTINCT 
  
 session_id 
 ) 
  
 AS 
  
 session_count 
 , 
  
 SUM 
 ( 
 spend_value 
 ) 
  
 / 
  
 COUNT 
 ( 
 DISTINCT 
  
 session_id 
 ) 
  
 AS 
  
 avg_spend_per_session_by_user 
 FROM 
  
 events 
 WHERE 
  
 event_name 
  
 = 
  
 'purchase' 
  
 and 
  
 session_id 
  
 IS 
  
 NOT 
  
 NULL 
 GROUP 
  
 BY 
  
 user_pseudo_id 
 

Latest Session Id and Session Number for users

The following query provides the list of the latest ga_session_id and ga_session_number from last 4 days for a list of users. You can provide either a user_pseudo_id list or a user_id list.

user_pseudo_id

  -- Get the latest ga_session_id and ga_session_number for specific users during last 4 days. 
 -- Replace timezone. List at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones. 
 DECLARE 
  
 REPORTING_TIMEZONE 
  
 STRING 
  
 DEFAULT 
  
 'America/Los_Angeles' 
 ; 
 -- Replace list of user_pseudo_id's with ones you want to query. 
 DECLARE 
  
 USER_PSEUDO_ID_LIST 
  
 ARRAY<STRING> 
  
 DEFAULT 
  
 [ 
  
 '1005355938.1632145814' 
 , 
  
 '979622592.1632496588' 
 , 
  
 '1101478530.1632831095' 
 ]; 
 CREATE 
  
 TEMP 
  
 FUNCTION 
  
 GetParamValue 
 ( 
 params 
  
 ANY 
  
 TYPE 
 , 
  
 target_key 
  
 STRING 
 ) 
 AS 
  
 ( 
  
 ( 
 SELECT 
  
 ` 
 value 
 ` 
  
 FROM 
  
 UNNEST 
 ( 
 params 
 ) 
  
 WHERE 
  
 key 
  
 = 
  
 target_key 
  
 LIMIT 
  
 1 
 ) 
 ); 
 CREATE 
  
 TEMP 
  
 FUNCTION 
  
 GetDateSuffix 
 ( 
 date_shift 
  
 INT64 
 , 
  
 timezone 
  
 STRING 
 ) 
 AS 
  
 ( 
  
 ( 
 SELECT 
  
 FORMAT_DATE 
 ( 
 '%Y%m%d' 
 , 
  
 DATE_ADD 
 ( 
 CURRENT_DATE 
 ( 
 timezone 
 ), 
  
 INTERVAL 
  
 date_shift 
  
 DAY 
 ))) 
 ); 
 SELECT 
  
 DISTINCT 
  
 user_pseudo_id 
 , 
  
 FIRST_VALUE 
 ( 
 GetParamValue 
 ( 
 event_params 
 , 
  
 'ga_session_id' 
 ). 
 int_value 
 ) 
  
 OVER 
  
 ( 
 UserWindow 
 ) 
  
 AS 
  
 ga_session_id 
 , 
  
 FIRST_VALUE 
 ( 
 GetParamValue 
 ( 
 event_params 
 , 
  
 'ga_session_number' 
 ). 
 int_value 
 ) 
  
 OVER 
  
 ( 
 UserWindow 
 ) 
  
 AS 
  
 ga_session_number 
 FROM 
  
 -- Replace table name. 
  
 ` 
 bigquery 
 - 
 public 
 - 
 data 
 . 
 ga4_obfuscated_sample_ecommerce 
 . 
 events_ 
 *` 
 WHERE 
  
 user_pseudo_id 
  
 IN 
  
 UNNEST 
 ( 
 USER_PSEUDO_ID_LIST 
 ) 
  
 AND 
  
 RIGHT 
 ( 
 _TABLE_SUFFIX 
 , 
  
 8 
 ) 
  
 BETWEEN 
  
 GetDateSuffix 
 ( 
 - 
 3 
 , 
  
 REPORTING_TIMEZONE 
 ) 
  
 AND 
  
 GetDateSuffix 
 ( 
 0 
 , 
  
 REPORTING_TIMEZONE 
 ) 
 WINDOW 
  
 UserWindow 
  
 AS 
  
 ( 
 PARTITION 
  
 BY 
  
 user_pseudo_id 
  
 ORDER 
  
 BY 
  
 event_timestamp 
  
 DESC 
 ); 
 

user_id

  -- Get the latest ga_session_id and ga_session_number for specific users during last 4 days. 
 -- Replace timezone. List at https://en.wikipedia.org/wiki/List_of_tz_database_time_zones. 
 DECLARE 
  
 REPORTING_TIMEZONE 
  
 STRING 
  
 DEFAULT 
  
 'America/Los_Angeles' 
 ; 
 -- Replace list of user_id's with ones you want to query. 
 DECLARE 
  
 USER_ID_LIST 
  
 ARRAY<STRING> 
  
 DEFAULT 
  
 [ 
 '<user_id_1>' 
 , 
  
 '<user_id_2>' 
 , 
  
 '<user_id_n>' 
 ]; 
 CREATE 
  
 TEMP 
  
 FUNCTION 
  
 GetParamValue 
 ( 
 params 
  
 ANY 
  
 TYPE 
 , 
  
 target_key 
  
 STRING 
 ) 
 AS 
  
 ( 
  
 ( 
 SELECT 
  
 ` 
 value 
 ` 
  
 FROM 
  
 UNNEST 
 ( 
 params 
 ) 
  
 WHERE 
  
 key 
  
 = 
  
 target_key 
  
 LIMIT 
  
 1 
 ) 
 ); 
 CREATE 
  
 TEMP 
  
 FUNCTION 
  
 GetDateSuffix 
 ( 
 date_shift 
  
 INT64 
 , 
  
 timezone 
  
 STRING 
 ) 
 AS 
  
 ( 
  
 ( 
 SELECT 
  
 FORMAT_DATE 
 ( 
 '%Y%m%d' 
 , 
  
 DATE_ADD 
 ( 
 CURRENT_DATE 
 ( 
 timezone 
 ), 
  
 INTERVAL 
  
 date_shift 
  
 DAY 
 ))) 
 ); 
 SELECT 
  
 DISTINCT 
  
 user_pseudo_id 
 , 
  
 FIRST_VALUE 
 ( 
 GetParamValue 
 ( 
 event_params 
 , 
  
 'ga_session_id' 
 ). 
 int_value 
 ) 
  
 OVER 
  
 ( 
 UserWindow 
 ) 
  
 AS 
  
 ga_session_id 
 , 
  
 FIRST_VALUE 
 ( 
 GetParamValue 
 ( 
 event_params 
 , 
  
 'ga_session_number' 
 ). 
 int_value 
 ) 
  
 OVER 
  
 ( 
 UserWindow 
 ) 
  
 AS 
  
 ga_session_number 
 FROM 
  
 -- Replace table name. 
  
 ` 
 bigquery 
 - 
 public 
 - 
 data 
 . 
 ga4_obfuscated_sample_ecommerce 
 . 
 events_ 
 *` 
 WHERE 
  
 user_id 
  
 IN 
  
 UNNEST 
 ( 
 USER_ID_LIST 
 ) 
  
 AND 
  
 RIGHT 
 ( 
 _TABLE_SUFFIX 
 , 
  
 8 
 ) 
  
 BETWEEN 
  
 GetDateSuffix 
 ( 
 - 
 3 
 , 
  
 REPORTING_TIMEZONE 
 ) 
  
 AND 
  
 GetDateSuffix 
 ( 
 0 
 , 
  
 REPORTING_TIMEZONE 
 ) 
 WINDOW 
  
 UserWindow 
  
 AS 
  
 ( 
 PARTITION 
  
 BY 
  
 user_pseudo_id 
  
 ORDER 
  
 BY 
  
 event_timestamp 
  
 DESC 
 ); 
 
Design a Mobile Site
View Site in Mobile | Classic
Share by: