Query in BigQuery

This guide explains how to query data in BigQuery for typical Manufacturing Data Engine (MDE) use-cases.

If cloud metadata materialization is disabled, you can access cloud metadata instances by joining relevant record table with the metadata-store on the metadata instance_id with the following SQL query:

  SELECT 
  
 dnr 
 . 
 * 
 , 
  
 ms 
 . 
 instance 
 FROM 
  
 mde_data 
 . 
 ` 
  RECORD_TABLE_NAME 
 
 ` 
  
 AS 
  
 dnr 
 LEFT 
  
 JOIN 
  
 mde_dimension 
 . 
 ` 
 metadata 
 - 
 store 
 ` 
  
 AS 
  
 ms 
 ON 
  
 ms 
 . 
 instance_id 
  
 = 
  
 JSON_VALUE 
 ( 
 cloud_metadata_ref 
 , 
  
 "$. BUCKET_NAME 
.instance_id" 
 ) 
 WHERE 
  
 DATE 
 ( 
 event_timestamp 
 ) 
  
 = 
  
 ' EVENT_TIMESTAMP 
' 
 LIMIT 
  
 100 
 

Replace the following:

  • RECORD_TABLE_NAME : Name of the record table.
  • BUCKET_NAME : Name of the cloud metadata bucket.
  • EVENT_TIMESTAMP : Timestamp of the event.

To improve query performance and since the metadata-store is partitioned on bucket number, you can optionally specify the bucket number in the ON clause, as the following SQL query:

  SELECT 
  
 dnr 
 . 
 * 
 , 
  
 ms 
 . 
 instance 
 FROM 
  
 mde_data 
 . 
 ` 
< RECORD_TABLE_NAME 
> ` 
  
 AS 
  
 dnr 
 LEFT 
  
 JOIN 
  
 mde_dimension 
 . 
 ` 
 metadata 
 - 
 store 
 ` 
  
 AS 
  
 ms 
 ON 
  
 ms 
 . 
 instance_id 
  
 = 
  
 JSON_VALUE 
 ( 
 cloud_metadata_ref 
 , 
  
 "$. BUCKET_NAME 
.instance_id" 
 ) 
  
 AND 
  
 ms 
 . 
 bucket_number 
  
 = 
  
< BUCKET_NUMBER 
> WHERE 
  
 DATE 
 ( 
 event_timestamp 
 ) 
  
 = 
  
 ' EVENT_TIMESTAMP 
' 
 LIMIT 
  
 100 
 

Replace the following:

  • BUCKET_NAME : Name of the cloud metadata bucket.
  • EVENT_TIMESTAMP : Timestamp of the event.

You can access metadata instance attributes using the JSON dot notation which always returns a JSON object, or using one of the BigQuery JSON functions , such as JSON_VALUE to extract strings or other data types. See the following example:

  SELECT 
  
 dnr 
 . 
 * 
 , 
  
 ms 
 . 
 instance 
 . 
 deviceName 
  
 -- this returns a double quoted JSON string 
  
 JSON_VALUE 
 ( 
 ms 
 . 
 instance 
 , 
  
 '$.deviceName' 
 ) 
  
 -- this returns a string 
 FROM 
  
 mde_data 
 . 
 ` 
 example 
 - 
 record 
 - 
 tbl 
 ` 
  
 AS 
  
 dnr 
 LEFT 
  
 JOIN 
  
 mde_dimension 
 . 
 ` 
 metadata 
 - 
 store 
 ` 
  
 AS 
  
 ms 
 ON 
  
 ms 
 . 
 instance_id 
  
 = 
  
 JSON_VALUE 
 ( 
 cloud_metadata_ref 
 , 
  
 "$.bucket.instance_id" 
 ) 
 WHERE 
  
 DATE 
 ( 
 event_timestamp 
 ) 
  
 = 
  
 '2023-01-01' 
 LIMIT 
  
 100 
 

Similarly, if cloud metadata materialization is enabled, you can access metadata instance attributes directly from the record. See the following example:

  SELECT 
  
 * 
  
 ( 
 EXCEPT 
  
 materialized_cloud_metadata 
 ), 
  
 materialized_cloud_metadata 
 . 
 device 
 . 
 deviceName 
  
 -- this returns a double quoted JSON string 
  
 JSON_VALUE 
 ( 
 materialized_cloud_metadata 
 ., 
  
 '$.device.deviceName' 
 ) 
  
 -- this returns a string 
 FROM 
  
 mde_data 
 . 
 ` 
 example 
 - 
 record 
 - 
 tbl 
 ` 
 WHERE 
  
 DATE 
 ( 
 event_timestamp 
 ) 
  
 = 
  
 '2023-01-01' 
 LIMIT 
  
 100 
 

To obtain an array of all metadata instance IDs contained in the cloud_metadata_ref field of a record follow these steps:

  1. Create the user defined function (UDF) with the following SQL query:

      CREATE 
      
     OR 
      
     REPLACE 
      
     FUNCTION 
      
     ` 
     mde_data 
     . 
     get_instance_ids 
     ` 
     ( 
     input 
      
     JSON 
     ) 
      
     RETURNS 
      
     ARRAY<STRING> 
      
     LANGUAGE 
      
     js 
      
     AS 
      
     R 
     """ 
     return input ? Object.keys(input).map(bucketName => input[bucketName].instance_id).filter(instance_id => instance_id != null) : []; 
     """ 
     ; 
     
    
  2. Execute the function in a query:

      SELECT 
      
     mde_data 
     . 
     get_instance_ids 
     ( 
     cloud_metadata_ref 
     ) 
      
     as 
      
     metadata_instance_ids 
     , 
      
     * 
     , 
     FROM 
      
     mde_data 
     . 
     ` 
      RECORD_TABLE_NAME 
     
     ` 
     WHERE 
     DATE 
     ( 
     event_timestamp 
     ) 
      
     = 
      
     ' EVENT_TIMESTAMP 
    ' 
     LIMIT 
      
     100 
     
    

    Replace the following:

  • RECORD_TABLE_NAME : Name of the record table.
  • EVENT_TIMESTAMP : Timestamp of the event.
Create a Mobile Website
View Site in Mobile | Classic
Share by: