Simulated data is provided as an example for each data type and format in the following sections.
CSV
To load the simulated CSV
data
using CsvDataLoader
:
-
Map the column names to the variable types. The required variable types are
time,controls,kpi,revenue_per_kpi,mediaandmedia_spend. For the definition of each variable, see Collect and organize your data .coord_to_columns = load . CoordToColumns ( time = 'time' , controls = [ 'GQV' , 'Discount' , 'Competitor_Sales' ], kpi = 'conversions' , revenue_per_kpi = 'revenue_per_conversion' , media = [ 'Channel0_impression' , 'Channel1_impression' , 'Channel2_impression' , 'Channel3_impression' , 'Channel4_impression' , 'Channel5_impression' , ], media_spend = [ 'Channel0_spend' , 'Channel1_spend' , 'Channel2_spend' , 'Channel3_spend' , 'Channel4_spend' , 'Channel5_spend' , ], ) -
Map the media variables and the media spends to the designated channel names that you want to display in the two-page output. In the following example,
Channel0_impressionandChannel0_spendare connected to the same channel,Channel0.correct_media_to_channel = { 'Channel0_impression' : 'Channel0' , 'Channel1_impression' : 'Channel1' , 'Channel2_impression' : 'Channel2' , 'Channel3_impression' : 'Channel3' , 'Channel4_impression' : 'Channel4' , 'Channel5_impression' : 'Channel5' , } correct_media_spend_to_channel = { 'Channel0_spend' : 'Channel0' , 'Channel1_spend' : 'Channel1' , 'Channel2_spend' : 'Channel2' , 'Channel3_spend' : 'Channel3' , 'Channel4_spend' : 'Channel4' , 'Channel5_spend' : 'Channel5' , } -
Load the data using
CsvDataLoader:loader = load . CsvDataLoader ( csv_path = f '/ { PATH } / { FILENAME } .csv' , kpi_type = 'non_revenue' , coord_to_columns = coord_to_columns , media_to_channel = correct_media_to_channel , media_spend_to_channel = correct_media_spend_to_channel , ) data = loader . load ()Where:
-
kpi_typeis either'revenue'or'non_revenue'. -
PATHis the path to the data file location. -
FILENAMEis the name of your data file.
-
Xarray Dataset
To load the simulated Xarray
Dataset
using XrDatasetDataLoader
:
-
Load the data using
pickle:import pickle with open ( f '/ { PATH } / { FILENAME } .pkl' , 'r' ) as fh : XrDataset = pickle . load ( fh )Where:
-
PATHis the path to the data file location. -
FILENAMEis the name of your data file.
-
-
Pass the dataset to
XrDatasetDataLoader. Use thename_mappingargument to map the coordinates and arrays. Provide mapping if the names in the input dataset are different from the required names. The required coordinate names aretime,control_variable, andmedia_channel. The required data variables names arekpi,revenue_per_kpi,controls,media, andmedia_spend.loader = load . XrDatasetDataLoader ( XrDataset , kpi_type = 'non_revenue' , name_mapping = { 'channel' : 'media_channel' , 'control' : 'control_variable' , 'conversions' : 'kpi' , 'revenue_per_conversion' : 'revenue_per_kpi' , 'control_value' : 'controls' , 'spend' : 'media_spend' }, ) data = loader . load ()Where:
-
kpi_typeis either'revenue'or'non_revenue'.
-
Numpy ndarray
To load numpy ndarrays directly, use NDArrayInputDataBuilder
:
-
Create the data into separate numpy ndarrays.
import numpy as np kpi_nd = np . array ([[ 1 , 2 , 3 ]]) controls_nd = np . array ([[[ 1 , 2 ], [ 3 , 4 ], [ 5 , 6 ]]]) revenue_per_kpi_nd = np . array ([[ 1 , 2 , 3 ]]) media_nd = np . array ([[[ 1 , 2 ], [ 3 , 4 ], [ 5 , 6 ]]]) media_spend_nd = np . array ([[[ 1 , 2 ], [ 3 , 4 ], [ 5 , 6 ]]]) -
Use a
NDArrayInputDataBuilderto set times, as well as give channel or dimension names as required in a Meridian input data. For the definition of each variable, see Collect and organize your data .from meridian.data import nd_array_input_data_builder as data_builder builder = ( data_builder . NDArrayInputDataBuilder ( kpi_type = 'non_revenue' ) ) builder . time_coords = [ '2024-01-02' , '2024-01-03' , '2024-01-01' ] builder . media_time_coords = [ '2024-01-02' , '2024-01-03' , '2024-01-01' ] builder = ( builder . with_kpi ( kpi_nd ) . with_revenue_per_kpi ( revenue_per_kpi_nd ) . with_controls ( controls_nd , control_names = [ "control0" , "control1" ]) . with_media ( m_nd = media_nd , ms_nd = media_spend_nd , media_channels = [ "channel0" , "channel1" ] ) ) data = builder . build ()Where:
-
kpi_typeis either'revenue'or'non_revenue'.
-
Pandas DataFrames or other data formats
To load the simulated other data
format
(such as excel
) using DataFrameInputDataBuilder
:
-
Read the data (such as an
excelspreadsheet) into one or more PandasDataFrame(s).import pandas as pd df = pd . read_excel ( 'https://github.com/google/meridian/raw/main/meridian/data/simulated_data/xlsx/national_media.xlsx' , engine = 'openpyxl' , ) -
Use a
DataFrameInputDataBuilderto map column names to the variable types required in a Meridian input data. For the definition of each variable, see Collect and organize your data .from meridian.data import data_frame_input_data_builder as data_builder builder = data_builder . DataFrameInputDataBuilder ( kpi_type = 'non_revenue' , default_kpi_column = "conversions" , default_revenue_per_kpi_column = "revenue_per_conversion" , ) builder = ( builder . with_kpi ( df ) . with_revenue_per_kpi ( df ) . with_controls ( df , control_cols = [ "GQV" , "Discount" , "Competitor_Sales" ]) ) channels = [ "Channel0" , "Channel1" , "Channel2" , "Channel3" , "Channel4" , "Channel5" ] builder = builder . with_media ( df , media_cols = [ f " { channel } _impression" for channel in channels ], media_spend_cols = [ f " { channel } _spend" for channel in channels ], media_channels = channels , ) data = builder . build ()Where:
-
kpi_typeis either'revenue'or'non_revenue'.
-
Next, you can create your model .


