Join the newly launched Discord community for real-time discussions, peer support, and direct interaction with the Meridian team!

meridian.data.load.DataFrameDataLoader

View source on GitHub

Reads data from a Pandas DataFrame .

Inherits From: InputDataLoader

  meridian 
 . 
 data 
 . 
 load 
 . 
 DataFrameDataLoader 
 ( 
 df 
 : 
 pd 
 . 
 DataFrame 
 , 
 coord_to_columns 
 : 
 CoordToColumns 
 , 
 kpi_type 
 : 
 str 
 , 
 media_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 media_spend_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 reach_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 frequency_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 rf_spend_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 organic_reach_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 organic_frequency_to_channel 
 : 
 ( 
 Mapping 
 [ 
 str 
 , 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 )

This class reads input data from a Pandas DataFrame . The coord_to_columns attribute stores a mapping from target InputData coordinates and array names to the DataFrame column names if they are different. The fields are:

geo , time , kpi , revenue_per_kpi , population (single column)
controls (multiple columns, optional)
(1) media , media_spend (multiple columns)
(2) reach , frequency , rf_spend (multiple columns)
non_media_treatments (multiple columns, optional)
organic_media (multiple columns, optional)
organic_reach , organic_frequency (multiple columns, optional)

The DataFrame must include (1) or (2), but doesn't need to include both. Also, each media channel must appear in (1) or (2), but not both.

Note the following:

Time column values must be formatted in yyyy-mm-dd date format.
In a national model, geo and population are optional. If the population is provided, it is reset to a default value of 1.0 .
If media data is provided, then media_to_channel and media_spend_to_channel are required. If reach and frequency data is provided, then reach_to_channel and frequency_to_channel and rf_spend_to_channel are required.
If organic_reach and organic_frequency data is provided, then organic_reach_to_channel and organic_frequency_to_channel are required.

Example:

  # df = [...] 
 coord_to_columns 
 = 
 CoordToColumns 
 ( 
 geo 
 = 
 'dmas' 
 , 
 time 
 = 
 'dates' 
 , 
 kpi 
 = 
 'conversions' 
 , 
 revenue_per_kpi 
 = 
 'revenue_per_conversions' 
 , 
 controls 
 = 
 [ 
 'control_income' 
 ], 
 population 
 = 
 'populations' 
 , 
 media 
 = 
 [ 
 'impressions_tv' 
 , 
 'impressions_fb' 
 , 
 'impressions_search' 
 ], 
 media_spend 
 = 
 [ 
 'spend_tv' 
 , 
 'spend_fb' 
 , 
 'spend_search' 
 ], 
 reach 
 = 
 [ 
 'reach_yt' 
 ], 
 frequency 
 = 
 [ 
 'frequency_yt' 
 ], 
 rf_spend 
 = 
 [ 
 'rf_spend_yt' 
 ], 
 non_media_treatments 
 = 
 [ 
 'price' 
 , 
 'discount' 
 ] 
 organic_media 
 = 
 [ 
 'organic_impressions_blog' 
 ], 
 organic_reach 
 = 
 [ 
 'organic_reach_newsletter' 
 ], 
 organic_frequency 
 = 
 [ 
 'organic_frequency_newsletter' 
 ], 
 ) 
 media_to_channel 
 = 
 { 
 'impressions_tv' 
 : 
 'tv' 
 , 
 'impressions_fb' 
 : 
 'fb' 
 , 
 'impressions_search' 
 : 
 'search' 
 , 
 } 
 media_spend_to_channel 
 = 
 { 
 'spend_tv' 
 : 
 'tv' 
 , 
 'spend_fb' 
 : 
 'fb' 
 , 
 'spend_search' 
 : 
 'search' 
 } 
 reach_to_channel 
 = 
 { 
 'reach_yt' 
 : 
 'yt' 
 } 
 frequency_to_channel 
 = 
 { 
 'frequency_yt' 
 : 
 'yt' 
 } 
 rf_spend_to_channel 
 = 
 { 
 'rf_spend_yt' 
 : 
 'yt' 
 } 
 organic_reach_to_channel 
 = 
 { 
 'organic_reach_newsletter' 
 : 
 'newsletter' 
 } 
 organic_frequency_to_channel 
 = 
 { 
 'organic_frequency_newsletter' 
 : 
 'newsletter' 
 } 
 data_loader 
 = 
 DataFrameDataLoader 
 ( 
 df 
 = 
 df 
 , 
 coord_to_columns 
 = 
 coord_to_columns 
 , 
 kpi_type 
 = 
 'non-revenue' 
 , 
 media_to_channel 
 = 
 media_to_channel 
 , 
 media_spend_to_channel 
 = 
 media_spend_to_channel 
 , 
 reach_to_channel 
 = 
 reach_to_channel 
 , 
 frequency_to_channel 
 = 
 frequency_to_channel 
 , 
 rf_spend_to_channel 
 = 
 rf_spend_to_channel 
 , 
 organic_reach_to_channel 
 = 
 organic_reach_to_channel 
 , 
 organic_frequency_to_channel 
 = 
 organic_frequency_to_channel 
 , 
 ) 
 data 
 = 
 data_loader 
 . 
 load 
 ()

Attributes

df

The pd.DataFrame object to read from. One of the following conditions is required:

There are no NAs in the dataframe
For any number of initial periods there is only media data and NAs in all of the non-media data columns ( kpi , revenue_per_kpi , media_spend , controls , and population ).

coord_to_columns

A CoordToColumns object whose fields are the desired coordinates of the InputData and the values are the current names of columns (or lists of columns) in the DataFrame. Example:

 coord_to_columns = CoordToColumns(
    geo='dmas',
    time='dates',
    kpi='conversions',
    revenue_per_kpi='revenue_per_conversions',
    media=['impressions_tv', 'impressions_yt', 'impressions_search'],
    spend=['spend_tv', 'spend_yt', 'spend_search'],
    controls=['control_income'],
    population=population,
)

kpi_type

A string denoting whether the KPI is of a 'revenue' or 'non-revenue' type. When the kpi_type is 'non-revenue' and there exists a revenue_per_kpi , ROI calibration is used and the analysis is run on revenue. When the revenue_per_kpi doesn't exist for the same kpi_type , custom ROI calibration is used and the analysis is run on KPI.

media_to_channel

A dictionary whose keys are the actual column names for media data in the dataframe, and the values are the desired channel names. These are the same as for the media_spend data. Example:

 media_to_channel = {'media_tv': 'tv', 'media_yt': 'yt', 'media_fb': 'fb'}

media_spend_to_channel

A dictionary whose keys are the actual column names for media_spend data in the dataframe, and the values are the desired channel names. These are same as for the media data. Example:

 media_spend_to_channel = {
    'spend_tv': 'tv', 'spend_yt': 'yt', 'spend_fb': 'fb'
}

reach_to_channel

A dictionary whose keys are the actual column names for reach data in the dataframe, and the values are the desired channel names. These are the same as for the rf_spend data. Example:

 reach_to_channel = {'reach_tv': 'tv', 'reach_yt': 'yt', 'reach_fb': 'fb'}

frequency_to_channel

A dictionary whose keys are the actual column names for frequency data in the dataframe, and the values are the desired channel names. These are the same as for the rf_spend data. Example:

 frequency_to_channel = {
    'frequency_tv': 'tv', 'frequency_yt': 'yt', 'frequency_fb': 'fb'
}

rf_spend_to_channel

A dictionary whose keys are the actual column names for rf_spend data in the dataframe, and values are the desired channel names. These are the same as for the reach and frequency data. Example:

 rf_spend_to_channel = {
    'rf_spend_tv': 'tv', 'rf_spend_yt': 'yt', 'rf_spend_fb': 'fb'
}

organic_reach_to_channel

A dictionary whose keys are the actual column names for organic_reach data in the dataframe, and the values are the desired channel names. These are the same as for the organic_frequency data. Example:

 organic_reach_to_channel = {
    'organic_reach_newsletter': 'newsletter',
}

organic_frequency_to_channel

A dictionary whose keys are the actual column names for organic_frequency data in the dataframe, and the values are the desired channel names. These are the same as for the organic_reach data. Example:

 organic_frequency_to_channel = {
    'organic_frequency_newsletter': 'newsletter',
}

Methods

`load`

View source

  load 
 () 
 -> 
   meridian 
 . 
 data 
 . 
 input_data 
 . 
 InputData

Reads data from a dataframe and returns an InputData object.

`eq`

  __eq__ 
 ( 
 other 
 )

Return self==value.

Class Variables

frequency_to_channel

None

media_spend_to_channel

None

media_to_channel

None

organic_frequency_to_channel

None

organic_reach_to_channel

None

reach_to_channel

None

rf_spend_to_channel

None

meridian.data.load.DataFrameDataLoader Stay organized with collections Save and categorize content based on your preferences.

Note the following:

Example:

Attributes

Methods

load

__eq__

Class Variables

meridian.data.load.DataFrameDataLoader

`load`

`eq`