meridian.data.input_data.InputData

A data container for advertising data in a format supported by Meridian.

kpi
A DataArray of dimensions (n_geos, n_times) containing the non-negative dependent variable. Typically this is the number of units sold, but it can be any metric, such as revenue or conversions.
kpi_type
A string denoting whether the KPI is of a 'revenue' or 'non-revenue' type. When the kpi_type is 'non-revenue' and revenue_per_kpi exists, ROI calibration is used and the analysis is run on revenue. When the revenue_per_kpi doesn't exist for the same kpi_type , custom ROI calibration is used and the analysis is run on KPI.
population
A DataArray of dimensions (n_geos,) containing the population of each group. This variable is used to scale the KPI and media for modeling.
controls
An optional DataArray of dimensions (n_geos, n_times, n_controls) containing control variable values.
revenue_per_kpi
An optional DataArray of dimensions (n_geos, n_times) containing the average revenue amount per KPI unit. Although modeling is done on kpi , model analysis and optimization are done on KPI * revenue_per_kpi (revenue), if this value is available. If kpi corresponds to revenue, then an array of ones is passed automatically.
media
An optional DataArray of dimensions (n_geos, n_media_times, n_media_channels) containing non-negative media execution values. Typically these are impressions, but it can be any metric, such as cost or clicks. n_media_times n_times is required, and the final n_times time periods must align with the time window of kpi and controls . Due to lagged effects, we recommend that the time window for media includes up to max_lag additional periods prior to this window. If n_media_times < n_times + max_lag , the model effectively imputes media history as zero (no media execution). If n_media_times > n_times + max_lag , then only the final n_times + max_lag periods are used to fit the model. media and media_spend must contain the same number of media channels in the same order. If either of these arguments is passed, then the other is not optional.
media_spend
An optional DataArray containing the cost of each media channel. This is used as the denominator for ROI calculations. It is also used to calculate an assumed cost per media unit for post-modeling analysis such as response curves and budget optimization. Only the aggregate spend (across geos and time periods) is required for these calculations. However, a spend breakdown by geo and time period is required if roi_calibration_period is specified or if conducting post-modeling analysis on a specific subset of geos and/or time periods. The DataArray shape can be (n_geos, n_times, n_media_channels) or (n_media_channels,) if the data is aggregated over geo and time dimensions. We recommend that the spend total aligns with the time window of the kpi and controls data, which is the time window over which incremental outcome of the ROI numerator is calculated. However, note that incremental outcome is influenced by media execution prior to this time window, through lagged effects, and excludes lagged effects beyond the time window of media executed during the time window. media and media_spend must contain the same number of media channels in the same order. If either of these arguments is passed, then the other is not optional. If a tensor of shape (n_media_channels,) is passed as media_spend , then it will be automatically allocated across geos and times proportinally to media .
reach
An optional DataArray of dimensions (n_geos, n_media_times, n_rf_channels) containing non-negative reach values. It is required that n_media_times n_times , and the final n_times time periods must align with the time window of kpi and controls . The time window must include the time window of the kpi and controls data, but it is optional to include lagged time periods prior to the time window of the kpi and controls data. If lagged reach is not included, or if the lagged reach includes fewer than max_lag time periods, then the model calculates Adstock assuming that reach execution is zero prior to the first observed time period. We recommend including n_times + max_lag time periods, unless the value of max_lag is prohibitively large. If only media data is used, then reach will be None . reach , frequency , and rf_spend must contain the same number of media channels in the same order. If any of these arguments is passed, then the others are not optional.
frequency
An optional DataArray of dimensions (n_geos, n_media_times, n_rf_channels) containing non-negative frequency values. It is required that n_media_times n_times , and the final n_times time periods must align with the time window of kpi and controls . The time window must include the time window of the kpi and controls data, but it is optional to include lagged time periods prior to the time window of the kpi and controls data. If lagged frequency is not included, or if the lagged frequency includes fewer than max_lag time periods, then the model calculates Adstock assuming that frequency execution is zero prior to the first observed time period. We recommend including n_times + max_lag time periods, unless the value of max_lag is prohibitively large. If only media data is used, then frequency will be None . reach , frequency , and rf_spend must contain the same number of media channels in the same order. If any of these arguments is passed, then the others are not optional.
rf_spend
An optional DataArray containing the cost of each reach and frequency channel. This is used as the denominator for ROI calculations. It is also used to calculate an assumed cost per media unit for post-modeling analysis such as response curves and budget optimization. Only the aggregate spend (across geos and time periods) is required for these calculations. However, a spend breakdown by geo and time period is required if rf_roi_calibration_period is specified or if conducting post-modeling analysis on a specific subset of geos and/or time periods. The DataArray shape can be (n_rf_channels,) or (n_geos, n_times, n_rf_channels) . The spend should be aggregated over geo and/or time dimensions that are not represented. We recommend that the spend total aligns with the time window of the kpi and controls data, which is the time window over which incremental outcome of the ROI numerator is calculated. However, note that incremental outcome is influenced by media execution prior to this time window, through lagged effects, and excludes lagged effects beyond the time window of media executed during the time window. If only media data is used, rf_spend will be None . reach , frequency , and rf_spend must contain the same number of media channels in the same order. If any of these arguments is passed, then the others are not optional. If a tensor of shape (n_rf_channels,) is passed as rf_spend , then it will be automatically allocated across geos and times proportionally to (reach * frequency) .
organic_media
An optional DataArray of dimensions (n_geos, n_media_times, n_organic_media_channels) containing non-negative organic media values. Organic media variables are media activities that have no direct cost. These may include impressions from newsletters, a blog post, social media activity or email campaigns but it can be any metric, such as clicks. n_media_times n_times is required, and the final n_times time periods must align with the time window of kpi and controls . Due to lagged effects, we recommend that the time window for organic media includes up to max_lag additional periods prior to this window. If n_organic_media_times < n_times + max_lag , the model effectively imputes organic media history. If n_organic_media_times > n_times + max_lag , then only the final n_times + max_lag periods are used to fit the model.
organic_reach
An optional DataArray of dimensions (n_geos, n_media_times, n_organic_rf_channels) containing non-negative organic reach values. It is required that n_media_times n_times , and the final n_times time periods must align with the time window of kpi and controls . The time window must include the time window of the kpi and controls data, but it is optional to include lagged time periods prior to the time window of the kpi and controls data. If lagged reach is not included, or if the lagged reach includes fewer than max_lag time periods, then the model calculates Adstock assuming that reach execution is zero prior to the first observed time period. We recommend including n_times + max_lag time periods, unless the value of max_lag is prohibitively large. If no organic reach and frequency data is used, then organic_reach and organic_frequency will be None . organic_reach , and organic_frequency must contain the same number of channels in the same order. If any of these arguments is passed, then the other is not optional.
organic_frequency
An optional DataArray of dimensions (n_geos, n_media_times, n_organic_rf_channels) containing non-negative organic frequency values. It is required that n_media_times n_times , and the final n_times time periods must align with the time window of kpi and controls . The time window must include the time window of the kpi and controls data, but it is optional to include lagged time periods prior to the time window of the kpi and controls data. If lagged frequency is not included, or if the lagged frequency includes fewer than max_lag time periods, then the model calculates Adstock assuming that frequency execution is zero prior to the first observed time period. We recommend including n_times + max_lag time periods, unless the value of max_lag is prohibitively large. If no organic reach and frequency data is used, then organic_frequency will be None . organic_reach and organic_frequency must contain the same number of channels in the same order. If any of these arguments is passed, then the other is not optional.
non_media_treatments
An optional DataArray of dimensions (n_geos, n_times, n_non_media_channels) containing non-media treatment variables values. Non-media treatment variables are marketing activities taken by the advertiser not directly related to media. They have no direct marketing cost associated with them but unlike organic media variables there are no Adstock and Hill effects. They differ from control variables as they are considered to be intervenable and hence are treatment variables under the causal model. Some examples include running a promotion, the price of a product and a change in a product's packaging and/or design.
allocated_media_spend
Returns the allocated media spend for each geo and time.
allocated_rf_spend
Returns the allocated RF spend for each geo and time.
control_variable
Returns the control variable dimension.
geo
Returns the geo dimension.
media_channel
Returns the media channel dimension.
media_spend_has_geo_dimension
Checks whether the media_spend array has a geo dimension.
media_spend_has_time_dimension
Checks whether the media_spend array has a time dimension.
media_time
Returns the media time dimension coordinates.
media_time_coordinates
Returns the media time dimension in a TimeCoordinates wrapper.
non_media_channel
Returns the non-media treatments channel dimension.
organic_media_channel
Returns the organic media channel dimension.
organic_rf_channel
Returns the organic RF channel dimension.
rf_channel
Returns the RF channel dimension.
rf_spend_has_geo_dimension
Checks whether the rf_spend array has a geo dimension.
rf_spend_has_time_dimension
Checks whether the rf_spend array has a time dimension.
scaled_centered_kpi
Calculates scaled and centered KPI values.

mean-centered by geo.

time
Returns the time dimension coordinates.
time_coordinates
Returns the (KPI) time dimension in a TimeCoordinates wrapper.

Methods

aggregate_media_spend

View source

Aggregates media spend by channel over the calibration period.

aggregate_rf_spend

View source

Aggregates RF spend by channel over the calibration period.

as_dataset

View source

Returns data as a single xarray.Dataset object.

copy

View source

Returns a copy of the InputData instance.

Args

deep
If True, a deep copy is made, meaning all xarray.DataArray objects are also deepcopied. If False, a shallow copy is made.

Returns
A new InputData instance.

get_all_adstock_hill_channels

View source

Returns all channel dimensions that adstock hill is applied to.

RF, organic media and organic RF channels are concatenated to the end of the media channels if they are present.

get_all_channels

View source

Returns all the channel dimensions.

This method returns media, RF, organic media, organic RF and non-media channel names, concatenated into a single array in that order.

get_all_media_and_rf

View source

Returns all of the media execution values, including both media and RF.

If media, reach, and frequency were used for modeling, reach * frequency is concatenated to the end of media.

Returns
np.ndarray with dimensions (n_geos, n_media_times, n_channels) containing media or reach * frequency for each media_channel or rf_channel .

get_all_paid_channels

View source

Returns all the paid channel dimensions, including both media and RF.

If both media and RF channels are present, then the RF channels are concatenated to the end of the media channels.

get_n_top_largest_geos

View source

Finds the specified number of the largest geos by population.

Args

num_geos
The number of top largest geos to return based on population.

Returns
A list of the specified number of top largest geos.

get_organic_media_channels_argument_builder

View source

Returns an argument builder for organic media channels only .

get_organic_rf_channels_argument_builder

View source

Returns an argument builder for organic RF channels only .

get_paid_channels_argument_builder

View source

Returns an argument builder for all paid channels.

get_paid_media_channels_argument_builder

View source

Returns an argument builder for paid media channels only .

get_paid_rf_channels_argument_builder

View source

Returns an argument builder for paid RF channels only .

get_total_outcome

View source

Returns total outcome, aggregated over geos and times.

get_total_spend

View source

Returns total spend, including media_spend and rf_spend .

__eq__

Return self==value.

controls
None
frequency
None
media
None
media_spend
None
non_media_treatments
None
organic_frequency
None
organic_media
None
organic_reach
None
reach
None
revenue_per_kpi
None
rf_spend
None

Design a Mobile Site
View Site in Mobile | Classic
Share by: