Join the newly launched Discord community for real-time discussions, peer support, and direct interaction with the Meridian team!

meridian.analysis.analyzer.Analyzer

View source on GitHub

Runs calculations to analyze the raw data after fitting the model.

  meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 Analyzer 
 ( 
 meridian 
 : 
   meridian 
 . 
 model 
 . 
 model 
 . 
 Meridian 
 
 
 )

Methods

`adstock_decay`

View source

  adstock_decay 
 ( 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 ) 
 -> 
 pd 
 . 
 DataFrame

Calculates adstock decay for paid media, RF, and organic media channels.

Args

confidence_level

Confidence level for prior and posterior credible intervals, represented as a value between zero and one.

Returns

Pandas DataFrame containing the channel , time_units , distribution , ci_hi , ci_lo , and mean for the Adstock function.

`baseline_summary_metrics`

View source

  baseline_summary_metrics 
 ( 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 aggregate_times 
 : 
 bool 
 = 
 True 
 , 
 non_media_baseline_values 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
 xr 
 . 
 Dataset

Returns baseline summary metrics.

Args

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list containing a subset of times to include. By default, all time periods are included.

aggregate_geos

Boolean. If True , the expected outcome is summed over all of the regions.

aggregate_times

Boolean. If True , the expected outcome is summed over all of the time periods.

non_media_baseline_values

Optional list of shape (n_non_media_channels,) . Each element is a float which means that the fixed value will be used as baseline for the given channel. It is expected that they are scaled by population for the channels where model_spec.non_media_population_scaling_id is True . If None , the model_spec.non_media_baseline_values is used, which defaults to the minimum value for each non_media treatment channel.

confidence_level

Confidence level for media summary metrics credible intervals, represented as a value between zero and one.

batch_size

Integer representing the maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing batch_size . The calculation will generally be faster with larger batch_size values.

Returns

An xr.Dataset with coordinates: metric ( mean , median , ci_low , ci_high ), distribution (prior, posterior) and contains the following data variables: baseline_outcome , pct_of_contribution .

`compute_incremental_outcome_aggregate`

View source

  compute_incremental_outcome_aggregate 
 ( 
 use_posterior 
 : 
 bool 
 , 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 use_kpi 
 : 
 ( 
 bool 
 | 
 None 
 ) 
 = 
 None 
 , 
 include_non_paid_channels 
 : 
 bool 
 = 
 True 
 , 
 non_media_baseline_values 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 ** 
 kwargs 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Aggregates the incremental outcome of the media channels.

Args

use_posterior

Boolean. If True , then the incremental outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.

new_data

Optional DataTensors container with optional tensors: media , reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments and revenue_per_kpi . If None , the incremental outcome is calculated using the InputData provided to the Meridian object. If new_data is provided, the incremental outcome is calculated using the new tensors in new_data and the original values of the remaining tensors. For example, compute_incremental_outcome_aggregate(new_data=DataTensors(media=new_media)) computes the incremental outcome using new_media and the original values of reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments and revenue_per_kpi . If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

use_kpi

Boolean. If True , the summary metrics are calculated using KPI. If False , the metrics are calculated using revenue.

include_non_paid_channels

Boolean. If True , then non-media treatments and organic effects are included in the calculation. If False , then only the paid media and RF effects are included.

non_media_baseline_values

**kwargs

kwargs to pass to incremental_outcome , which could contain selected_geos, selected_times, aggregate_geos, aggregate_times, batch_size.

Returns

A Tensor with the same dimensions as incremental_outcome except the size of the channel dimension is incremented by one, with the new component at the end containing the total incremental outcome of all channels.

`cpik`

View source

  cpik 
 ( 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Calculates the cost per incremental KPI distribution for each channel.

The CPIK numerator is the total spend on the channel. The CPIK denominator is the change in expected KPI when one channel's spend is set to zero, leaving all other channels' spend unchanged.

If new_data=None , this method calculates CPIK conditional on the values of the paid media variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example,

  new_data 
 = 
 DataTensors 
 ( 
 media 
 = 
 new_media 
 , 
 frequency 
 = 
 new_frequency 
 )

If selected_geos or selected_times is specified, then the CPIK numerator is the total spend during the selected geos and time periods. An exception will be thrown if the spend of the InputData used to train the model does not have geo and time dimensions. (If the new_data.media_spend and new_data.rf_spend arguments are used with different dimensions than the InputData spend, then an exception will be thrown since this is a likely user error.)

Note that CPIK is simply 1/ROI, where ROI is obtained from a call to the roi method with use_kpi=True .

Args

use_posterior

Boolean. If True then the posterior distribution is calculated. Otherwise, the prior distribution is calculated.

new_data

Optional. DataTensors containing media , media_spend , reach , frequency , rf_spend and revenue_per_kpi data. If provided, the cpik is calculated using the values of the tensors passed in new_data and the original values of all the remaining tensors. If None , the ROI is calculated using the original values of all the tensors. If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

selected_geos

Optional. Contains a subset of geos to include. By default, all geos are included.

selected_times

Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the new_data args, if provided. By default, all time periods are included.

aggregate_geos

Boolean. If True , the expected KPI is summed over all of the regions.

batch_size

Returns

Tensor of CPIK values with dimensions

(n_chains, n_draws, n_geos,
(n_media_channels + n_rf_channels))

. The n_geos dimension is dropped if aggregate_geos=True .

`expected_outcome`

View source

  expected_outcome 
 ( 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 aggregate_times 
 : 
 bool 
 = 
 True 
 , 
 inverse_transform_outcome 
 : 
 bool 
 = 
 True 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Calculates either prior or posterior expected outcome.

This calculates E(Outcome|Media, RF, Organic media, Organic RF, Non-media treatments, Controls) for each posterior (or prior) parameter draw, where Outcome refers to either revenue if use_kpi=False , or kpi if use_kpi=True . When revenue_per_kpi is not defined, use_kpi cannot be False .

If new_data=None , this method calculates expected outcome conditional on the values of the independent variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument, as long as the new tensors' dimensions match. For example,

  new_data 
 = 
 DataTensors 
 ( 
 reach 
 = 
 new_reach 
 , 
 frequency 
 = 
 new_frequency 
 )

In principle, expected outcome could be calculated with other time dimensions (for future predictions, for instance). However, this is not allowed with this method because of the additional complexities this introduces:

Corresponding price (revenue per KPI) data would also be needed.
If the model contains weekly effect parameters, then some method is needed to estimate or predict these effects for time periods outside of the training data window.

Args

use_posterior

Boolean. If True , then the expected outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.

new_data

An optional DataTensors container with optional new tensors: media , reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments , controls . If None , expected outcome is calculated conditional on the original values of the data tensors that the Meridian object was initialized with. If new_data argument is used, expected outcome is calculated conditional on the values of the tensors passed in new_data and on the original values of the remaining unset tensors. For example,

expected_outcome(new_data=DataTensors(reach=new_reach,
frequency=new_frequency))

calculates expected outcome conditional on the original media , organic_media , organic_reach , organic_frequency , non_media_treatments and controls tensors and on the new given values for reach and frequency tensors. The new tensors' dimensions must match the dimensions of the corresponding original tensors from input_data .

selected_geos

Optional list of containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list of containing a subset of dates to include. The values accepted here must match time dimension coordinates from InputData.time . By default, all time periods are included.

aggregate_geos

Boolean. If True , the expected outcome is summed over all regions.

aggregate_times

Boolean. If True , the expected outcome is summed over all time periods.

inverse_transform_outcome

Boolean. If True , returns the expected outcome in the original KPI or revenue (depending on what is passed to use_kpi ), as it was passed to InputData . If False, returns the outcome after transformation by KpiTransformer , reflecting how its represented within the model.

use_kpi

Boolean. If use_kpi = True , the expected KPI is calculated; otherwise the expected revenue (kpi * revenue_per_kpi) is calculated. It is required that use_kpi = True if revenue_per_kpi is not defined or if inverse_transform_outcome = False .

batch_size

Returns

Tensor of expected outcome (either KPI or revenue, depending on the use_kpi argument) with dimensions

(n_chains, n_draws, n_geos,
n_times)

. The n_geos and n_times dimensions is dropped if aggregate_geos=True or aggregate_time=True , respectively.

Raises

NotFittedModelError

if sample_posterior() (for use_posterior=True ) or sample_prior() (for use_posterior=False ) has not been called prior to calling this method.

`expected_vs_actual_data`

View source

  expected_vs_actual_data 
 ( 
 aggregate_geos 
 : 
 bool 
 = 
 False 
 , 
 aggregate_times 
 : 
 bool 
 = 
 False 
 , 
 split_by_holdout_id 
 : 
 bool 
 = 
 False 
 , 
 non_media_baseline_values 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 ) 
 -> 
 xr 
 . 
 Dataset

Calculates the data for the expected versus actual outcome over time.

Args

aggregate_geos

Boolean. If True , the expected, baseline, and actual are summed over all of the regions.

aggregate_times

Boolean. If True , the expected, baseline, and actual are summed over all of the time periods.

split_by_holdout_id

Boolean. If True and holdout_id exists, the data is split into 'Train' , 'Test' , and 'All Data' subsections.

non_media_baseline_values

confidence_level

Confidence level for expected outcome credible intervals, represented as a value between zero and one. Default: 0.9 .

Returns

A dataset with the expected, baseline, and actual outcome metrics.

`filter_and_aggregate_geos_and_times`

View source

  filter_and_aggregate_geos_and_times 
 ( 
 tensor 
 : 
   meridian 
 . 
 backend 
 . 
 Tensor 
 
 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 aggregate_times 
 : 
 bool 
 = 
 True 
 , 
 flexible_time_dim 
 : 
 bool 
 = 
 False 
 , 
 has_media_dim 
 : 
 bool 
 = 
 True 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Filters and/or aggregates geo and time dimensions of a tensor.

Args

tensor

Tensor with dimensions [..., n_geos, n_times] or

[..., n_geos,
n_times, n_channels]

, where n_channels is the number of either media channels, RF channels, all paid channels (media and RF), or all channels (media, RF, non-media, organic media, organic RF).

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included. The selected geos should match those in InputData.geo .

selected_times

Optional list of times to include. This can either be a string list containing a subset of time dimension coordinates from InputData.time or a boolean list with length equal to the time dimension of the tensor. By default, all time periods are included.

aggregate_geos

Boolean. If True , the tensor is summed over all geos.

aggregate_times

Boolean. If True , the tensor is summed over all time periods.

flexible_time_dim

Boolean. If True , the time dimension of the tensor is not required to match the number of time periods in InputData.time . In this case, if using selected_times , it must be a boolean list with length equal to the time dimension of the tensor.

has_media_dim

Boolean. Only used if flexible_time_dim=True . Otherwise, this is assumed based on the tensor dimensions. If True , the tensor is assumed to have a media dimension following the time dimension. If False , the last dimension of the tensor is assumed to be the time dimension.

Returns

A tensor with filtered and/or aggregated geo and time dimensions.

`get_aggregated_impressions`

View source

  get_aggregated_impressions 
 ( 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 aggregate_times 
 : 
 bool 
 = 
 True 
 , 
 optimal_frequency 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 include_non_paid_channels 
 : 
 bool 
 = 
 True 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Computes aggregated impressions values in the data across all channels.

Args

new_data

An optional DataTensors object containing the new media , reach , frequency , organic_media , organic_reach , organic_frequency , and non_media_treatments tensors. If new_data argument is used, then the aggregated impressions are computed using the values of the tensors passed in the new_data argument and the original values of all the remaining tensors. If None , the existing tensors from the Meridian object are used.

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in the tensors in the new_data argument, if provided. By default, all time periods are included.

aggregate_geos

Boolean. If True , the expected outcome is summed over all of the regions.

aggregate_times

Boolean. If True , the expected outcome is summed over all of the time periods.

optimal_frequency

An optional list with dimension n_rf_channels , containing the optimal frequency per channel, that maximizes posterior mean ROI. Default value is None , and historical frequency is used for the metrics calculation.

include_non_paid_channels

Boolean. If True , the organic media, organic RF, and non-media channels are included in the aggregation.

Returns

A tensor with the shape (n_selected_geos, n_selected_times, n_channels) (or (n_channels,) if geos and times are aggregated) with aggregate impression values per channel.

`get_aggregated_spend`

View source

  get_aggregated_spend 
 ( 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 include_media 
 : 
 bool 
 = 
 True 
 , 
 include_rf 
 : 
 bool 
 = 
 True 
 ) 
 -> 
 xr 
 . 
 DataArray

Gets the aggregated spend based on the selected time.

Args

new_data

An optional DataTensors object containing the new media , media_spend , reach , frequency , rf_spend tensors. If None , the existing tensors from the Meridian object are used. If new_data argument is used, then the aggregated spend is computed using the values of the tensors passed in the new_data argument and the original values of all the remaining tensors. If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

selected_times

Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in KPI data. By default, all time periods are included.

include_media

Whether to include spends for paid media channels that do not have R&F data.

include_rf

Whether to include spends for paid media channels with R&F data.

Returns

An xr.DataArray with the coordinate channel and contains the data variable spend .

Raises

ValueError

A ValueError is raised when include_media and include_rf are both False.

`get_historical_spend`

View source

  get_historical_spend 
 ( 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 include_media 
 : 
 bool 
 = 
 True 
 , 
 include_rf 
 : 
 bool 
 = 
 True 
 ) 
 -> 
 xr 
 . 
 DataArray

Deprecated. Gets the aggregated historical spend based on the time.

Args

selected_times

The time period to get the historical spends. If None, the historical spends will be aggregated over all time points.

include_media

Whether to include spends for paid media channels that do not have R&F data.

include_rf

Whether to include spends for paid media channels with R&F data.

Returns

An xr.DataArray with the coordinate channel and contains the data variable spend .

Raises

ValueError

A ValueError is raised when include_media and include_rf are both False.

`get_rhat`

View source

  get_rhat 
 () 
 -> 
 Mapping 
 [ 
 str 
 , 
   meridian 
 . 
 backend 
 . 
 Tensor 
 
 
 ]

Computes the R-hat values for each parameter in the model.

Returns

A dictionary of r-hat values where each parameter is a key and values are r-hats corresponding to the parameter.

Raises

NotFittedModelError

If self.sample_posterior() is not called before calling this method.

`hill_curves`

View source

  hill_curves 
 ( 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 , 
 n_bins 
 : 
 int 
 = 
 25 
 ) 
 -> 
 pd 
 . 
 DataFrame

Estimates Hill curve tables used for plotting each channel's curves.

Args

confidence_level

Confidence level for prior and posterior credible intervals, represented as a value between zero and one. Default is 0.9 .

n_bins

Number of equal-width bins to include in the histogram for the plotting. Default is 25 .

Returns

Hill curves pd.DataFrame with columns:

channel : media or rf channel name.
media_units : Media (for media channels) or average frequency (for rf channels) units.
distribution : Indication of posterior or prior draw.
ci_hi : Upper bound of the credible interval of the value of the Hill function.
ci_lo : Lower bound of the credible interval of the value of the Hill function.
mean : Point-wise mean of the value of the Hill function per draw.
channel_type : Indication of a media or rf channel.
scaled_count_histogram : Scaled count of media units or average frequencies within the bin.
count_histogram : Count value of media units or average frequencies within the bin.
start_interval_histogram : Media unit or average frequency starting point for a histogram bin.
end_interval_histogram : Media unit or average frequency ending point for a histogram bin.

`incremental_outcome`

View source

  incremental_outcome 
 ( 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 non_media_baseline_values 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 scaling_factor0 
 : 
 float 
 = 
 0.0 
 , 
 scaling_factor1 
 : 
 float 
 = 
 1.0 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 media_selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 aggregate_times 
 : 
 bool 
 = 
 True 
 , 
 inverse_transform_outcome 
 : 
 bool 
 = 
 True 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 by_reach 
 : 
 bool 
 = 
 True 
 , 
 include_non_paid_channels 
 : 
 bool 
 = 
 True 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Calculates either the posterior or prior incremental outcome.

This calculates the media outcome of each media channel for each posterior or prior parameter draw. Incremental outcome is defined as:

E(Outcome|Treatment_1, Controls) minus E(Outcome|Treatment_0, Controls)

For paid & organic channels (without reach and frequency data), Treatment_1 means that media execution for a given channel is multiplied by scaling_factor1 (1.0 by default) for the set of time periods specified by media_selected_times . Similarly, Treatment_0 means that media execution is multiplied by scaling_factor0 (0.0 by default) for these time periods.

For paid & organic channels with reach and frequency data, either reach or frequency is held fixed while the other is scaled, depending on the by_reach argument.

For non-media treatments, Treatment_1 means that the variable is set to historical values. Treatment_0 means that the variable is set to its baseline value for all geos and time periods. Note that the scaling factors ( scaling_factor0 and scaling_factor1 ) are not applicable to non-media treatments.

"Outcome" refers to either revenue if use_kpi=False , or kpi if use_kpi=True . When revenue_per_kpi is not defined, use_kpi cannot be False.

If new_data=None , this method computes incremental outcome using media , reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments and revenue_per_kpi tensors that the Meridian object was initialized with. This behavior can be overridden with the new_data argument. For example, new_data=DataTensors(media=new_media) calculates incremental outcome using the new_media tensor and the original values of reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments and revenue_per_kpi tensors.

The calculation in this method depends on two key assumptions made in the Meridian implementation:

Additivity of media effects (no interactions).
Additive changes on the model KPI scale correspond to additive changes on the original KPI scale. In other words, the intercept and control effects do not influence the media effects. This assumption currently holds because the outcome transformation only involves centering and scaling, for example, no log transformations.

Args

use_posterior

Boolean. If True , then the incremental outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.

new_data

Optional DataTensors container with optional tensors: media , reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments and revenue_per_kpi . If None , the incremental outcome is calculated using the InputData provided to the Meridian object. If new_data is provided, the incremental outcome is calculated using the new tensors in new_data and the original values of the remaining tensors. For example, incremental_outcome(new_data=DataTensors(media=new_media) computes the incremental outcome using new_media and the original values of reach , frequency , organic_media , organic_reach , organic_frequency , non_media_treatments and revenue_per_kpi . If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

non_media_baseline_values

scaling_factor0

Float. The factor by which to scale the counterfactual scenario "Media_0" during the time periods specified in media_selected_times . Must be non-negative and less than scaling_factor1 .

scaling_factor1

Float. The factor by which to scale "Media_1" during the selected time periods specified in media_selected_times . Must be non-negative and greater than scaling_factor0 .

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in new_data if time is modified in new_data , or input_data.n_times otherwise. The incremental outcome corresponds to incremental KPI generated during the selected_times arg by media executed during the media_selected_times arg. Note that if use_kpi=False , then selected_times can only include the time periods that have revenue_per_kpi input data. By default, all time periods are included where revenue_per_kpi data is available.

media_selected_times

Optional list containing either a subset of dates to include or booleans with length equal to the number of time periods in KPI data or number of time periods in the new_data args, if provided. If new_data is provided, media_selected_times can select any subset of time periods in new_data . If new_data is not provided, media_selected_times selects from InputData.time . The incremental outcome corresponds to incremental KPI generated during the selected_times arg by treatment variables executed during the media_selected_times arg. For each channel, the incremental outcome is defined as the difference between expected KPI when treatment variables execution is scaled by scaling_factor1 and scaling_factor0 during these specified time periods. By default, the difference is between treatment variables at historical execution levels, or as provided in new_data , versus zero execution. Defaults to include all time periods.

aggregate_geos

Boolean. If True , then incremental outcome is summed over all regions.

aggregate_times

Boolean. If True , then incremental outcome is summed over all time periods.

inverse_transform_outcome

use_kpi

Boolean. If use_kpi = True , the expected KPI is calculated; otherwise the expected revenue (kpi * revenue_per_kpi) is calculated. It is required that use_kpi = True if revenue_per_kpi data is not available or if inverse_transform_outcome = False .

by_reach

Boolean. If True , then the incremental outcome is calculated by scaling the reach and holding the frequency constant. If False , then the incremental outcome is calculated by scaling the frequency and holding the reach constant. Only used for channels with RF data.

include_non_paid_channels

Boolean. If True , then non-media treatments and organic effects are included in the calculation. If False , then only the paid media and RF effects are included.

batch_size

Returns

Tensor of incremental outcome (either KPI or revenue, depending on use_kpi argument) with dimensions

(n_chains, n_draws, n_geos,
n_times, n_channels)

. If include_non_paid_channels=True , then n_channel is the total number of media, RF, organic media, and organic RF and non-media channels. If include_non_paid_channels=False , then n_channels is the total number of media and RF channels. The n_geos and n_times dimensions are dropped if aggregate_geos=True or aggregate_times=True , respectively.

Raises

NotFittedModelError

If sample_posterior() (for use_posterior=True ) or sample_prior() (for use_posterior=False ) has not been called prior to calling this method.

ValueError

If new_data argument contains tensors with modified time dimension and not all treatment variables are provided in new_data with matching time dimensions.

`marginal_roi`

View source

  marginal_roi 
 ( 
 incremental_increase 
 : 
 float 
 = 
 0.01 
 , 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 by_reach 
 : 
 bool 
 = 
 True 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Calculates the marginal ROI prior or posterior distribution.

The marginal ROI (mROI) numerator is the change in expected outcome ( kpi or kpi * revenue_per_kpi ) when one channel's spend is increased by a small fraction. The mROI denominator is the corresponding small fraction of the channel's total spend.

If new_data=None , this method calculates marginal ROI conditional on the values of the paid media variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example,

  new_data 
 = 
 DataTensors 
 ( 
 media 
 = 
 new_media 
 , 
 frequency 
 = 
 new_frequency 
 )

If selected_geos or selected_times is specified, then the mROI denominator is based on the total spend during the selected geos and time periods. An exception will be thrown if the spend of the InputData used to train the model does not have geo and time dimensions. (If the new_data.media_spend and new_data.rf_spend arguments are used with different dimensions than the InputData spend, then an exception will be thrown since this is a likely user error.)

Args

incremental_increase

Small fraction by which each channel's spend is increased when calculating its mROI numerator. The mROI denominator is this fraction of the channel's total spend. Only used if marginal is True .

use_posterior

If True then the posterior distribution is calculated. Otherwise, the prior distribution is calculated.

new_data

Optional. DataTensors containing media , media_spend , reach , frequency , rf_spend and revenue_per_kpi data. If provided, the marginal ROI is calculated using the values of the tensors passed in new_data and the original values of all the remaining tensors. If None , the marginal ROI is calculated using the original values of all the tensors. If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

selected_geos

Optional. Contains a subset of geos to include. By default, all geos are included.

selected_times

aggregate_geos

If True , the expected revenue is summed over all of the regions.

by_reach

Used for a channel with reach and frequency. If True , returns the mROI by reach for a given fixed frequency. If False , returns the mROI by frequency for a given fixed reach.

use_kpi

If False , then revenue is used to calculate the mROI numerator. Otherwise, uses KPI to calculate the mROI numerator.

batch_size

Maximum draws per chain in each batch. The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing batch_size . The calculation will generally be faster with larger batch_size values.

Returns

Tensor of mROI values with dimensions

(n_chains, n_draws, n_geos,
(n_media_channels + n_rf_channels))

. The n_geos dimension is dropped if aggregate_geos=True .

`negative_baseline_probability`

View source

  negative_baseline_probability 
 ( 
 non_media_baseline_values 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
 np 
 . 
 floating

Calculates either prior or posterior negative baseline probability.

This calculates either the prior or posterior probability that the baseline, aggregated over the supplied time window, is negative.

The baseline is calculated by computing expected_outcome with the following assumptions: 1) media is set to all zeros, 2) reach is set to all zeros, 3) organic_media is set to all zeros, 4) organic_reach is set to all zeros, 5) non_media_treatments is set to the counterfactual values according to the non_media_baseline_values argument, 6) controls are set to historical values.

Args

non_media_baseline_values

Optional list of shape (n_non_media_channels,) . Each element is a float denoting a fixed value that will be used as the baseline for the given channel. It is expected that they are scaled by population for the channels where model_spec.non_media_population_scaling_id is True . If None , the model_spec.non_media_baseline_values is used, which defaults to the minimum value for each non_media treatment channel.

use_posterior

Boolean. If True , then the expected outcome posterior distribution is calculated. Otherwise, the prior distribution is calculated.

selected_geos

Optional list of containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list of containing a subset of dates to include. The values accepted here must match time dimension coordinates from InputData.time . By default, all time periods are included.

use_kpi

batch_size

Returns

A float representing the prior or posterior negative baseline probability over the supplied time window.

Raises

NotFittedModelError

if sample_posterior() (for use_posterior=True ) or sample_prior() (for use_posterior=False ) has not been called prior to calling this method.

`optimal_freq`

View source

  optimal_freq 
 ( 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 max_frequency 
 : 
 ( 
 float 
 | 
 None 
 ) 
 = 
 None 
 , 
 freq_grid 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 ) 
 -> 
 xr 
 . 
 Dataset

Calculates the optimal frequency that maximizes posterior mean ROI.

For this optimization, historical spend is used and fixed, and frequency is restricted to be constant across all geographic regions and time periods. Reach is calculated for each geographic area and time period such that the number of impressions remains unchanged as frequency varies. Meridian solves for the frequency at which posterior mean ROI is optimized.

If new_data=None , this method calculates the opptimal frequency on the values of the paid RF variables that the Meridian object was initialized with. The user can override this historical data through the new_data argument. For example,

  new_data 
 = 
 DataTensors 
 ( 
 reach 
 = 
 new_reach 
 , 
 frequency 
 = 
 new_frequency 
 )

Args

new_data

Optional DataTensors object containing rf_impressions , rf_spend , and revenue_per_kpi . If provided, the optimal frequency is calculated using the values of the tensors passed in new_data and the original values of all the remaining tensors. If None , the historical data used to initialize the Meridian object is used. If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

max_frequency

Maximum frequency value used to calculate the frequency grid. If None , the maximum frequency value is calculated from the historic frequency (maximum value of Meridian.input_data, not new_data ). If freq_grid is provided, this argument has no effect.

freq_grid

List of frequency values. The ROI of each channel is calculated for each frequency value in the list. By default, the list includes numbers from 1.0 to the maximum frequency in increments of 0.1 .

use_posterior

Boolean. If True , posterior optimal frequencies are generated. If False , prior optimal frequencies are generated.

use_kpi

Boolean. If True , the counterfactual metrics are calculated using KPI. If False , the counterfactual metrics are calculated using revenue.

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

confidence_level

Confidence level for prior and posterior credible intervals, represented as a value between zero and one.

Returns

An xarray Dataset which contains:

Coordinates: frequency , rf_channel , metric ( mean , median , ci_lo , ci_hi ).
Data variables:
- optimal_frequency : The frequency that optimizes the posterior mean of ROI.
- roi : The ROI for each frequency value in freq_grid .
- optimized_incremental_outcome : The incremental outcome based on the optimal frequency.
- optimized_effectiveness : The effectiveness based on the optimal frequency.
- optimized_roi : The ROI based on the optimal frequency.
- optimized_mroi_by_reach : The marginal ROI with a small change in reach and fixed frequency at the optimal frequency.
- optimized_mroi_by_frequency : The marginal ROI with a small change around the optimal frequency and fixed reach.
- optimized_cpik : The CPIK based on the optimal frequency.

Raises

NotFittedModelError

If sample_posterior() (for use_posterior=True ) or sample_prior() (for use_posterior=False ) has not been called prior to calling this method.

ValueError

If there are no channels with reach and frequency data.

`predictive_accuracy`

View source

  predictive_accuracy 
 ( 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
 xr 
 . 
 Dataset

Calculates R-Squared , MAPE , and wMAPE goodness of fit metrics.

R-Squared , MAPE (mean absolute percentage error), and wMAPE (weighted absolute percentage error) are calculated on the revenue scale ( KPI * revenue_per_kpi ) when revenue_per_kpi is specified, or the KPI scale when revenue_per_kpi = None . This is the same scale as what is used in the ROI numerator (incremental outcome).

Prediction errors in wMAPE are weighted by the actual revenue ( KPI * revenue_per_kpi ) when revenue_per_kpi is specified, or weighted by the KPI scale when revenue_per_kpi = None . This means that percentage errors when revenue is high are weighted more heavily than errors when revenue is low.

R-Squared , MAPE and wMAPE are calculated both at the model-level (one observation per geo and time period) and at the national-level (aggregating KPI or revenue outcome across geos so there is one observation per time period).

R-Squared , MAPE , and wMAPE are calculated for the full sample. If the model object has any holdout observations, then R-squared , MAPE , and wMAPE are also calculated for the Train and Test subsets.

Args

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list containing a subset of dates to include. By default, all time periods are included.

batch_size

Integer representing the maximum draws per chain in each batch. By default, batch_size is 100 . The calculation is run in batches to avoid memory exhaustion. If a memory error occurs, try reducing batch_size . The calculation will generally be faster with larger batch_size values.

Returns

An xarray Dataset containing the computed R_Squared , MAPE , and wMAPE values, with coordinates metric , geo_granularity , evaluation_set , and accompanying data variable value . If holdout_id exists, the data is split into 'Train' , 'Test' , and 'All Data' subsections, and the three metrics are computed for each.

`response_curves`

View source

  response_curves 
 ( 
 spend_multipliers 
 : 
 ( 
 list 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 by_reach 
 : 
 bool 
 = 
 True 
 , 
 use_optimal_frequency 
 : 
 bool 
 = 
 False 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
 xr 
 . 
 Dataset

Method to generate a response curves xarray.Dataset.

Response curves are calculated in aggregate across geos and time periods, assuming the historical flighting pattern across geos and time periods for each media channel.

A list of multipliers is applied to each media channel's total historical spend within selected_geos and selected_times to obtain the x-axis values. The y-axis values are the incremental outcome generated by each channel within selected_geos and selected_times under the counterfactual where media units in each geo and time period are scaled by the corresponding multiplier. (Media units for time periods prior to selected_times are also scaled by the multiplier.)

Args

spend_multipliers

List of multipliers. Each channel's total spend is multiplied by these factors to obtain the values at which the curve is calculated for that channel.

use_posterior

Boolean. If True , posterior response curves are generated. If False , prior response curves are generated.

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

Optional list containing a subset of dates to include. By default, all time periods are included.

by_reach

Boolean. For channels with reach and frequency. If True , plots the response curve by reach. If False , plots the response curve by frequency.

use_optimal_frequency

If True , uses the optimal frequency to plot the response curves. Defaults to False .

use_kpi

A boolean flag indicating whether to use KPI instead of revenue to generate the response curves. Defaults to False .

confidence_level

Confidence level for prior and posterior credible intervals, represented as a value between zero and one.

batch_size

Returns

An xarray.Dataset containing the data needed to visualize response curves.

`rhat_summary`

View source

  rhat_summary 
 ( 
 bad_rhat_threshold 
 : 
 float 
 = 
 1.2 
 ) 
 -> 
 pd 
 . 
 DataFrame

Computes a summary of the R-hat values for each parameter in the model.

Summarizes the Gelman & Rubin (1992) potential scale reduction for chain convergence, commonly referred to as R-hat. It is a convergence diagnostic measure that measures the degree to which variance (of the means) between chains exceeds what you would expect if the chains were identically distributed. Values close to 1.0 indicate convergence. R-hat < 1.2 indicates approximate convergence and is a reasonable threshold for many problems (Brooks & Gelman, 1998).

References

Andrew Gelman and Donald B. Rubin. Inference from Iterative Simulation Using Multiple Sequences. Statistical Science, 7(4):457-472, 1992. Stephen P. Brooks and Andrew Gelman. General Methods for Monitoring Convergence of Iterative Simulations. Journal of Computational and Graphical Statistics, 7(4), 1998.

Args

bad_rhat_threshold

The threshold for determining which R-hat values are considered bad.

Returns

A DataFrame with the following columns:

n_params : The number of respective parameters in the model.
avg_rhat : The average R-hat value for the respective parameter.
n_params : The number of respective parameters in the model.
avg_rhat : The average R-hat value for the respective parameter.
max_rhat : The maximum R-hat value for the respective parameter.
percent_bad_rhat : The percentage of R-hat values for the respective parameter that are greater than bad_rhat_threshold .
row_idx_bad_rhat : The row indices of the R-hat values that are greater than bad_rhat_threshold .
col_idx_bad_rhat : The column indices of the R-hat values that are greater than bad_rhat_threshold .

Raises

NotFittedModelError

If self.sample_posterior() is not called before calling this method.

ValueError

If the number of dimensions of the R-hat array for a parameter is not 1 or 2 .

`roi`

View source

  roi 
 ( 
 use_posterior 
 : 
 bool 
 = 
 True 
 , 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 ) 
 -> 
   meridian 
 . 
 backend 
 . 
 Tensor

Calculates ROI prior or posterior distribution for each media channel.

The ROI numerator is the change in expected outcome ( kpi or kpi * revenue_per_kpi ) when one channel's spend is set to zero, leaving all other channels' spend unchanged. The ROI denominator is the total spend of the channel.

If new_data=None , this method calculates ROI conditional on the values of the paid media variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example,

  new_data 
 = 
 DataTensors 
 ( 
 media 
 = 
 new_media 
 , 
 frequency 
 = 
 new_frequency 
 )

If selected_geos or selected_times is specified, then the ROI denominator is the total spend during the selected geos and time periods. An exception will be thrown if the spend of the InputData used to train the model does not have geo and time dimensions. (If the new_data.media_spend and new_data.rf_spend arguments are used with different dimensions than the InputData spend, then an exception will be thrown since this is a likely user error.)

Args

use_posterior

Boolean. If True , then the posterior distribution is calculated. Otherwise, the prior distribution is calculated.

new_data

Optional. DataTensors containing media , media_spend , reach , frequency , and rf_spend , and revenue_per_kpi data. If provided, the ROI is calculated using the values of the tensors passed in new_data and the original values of all the remaining tensors. If None , the ROI is calculated using the original values of all the tensors. If any of the tensors in new_data is provided with a different number of time periods than in InputData , then all tensors must be provided with the same number of time periods.

selected_geos

Optional. Contains a subset of geos to include. By default, all geos are included.

selected_times

aggregate_geos

Boolean. If True , the expected revenue is summed over all of the regions.

use_kpi

If False , then revenue is used to calculate the ROI numerator. Otherwise, uses KPI to calculate the ROI numerator.

batch_size

Returns

Tensor of ROI values with dimensions

(n_chains, n_draws, n_geos,
(n_media_channels + n_rf_channels))

. The n_geos dimension is dropped if aggregate_geos=True .

`summary_metrics`

View source

  summary_metrics 
 ( 
 new_data 
 : 
 ( 
   meridian 
 . 
 analysis 
 . 
 analyzer 
 . 
 DataTensors 
 
 
 | 
 None 
 ) 
 = 
 None 
 , 
 marginal_roi_by_reach 
 : 
 bool 
 = 
 True 
 , 
 marginal_roi_incremental_increase 
 : 
 float 
 = 
 0.01 
 , 
 selected_geos 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 selected_times 
 : 
 ( 
 Sequence 
 [ 
 str 
 ] 
 | 
 Sequence 
 [ 
 bool 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 aggregate_geos 
 : 
 bool 
 = 
 True 
 , 
 aggregate_times 
 : 
 bool 
 = 
 True 
 , 
 optimal_frequency 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 , 
 use_kpi 
 : 
 bool 
 = 
 False 
 , 
 confidence_level 
 : 
 float 
 = 
 constants 
 . 
 DEFAULT_CONFIDENCE_LEVEL 
 , 
 batch_size 
 : 
 int 
 = 
 constants 
 . 
 DEFAULT_BATCH_SIZE 
 , 
 include_non_paid_channels 
 : 
 bool 
 = 
 False 
 , 
 non_media_baseline_values 
 : 
 ( 
 Sequence 
 [ 
 float 
 ] 
 | 
 None 
 ) 
 = 
 None 
 ) 
 -> 
 xr 
 . 
 Dataset

Returns summary metrics.

If new_data=None , this method calculates all the metrics conditional on the values of the data variables that the Meridian object was initialized with. The user can also override this historical data through the new_data argument. For example, to override the media, frequency, and non-media treatments data variables, the user can pass the following new_data argument:

  new_data 
 = 
 DataTensors 
 ( 
 media 
 = 
 new_media 
 , 
 frequency 
 = 
 new_frequency 
 , 
 non_media_treatments 
 = 
 new_non_media_treatments 
 )

Note that if new_data is provided with a different number of time periods than in InputData , pct_of_contribution is not defined because expected_outcome() is not defined for new time periods.

Note that mroi and effectiveness metrics are not defined ( math.nan ) for the aggregate "All Paid Channels" channel dimension.

Args

new_data

Optional DataTensors object with optional new tensors: media , media_spend , reach , frequency , rf_spend , organic_media , organic_reach , organic_frequency , non_media_treatments , controls , revenue_per_kpi . If provided, the summary metrics are calculated using the values of the tensors passed in new_data and the original values of all the remaining tensors. If None , the summary metrics are calculated using the original values of all the tensors. If new_data is provided with a different number of time periods than in InputData , then all tensors, except controls , must have the same number of time periods.

marginal_roi_by_reach

Boolean. Marginal ROI (mROI) is defined as the return on the next dollar spent. If this argument is True , the assumption is that the next dollar spent only impacts reach, holding frequency constant. If this argument is False , the assumption is that the next dollar spent only impacts frequency, holding reach constant. Used only when include_non_paid_channels is False .

marginal_roi_incremental_increase

Small fraction by which each channel's spend is increased when calculating its mROI numerator. The mROI denominator is this fraction of the channel's total spend. Used only when include_non_paid_channels is False .

selected_geos

Optional list containing a subset of geos to include. By default, all geos are included.

selected_times

aggregate_geos

Boolean. If True , the expected outcome is summed over all of the regions.

aggregate_times

Boolean. If True , the expected outcome is summed over all of the time periods. Note that if False , ROI, mROI, Effectiveness, and CPIK are not reported because they do not have a clear interpretation by time period.

optimal_frequency

use_kpi

Boolean. If True , the summary metrics are calculated using KPI. If False , the metrics are calculated using revenue.

confidence_level

Confidence level for summary metrics credible intervals, represented as a value between zero and one.

batch_size

include_non_paid_channels

Boolean. If True , non-paid channels (organic media, organic reach and frequency, and non-media treatments) are included in the summary but only the metrics independent of spend are reported. If False , only the paid channels (media, reach and frequency) are included but the summary contains also the metrics dependent on spend. Default: False .

non_media_baseline_values

Returns

An xr.Dataset with coordinates: channel , metric ( mean , median , ci_low , ci_high ), distribution (prior, posterior) and contains the following non-paid data variables: incremental_outcome , pct_of_contribution , effectiveness , and the following paid data variables: impressions , pct_of_impressions , spend , pct_of_spend , CPM , roi , mroi , cpik . The paid data variables are only included when include_non_paid_channels is False . Note that roi , mroi , cpik , and effectiveness metrics are not reported when aggregate_times=False because they do not have a clear interpretation by time period.

meridian.analysis.analyzer.Analyzer Stay organized with collections Save and categorize content based on your preferences.

Methods

adstock_decay

baseline_summary_metrics

compute_incremental_outcome_aggregate

cpik

expected_outcome

expected_vs_actual_data

filter_and_aggregate_geos_and_times

get_aggregated_impressions

get_aggregated_spend

get_historical_spend

get_rhat

hill_curves

incremental_outcome

marginal_roi

negative_baseline_probability

optimal_freq

predictive_accuracy

response_curves

rhat_summary

roi

summary_metrics

meridian.analysis.analyzer.Analyzer

`adstock_decay`

`baseline_summary_metrics`

`compute_incremental_outcome_aggregate`

`cpik`

`expected_outcome`

`expected_vs_actual_data`

`filter_and_aggregate_geos_and_times`

`get_aggregated_impressions`

`get_aggregated_spend`

`get_historical_spend`

`get_rhat`

`hill_curves`

`incremental_outcome`

`marginal_roi`

`negative_baseline_probability`

`optimal_freq`

`predictive_accuracy`

`response_curves`

`rhat_summary`

`roi`

`summary_metrics`