meridian.model.eda.eda_engine.EDAEngine

Meridian EDA Engine.

all_freq_da
A DataArray containing all frequency data.

This includes both paid and organic frequency, concatenated along the RF_CHANNEL dimension.

all_reach_scaled_da
A DataArray containing all scaled reach data.

This includes both paid and organic reach, concatenated along the RF_CHANNEL dimension.

all_spend_ds
A Dataset containing all spend data.

This includes media spend and rf spend.

controls_and_non_media_scaled_ds
A Dataset of scaled controls and non-media treatments.
controls_scaled_da
The scaled controls data array.
frequency_da
The frequency data array.
geo_population_da
The geo population data array.
kpi_has_variability
Whether the KPI has variability across geos and times.
kpi_scaled_da
The scaled KPI data array.
media_raw_da
The raw media data array.
media_scaled_da
The scaled media data array.
media_spend_da
The media spend data.

If the input spend is aggregated, it is allocated across geo and time proportionally to media units.

national_all_freq_da
A DataArray containing all national-level frequency data.

This includes both paid and organic frequency, concatenated along the RF_CHANNEL dimension.

national_all_reach_scaled_da
A DataArray containing all national-level scaled reach data.

This includes both paid and organic reach, concatenated along the RF_CHANNEL dimension.

national_all_spend_ds
A Dataset containing all national spend data.

This includes media spend and rf spend.

national_controls_and_non_media_scaled_ds
A Dataset of national scaled controls and non-media treatments.
national_controls_scaled_da
The national scaled controls data array.
national_frequency_da
The national frequency data array.
national_kpi_scaled_da
The national scaled KPI data array.
national_media_raw_da
The national raw media data array.
national_media_scaled_da
The national scaled media data array.
national_media_spend_da
The national media spend data array.
national_non_media_scaled_da
The national scaled non-media treatment data array.
national_organic_frequency_da
The national organic frequency data array.
national_organic_media_raw_da
The national raw organic media data array.
national_organic_media_scaled_da
The national scaled organic media data array.
national_organic_reach_raw_da
The national raw organic reach data array.
national_organic_reach_scaled_da
The national scaled organic reach data array.
national_organic_rf_impressions_raw_da
The national raw organic RF impressions data array.
national_organic_rf_impressions_scaled_da
The national scaled organic RF impressions data array.
national_paid_raw_media_units_ds

national_reach_raw_da
The national raw reach data array.
national_reach_scaled_da
The national scaled reach data array.
national_rf_impressions_raw_da
The national raw RF impressions data array.
national_rf_impressions_scaled_da
The national scaled RF impressions data array.
national_rf_spend_da
The national RF spend data array.
national_treatment_control_scaled_ds
A Dataset containing all scaled treatments and controls.

This includes media, RF impressions, organic media, organic RF impressions, non-media treatments, and control variables, all at the national level.

national_treatments_without_non_media_scaled_ds
A Dataset of national scaled treatments excluding non-media.
non_media_scaled_da
The scaled non-media treatments data array.
organic_frequency_da
The organic frequency data array.
organic_media_raw_da
The raw organic media data array.
organic_media_scaled_da
The scaled organic media data array.
organic_reach_raw_da
The raw organic reach data array.
organic_reach_scaled_da
The scaled organic reach data array.
organic_rf_impressions_raw_da
The raw organic RF impressions data array.
organic_rf_impressions_scaled_da
The scaled organic RF impressions data array.
paid_raw_media_units_ds

reach_raw_da
The raw reach data array.
reach_scaled_da
The scaled reach data array.
rf_impressions_raw_da
The raw RF impressions data array.
rf_impressions_scaled_da
The scaled RF impressions data array.
rf_spend_da
The RF spend data.

If the input spend is aggregated, it is allocated across geo and time proportionally to RF impressions (reach * frequency).

spec
The EDA specification.
treatment_control_scaled_ds
A Dataset containing all scaled treatments and controls.

This includes media, RF impressions, organic media, organic RF impressions, non-media treatments, and control variables, all at the geo level.

treatments_without_non_media_scaled_ds
A Dataset of scaled treatments excluding non-media.

Methods

check_cost_per_media_unit

View source

Checks if the cost per media unit is valid.

This function checks the following conditions: 1. cost == 0 and media unit > 0. 2. cost > 0 and media unit == 0. 3. cost_per_media_unit has outliers.

Returns
An EDAOutcome object with findings and result values.

check_geo_cost_per_media_unit

View source

Checks if the cost per media unit is valid for geo data.

Returns
An EDAOutcome object with findings and result values.

Raises

GeoLevelCheckOnNationalModelError
If the check is called for a national model.

check_geo_pairwise_corr

View source

Checks pairwise correlation among treatments and controls for geo data.

Returns
An EDAOutcome object with findings and result values.

Raises

GeoLevelCheckOnNationalModelError
If the model is national.

check_geo_std

View source

Checks std for geo-level KPI, treatments, R&F, and controls.

check_geo_vif

View source

Checks geo variance inflation factor among treatments and controls.

The VIF calculation only focuses on multicollinearity among non-constant variables. Any variable with constant values will result in a NaN VIF value.

Returns
An EDAOutcome object with findings and result values.

check_national_cost_per_media_unit

View source

Checks if the cost per media unit is valid for national data.

Returns
An EDAOutcome object with findings and result values.

check_national_pairwise_corr

View source

Checks pairwise correlation among treatments and controls for national data.

Returns
An EDAOutcome object with findings and result values.

check_national_std

View source

Checks std for national-level KPI, treatments, R&F, and controls.

check_national_vif

View source

Checks national variance inflation factor among treatments and controls.

The VIF calculation only focuses on multicollinearity among non-constant variables. Any variable with constant values will result in a NaN VIF value.

Returns
An EDAOutcome object with findings and result values.

check_overall_kpi_invariability

View source

Checks if the KPI is constant across all geos and times.

check_pairwise_corr

View source

Checks pairwise correlation among treatments and controls.

Returns
An EDAOutcome object with findings and result values.

check_population_corr_raw_media

View source

Checks Spearman correlation between population and raw media executions.

Calculates correlation between population and time-averaged raw media executions (paid/organic impressions/reach). These are expected to have reasonably high correlation with population.

Returns
An EDAOutcome object with findings and result values.

Raises

GeoLevelCheckOnNationalModelError
If the model is national or geo population data is missing.

check_population_corr_scaled_treatment_control

View source

Checks Spearman correlation between population and treatments/controls.

Calculates correlation between population and time-averaged treatments and controls. High correlation for controls or non-media channels may indicate a need for population-scaling. High correlation for other media channels may indicate double-scaling.

Returns
An EDAOutcome object with findings and result values.

Raises

GeoLevelCheckOnNationalModelError
If the model is national or geo population data is missing.

check_std

View source

Checks standard deviation for treatments and controls.

Returns
An EDAOutcome object with findings and result values.

check_variable_geo_time_collinearity

View source

Compute adjusted R-squared for treatments and controls vs geo and time.

These checks are applied to geo-level dataset only.

Returns
An EDAOutcome object containing a VariableGeoTimeCollinearityArtifact. The artifact includes a Dataset with 'rsquared_geo' and 'rsquared_time', showing the adjusted R-squared values for each treatment/control variable when regressed against 'geo' and 'time', respectively. If a variable is constant across geos or times, the corresponding 'rsquared_geo' or 'rsquared_time' value will be NaN.

check_vif

View source

Computes variance inflation factor among treatments and controls.

The VIF calculation only focuses on multicollinearity among non-constant variables. Any variable with constant values will result in a NaN VIF value.

Returns
An EDAOutcome object with findings and result values.

run_all_critical_checks

View source

Runs all critical EDA checks.

Critical checks are those that can result in EDASeverity.ERROR findings.

Returns
A CriticalCheckEDAOutcomes object containing the results of all critical checks.

Create a Mobile Website
View Site in Mobile | Classic
Share by: