meridian.data.load.CsvDataLoader

Reads data from a CSV file.

Inherits From: InputDataLoader

This class reads input data from a CSV file. The coord_to_columns attribute stores a mapping from target InputData coordinates and array names to the CSV column names, if they are different. The fields are:

  • geo , time , kpi , revenue_per_kpi , population (single column)
  • controls (multiple columns, optional)
  • (1) media , media_spend (multiple columns)
  • (2) reach , frequency , rf_spend (multiple columns)
  • non_media_treatments (multiple columns, optional)
  • organic_media (multiple columns, optional)
  • organic_reach , organic_frequency (multiple columns, optional)

The DataFrame must include either (1) or (2), but doesn't need to include both.

Internally, this class reads the CSV file into a Pandas DataFrame and then loads the data using DataFrameDataLoader .

csv_path
The path to the CSV file to read from. One of the following conditions is required:

  • There are no gaps in the data.
  • For up to max_lag initial periods there is only media data and empty cells in all the data columns different from media , reach , frequency , organic_media , organic_reach and organic_frequency ( kpi , revenue_per_kpi , media_spend , rf_spend , controls , population and non_media_treatments ).
    coord_to_columns
    A CoordToColumns object whose fields are the desired coordinates of the InputData and the values are the current names of columns (or lists of columns) in the CSV file. Example:
 coord_to_columns = CoordToColumns(
    geo='dmas',
    time='dates',
    kpi='revenue',
    revenue_per_kpi='revenue_per_conversions',
    media=['impressions_tv', impressions_yt', 'impressions_search'],
    spend=['spend_tv', 'spend_yt', 'spend_search'],
    reach=['reach_fb'],
    frequency=['frequency_fb'],
    rf_spend=['rf_spend_fb'],
    controls=['control_income'],
    population='population',
    non_media_treatments=['price', 'discount'],
    organic_media=['organic_impressions_blog'],
    organic_reach=['organic_reach_newsletter'],
    organic_frequency=['organic_frequency_newsletter'],
) 

kpi_type
A string denoting whether the KPI is of a 'revenue' or 'non-revenue' type. When the kpi_type is 'non-revenue' and there exists a revenue_per_kpi , ROI calibration is used and the analysis is run on revenue. When the revenue_per_kpi doesn't exist for the same kpi_type , custom ROI calibration is used and the analysis is run on KPI.
media_to_channel
A dictionary whose keys are the actual column names for media data in the CSV file and values are the desired channel names, the same as for the media_spend data. Example:

 media_to_channel = {
    'media_tv': 'tv', 'media_yt': 'yt', 'media_fb': 'fb'
} 

media_spend_to_channel
A dictionary whose keys are the actual column names for media_spend data in the CSV file and values are the desired channel names, the same as for the media data. Example:

 `media_spend_to_channel = {
    'spend_tv': 'tv', 'spend_yt': 'yt', 'spend_fb': 'fb'
} 

reach_to_channel
A dictionary whose keys are the actual column names for reach data in the dataframe and values are the desired channel names, the same as for the rf_spend data. Example:

 reach_to_channel = {
    'reach_tv': 'tv', 'reach_yt': 'yt', 'reach_fb': 'fb'
} 

frequency_to_channel
A dictionary whose keys are the actual column names for frequency data in the dataframe and values are the desired channel names, the same as for the rf_spend data. Example:

 frequency_to_channel = {
    'frequency_tv': 'tv', 'frequency_yt': 'yt', 'frequency_fb': 'fb'
} 

rf_spend_to_channel
A dictionary whose keys are the actual column names for rf_spend data in the dataframe and values are the desired channel names, the same as for the reach and frequency data. Example:

 rf_spend_to_channel = {
    'rf_spend_tv': 'tv', 'rf_spend_yt': 'yt', 'rf_spend_fb': 'fb'
} 

organic_reach_to_channel
A dictionary whose keys are the actual column names for organic_reach data in the dataframe and values are the desired channel names, the same as for the organic_frequency . Example:

 organic_reach_to_channel = {
    'organic_reach_newsletter': 'newsletter',
} 

organic_frequency_to_channel
A dictionary whose keys are the actual column names for organic_frequency data in the dataframe and values are the desired channel names, the same as for the organic_reach data. Example:

 organic_frequency_to_channel = {
    'organic_frequency_newsletter': 'newsletter',
} 

Methods

load

View source

Reads data from a CSV file and returns an InputData object.

Design a Mobile Site
View Site in Mobile | Classic
Share by: