Reads data from a CSV file.
Inherits From: InputDataLoader
meridian
.
data
.
load
.
CsvDataLoader
(
csv_path
:
str
,
coord_to_columns
:
meridian
.
data
.
load
.
CoordToColumns
,
kpi_type
:
str
,
media_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
,
media_spend_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
,
reach_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
,
frequency_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
,
rf_spend_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
,
organic_reach_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
,
organic_frequency_to_channel
:
(
Mapping
[
str
,
str
]
|
None
)
=
None
)
This class reads input data from a CSV file. The coord_to_columns
attribute
stores a mapping from target InputData
coordinates and array names to the
CSV column names, if they are different. The fields are:
-
geo,time,kpi,revenue_per_kpi,population(single column) -
controls(multiple columns, optional) - (1)
media,media_spend(multiple columns) - (2)
reach,frequency,rf_spend(multiple columns) -
non_media_treatments(multiple columns, optional) -
organic_media(multiple columns, optional) -
organic_reach,organic_frequency(multiple columns, optional)
The DataFrame must include either (1) or (2), but doesn't need to include both.
Internally, this class reads the CSV file into a Pandas DataFrame and then
loads the data using DataFrameDataLoader
.
Args
- There are no gaps in the data.
- For up to
max_laginitial periods there is only media data and empty cells in all the data columns different frommedia,reach,frequency,organic_media,organic_reachandorganic_frequency(kpi,revenue_per_kpi,media_spend,rf_spend,controls,populationandnon_media_treatments).
coord_to_columns = CoordToColumns(
geo='dmas',
time='dates',
kpi='revenue',
revenue_per_kpi='revenue_per_conversions',
media=['impressions_tv', impressions_yt', 'impressions_search'],
spend=['spend_tv', 'spend_yt', 'spend_search'],
reach=['reach_fb'],
frequency=['frequency_fb'],
rf_spend=['rf_spend_fb'],
controls=['control_income'],
population='population',
non_media_treatments=['price', 'discount'],
organic_media=['organic_impressions_blog'],
organic_reach=['organic_reach_newsletter'],
organic_frequency=['organic_frequency_newsletter'],
)
'revenue'
or 'non-revenue'
type. When the kpi_type
is 'non-revenue'
and there
exists a revenue_per_kpi
, ROI calibration is used and the analysis is
run on revenue. When the revenue_per_kpi
doesn't exist for the same kpi_type
, custom ROI calibration is used and the analysis is run on
KPI.media_spend
data. Example: media_to_channel = {
'media_tv': 'tv', 'media_yt': 'yt', 'media_fb': 'fb'
}
media_spend
data in the CSV file and values are the desired
channel names, the same as for the media
data. Example: `media_spend_to_channel = {
'spend_tv': 'tv', 'spend_yt': 'yt', 'spend_fb': 'fb'
}
reach
data in the dataframe and values are the desired channel names,
the same as for the rf_spend
data. Example: reach_to_channel = {
'reach_tv': 'tv', 'reach_yt': 'yt', 'reach_fb': 'fb'
}
frequency
data in the dataframe and values are the desired channel
names, the same as for the rf_spend
data. Example: frequency_to_channel = {
'frequency_tv': 'tv', 'frequency_yt': 'yt', 'frequency_fb': 'fb'
}
rf_spend
data in the dataframe and values are the desired channel
names, the same as for the reach
and frequency
data. Example: rf_spend_to_channel = {
'rf_spend_tv': 'tv', 'rf_spend_yt': 'yt', 'rf_spend_fb': 'fb'
}
organic_reach
data in the dataframe and values are the
desired channel names, the same as for the organic_frequency
. Example: organic_reach_to_channel = {
'organic_reach_newsletter': 'newsletter',
}
organic_frequency
data in the dataframe and values
are the desired channel names, the same as for the organic_reach
data. Example: organic_frequency_to_channel = {
'organic_frequency_newsletter': 'newsletter',
}
Methods
load
load
()
->
meridian
.
data
.
input_data
.
InputData
Reads data from a CSV file and returns an InputData
object.



