- 1.73.0 (latest)
- 1.72.0
- 1.71.1
- 1.70.0
- 1.69.0
- 1.68.0
- 1.67.1
- 1.66.0
- 1.65.0
- 1.63.0
- 1.62.0
- 1.60.0
- 1.59.0
- 1.58.0
- 1.57.0
- 1.56.0
- 1.55.0
- 1.54.1
- 1.53.0
- 1.52.0
- 1.51.0
- 1.50.0
- 1.49.0
- 1.48.0
- 1.47.0
- 1.46.0
- 1.45.0
- 1.44.0
- 1.43.0
- 1.39.0
- 1.38.1
- 1.37.0
- 1.36.4
- 1.35.0
- 1.34.0
- 1.33.1
- 1.32.0
- 1.31.1
- 1.30.1
- 1.29.0
- 1.28.1
- 1.27.1
- 1.26.1
- 1.25.0
- 1.24.1
- 1.23.0
- 1.22.1
- 1.21.0
- 1.20.0
- 1.19.1
- 1.18.3
- 1.17.1
- 1.16.1
- 1.15.1
- 1.14.0
- 1.13.1
- 1.12.1
- 1.11.0
- 1.10.0
- 1.9.0
- 1.8.1
- 1.7.1
- 1.6.2
- 1.5.0
- 1.4.3
- 1.3.0
- 1.2.0
- 1.1.1
- 1.0.1
- 0.9.0
- 0.8.0
- 0.7.1
- 0.6.0
- 0.5.1
- 0.4.0
- 0.3.1
Transformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
This message has oneof
_ fields (mutually exclusive fields).
For each oneof, at most one member field can be set at the same time.
Setting any member of the oneof automatically clears all other
members.
.. _oneof: https://proto-plus-python.readthedocs.io/en/stable/fields.html#oneofs-mutually-exclusive-fields
Attributes
Inheritance
builtins.object > proto.message.Message > TransformationClasses
AutoTransformation
AutoTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Training pipeline will infer the proper transformation based on the statistic of dataset.
CategoricalArrayTransformation
CategoricalArrayTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Treats the column as categorical array and performs following transformation functions.
- For each element in the array, convert the category name to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
- Empty arrays treated as an embedding of zeroes.
CategoricalTransformation
CategoricalTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Training pipeline will perform following transformation functions.
- The categorical string as is--no change to case, punctuation, spelling, tense, and so on.
- Convert the category name to a dictionary lookup index and generate an embedding for each index.
- Categories that appear less than 5 times in the training dataset are treated as the "unknown" category. The "unknown" category gets its own special lookup index and resulting embedding.
NumericArrayTransformation
NumericArrayTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Treats the column as numerical array and performs following transformation functions.
- All transformations for Numerical types applied to the average of the all elements.
- The average of empty arrays is treated as zero.
NumericTransformation
NumericTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Training pipeline will perform following transformation functions.
- The value converted to float32.
- The z_score of the value.
- log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
- z_score of log(value+1) when the value is greater than or equal to 0. Otherwise, this transformation is not applied and the value is considered a missing value.
- A boolean value that indicates whether the value is valid.
TextArrayTransformation
TextArrayTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Treats the column as text array and performs following transformation functions.
- Concatenate all text values in the array into a single text value using a space (" ") as a delimiter, and then treat the result as a single text value. Apply the transformations for Text columns.
- Empty arrays treated as an empty text.
TextTransformation
TextTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Training pipeline will perform following transformation functions.
- The text as is--no change to case, punctuation, spelling, tense, and so on.
- Tokenize text to words. Convert each words to a dictionary lookup index and generate an embedding for each index. Combine the embedding of all elements into a single embedding using the mean.
- Tokenization is based on unicode script boundaries.
- Missing values get their own lookup index and resulting embedding.
- Stop-words receive no special treatment and are not removed.
TimestampTransformation
TimestampTransformation
(
mapping
=
None
,
*
,
ignore_unknown_fields
=
False
,
**
kwargs
)
Training pipeline will perform following transformation functions.
- Apply the transformation functions for Numerical columns.
- Determine the year, month, day,and weekday. Treat each value from the
- timestamp as a Categorical column.
- Invalid numerical values (for example, values that fall outside of a typical timestamp range, or are extreme values) receive no special treatment and are not removed.