Anomaly detection overview
Anomaly detection is a data mining technique that you can use to identify data deviations in a given dataset. For example, if the return rate for a given product increases substantially from the baseline for that product, that might indicate a product defect or potential fraud. You can use anomaly detection to detect critical incidents, such as technical issues, or opportunities, such as changes in consumer behavior.
One challenge when you use anomaly detection is determining what counts as
anomalous data. If you have labeled data that identifies anomalies, you can
perform anomaly detection by using the ML.PREDICT
function
with one of the following supervised machine learning models:
- Linear and logistic regression models
- Boosted trees models
- Random forest models
- Deep neural network (DNN) models
- Wide & Deep models
- AutoML models
If you aren't certain what counts as anomalous data, or you don't have labeled
data to train a model on, you can use unsupervised machine learning to perform
anomaly detection. Use the ML.DETECT_ANOMALIES
function
with one of the following models to detect anomalies in training data or new
serving data:
ML.DETECT_ANOMALIES
function.
. ML.RECONSTRUCTION_LOSS
. The ML.RECONSTRUCTION_LOSS
function can
retrieve all types of reconstruction loss.Recommended knowledge
By using the default settings in the CREATE MODEL
statements and the
inference functions, you can create and use an anomaly detection
model even without much ML knowledge. However, having basic knowledge about
ML development helps you optimize both your data and your model to
deliver better results. We recommend using the following resources to develop
familiarity with ML techniques and processes:

