Use the Health Hub

Supported in:

Google secops SIEM

This document describes the Health Hub, which is the central location in Google Security Operations for you to monitor the status and health of all configured data sources. The Health Hubprovides crucial information on data sources and log types, offering the context needed to diagnose and remediate data pipeline issues.

The Health Hubincludes information about the following:

Ingestion volumes and ingestion health.
Parsing volumes from raw logs to Unified Data Model (UDM) events .
Context and links to interfaces with additional relevant information and functionality.
Irregular and failed sources and log types.

The Health Hubdetects irregularities and failures on a per-customer basis. It uses statistical methods with a 15-day lookback period to analyze ingestion data. Items that are marked irregular identify surges or drops in data being ingested and processed by Google SecOps. An irregularity can indicate a parser problem, vendor schema change, or a change in your data pipeline.

Key benefits

You can use the Health Hubto do the following:

Monitor overall data health at a glance. View the core health status and associated metrics for each feed, data source, log type, and source (that is, the feed ID).
Monitor aggregated data-health metrics for:
- Ingestion and parsing over time with highlighted events (not necessarily irregularities) that link to filtered dashboards.
- Irregularities—current and over time.
Access related dashboards, filtered by timeframe, log type, or feed.
Access the feed configuration to edit and fix or remediate a problem.
Access the parser configuration to edit and fix or remediate a problem.
Click the Set Up Alertslink to open the Cloud Monitoring interface, and from there, configure custom API-based alerts using Statusand log-volume metrics.

Key questions

This section refers to Health Hubcomponents and parameters, which are described in the Interface section.

You can use the Health Hubto answer the following questions about your data pipeline:

What were the last runs of my feed or last errors from my parser?

The Health Hubhas a deep-dive dashboard that shows the last 200 feed runs and last 200 parser errors for the specific row you click into.
Are my logs reaching Google SecOps?

You can verify whether logs are reaching Google SecOps by using the Last Ingestedand Last Normalizedmetrics. These metrics confirm the last time data was successfully delivered. Additionally, the ingestion-volume metrics (per source and per log type) show you the amount of data being ingested.
Are my logs being parsed correctly?

To confirm correct parsing, view the Last Normalizedmetric. This metric indicates when the last successful transformation from raw log into a UDM event occurred.
Why is ingestion or parsing not happening?

The text in the Latest Issue Detailscolumn identifies specific problems, which helps you pinpoint whether the action is actionable (you fix it) or non-actionable (requires support). The text Forbidden 403: Permission deniedis an example of an actionable error, where the auth account provided in the feed configuration lacks required permissions. The text Internal_erroris an example of a non-actionable error, where the recommended action is to open a support case with Google SecOps.
Are there significant changes in the number of ingested logs and parsed logs?

The Statusfield shows your data's health ( Healthyor Failed), based on data volume. You can also identify sudden or sustained surges or drops by viewing the Total Ingested Logsgraph.
How can I get alerted if my sources are failing?

The Health Hubfeeds the Statusand log-volume metrics into Cloud Monitoring. In one of the Health Hubtables, click the relevant Alertslink to open the Cloud Monitoring interface. There, you can configure custom API-based alerts using Statusand log-volume metrics.
How do I infer a delay in a log-type ingestion?

A delay is indicated when the Last Event Timeis significantly behind the Last Ingestedtimestamp. The Health Hubexposes the 95 ^th percentile of the Last Ingested– Last Event Timedelta—per log type. A high value suggests a latency problem within the Google SecOps pipeline, whereas a normal value might indicate that the source is pushing old data.
Have any recent changes in my configuration caused feed failures?

If the Config Last Updatedtimestamp is close to the Last Ingestedtimestamp, it suggests that a recent configuration update may be the cause of a failure. This correlation helps in root-cause analysis.
How has the health of ingestion and parsing been trending over time?

The Data Source Health Overview, Parsing Health Overview, and Total Ingested Logsgraphs show the historical trend of your data's health, letting you observe long-term patterns and irregularities.

Interface

To open the Health Hub, in the side navigation menu, click Health Hub.

The Health Hubis a read-only default dashboard and can't be modified directly. To customize, create a copy of the Health Hub, and then modify the duplicated dashboard for your specific use case.

The Health Hubdisplays the following widgets:

Big number widgets:
- Healthy Sources: The number of data sources performing with no failures and no irregularities.
- Failed Sources: The number of data sources that need immediate attention.
- Irregular Data Sources: The number of irregular data sources and parsers.
- Healthy Parsers: The number of parsers performing with no failures.
- Failed Parsers: The number of parsers that need immediate attention.
- Irregular Parsers: The number of parsers that are exhibiting irregular behavior.
Data Source Health Overview: A line graph showing the Healthy, Irregular, and Faileddata-sources-per-day curves over time.
Parsing Health Overview: A line graph showing the Healthy, Irregular, and Failedparsers-per-day curves over time.
Total Ingested Logs: A line graph showing the Ingested Logslogs-per-day curve over time.

Note: Some ingested logs are mapped to more than one normalized log. The total parsed log count can be higher than the total ingested log count.
Failing Parser by Log Type: A line graph showing a curve for each parser with a failed health status, per day over time. In this context, the failed health status is due to a very low parsing-success rate.
Health Status by Data Sourcetable—includes the following columns:
- Status: The cumulative status of the feed ( Healthy, Irregular, or Failed), derived from data volume, configuration errors, and API errors.
- Source Type: The source type (ingestion mechanism)—for example, Ingestion API, Feeds, Native Workspace Ingestion, or Azure Event Hub Feeds.
- Name: The feed name.
- Log Type: The log type—for example, CS_EDR, UDM, GCP_CLOUDAUDIT, or WINEVTLOG.
- Latest Issue Details: The details about the latest issue in the specified timeframe—for example, Failed parsing logs, Config credential issue, or Normalization issue. The stated issue can be actionable (for example, Incorrect Auth) or non-actionable (for example, Internal_error). If the issue is non-actionable, the recommended action is to open a support case with Google SecOps. When there has been no issue in the specified timeframe, the value is empty or displays OK.
- Issue Duration: The number of days that the data source has been in an irregular or failed state. When the Statusis Healthy, the value is empty or displays N/A.
- Last Collected: The timestamp of the last data collection.
- Last Ingested: The timestamp of the last successful ingestion. Use this metric to identify whether your logs are reaching Google SecOps.
- Config Last Updated: The timestamp of the last change to the metric. Use this value to correlate configuration updates with observed irregularities or failures, helping you determine the root cause of ingestion problems or parsing problems.
- View Ingestion Details: A link that opens a new tab with another dashboard, which contains additional, historical information—for deeper analysis.
- Edit Data Source: A link that opens a new tab with the corresponding feed configuration —where you can fix configuration-related irregularities or failures.
- Set Up Alerts: A link, which opens a new tab with the corresponding Cloud Monitoring interface.
Health Status by Parsertable—includes the following columns:
- Status: The cumulative status of the log type ( Healthy, Irregular, or Failed).
- Latest Issue Details: The details about the latest parsing problem in the specified timeframe—for example, Failed parsing logs, Config credential issue, or Normalization issue. The stated issue can be actionable (for example, Incorrect Auth) or non-actionable (for example, Internal_error). If the issue is non-actionable, the recommended action is to open a support case with Google SecOps. When there has been no issue in the specified timeframe, the value is empty or displays OK.
- Last Ingested: The timestamp of the last successful ingestion. You can use this metric to determine whether logs are reaching Google SecOps.
- Last Event Time: The event timestamp of the last normalized log.
  
  Note: The value is always the latest timestamp—even if an older event is ingested later.
- Last Normalized: The timestamp of the last parsing and normalization action for the log type. You can use this metric to determine whether raw logs are successfully transformed into UDM events .
- Config Last Updated: The timestamp of the last change to the metric. Use this value to correlate configuration updates with observed irregularities or failures, helping you determine the root cause of ingestion problems or parsing problems.
- View Parsing Details: A link, which opens a new tab with another dashboard, which contains additional, historical information—for deeper analysis.
- Edit Parser: A link, which opens a new tab with the corresponding parser configuration —where you can fix configuration-related irregularities or failures.
- Set Up Alert: A link, which opens a new tab with the corresponding Cloud Monitoring interface.

Irregularity-detection engine

The Health Hubuses the Google SecOps irregularity-detection engine to automatically identify significant changes in your data, letting you quickly detect and address potential problems.

The irregularity-detection engine looks back at the last 15 days of data, calculates based on the last 24 hours compared to the 15-day period, and recalculates every hour.

Data ingestion irregularity-detection

Google SecOps analyzes daily volume changes, while considering normal weekly patterns.

The irregularity-detection engine uses the following calculations to detect unusual surges or drops in your data ingestion:

Daily and weekly comparisons: Google SecOps calculates the difference in ingestion volume between the current day and the previous day, and also the difference between the current day and the average volume over the past week.
Standardization: To understand the significance of these changes, Google SecOps standardizes them using the following z-score formula:

z = (x _i − x_bar) / stdev

where
- z is the standardized score (or z-score) for an individual difference
- x _i is an individual difference value
- x_bar is the mean of the differences
- stdev is the standard deviation of the differences
Irregularity flagging: Google SecOps flags an irregularity if both the daily and weekly standardized changes are statistically significant. Specifically, Google SecOps searches for:
- Drops: Both the daily and weekly standardized differences are less than -1.645.
- Surges: Both the daily and weekly standardized differences are greater than 1.645.

Normalization ratio

When calculating the ratio of ingested events to normalized events, the irregularity-detection engine uses a combined approach to ensure that only significant drops in normalization rates are flagged. The irregularity-detection engine generates an alert only when the following two conditions are met:

There is a statistically significant drop in the normalization ratio compared to the previous day.
The drop is also significant in absolute terms, with a magnitude of 0.05 or greater.

Parsing error irregularity detection

For errors that occur during data parsing, the irregularity-detection engine uses a ratio-based method. The irregularity-detection engine triggers an alert if the proportion of parser errors relative to the total number of ingested events increases by 5 percentage points or more compared to the previous day.

What's next

Need more help? Get answers from Community members and Google SecOps professionals.