Optimize detection and reporting performance

Supported in:

This document describes how to optimize detection and reporting performance.

Total detection latency

For a security operations center (SOC), the total mean time to detect (MTTD) is the sum of time delays across the security pipeline. To accurately measure and reduce MTTD, you need to track three primary components:

Log-ingestion latency (log creation to data ingestion)

Log-ingestion latency is the time elapsed between when the security event occurred on the source system ( metadata.event_timestamp ) and when the log was successfully ingested and parsed in Google Security Operations ( metadata.ingested_time ).

Contributing factors:

  • Collector or forwarder issues (for example, backlogs or network throttling).
  • Log source parsing issues (for example, delays in UDM normalization).

To reduce log-ingestion latency, do the following:

  • Monitor log-source health and optimize collector or forwarder configurations.
  • To monitor the delta, in YARA-L or Data Lake, compare UDM timestamps— metadata.ingested_timestamp versus metadata.event_timestamp .

Rule-processing latency (data ingestion to detection creation)

Rule-processing latency is the time elapsed between data ingestion and when the detection engine successfully creates an alert ( detection.creation_time ). This component is heavily affected by your YARA-L rule configuration.

Contributing factors:

  • Rule run frequency:near real-time (best), 10 minutes, 1 hour, or 24 hours. The higher the frequency, the higher the minimum processing latency. For more information, see Set the run frequency .
  • Rule type and complexity:multi-event rules require a match window to fully process them, which imposes inherent latency. Composite rules that rely on other non-real-time detections also introduce delays. For more information, see Composite detections .

To reduce rule-processing latency, do the following:

  • Use single-event rules running in near real-time where possible.
  • For multi-event rules, set the smallest possible window size.

For more information, Sample YARA-L 2.0 queries for Dashboards .

YARA-L rule to monitor rule-processing latency

The following YARA-L rule identifies instances where the delta between the time a log was ingested and the time the detection was created exceeds a specific threshold. Use the rule to identify performance bottlenecks in your detection pipeline.

Deploy this rule in your test environment to baseline your log sources.

You can export these outcomes to a dashboard to visualize latency trends across different log types.

The rule compares the metadata.event_timestamp (when the activity happened) against the metadata.ingested_time (when Google SecOps received the log).

  rule 
  
 rule_processing_latency_monitor 
  
 { 
  
 meta 
 : 
  
 author 
  
 = 
  
 "SecOps Engineering" 
  
 description 
  
 = 
  
 "Alerts when the gap between ingestion and detection creation is greater than 15 minutes." 
  
 severity 
  
 = 
  
 "Low" 
  
 events 
 : 
  
 $event 
 . 
 metadata 
 . 
 event_timestamp 
 . 
 seconds 
  
 = 
  
 $event_ts 
  
 $event 
 . 
 metadata 
 . 
 ingested_time 
 . 
 seconds 
  
 = 
  
 $ingest_ts 
  
  
 // Calculate the delta in seconds 
  
 $latency_delta 
  
 = 
  
 $ingest_ts 
  
-  
 $event_ts 
  
 // Threshold: 900 seconds (15 minutes) 
  
 $latency_delta 
 > 
900  
 match 
 : 
  
 $event 
 . 
 metadata 
 . 
 log_type 
  
 over 
  
1 h 
  
 outcome 
 : 
  
 $max_latency 
  
 = 
  
 max 
 ( 
 $latency_delta 
 ) 
  
 $log_source 
  
 = 
  
 array_distinct 
 ( 
 $event 
 . 
 metadata 
 . 
 log_type 
 ) 
  
 condition 
 : 
  
 $event 
 } 
 

Case-acknowledgement latency (detection creation to analyst assignment)

This section is not relevant for customers who use the Google SecOps SIEM standalone platform.

Case-acknowledgement latency is the time elapsed between the detection that creates an alert and the alert being acknowledged by an analyst for triage in the SOAR component.

The mean time to acknowledge (MTTA) metric specifically tracks the efficiency of the SOC team in responding to a generated alert.

  • To reduce case-acknowledgement latency, optimize alert routing, tuning, and automation (for example, using playbooks for auto-assignment or enrichment) to quickly move the alert to the triage stage.

What's next

  • To learn about how rule replays (also called cleanup runs ) manage late-arriving data and context updates, and how this affects the MTTD metrics, see Understand rule replays and MTTD .
  • To learn more about rule detection delays in Google SecOps, contributing factors, troubleshooting, and techniques to reduce delays, see Understand rule detection delays .

Need more help? Get answers from Community members and Google SecOps professionals.

Design a Mobile Site
View Site in Mobile | Classic
Share by: