User-defined metrics from the agent

This guide explains how you can configure the Monitoring agent to recognize and export your application metrics to Cloud Monitoring.

The Monitoring agent is a collectd daemon. In addition to exporting many predefined system and third-party metrics to Cloud Monitoring, the agent can export your own collectd application metrics to Monitoring as user-defined metrics . Your collectd plugins can also export to Monitoring.

An alternative way to export application metrics to Monitoring is to use StatsD . Cloud Monitoring provides a default configuration that maps StatsD metrics to user-defined metrics. If you are satisfied with that mapping, then you don't need the customization steps described below. For more information, see the StatsD plugin .

For more information about metrics, see the following documents:

This functionality is only available for agents running on Linux. It is not available on Windows.

Before you begin

Install the most recent Monitoring agent on a VM instance and verify it is working. To update your agent, see Updating the agent .
Configure collectd to get monitoring data from your application. Collectd supports many application frameworks and standard monitoring endpoints through its read plugins . Find a read plugin that works for you.
(Optional) As a convenience, add the agent's collectd reference documentation to your system's man pages by updating the MANPATH variable and then running mandb :
```
  export 
  
 MANPATH 
 = 
 "$MANPATH:/opt/stackdriver/collectd/share/man" 
 sudo 
  
 mandb 
 
```
The man pages are for stackdriver-collectd .

Important files and directories

The following files and directories, created by installing the agent, are relevant to using the Monitoring agent (collectd):

/etc/stackdriver/collectd.conf: The collectd configuration file used by the agent. Edit this file to change general configuration.

Note: The system-default collectd configuration, /etc/collectd.conf , isn't used by the Monitoring agent.
/etc/stackdriver/collectd.d/: The directory for user-added configuration files. To send user-defined metrics from the agent, you place the required configuration files, discussed below, in this directory. For backward compatibility, the agent also looks for files in /opt/stackdriver/collectd/etc/collectd.d/ .
/opt/stackdriver/collectd/share/man/*: The documentation for the agent's version of collectd. You can add these pages to your system's set of man pages; see Before you begin for details.
/etc/init.d/stackdriver-agent: The init script for the agent.

How Monitoring handles collectd metrics

As background, the Monitoring agent processes collectd metrics and sends them to Monitoring, which treats each metric as a member of one of the following categories:

User-defined metrics. Collectd metrics that have the metadata key stackdriver_metric_type and a single data source are handled as user-defined metrics and sent to Monitoring using the projects.timeSeries.create method in the Monitoring API.
Curated metrics. All other collectd metrics are sent to Monitoring using an internal API. Only the metrics in the list of curated metrics are accepted and processed.
Discarded metrics. Collectd metrics that aren't in the curated metrics list and aren't user-defined metrics are silently discarded by Monitoring. The agent itself isn't aware of which metrics are accepted or discarded.

Write user-defined metrics with the agent

You configure the agent to send metric data points to Monitoring. Each point must be associated with a user-defined metric, which you define with a metric descriptor . These concepts are introduced in Metrics, time series, and resources and described in detail at Structure of time series and User-defined metrics overview .

You can have a collectd metric treated as a user-defined metric by adding the proper metadata to the metric:

stackdriver_metric_type : (required) the name of the exported metric. Example: custom.googleapis.com/my_custom_metric .
label:[LABEL] : (optional) additional labels for the exported metric. For example, if you want a Monitoring STRING label named color , then your metadata key would be label:color and the value of the key could be "blue" . You can have up to 10 labels per metric type.

You can use a collectd filter chain to modify the metadata for your metrics. Because filter chains can't modify the list of data sources and user-defined metrics only support a single data source, any collectd metrics that you want to use with this facility must have a single data source.

Example

In this example we will monitor active Nginx connections from two Nginx services, my_service_a and my_service_b . We will send these to Monitoring using a user-defined metric. We will take the following steps:

Identify the collectd metrics for each Nginx service.
Define a Monitoring metric descriptor.
Configure a collectd filter chain to add metadata to the collectd metrics, to meet the expectations of the Monitoring agent.

Incoming collectd metrics

Collectd expects metrics to consist of the following components. The first five components make up the collectd identifierfor the metric:

Host, Plugin, Plugin-instance, Type, Type-instance, [value]

In this example, the metrics you want to send as a user-defined metric have the following values:

Component	Expected value(s)
Host	any
Plugin	`curl_json`
Plugin instance	`nginx_my_service_a` or `nginx_my_service_b` ¹
Type	`gauge`
Type instance	`active-connections`
`[value]`	any value ²

Notes:
¹ In the example, this value encodes both the application (Nginx) and the connected service name.
² The value is typically a timestamp and double-precision number. Monitoring handles the details of interpreting the various kinds of values. Compound values aren't supported by the Monitoring agent.

Monitoring metric descriptor and time series

On the Monitoring side, design a metric descriptor for your user-defined metric. The following descriptor is a reasonable choice for the data in this example:

Name: custom.googleapis.com/nginx/active_connections
Labels:
- service_name (STRING): The name of the service connected to Nginx.
Kind: GAUGE
Type: DOUBLE

After you've designed the metric descriptor, you can create it by using projects.metricDescriptors.create , or you can let it be created for you from the time series metadata, discussed below. For more information, see Creating metric descriptors on this page.

The time series data for this metric descriptor must contain the following information, because of the way the metric descriptor is defined:

Metric type: custom.googleapis.com/nginx/active_connections
Metric label values:
- service_name : either "my_service_a" or "my_service_b"

Other time series information, including the associated monitored resource —the VM instance sending the data—and the metric's data point, is automatically obtained by the agent for all metrics. You don't have to do anything special.

Your filter chain

Create a file, /opt/stackdriver/collectd/etc/collectd.d/nginx_curl_json.conf , containing the following code:

  LoadPlugin 
  
 match_regex 
 LoadPlugin 
  
 target_set 
 LoadPlugin 
  
 target_replace 
 # 
  
 Insert 
  
 a 
  
 new 
  
 rule 
  
 in 
  
 the 
  
 default 
  
 "PreCache" 
  
 chain 
 , 
  
 to 
  
 divert 
  
 your 
  
 metrics 
 . 
 PreCacheChain 
  
 "PreCache" 
< Chain 
  
 "PreCache" 
>  
< Rule 
  
 "jump_to_custom_metrics_from_curl_json" 
>  
 # 
  
 If 
  
 the 
  
 plugin 
  
 name 
  
 and 
  
 instance 
  
 match 
 , 
  
 this 
  
 is 
  
 PROBABLY 
  
 a 
  
 metric 
  
 we 
 ' 
 re 
  
 looking 
  
 for 
 : 
  
< Match 
  
 regex 
>  
 Plugin 
  
 "^curl_json$" 
  
 PluginInstance 
  
 "^nginx_" 
  
< / 
 Match 
>  
< Target 
  
 "jump" 
>  
 # 
  
 Go 
  
 execute 
  
 the 
  
 following 
  
 chain 
 ; 
  
 then 
  
 come 
  
 back 
 . 
  
 Chain 
  
 "PreCache_curl_json" 
  
< / 
 Target 
>  
< / 
 Rule 
>  
 # 
  
 Continue 
  
 processing 
  
 metrics 
  
 in 
  
 the 
  
 default 
  
 "PreCache" 
  
 chain 
 . 
< / 
 Chain 
> # 
  
 Following 
  
 is 
  
 a 
  
 NEW 
  
 filter 
  
 chain 
 , 
  
 just 
  
 for 
  
 your 
  
 metric 
 . 
 # 
  
 It 
  
 is 
  
 only 
  
 executed 
  
 if 
  
 the 
  
 default 
  
 chain 
  
 "jumps" 
  
 here 
 . 
< Chain 
  
 "PreCache_curl_json" 
>  
 # 
  
 The 
  
 following 
  
 rule 
  
 does 
  
 all 
  
 the 
  
 work 
  
 for 
  
 your 
  
 metric 
 : 
  
< Rule 
  
 "rewrite_curl_json_my_special_metric" 
>  
 # 
  
 Do 
  
 a 
  
 careful 
  
 match 
  
 for 
  
 just 
  
 your 
  
 metrics 
 ; 
  
 if 
  
 it 
  
 fails 
 , 
  
 drop 
  
 down 
  
 # 
  
 to 
  
 the 
  
 next 
  
 rule 
 : 
  
< Match 
  
 regex 
>  
 Plugin 
  
 "^curl_json$" 
  
 # 
  
 Match 
  
 on 
  
 plugin 
 . 
  
 PluginInstance 
  
 "^nginx_my_service_.*$" 
  
 # 
  
 Match 
  
 on 
  
 plugin 
  
 instance 
 . 
  
 Type 
  
 "^gauge$" 
  
 # 
  
 Match 
  
 on 
  
 type 
 . 
  
 TypeInstance 
  
 "^active-connections$" 
  
 # 
  
 Match 
  
 on 
  
 type 
  
 instance 
 . 
  
< / 
 Match 
>  
< Target 
  
 "set" 
>  
 # 
  
 Specify 
  
 the 
  
 metric 
  
 descriptor 
  
 type 
 : 
  
 MetaData 
  
 "stackdriver_metric_type" 
  
 "custom.googleapis.com/nginx/active_connections" 
  
 # 
  
 Specify 
  
 a 
  
 value 
  
 for 
  
 the 
  
 "service_name" 
  
 label 
 ; 
  
 clean 
  
 it 
  
 up 
  
 in 
  
 the 
  
 next 
  
 Target 
 : 
  
 MetaData 
  
 "label:service_name" 
  
 "%{plugin_instance}" 
  
< / 
 Target 
>  
< Target 
  
 "replace" 
>  
 # 
  
 Remove 
  
 the 
  
 "nginx_" 
  
 prefix 
  
 in 
  
 the 
  
 service_name 
  
 to 
  
 get 
  
 the 
  
 real 
  
 service 
  
 name 
 : 
  
 MetaData 
  
 "label:service_name" 
  
 "nginx_" 
  
 "" 
  
< / 
 Target 
>  
< / 
 Rule 
>  
 # 
  
 The 
  
 following 
  
 rule 
  
 is 
  
 run 
  
 after 
  
 rewriting 
  
 your 
  
 metric 
 , 
  
 or 
  
 # 
  
 if 
  
 the 
  
 metric 
  
 wasn 
 ' 
 t 
  
 one 
  
 of 
  
 your 
  
 user 
 - 
 defined 
  
 metrics 
 . 
  
 The 
  
 rule 
  
 returns 
  
 # 
  
 to 
  
 the 
  
 default 
  
 "PreCache" 
  
 chain 
 . 
  
 The 
  
 default 
  
 processing 
  
 # 
  
 will 
  
 write 
  
 all 
  
 metrics 
  
 to 
  
 Cloud 
  
 Monitoring 
 , 
  
 # 
  
 which 
  
 will 
  
 drop 
  
 any 
  
 unrecognized 
  
 metrics 
 : 
  
 ones 
  
 that 
  
 aren 
 ' 
 t 
  
 # 
  
 in 
  
 the 
  
 list 
  
 of 
  
 curated 
  
 metrics 
  
 and 
  
 don 
 ' 
 t 
  
 have 
  
 # 
  
 the 
  
 user 
 - 
 defined 
  
 metric 
  
 metadata 
 . 
  
< Rule 
  
 "go_back" 
>  
 Target 
  
 "return" 
  
< / 
 Rule 
>
< / 
 Chain 
>

Load the new configuration

Restart your agent to pick up the new configuration by executing the following command on your VM instance:

 sudo service stackdriver-agent restart

Your user-defined metric information begins to flow into Monitoring.

Reference and best practices

Metric descriptors and time series

For an introduction to Cloud Monitoring metrics, see Metrics, time series, and resources . More details are available in User-defined metrics overview and Structure of time series .

Metric descriptors. A metric descriptor has the following significant pieces:

A typeof the form custom.googleapis.com/[NAME1]/.../[NAME0] . For example:
```
 custom.googleapis.com/my_measurement
custom.googleapis.com/instance/network/received_packets_count
custom.googleapis.com/instance/network/sent_packets_count 
```
The recommended naming is hierarchical to make the metrics easier for people to keep track of. Metric types can't contain hyphens; for the exact naming rules, see Naming metric types and labels .
Up to 10 labels to annotate the metric data, such as device_name , fault_type , or response_code . The values of the labels aren't specified in the metric descriptor.
The kind and value type of the data points, such as "a gauge value of type double". For more information, see MetricKind and ValueType .

Time series. A metric data point has the following significant pieces:

The type of the associated metric descriptor.
Values for all of the metric descriptor's labels.
A timestamped value consistent with the metric descriptor's value type and kind.
The monitored resource the data came from, typically a VM instance. Space for the resource is built in, so the descriptor doesn't need a separate label for it.

Creating metric descriptors

You don't have to create a metric descriptor ahead of time. When a data point arrives in Monitoring, the point's metric type, labels, and the point's value can be used to automatically create a gauge or cumulative metric descriptor. For more information, see Auto-creation of metric descriptors .

However, there are advantages to creating your own metric descriptor:

You can include some thoughtful documentation for the metric and its labels.
You can specify additional kinds and types of metrics. The only (kind, type) combinations supported by the agent are (GAUGE, DOUBLE) and (CUMULATIVE, INT64). For more information, see Metric kinds and value types .
You can specify label types other than STRING.

If you write a data point to Monitoring that uses a metric type that isn't defined, then a new metric descriptor is created for the data point. This behavior can be a problem when you are debugging the code that writes metric data—misspelling the metric type results in spurious metric descriptors.

After you create a metric descriptor, or after it is created for you, it cannot be changed. For example, you can't add or remove labels. You can only delete the metric descriptor—which deletes all its data—and then recreate the descriptor the way you want.

For more details about creating metric descriptors, see Creating your metric .

Pricing

To learn about pricing for Cloud Monitoring, see the Google Cloud Observability pricing page.

Limits

Cloud Monitoring has limits on the number of metric time series and the number of user-defined metric descriptors in each project. For details, see Quotas and limits .

If you discover that you have created metric descriptors you no longer want, you can find and delete the descriptors using the Monitoring API. For more information, see projects.metricDescriptors .

Troubleshooting

This section explains how to configure the Monitoring agent's write_log plugin to dump out the full set of metric points, including metadata. This can be used to determine what points need to be transformed, as well as to ensure your transformations behave as expected.

Enabling write_log

The write_log plugin is included in the stackdriver-agent package. To enable the plugin:

As root, edit the following configuration file:
```
 /etc/stackdriver/collectd.conf 
```
Right after LoadPlugin write_gcm , add:
```
 LoadPlugin write_log 
```

Right after <Plugin "write_gcm">…</Plugin> , add:

 <Plugin "write_log">
  Format JSON
</Plugin>

Search for <Target "write">…</Target> and after every Plugin "write_gcm" , add:
```
 Plugin "write_log" 
```

Save your changes and restart the agent:

 sudo service stackdriver-agent restart

These changes will print one log line per metric value reported, including the full collectd identifier, the metadata entries, and the value.

Output of write_log

If you were successful in the previous step, you should see the output of write_log in the system logs:

Debian-based Linux: /var/log/syslog
Red Hat-based Linux: /var/log/messages

The sample lines below have been formatted to make them easier to read in this document.

  Dec 
 8 
  
 15 
 : 
 13 
 : 
 45 
  
 test 
 - 
 write 
 - 
 log 
  
 collectd 
 [ 
 1061 
 ]: 
  
 write_log 
  
 values 
 : 
 # 
 012 
 [{ 
  
 "values" 
 :[ 
 1933524992 
 ], 
  
 "dstypes" 
 :[ 
 "gauge" 
 ], 
  
 "dsnames" 
 :[ 
 "value" 
 ], 
  
 "time" 
 : 
 1481210025.252 
 , 
  
 "interval" 
 : 
 60.000 
 , 
  
 "host" 
 : 
 "test-write-log.c.test-write-log.internal" 
 , 
  
 "plugin" 
 : 
 "df" 
 , 
  
 "plugin_instance" 
 : 
 "udev" 
 , 
  
 "type" 
 : 
 "df_complex" 
 , 
  
 "type_instance" 
 : 
 "free" 
 }] 
 Dec 
 8 
  
 15 
 : 
 13 
 : 
 45 
  
 test 
 - 
 write 
 - 
 log 
  
 collectd 
 [ 
 1061 
 ]: 
  
 write_log 
  
 values 
 : 
 # 
 012 
 [{ 
  
 "values" 
 :[ 
 0 
 ], 
  
 "dstypes" 
 :[ 
 "gauge" 
 ], 
  
 "dsnames" 
 :[ 
 "value" 
 ], 
  
 "time" 
 : 
 1481210025.252 
 , 
  
 "interval" 
 : 
 60.000 
 , 
  
 "host" 
 : 
 "test-write-log.c.test-write-log.internal" 
 , 
  
 "plugin" 
 : 
 "df" 
 , 
  
 "plugin_instance" 
 : 
 "udev" 
 , 
  
 "type" 
 : 
 "df_complex" 
 , 
  
 "type_instance" 
 : 
 "reserved" 
 }]