This guide explains how you can configure the Monitoring agent to recognize and export your application metrics to Cloud Monitoring.
The Monitoring agent is a collectd daemon. In addition to exporting many predefined system and third-party metrics to Cloud Monitoring, the agent can export your own collectd application metrics to Monitoring as user-defined metrics . Your collectd plugins can also export to Monitoring.
An alternative way to export application metrics to Monitoring is to use StatsD . Cloud Monitoring provides a default configuration that maps StatsD metrics to user-defined metrics. If you are satisfied with that mapping, then you don't need the customization steps described below. For more information, see the StatsD plugin .
For more information about metrics, see the following documents:
This functionality is only available for agents running on Linux. It is not available on Windows.
Before you begin
-
Install the most recent Monitoring agent on a VM instance and verify it is working. To update your agent, see Updating the agent .
-
Configure collectd to get monitoring data from your application. Collectd supports many application frameworks and standard monitoring endpoints through its read plugins . Find a read plugin that works for you.
-
(Optional) As a convenience, add the agent's collectd reference documentation to your system's
man
pages by updating theMANPATH
variable and then runningmandb
:export MANPATH = "$MANPATH:/opt/stackdriver/collectd/share/man" sudo mandb
The man pages are for
stackdriver-collectd
.
Important files and directories
The following files and directories, created by installing the agent, are relevant to using the Monitoring agent (collectd):
-
/etc/stackdriver/collectd.conf
-
The collectd configuration file used by the agent. Edit this file to change general configuration.
-
/etc/stackdriver/collectd.d/
-
The directory for user-added configuration files. To send user-defined metrics from the agent, you place the required configuration files, discussed below, in this directory. For backward compatibility, the agent also looks for files in
/opt/stackdriver/collectd/etc/collectd.d/
. -
/opt/stackdriver/collectd/share/man/*
-
The documentation for the agent's version of collectd. You can add these pages to your system's set of
man
pages; see Before you begin for details. -
/etc/init.d/stackdriver-agent
-
The init script for the agent.
How Monitoring handles collectd metrics
As background, the Monitoring agent processes collectd metrics and sends them to Monitoring, which treats each metric as a member of one of the following categories:
-
User-defined metrics. Collectd metrics that have the metadata key
stackdriver_metric_type
and a single data source are handled as user-defined metrics and sent to Monitoring using theprojects.timeSeries.create
method in the Monitoring API. -
Curated metrics. All other collectd metrics are sent to Monitoring using an internal API. Only the metrics in the list of curated metrics are accepted and processed.
-
Discarded metrics. Collectd metrics that aren't in the curated metrics list and aren't user-defined metrics are silently discarded by Monitoring. The agent itself isn't aware of which metrics are accepted or discarded.
Write user-defined metrics with the agent
You configure the agent to send metric data points to Monitoring. Each point must be associated with a user-defined metric, which you define with a metric descriptor . These concepts are introduced in Metrics, time series, and resources and described in detail at Structure of time series and User-defined metrics overview .
You can have a collectd metric treated as a user-defined metric by adding the proper metadata to the metric:
-
stackdriver_metric_type
: (required) the name of the exported metric. Example:custom.googleapis.com/my_custom_metric
. -
label:[LABEL]
: (optional) additional labels for the exported metric. For example, if you want a Monitoring STRING label namedcolor
, then your metadata key would belabel:color
and the value of the key could be"blue"
. You can have up to 10 labels per metric type.
You can use a collectd filter chain to modify the metadata for your metrics. Because filter chains can't modify the list of data sources and user-defined metrics only support a single data source, any collectd metrics that you want to use with this facility must have a single data source.
Example
In this example we will monitor active Nginx connections from two Nginx
services, my_service_a
and my_service_b
. We will send these to
Monitoring using a user-defined metric.
We will take the following steps:
-
Identify the collectd metrics for each Nginx service.
-
Define a Monitoring metric descriptor.
-
Configure a collectd filter chain to add metadata to the collectd metrics, to meet the expectations of the Monitoring agent.
Incoming collectd metrics
Collectd expects metrics to consist of the following components. The first five components make up the collectd identifierfor the metric:
Host, Plugin, Plugin-instance, Type, Type-instance, [value]
In this example, the metrics you want to send as a user-defined metric have the following values:
Component | Expected value(s) |
---|---|
Host | any |
Plugin | curl_json
|
Plugin instance | nginx_my_service_a
ornginx_my_service_b
1
|
Type | gauge
|
Type instance | active-connections
|
[value]
|
any value 2 |
Notes:
1
In the example, this value encodes both the application (Nginx) and
the connected service name.
2
The value is typically a timestamp and double-precision number.
Monitoring handles the details of interpreting the various kinds
of values. Compound values aren't supported by the
Monitoring agent.
Monitoring metric descriptor and time series
On the Monitoring side, design a metric descriptor for your user-defined metric. The following descriptor is a reasonable choice for the data in this example:
- Name:
custom.googleapis.com/nginx/active_connections
- Labels:
-
service_name
(STRING): The name of the service connected to Nginx.
-
- Kind: GAUGE
- Type: DOUBLE
After you've designed the metric descriptor, you can create it by using projects.metricDescriptors.create
,
or you can let it be created for you
from the time series metadata, discussed below. For more information,
see Creating metric descriptors
on this page.
The time series data for this metric descriptor must contain the following information, because of the way the metric descriptor is defined:
- Metric type:
custom.googleapis.com/nginx/active_connections
- Metric label values:
-
service_name
: either"my_service_a"
or"my_service_b"
-
Other time series information, including the associated monitored resource —the VM instance sending the data—and the metric's data point, is automatically obtained by the agent for all metrics. You don't have to do anything special.
Your filter chain
Create a file, /opt/stackdriver/collectd/etc/collectd.d/nginx_curl_json.conf
,
containing the following code:
LoadPlugin
match_regex
LoadPlugin
target_set
LoadPlugin
target_replace
#
Insert
a
new
rule
in
the
default
"PreCache"
chain
,
to
divert
your
metrics
.
PreCacheChain
"PreCache"
< Chain
"PreCache"
>
< Rule
"jump_to_custom_metrics_from_curl_json"
>
#
If
the
plugin
name
and
instance
match
,
this
is
PROBABLY
a
metric
we
'
re
looking
for
:
< Match
regex
>
Plugin
"^curl_json$"
PluginInstance
"^nginx_"
< /
Match
>
< Target
"jump"
>
#
Go
execute
the
following
chain
;
then
come
back
.
Chain
"PreCache_curl_json"
< /
Target
>
< /
Rule
>
#
Continue
processing
metrics
in
the
default
"PreCache"
chain
.
< /
Chain
> #
Following
is
a
NEW
filter
chain
,
just
for
your
metric
.
#
It
is
only
executed
if
the
default
chain
"jumps"
here
.
< Chain
"PreCache_curl_json"
>
#
The
following
rule
does
all
the
work
for
your
metric
:
< Rule
"rewrite_curl_json_my_special_metric"
>
#
Do
a
careful
match
for
just
your
metrics
;
if
it
fails
,
drop
down
#
to
the
next
rule
:
< Match
regex
>
Plugin
"^curl_json$"
#
Match
on
plugin
.
PluginInstance
"^nginx_my_service_.*$"
#
Match
on
plugin
instance
.
Type
"^gauge$"
#
Match
on
type
.
TypeInstance
"^active-connections$"
#
Match
on
type
instance
.
< /
Match
>
< Target
"set"
>
#
Specify
the
metric
descriptor
type
:
MetaData
"stackdriver_metric_type"
"custom.googleapis.com/nginx/active_connections"
#
Specify
a
value
for
the
"service_name"
label
;
clean
it
up
in
the
next
Target
:
MetaData
"label:service_name"
"%{plugin_instance}"
< /
Target
>
< Target
"replace"
>
#
Remove
the
"nginx_"
prefix
in
the
service_name
to
get
the
real
service
name
:
MetaData
"label:service_name"
"nginx_"
""
< /
Target
>
< /
Rule
>
#
The
following
rule
is
run
after
rewriting
your
metric
,
or
#
if
the
metric
wasn
'
t
one
of
your
user
-
defined
metrics
.
The
rule
returns
#
to
the
default
"PreCache"
chain
.
The
default
processing
#
will
write
all
metrics
to
Cloud
Monitoring
,
#
which
will
drop
any
unrecognized
metrics
:
ones
that
aren
'
t
#
in
the
list
of
curated
metrics
and
don
'
t
have
#
the
user
-
defined
metric
metadata
.
< Rule
"go_back"
>
Target
"return"
< /
Rule
>
< /
Chain
>
Load the new configuration
Restart your agent to pick up the new configuration by executing the following command on your VM instance:
sudo service stackdriver-agent restart
Your user-defined metric information begins to flow into Monitoring.
Reference and best practices
Metric descriptors and time series
For an introduction to Cloud Monitoring metrics, see Metrics, time series, and resources . More details are available in User-defined metrics overview and Structure of time series .
Metric descriptors. A metric descriptor has the following significant pieces:
-
A typeof the form
custom.googleapis.com/[NAME1]/.../[NAME0]
. For example:custom.googleapis.com/my_measurement custom.googleapis.com/instance/network/received_packets_count custom.googleapis.com/instance/network/sent_packets_count
The recommended naming is hierarchical to make the metrics easier for people to keep track of. Metric types can't contain hyphens; for the exact naming rules, see Naming metric types and labels .
-
Up to 10 labels to annotate the metric data, such as
device_name
,fault_type
, orresponse_code
. The values of the labels aren't specified in the metric descriptor. -
The kind and value type of the data points, such as "a gauge value of type double". For more information, see
MetricKind
andValueType
.
Time series. A metric data point has the following significant pieces:
-
The type of the associated metric descriptor.
-
Values for all of the metric descriptor's labels.
-
A timestamped value consistent with the metric descriptor's value type and kind.
-
The monitored resource the data came from, typically a VM instance. Space for the resource is built in, so the descriptor doesn't need a separate label for it.
Creating metric descriptors
You don't have to create a metric descriptor ahead of time. When a data point arrives in Monitoring, the point's metric type, labels, and the point's value can be used to automatically create a gauge or cumulative metric descriptor. For more information, see Auto-creation of metric descriptors .
However, there are advantages to creating your own metric descriptor:
-
You can include some thoughtful documentation for the metric and its labels.
-
You can specify additional kinds and types of metrics. The only (kind, type) combinations supported by the agent are (GAUGE, DOUBLE) and (CUMULATIVE, INT64). For more information, see Metric kinds and value types .
-
You can specify label types other than STRING.
If you write a data point to Monitoring that uses a metric type that isn't defined, then a new metric descriptor is created for the data point. This behavior can be a problem when you are debugging the code that writes metric data—misspelling the metric type results in spurious metric descriptors.
After you create a metric descriptor, or after it is created for you, it cannot be changed. For example, you can't add or remove labels. You can only delete the metric descriptor—which deletes all its data—and then recreate the descriptor the way you want.
For more details about creating metric descriptors, see Creating your metric .
Pricing
In general, Cloud Monitoring system metrics are free, and metrics from external systems, agents, or applications are not. Billable metrics are billed by either the number of bytes or the number of samples ingested.
For more information, see the Cloud Monitoring sections of the Google Cloud Observability pricing page.
Limits
Cloud Monitoring has limits on the number of metric time series and the number of user-defined metric descriptors in each project. For details, see Quotas and limits .
If you discover that you have created metric descriptors you no longer want, you
can find and delete the descriptors using the Monitoring API. For more
information, see projects.metricDescriptors
.
Troubleshooting
This section explains how to configure the Monitoring agent's write_log
plugin to dump out the full set of metric points, including
metadata. This can be used to determine what points need to be transformed, as
well as to ensure your transformations behave as expected.
Enabling write_log
The write_log
plugin is included in the stackdriver-agent
package. To enable
the plugin:
-
As root, edit the following configuration file:
/etc/stackdriver/collectd.conf
-
Right after
LoadPlugin write_gcm
, add:LoadPlugin write_log
-
Right after
<Plugin "write_gcm">…</Plugin>
, add:<Plugin "write_log"> Format JSON </Plugin>
-
Search for
<Target "write">…</Target>
and after everyPlugin "write_gcm"
, add:Plugin "write_log"
-
Save your changes and restart the agent:
sudo service stackdriver-agent restart
These changes will print one log line per metric value reported, including the full collectd identifier, the metadata entries, and the value.
Output of write_log
If you were successful in the previous step, you should see the output of write_log
in the system logs:
- Debian-based Linux:
/var/log/syslog
- Red Hat-based Linux:
/var/log/messages
The sample lines below have been formatted to make them easier to read in this document.
Dec
8
15
:
13
:
45
test
-
write
-
log
collectd
[
1061
]:
write_log
values
:
#
012
[{
"values"
:[
1933524992
],
"dstypes"
:[
"gauge"
],
"dsnames"
:[
"value"
],
"time"
:
1481210025.252
,
"interval"
:
60.000
,
"host"
:
"test-write-log.c.test-write-log.internal"
,
"plugin"
:
"df"
,
"plugin_instance"
:
"udev"
,
"type"
:
"df_complex"
,
"type_instance"
:
"free"
}]
Dec
8
15
:
13
:
45
test
-
write
-
log
collectd
[
1061
]:
write_log
values
:
#
012
[{
"values"
:[
0
],
"dstypes"
:[
"gauge"
],
"dsnames"
:[
"value"
],
"time"
:
1481210025.252
,
"interval"
:
60.000
,
"host"
:
"test-write-log.c.test-write-log.internal"
,
"plugin"
:
"df"
,
"plugin_instance"
:
"udev"
,
"type"
:
"df_complex"
,
"type_instance"
:
"reserved"
}]