AIP-193
Errors
Effective error communication is an important part of designing simple and intuitive APIs. Services returning standardized error responses enable API clients to construct centralized common error handling logic. This common logic simplifies API client applications and eliminates the need for cumbersome custom error handling code.
Guidance
Services mustreturn a google.rpc.Status
message when an
API error occurs, and mustuse the canonical error codes defined in google.rpc.Code
. More information about the particular codes
is available in the gRPC status code documentation
.
Error messages shouldhelp a reasonably technical user understand and resolve the issue, and should notassume that the user is an expert in your particular API. Additionally, error messages must notassume that the user will know anything about its underlying implementation.
Error messages shouldbe brief but actionable. Any extra information shouldbe provided in the details
field. If even more information
is necessary, you shouldprovide a link where a reader can get more
information or ask questions to help resolve the issue. It is also
important to set the right tone
when writing messages.
The following sections describe the fields of google.rpc.Status
.
Status.message
The message
field is a developer-facing, human-readable "debug message"
which shouldbe in English. (Localized messages are expressed using
a LocalizedMessage
within the details
field. See LocalizedMessage
for more details.) Any dynamic aspects of
the message mustbe included as metadata within the ErrorInfo
that appears
in details
.
The message is considered a problem description. It is intended for
developers to understand the problem and is more detailed than ErrorInfo.reason
, discussed later
.
Messages shoulduse simple descriptive language that is easy to understand (without technical jargon) to clearly state the problem that results in an error, and offer an actionable resolution to it.
For pre-existing (brownfield) APIs which have previously returned errors
without machine-readable identifiers, the value of message
mustremain the same for any given error. For more information, see Changing Error Messages
.
Status.code
The code
field is the status code, which mustbe the numeric value of
one of the elements of the google.rpc.Code
enum.
For example, the value 5
is the numeric value of the NOT_FOUND
enum element.
Status.details
The details
field allows messages with additional error information to
be included in the error response, each packed in a google.protobuf.Any
message.
Google defines a set of standard detail payloads for error details, which cover most common needs for API errors. Services shoulduse these standard detail payloads when feasible.
Each type of detail payload mustbe included at most once. For
example, there must notbe more than one BadRequest
message in the details
, but there maybe a BadRequest
and a PreconditionFailure
.
All error responses mustinclude an ErrorInfo
within details
. This
provides machine-readable identifiers so that users can write code against
specific aspects of the error.
The following sections describe the most common standard detail payloads.
ErrorInfo
The ErrorInfo
message is the primary way to send a
machine-readable identifier. Contextual information shouldbe
included in metadata
in ErrorInfo
and mustbe included if it
appears within an error message.
The reason
field is a short snake_case description of the cause of the
error. Error reasons are unique within a particular domain of errors.
The reason mustbe at most 63 characters and match a regular expression of [A-Z][A-Z0-9_]+[A-Z0-9]
. (This is UPPER_SNAKE_CASE, without leading
or trailing underscores, and without leading digits.)
The reason shouldbe terse, but meaningful enough for a human reader to understand what the reason refers to.
Good examples:
-
CPU_AVAILABILITY
-
NO_STOCK
-
CHECKED_OUT
-
AVAILABILITY_ERROR
Bad examples:
-
THE_BOOK_YOU_WANT_IS_NOT_AVAILABLE
(overly verbose) -
ERROR
(too general)
The domain
field is the logical grouping to which the reason
belongs.
The domain mustbe a globally unique value, and is typically the name of the service
that generated the error, e.g. pubsub.googleapis.com
.
The (reason, domain) pair form a machine-readable way of identifying a particular error. Services mustuse the same (reason, domain) pair for the same error, and must notuse the same (reason, domain) pair for logically different errors. The decision about whether two errors are "the same" or not is not always clear, but shouldgenerally be considered in terms of the expected action a client might take to resolve them.
The metadata
field is a map of key/value pairs providing additional
dynamic information as context. Each key within metadata
mustbe at most
64 characters long, and conform to the regular expression [a-z][a-zA-Z0-9-_]+
.
Any request-specific information which contributes to the Status.message
or LocalizedMessage.message
messages mustbe represented within metadata
.
This practice is critical so that machine actors do not need to parse error
messages to extract information.
For example consider the following message:
An <e2-medium> VM instance with <local-ssd=3,nvidia-t4=2> is currently unavailable in the <us-east1-a> zone. Consider trying your request in the <us-central1-f,us-central1-c> zone(s), which currently has/have capacity to accommodate your request. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation.
The ErrorInfo.metadata
map for the same error could be:
-
"zone": "us-east1-a"
-
"vmType": "e2-medium"
-
"attachment": "local-ssd=3,nvidia-t4=2"
-
"zonesWithCapacity": "us-central1-f,us-central1-c"
Additional contextual information that does not appear in an error message mayalso be included in metadata
to allow programmatic use by the client.
The metadata included for any given (reason,domain) pair can evolve over time:
- New keys maybe included
- All keys that have been included mustcontinue to be included (but may have empty values)
In other words, once a user has observed a given key for a (reason, domain) pair, the service mustallow them to rely on it continuing to be present in the future.
The set of keys provided in each (reason, domain) pair is independent from other pairs,
but services shouldaim for consistent key naming. For example, two error reasons
within the same domain should not use metadata keys of vmType
and virtualMachineType
.
LocalizedMessage
google.rpc.LocalizedMessage
is used to provide an error
message which shouldbe localized to a user-specified locale where
possible.
If the Status.message
field has a sub-optimal value
which cannot be changed due to the constraints in the Changing Error Messages
section, LocalizedMessage
maybe used to provide a better error message even when no user-specified
locale is available.
Regardless of how the locale for the message was determined, both the locale
and message
fields mustbe populated.
The locale
field specifies the locale of the message,
following IETF bcp47
(Tags for
Identifying Languages). Example values: "en-US"
, "fr-CH"
, "es-MX"
.
The message
field contains the localized text itself. This shouldinclude a brief description of the error and a call to action
to resolve the error. The message shouldinclude contextual information
to make the message as specific as possible. Any contextual information
in the message mustbe included in ErrorInfo.metadata
. See ErrorInfo
for more details of how contextual information
may be included in a message and the corresponding metadata.
The LocalizedMessage
payload shouldcontain the complete resolution
to the error. If more information is needed than can reasonably fit in this
payload, then additional resolution information mustbe provided in
a Help
payload. See the Help
section for guidance.
Help
When other textual error messages (in Status.message
or LocalizedMessage.message
) don't provide the user sufficient
context or actionable next steps, or if there are multiple points of
failure that need to be considered in troubleshooting, a link to
supplemental troubleshooting documentation mustbe provided in the Help
payload.
Provide this information in addition to a clear problem definition and
actionable resolution, not as an alternative to them. The linked
documentation mustclearly relate to the error. If a single page
contains information about multiple errors, the ErrorInfo.reason
value mustbe used to narrow down
the relevant information.
The description
field is a textual description of the linked information.
This mustbe suitable to display to a user as text for a hyperlink.
This mustbe plain text (not HTML, Markdown etc).
Example description
value: "Troubleshooting documentation for STOCKOUT errors"
The url
field is the URL to link to. This mustbe an absolute URL,
including scheme.
Example url
value: "https://cloud.google.com/compute/docs/resource-error"
For publicly-documented services, even those with access controls on actual usage, the linked content mustbe accessible without authentication.
For privately-documented services, the linked content mayrequire authentication.
Error messages
Textual error messages can be present in both Status.message
and LocalizedMessage.message
fields. Messages shouldbe succinct but
actionable, with request-specific information (such as a resource name
or region) providing precise details where appropriate. Any request-specific
details mustbe present in ErrorInfo.metadata
.
Changing error messages
Changing the content of Status.message
over time must be done carefully,
to avoid breaking clients who have previously had to rely on the message
for all information. See the rationale section
for more details.
For a given RPC:
- If the RPC has always
returned
ErrorInfo
with machine-readable information, the content ofStatus.message
maychange over time. (For example, the API producer may provide a clearer explanation, or more request-specific information.) - Otherwise, the content of
Status.message
mustbe stable, providing the same text with the same request-specific information. Instead of changingStatus.message
, the API shouldinclude aLocalizedMessage
withinStatus.details
.
Even if an RPC has always returned ErrorInfo
, the API maykeep
the existing Status.message
stable and add a LocalizedMessage
within Status.details
.
The content of LocalizedMessage.details
maychange over time.
Partial errors
APIs should notsupport partial errors. Partial errors add significant complexity for users, because they usually sidestep the use of error codes, or move those error codes into the response message, where the user mustwrite specialized error handling logic to address the problem.
However, occasionally partial errors are necessary, particularly in bulk operations where it would be hostile to users to fail an entire large request because of a problem with a single entry.
Methods that require partial errors shoulduse long-running
operations
, and the method shouldput partial failure information
in the metadata message. The errors themselves muststill be
represented with a google.rpc.Status
object.
Permission Denied
If the user does not have permission to access the resource or parent,
regardless of whether or not it exists, the service musterror with PERMISSION_DENIED
(HTTP 403). Permission mustbe checked prior to
checking if the resource or parent exists.
If the user does have proper permission, but the requested resource or
parent does not exist, the service musterror with NOT_FOUND
(HTTP
404).
HTTP/1.1+JSON representation
When clients use HTTP/1.1 as per AIP-127
, the error information
is returned in the body of the response, as a JSON object. For backward
compatibility reasons, this does not map precisely to google.rpc.Status
,
but contains the same core information. The schema is defined in the following proto:
message
Error
{
message
Status
{
// The HTTP status code that corresponds to `google.rpc.Status.code`.
int32
code
=
1
;
// This corresponds to `google.rpc.Status.message`.
string
message
=
2
;
// This is the enum version for `google.rpc.Status.code`.
google.rpc.Code
status
=
4
;
// This corresponds to `google.rpc.Status.details`.
repeated
google.protobuf.Any
details
=
5
;
}
Status
error
=
1
;
}
The most important difference is that the code
field in the JSON is an HTTP status code, not
the direct value of google.rpc.Status.code
. For example, a google.rpc.Status
message with a code
value of 5 would be mapped to an object including the following
code-related fields (as well as the message, details etc):
{
"error"
:
{
"code"
:
404
,
//
The
HTTP
s
tatus
code
f
or
"not found"
"status"
:
"NOT_FOUND"
//
The
na
me
i
n
google.rpc.Code
f
or
value
5
}
}
The following JSON shows a fully populated HTTP/1.1+JSON representation of an error response.
{
"error"
:
{
"code"
:
429
,
"message"
:
"The zone 'us-east1-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later."
,
"status"
:
"RESOURCE_EXHAUSTED"
,
"details"
:
[
{
"@type"
:
"type.googleapis.com/google.rpc.ErrorInfo"
,
"reason"
:
"RESOURCE_AVAILABILITY"
,
"domain"
:
"compute.googleapis.com"
,
"metadata"
:
{
"zone"
:
"us-east1-a"
,
"vmType"
:
"e2-medium"
,
"attachment"
:
"local-ssd=3,nvidia-t4=2"
,
"zonesWithCapacity"
:
"us-central1-f,us-central1-c"
}
},
{
"@type"
:
"type.googleapis.com/google.rpc.LocalizedMessage"
,
"locale"
:
"en-US"
,
"message"
:
"An <e2-medium> VM instance with <local-ssd=3,nvidia-t4=2> is currently unavailable in the <us-east1-a> zone. Consider trying your request in the <us-central1-f,us-central1-c> zone(s), which currently has/have capacity to accommodate your request. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation."
},
{
"@type"
:
"type.googleapis.com/google.rpc.Help"
,
"links"
:
[
{
"description"
:
"Additional information on this error"
,
"url"
:
"https://cloud.google.com/compute/docs/resource-error"
}
]
}
]
}
}
Rationale
Requiring ErrorInfo
ErrorInfo
is required because it further identifies an error. With
only approximately twenty available values
for Status.status
,
it is difficult to disambiguate one error from another across an entire API Service
.
Also, error messages often contain dynamic segments that express variable information, so there needs to be machine-readable component of every error response that enables clients to use such information programmatically.
Including LocalizedMessage
LocalizedMessage
was selected as the location to present alternate
error messages. While LocalizedMessage
mayuse a locale specified
in the request, a service mayprovide a LocalizedMessage
even without
a user-specified locale, typically to provide a better error message in situations where Status.message
cannot be changed
.
Where the locale is not specified by the user, it shouldbe en-US
(US English).
A service mayinclude LocalizedMessage
even when the same message is
provided in Status.message
and when localization into a user-specified locale
is not supported. Reasons for this include:
- An intention to support user-specified localization in the near future, allowing
clients to consistently use
LocalizedMessage
and not change their error-reporting code when the functionality is introduced. - Consistency across all RPCs within a service: if some RPCs include
LocalizedMessage
and some only useStatus.message
for error messages, clients have to be aware of which RPCs will do what, or implement a fall-back mechanism. ProvidingLocalizedMessage
on all RPCs allows simple and consistent client code to be written.
Updating Status.message
If a client has ever observed an error with Status.message
populated
(which it always will be) but without ErrorInfo
, the developer of that client
may well have had to resort to parsing Status.message
in order to find out
information beyond just what Status.code
conveys. That information may be
found by matching specific text (e.g. "Connection closed with unknown cause")
or by parsing the message to find out metadata values (e.g. a region with
insufficient resources). At that point, Status.message
is implicitly part
of the API contract, so must notbe updated - that would be a breaking
change. This is one reason for introducing LocalizedMessage
into the Status.details
.
RPCs which have alwaysincluded ErrorInfo
are in a better position:
the contract is then more about the stability of ErrorInfo
for any given
error. The reason and domain need to be consistent over time, and the
metadata provided for any given (reason,domain) can only be expanded.
It's still possible that clients could be parsing Status.message
instead of
using ErrorInfo
, but they will always have had a more robust option
available to them.
Further reading
- For which error codes to retry, see AIP-194 .
- For how to retry errors in client libraries, see AIP-4221 .
Changelog
- 2024-10-18: Rewrite/restructure for clarity.
- 2024-01-10: Incorporate guidance for writing effective messages.
- 2023-05-17: Change the recommended language for
Status.message
to be the service's native language rather than English. - 2023-05-17: Specify requirements for changing error messages.
- 2023-05-10: Require
ErrorInfo
for all error responses. - 2023-05-04: Require uniqueness by message type for error details.
- 2022-11-04: Added guidance around PERMISSION_DENIED errors previously found in other AIPs.
- 2022-08-12: Reworded/Simplified intro to add clarity to the intent.
- 2020-01-22: Added a reference to the
ErrorInfo
message. - 2019-10-14: Added guidance restricting error message mutability to if there is a machine-readable identifier present.
- 2019-09-23: Added guidance about error message strings being able to change.