AIP-193

Errors

Effective error communication is an important part of designing simple and intuitive APIs. Services returning standardized error responses enable API clients to construct centralized common error handling logic. This common logic simplifies API client applications and eliminates the need for cumbersome custom error handling code.

Guidance

Services mustreturn a google.rpc.Status message when an API error occurs, and mustuse the canonical error codes defined in google.rpc.Code . More information about the particular codes is available in the gRPC status code documentation .

Error messages shouldhelp a reasonably technical user understand and resolve the issue, and should notassume that the user is an expert in your particular API. Additionally, error messages must notassume that the user will know anything about its underlying implementation.

Error messages shouldbe brief but actionable. Any extra information shouldbe provided in the details field. If even more information is necessary, you shouldprovide a link where a reader can get more information or ask questions to help resolve the issue. It is also important to set the right tone when writing messages.

The following sections describe the fields of google.rpc.Status .

Status.message

The message field is a developer-facing, human-readable "debug message" which shouldbe in English. (Localized messages are expressed using a LocalizedMessage within the details field. See LocalizedMessage for more details.) Any dynamic aspects of the message mustbe included as metadata within the ErrorInfo that appears in details .

The message is considered a problem description. It is intended for developers to understand the problem and is more detailed than ErrorInfo.reason , discussed later .

Messages shoulduse simple descriptive language that is easy to understand (without technical jargon) to clearly state the problem that results in an error, and offer an actionable resolution to it.

For pre-existing (brownfield) APIs which have previously returned errors without machine-readable identifiers, the value of message mustremain the same for any given error. For more information, see Changing Error Messages .

Status.code

The code field is the status code, which mustbe the numeric value of one of the elements of the google.rpc.Code enum.

For example, the value 5 is the numeric value of the NOT_FOUND enum element.

Status.details

The details field allows messages with additional error information to be included in the error response, each packed in a google.protobuf.Any message.

Google defines a set of standard detail payloads for error details, which cover most common needs for API errors. Services shoulduse these standard detail payloads when feasible.

Each type of detail payload mustbe included at most once. For example, there must notbe more than one BadRequest message in the details , but there maybe a BadRequest and a PreconditionFailure .

All error responses mustinclude an ErrorInfo within details . This provides machine-readable identifiers so that users can write code against specific aspects of the error.

The following sections describe the most common standard detail payloads.

ErrorInfo

The ErrorInfo message is the primary way to send a machine-readable identifier. Contextual information shouldbe included in metadata in ErrorInfo and mustbe included if it appears within an error message.

The reason field is a short snake_case description of the cause of the error. Error reasons are unique within a particular domain of errors. The reason mustbe at most 63 characters and match a regular expression of [A-Z][A-Z0-9_]+[A-Z0-9] . (This is UPPER_SNAKE_CASE, without leading or trailing underscores, and without leading digits.)

The reason shouldbe terse, but meaningful enough for a human reader to understand what the reason refers to.

Good examples:

CPU_AVAILABILITY
NO_STOCK
CHECKED_OUT
AVAILABILITY_ERROR

Bad examples:

THE_BOOK_YOU_WANT_IS_NOT_AVAILABLE (overly verbose)
ERROR (too general)

The domain field is the logical grouping to which the reason belongs. The domain mustbe a globally unique value, and is typically the name of the service that generated the error, e.g. pubsub.googleapis.com .

The (reason, domain) pair form a machine-readable way of identifying a particular error. Services mustuse the same (reason, domain) pair for the same error, and must notuse the same (reason, domain) pair for logically different errors. The decision about whether two errors are "the same" or not is not always clear, but shouldgenerally be considered in terms of the expected action a client might take to resolve them.

The metadata field is a map of key/value pairs providing additional dynamic information as context. Each key within metadata mustbe at most 64 characters long, and conform to the regular expression [a-z][a-zA-Z0-9-_]+ .

Any request-specific information which contributes to the Status.message or LocalizedMessage.message messages mustbe represented within metadata . This practice is critical so that machine actors do not need to parse error messages to extract information.

For example consider the following message:

An <e2-medium> VM instance with <local-ssd=3,nvidia-t4=2> is currently unavailable in the <us-east1-a> zone. Consider trying your request in the <us-central1-f,us-central1-c> zone(s), which currently has/have capacity to accommodate your request. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation.

The ErrorInfo.metadata map for the same error could be:

"zone": "us-east1-a"
"vmType": "e2-medium"
"attachment": "local-ssd=3,nvidia-t4=2"
"zonesWithCapacity": "us-central1-f,us-central1-c"

Additional contextual information that does not appear in an error message mayalso be included in metadata to allow programmatic use by the client.

The metadata included for any given (reason,domain) pair can evolve over time:

New keys maybe included
All keys that have been included mustcontinue to be included (but may have empty values)

In other words, once a user has observed a given key for a (reason, domain) pair, the service mustallow them to rely on it continuing to be present in the future.

The set of keys provided in each (reason, domain) pair is independent from other pairs, but services shouldaim for consistent key naming. For example, two error reasons within the same domain should not use metadata keys of vmType and virtualMachineType .

LocalizedMessage

google.rpc.LocalizedMessage is used to provide an error message which shouldbe localized to a user-specified locale where possible.

If the Status.message field has a sub-optimal value which cannot be changed due to the constraints in the Changing Error Messages section, LocalizedMessage maybe used to provide a better error message even when no user-specified locale is available.

Regardless of how the locale for the message was determined, both the locale and message fields mustbe populated.

The locale field specifies the locale of the message, following IETF bcp47 (Tags for Identifying Languages). Example values: "en-US" , "fr-CH" , "es-MX" .

The message field contains the localized text itself. This shouldinclude a brief description of the error and a call to action to resolve the error. The message shouldinclude contextual information to make the message as specific as possible. Any contextual information in the message mustbe included in ErrorInfo.metadata . See ErrorInfo for more details of how contextual information may be included in a message and the corresponding metadata.

The LocalizedMessage payload shouldcontain the complete resolution to the error. If more information is needed than can reasonably fit in this payload, then additional resolution information mustbe provided in a Help payload. See the Help section for guidance.

Help

When other textual error messages (in Status.message or LocalizedMessage.message ) don't provide the user sufficient context or actionable next steps, or if there are multiple points of failure that need to be considered in troubleshooting, a link to supplemental troubleshooting documentation mustbe provided in the Help payload.

Provide this information in addition to a clear problem definition and actionable resolution, not as an alternative to them. The linked documentation mustclearly relate to the error. If a single page contains information about multiple errors, the ErrorInfo.reason value mustbe used to narrow down the relevant information.

The description field is a textual description of the linked information. This mustbe suitable to display to a user as text for a hyperlink. This mustbe plain text (not HTML, Markdown etc).

Example description value: "Troubleshooting documentation for STOCKOUT errors"

The url field is the URL to link to. This mustbe an absolute URL, including scheme.

Example url value: "https://cloud.google.com/compute/docs/resource-error"

For publicly-documented services, even those with access controls on actual usage, the linked content mustbe accessible without authentication.

For privately-documented services, the linked content mayrequire authentication.

Error messages

Textual error messages can be present in both Status.message and LocalizedMessage.message fields. Messages shouldbe succinct but actionable, with request-specific information (such as a resource name or region) providing precise details where appropriate. Any request-specific details mustbe present in ErrorInfo.metadata .

Changing error messages

Changing the content of Status.message over time must be done carefully, to avoid breaking clients who have previously had to rely on the message for all information. See the rationale section for more details.

For a given RPC:

If the RPC has always returned ErrorInfo with machine-readable information, the content of Status.message maychange over time. (For example, the API producer may provide a clearer explanation, or more request-specific information.)
Otherwise, the content of Status.message mustbe stable, providing the same text with the same request-specific information. Instead of changing Status.message , the API shouldinclude a LocalizedMessage within Status.details .

Even if an RPC has always returned ErrorInfo , the API maykeep the existing Status.message stable and add a LocalizedMessage within Status.details .

The content of LocalizedMessage.details maychange over time.

Partial errors

APIs should notsupport partial errors. Partial errors add significant complexity for users, because they usually sidestep the use of error codes, or move those error codes into the response message, where the user mustwrite specialized error handling logic to address the problem.

However, occasionally partial errors are necessary, particularly in bulk operations where it would be hostile to users to fail an entire large request because of a problem with a single entry.

Methods that require partial errors shoulduse long-running operations , and the method shouldput partial failure information in the metadata message. The errors themselves muststill be represented with a google.rpc.Status object.

Permission Denied

If the user does not have permission to access the resource or parent, regardless of whether or not it exists, the service musterror with PERMISSION_DENIED (HTTP 403). Permission mustbe checked prior to checking if the resource or parent exists.

If the user does have proper permission, but the requested resource or parent does not exist, the service musterror with NOT_FOUND (HTTP 404).

HTTP/1.1+JSON representation

When clients use HTTP/1.1 as per AIP-127 , the error information is returned in the body of the response, as a JSON object. For backward compatibility reasons, this does not map precisely to google.rpc.Status , but contains the same core information. The schema is defined in the following proto:

  message 
 Error 
 { 
 message 
 Status 
 { 
 // The HTTP status code that corresponds to `google.rpc.Status.code`. 
 int32 
 code 
 = 
 1 
 ; 
 // This corresponds to `google.rpc.Status.message`. 
 string 
 message 
 = 
 2 
 ; 
 // This is the enum version for `google.rpc.Status.code`. 
 google.rpc.Code 
 status 
 = 
 4 
 ; 
 // This corresponds to `google.rpc.Status.details`. 
 repeated 
 google.protobuf.Any 
 details 
 = 
 5 
 ; 
 } 
 Status 
 error 
 = 
 1 
 ; 
 }

The most important difference is that the code field in the JSON is an HTTP status code, not the direct value of google.rpc.Status.code . For example, a google.rpc.Status message with a code value of 5 would be mapped to an object including the following code-related fields (as well as the message, details etc):

  { 
 "error" 
 : 
 { 
 "code" 
 : 
 404 
 , 
 // 
 The 
 HTTP 
 s 
 tatus 
 code 
 f 
 or 
 "not found" 
 "status" 
 : 
 "NOT_FOUND" 
 // 
 The 
 na 
 me 
 i 
 n 
 google.rpc.Code 
 f 
 or 
 value 
 5 
 } 
 }

The following JSON shows a fully populated HTTP/1.1+JSON representation of an error response.

  { 
 "error" 
 : 
 { 
 "code" 
 : 
 429 
 , 
 "message" 
 : 
 "The zone 'us-east1-a' does not have enough resources available to fulfill the request. Try a different zone, or try again later." 
 , 
 "status" 
 : 
 "RESOURCE_EXHAUSTED" 
 , 
 "details" 
 : 
 [ 
 { 
 "@type" 
 : 
 "type.googleapis.com/google.rpc.ErrorInfo" 
 , 
 "reason" 
 : 
 "RESOURCE_AVAILABILITY" 
 , 
 "domain" 
 : 
 "compute.googleapis.com" 
 , 
 "metadata" 
 : 
 { 
 "zone" 
 : 
 "us-east1-a" 
 , 
 "vmType" 
 : 
 "e2-medium" 
 , 
 "attachment" 
 : 
 "local-ssd=3,nvidia-t4=2" 
 , 
 "zonesWithCapacity" 
 : 
 "us-central1-f,us-central1-c" 
 } 
 }, 
 { 
 "@type" 
 : 
 "type.googleapis.com/google.rpc.LocalizedMessage" 
 , 
 "locale" 
 : 
 "en-US" 
 , 
 "message" 
 : 
 "An <e2-medium> VM instance with <local-ssd=3,nvidia-t4=2> is currently unavailable in the <us-east1-a> zone. Consider trying your request in the <us-central1-f,us-central1-c> zone(s), which currently has/have capacity to accommodate your request. Alternatively, you can try your request again with a different VM hardware configuration or at a later time. For more information, see the troubleshooting documentation." 
 }, 
 { 
 "@type" 
 : 
 "type.googleapis.com/google.rpc.Help" 
 , 
 "links" 
 : 
 [ 
 { 
 "description" 
 : 
 "Additional information on this error" 
 , 
 "url" 
 : 
 "https://cloud.google.com/compute/docs/resource-error" 
 } 
 ] 
 } 
 ] 
 } 
 }

Rationale

Requiring ErrorInfo

ErrorInfo is required because it further identifies an error. With only approximately twenty available values for Status.status , it is difficult to disambiguate one error from another across an entire API Service .

Also, error messages often contain dynamic segments that express variable information, so there needs to be machine-readable component of every error response that enables clients to use such information programmatically.

Including LocalizedMessage

LocalizedMessage was selected as the location to present alternate error messages. While LocalizedMessage mayuse a locale specified in the request, a service mayprovide a LocalizedMessage even without a user-specified locale, typically to provide a better error message in situations where Status.message cannot be changed . Where the locale is not specified by the user, it shouldbe en-US (US English).

A service mayinclude LocalizedMessage even when the same message is provided in Status.message and when localization into a user-specified locale is not supported. Reasons for this include:

An intention to support user-specified localization in the near future, allowing clients to consistently use LocalizedMessage and not change their error-reporting code when the functionality is introduced.
Consistency across all RPCs within a service: if some RPCs include LocalizedMessage and some only use Status.message for error messages, clients have to be aware of which RPCs will do what, or implement a fall-back mechanism. Providing LocalizedMessage on all RPCs allows simple and consistent client code to be written.

Updating Status.message

If a client has ever observed an error with Status.message populated (which it always will be) but without ErrorInfo , the developer of that client may well have had to resort to parsing Status.message in order to find out information beyond just what Status.code conveys. That information may be found by matching specific text (e.g. "Connection closed with unknown cause") or by parsing the message to find out metadata values (e.g. a region with insufficient resources). At that point, Status.message is implicitly part of the API contract, so must notbe updated - that would be a breaking change. This is one reason for introducing LocalizedMessage into the Status.details .

RPCs which have alwaysincluded ErrorInfo are in a better position: the contract is then more about the stability of ErrorInfo for any given error. The reason and domain need to be consistent over time, and the metadata provided for any given (reason,domain) can only be expanded. It's still possible that clients could be parsing Status.message instead of using ErrorInfo , but they will always have had a more robust option available to them.

Changelog

2024-10-18: Rewrite/restructure for clarity.
2024-01-10: Incorporate guidance for writing effective messages.
2023-05-17: Change the recommended language for Status.message to be the service's native language rather than English.
2023-05-17: Specify requirements for changing error messages.
2023-05-10: Require ErrorInfo for all error responses.
2023-05-04: Require uniqueness by message type for error details.
2022-11-04: Added guidance around PERMISSION_DENIED errors previously found in other AIPs.
2022-08-12: Reworded/Simplified intro to add clarity to the intent.
2020-01-22: Added a reference to the ErrorInfo message.
2019-10-14: Added guidance restricting error message mutability to if there is a machine-readable identifier present.
2019-09-23: Added guidance about error message strings being able to change.