GoogleSQL for Bigtable supports collation. Collation defines rules to sort and compare strings in certain operations , such as conditional expressions, joins, and groupings.
By default, GoogleSQL sorts strings case-sensitively. This means that a
and A
are treated as different letters, and Z
would come before a
.
Example default sorting:Apple, Zebra, apple
By contrast, collation lets you sort and compare strings case-insensitively or according to specific language rules.
Example case-insensitive collation:Apple, apple, Zebra
Collation is useful when you need fine-tuned control over how values are sorted, joined, or grouped in tables.
Operations that propagate collation
Collation can pass through some query operations to other parts of a query. When collation passes through an operation in a query, this is known as propagation . During propagation:
- If an input contains no collation specification or an empty collation specification and another input contains an explicitly defined collation, the explicitly defined collation is used for all of the inputs.
- All inputs with a non-empty explicitly defined collation specification must have the same type of collation specification, otherwise an error is thrown.
GoogleSQL has several functions , operators , and expressions that can propagate collation.
Functions
The following example functions propagate collation.
| Function | Notes |
|---|---|
ANY_VALUE
|
|
ARRAY_AGG
|
Collation on input arguments are propagated as collation on the array element. |
ARRAY_FIRST
|
|
ARRAY_LAST
|
|
ARRAY_OFFSET
|
|
ARRAY_OFFSETS
|
|
ARRAY_SLICE
|
|
ARRAY_TO_STRING
|
Collation on array elements are propagated to output. |
CONCAT
|
|
FORMAT
|
Collation from format_string
to the returned string is propagated. |
FORMAT_DATE
|
Collation from format_string
to the returned string is propagated. |
FORMAT_DATETIME
|
Collation from format_string
to the returned string is propagated. |
FORMAT_TIME
|
Collation from format_string
to the returned string is propagated. |
FORMAT_TIMESTAMP
|
Collation from format_string
to the returned string is propagated. |
GREATEST
|
|
LEAST
|
|
LEFT
|
|
LOWER
|
|
LPAD
|
|
MAX
|
|
MIN
|
|
NORMALIZE
|
|
NORMALIZE_AND_CASEFOLD
|
|
NULLIFERROR
|
|
REPEAT
|
|
REPLACE
|
|
REVERSE
|
|
RIGHT
|
|
RPAD
|
|
SOUNDEX
|
|
SPLIT
|
Collation on input arguments are propagated as collation on the array element. |
STRING_AGG
|
|
SUBSTR
|
|
UPPER
|
Operators
The following example operators propagate collation.
| Operator | Notes |
|---|---|
||
concatenation operator
|
|
| Array subscript operator | Propagated to output. |
STRUCT
field access operator
|
When getting a STRUCT
, collation on the STRUCT
field is propagated as the output collation. |
UNNEST
|
Collation on the input array element is propagated to output. |
Expressions
The following example expressions propagate collation.
| Expression | Notes |
|---|---|
ARRAY
|
When you construct an ARRAY
, collation on input arguments is propagated on the elements in the ARRAY
. |
CASE
|
|
CASE
expr
|
|
IF
|
|
IFNULL
|
|
NULLIF
|
|
STRUCT
|
When you construct a STRUCT
, collation on input arguments is propagated on the fields in the STRUCT
. |
Collation specification details
A collation specification determines how strings are sorted and compared in collation-supported operations . Binary collation is used by default and indicates that the operation should return data in Unicode code point order . You can't set binary collation explicitly.

