Supported protocol buffer and Arrow data types

This document describes the supported protocol buffer and Arrow data types for each respective BigQuery data type. Before reading this document, read Overview of the BigQuery Storage Write API .

Supported protocol buffer data types

The following table shows the supported data types in protocol buffers and the corresponding input format in BigQuery:

BigQuery data type
Supported protocol buffer types
BOOL
bool , int32 , int64 , uint32 , uint64 , google.protobuf.BoolValue
BYTES
bytes , string , google.protobuf.BytesValue
DATE
int32 (preferred), int64 , string

The value is the number of days since the Unix epoch (1970-01-01). The valid range is -719162 (0001-01-01) to 2932896 (9999-12-31).

DATETIME , TIME
string

The value must be a DATETIME or TIME literal.

int64

Use the CivilTimeEncoder class to perform the conversion.

FLOAT
double , float , google.protobuf.DoubleValue , google.protobuf.FloatValue
GEOGRAPHY
string

The value is a geometry in either WKT or GeoJson format.

INTEGER
int32 , int64 , uint32 , enum , google.protobuf.Int32Value , google.protobuf.Int64Value , google.protobuf.UInt32Value
JSON
string
NUMERIC , BIGNUMERIC
int32 , int64 , uint32 , uint64 , double , float , string
bytes , google.protobuf.BytesValue

Use the BigDecimalByteStringEncoder class to perform the conversion.

STRING
string , enum , google.protobuf.StringValue
TIME
string

The value must be a TIME literal .

TIMESTAMP
int64 (preferred), int32 , uint32 , google.protobuf.Timestamp

The value is given in microseconds since the Unix epoch (1970-01-01).

INTERVAL
string , google.protobuf.Duration

The string value must be an INTERVAL literal .

RANGE<T>
message

A nested message type in the proto with two fields, start and end , where both fields must be of the same supported protocol buffer type that corresponds to a BigQuery data type T . T must be one of DATE , DATETIME , or TIMESTAMP . If a field ( start or end ) is not set in the proto message, it represents an unbounded boundary. In the following example, f_range_date represents a RANGE column in a table. Since the end field is not set in the proto message, the end boundary of this range is unbounded.

 {
  f_range_date: {
    start: 1
  }
} 
REPEATED FIELD
array

An array type in the proto corresponds to a repeated field in BigQuery.

RECORD
message

A nested message type in the proto corresponds to a record field in BigQuery.

Supported Apache Arrow data types

The following table shows the supported data types in Apache Arrow and the corresponding input format in BigQuery.

BigQuery data type
Supported Apache Arrow types
Supported type parameters
BOOL
Boolean
BYTES
Binary
DATE
Date
unit = Day
String , int32
DATETIME
Timestamp
unit = MICROSECONDS

timezone is empty

FLOAT
FloatingPoint
Precision in {SINGLE, DOUBLE}
GEOGRAPHY
Utf8

The value is a geometry in either WKT or GeoJson format.

INTEGER
int
bitWidth in {8, 16, 32, 64}

is_signed = false

JSON
Utf8
NUMERIC
Decimal128
You can provide a NUMERIC that has any precision or scale that's smaller than the BigQuery supported range .
BIGNUMERIC
Decimal256
You can provide a BIGNUMERIC that has any precision or scale that's smaller than the BigQuery supported range .
STRING
Utf8
TIMESTAMP
Timestamp
unit= MICROSECONDS

timezone = UTC

INTERVAL
Interval
unit in {YEAR_MONTH, DAY_TIME, MONTH_DAY_NANO}
Utf8
RANGE<T>
Struct

The Arrow Struct must have two subfields named start and end .

For the RANGE<DATE> column, the fields must be Arrow type Date with unit=Day .

For the RANGE<DATETIME> column, the fields must be the Arrow type Timestamp with unit=MICROSECONDS , without the timezone.

For the RANGE<TIMESTAMP> , the fields must be the Arrow type Timestamp with unit=MICROSECONDS , timezone=UTC .

A NULL value in any of the start and end fields will be treated as UNBOUNDED .

REPEATED FIELD
List
A NULL value must be represented by an empty list.
RECORD
Struct
Design a Mobile Site
View Site in Mobile | Classic
Share by: