Google BigQuery Storage: Node.js Client
Client for the BigQuery Storage API
A comprehensive list of changes in each version may be found in the CHANGELOG .
- Google BigQuery Storage Node.js Client API Reference
- Google BigQuery Storage Documentation
- github.com/googleapis/nodejs-bigquery-storage
Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained .
Table of contents:
Quickstart
Before you begin
- Select or create a Cloud Platform project .
- Enable billing for your project .
- Enable the Google BigQuery Storage API .
- Set up authentication with a service account so you can access the API from your local workstation.
Installing the client library
npm install @google-cloud/bigquery-storage
Using the client library
// The read stream contains blocks of Avro-encoded bytes. We use the
// 'avsc' library to decode these blocks. Install avsc with the following
// command: npm install avsc
const avro = require('avsc');
// See reference documentation at
// https://cloud.google.com/bigquery/docs/reference/storage
const {BigQueryReadClient} = require(' @google-cloud/bigquery-storage
');
const client = new BigQueryReadClient
();
async function bigqueryStorageQuickstart() {
// Get current project ID. The read session is created in this project.
// This project can be different from that which contains the table.
const myProjectId = await client.getProjectId();
// This example reads baby name data from the public datasets.
const projectId = 'bigquery-public-data';
const datasetId = 'usa_names';
const tableId = 'usa_1910_current';
const tableReference = `projects/${projectId}/datasets/${datasetId}/tables/${tableId}`;
const parent = `projects/${myProjectId}`;
/* We limit the output columns to a subset of those allowed in the table,
* and set a simple filter to only report names from the state of
* Washington (WA).
*/
const readOptions = {
selectedFields: ['name', 'number', 'state'],
rowRestriction: 'state = "WA"',
};
let tableModifiers = null;
const snapshotSeconds = 0;
// Set a snapshot time if it's been specified.
if (snapshotSeconds > 0) {
tableModifiers = {snapshotTime: {seconds: snapshotSeconds}};
}
// API request.
const request = {
parent,
readSession: {
table: tableReference,
// This API can also deliver data serialized in Apache Arrow format.
// This example leverages Apache Avro.
dataFormat: 'AVRO',
readOptions,
tableModifiers,
},
};
const [session] = await client.createReadSession(request);
const schema = JSON
.parse(session.avroSchema.schema);
const avroType = avro.Type.forSchema(schema);
/* The offset requested must be less than the last
* row read from ReadRows. Requesting a larger offset is
* undefined.
*/
let offset = 0;
const readRowsRequest = {
// Required stream name and optional offset. Offset requested must be less than
// the last row read from readRows(). Requesting a larger offset is undefined.
readStream: session.streams[0].name,
offset,
};
const names = new Set();
const states = [];
/* We'll use only a single stream for reading data from the table. Because
* of dynamic sharding, this will yield all the rows in the table. However,
* if you wanted to fan out multiple readers you could do so by having a
* reader process each individual stream.
*/
client
.readRows(readRowsRequest)
.on('error', console.error)
.on('data', data => {
offset = data.avroRows.serializedBinaryRows.offset;
try {
// Decode all rows in buffer
let pos;
do {
const decodedData = avroType.decode(
data.avroRows.serializedBinaryRows,
pos
);
if (decodedData.value) {
names.add(decodedData.value.name);
if (!states.includes(decodedData.value.state)) {
states.push(decodedData.value.state);
}
}
pos = decodedData.offset;
} while (pos > 0);
} catch (error) {
console.log(error);
}
})
.on('end', () => {
console.log(`Got ${names.size} unique names in states: ${states}`);
console.log(`Last offset: ${offset}`);
});
}
Samples
Samples are in the samples/
directory. Each sample's README.md
has instructions for running its sample.
Sample | Source Code | Try it |
---|---|---|
Append_rows_pending
|
source code | ![]() |
Append_rows_proto2
|
source code | ![]() |
Customer_record_pb
|
source code | ![]() |
BigQuery Storage Quickstart
|
source code | ![]() |
Sample_data_pb
|
source code | ![]() |
The Google BigQuery Storage Node.js Client API Reference documentation also contains samples.
Supported Node.js Versions
Our client libraries follow the Node.js release schedule . Libraries are compatible with all current active and maintenance versions of Node.js. If you are using an end-of-life version of Node.js, we recommend that you update as soon as possible to an actively supported LTS version.
Google's client libraries support legacy versions of Node.js runtimes on a best-efforts basis with the following warnings:
- Legacy versions are not tested in continuous integration.
- Some security patches and features cannot be backported.
- Dependencies cannot be kept up-to-date.
Client libraries targeting some end-of-life versions of Node.js are available, and
can be installed through npm dist-tags
.
The dist-tags follow the naming convention legacy-(version)
.
For example, npm install @google-cloud/bigquery-storage@legacy-8
installs client libraries
for versions compatible with Node.js 8.
Versioning
This library follows Semantic Versioning .
This library is considered to be stable. The code surface will not change in backwards-incompatible ways unless absolutely necessary (e.g. because of critical security issues) or with an extensive deprecation period. Issues and requests against stablelibraries are addressed with the highest priority.
More Information: Google Cloud Platform Launch Stages
Contributing
Contributions welcome! See the Contributing Guide .
Please note that this README.md
, the samples/README.md
,
and a variety of configuration files in this repository (including .nycrc
and tsconfig.json
)
are generated from a central template. To edit one of these files, make an edit
to its templates in directory
.
License
Apache Version 2.0
See LICENSE