Google Cloud Dataproc: Node.js Client
Google Cloud Dataproc API client for Node.js
A comprehensive list of changes in each version may be found in the CHANGELOG .
- Google Cloud Dataproc Node.js Client API Reference
- Google Cloud Dataproc Documentation
- github.com/googleapis/nodejs-dataproc
Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained .
Table of contents:
Quickstart
Before you begin
- Select or create a Cloud Platform project .
- Enable billing for your project .
- Enable the Google Cloud Dataproc API .
- Set up authentication with a service account so you can access the API from your local workstation.
Installing the client library
npm install @google-cloud/dataproc
Using the client library
// This quickstart sample walks a user through creating a Dataproc
// cluster, submitting a PySpark job from Google Cloud Storage to the
// cluster, reading the output of the job and deleting the cluster, all
// using the Node.js client library.
'use strict';
function main(projectId, region, clusterName, jobFilePath) {
const dataproc = require(' @google-cloud/dataproc
');
const {Storage} = require(' @google-cloud/storage
');
// Create a cluster client with the endpoint set to the desired cluster region
const clusterClient = new dataproc.v1. ClusterControllerClient
({
apiEndpoint: `${region}-dataproc.googleapis.com`,
projectId: projectId,
});
// Create a job client with the endpoint set to the desired cluster region
const jobClient = new dataproc.v1. JobControllerClient
({
apiEndpoint: `${region}-dataproc.googleapis.com`,
projectId: projectId,
});
async function quickstart() {
// Create the cluster config
const cluster = {
projectId: projectId,
region: region,
cluster: {
clusterName: clusterName,
config: {
masterConfig: {
numInstances: 1,
machineTypeUri: 'n1-standard-2',
},
workerConfig: {
numInstances: 2,
machineTypeUri: 'n1-standard-2',
},
},
},
};
// Create the cluster
const [operation] = await clusterClient.createCluster(cluster);
const [response] = await operation.promise();
// Output a success message
console.log(`Cluster created successfully: ${response.clusterName}`);
const job = {
projectId: projectId,
region: region,
job: {
placement: {
clusterName: clusterName,
},
pysparkJob: {
mainPythonFileUri: jobFilePath,
},
},
};
const [jobOperation] = await jobClient.submitJobAsOperation(job);
const [jobResponse] = await jobOperation.promise();
const matches =
jobResponse.driverOutputResourceUri.match('gs://(.*?)/(.*)');
const storage = new Storage();
const output = await storage
.bucket(matches[1])
.file(`${matches[2]}.000000000`)
. download
();
// Output a success message.
console.log(`Job finished successfully: ${output}`);
// Delete the cluster once the job has terminated.
const deleteClusterReq = {
projectId: projectId,
region: region,
clusterName: clusterName,
};
const [deleteOperation] = await clusterClient.deleteCluster(
deleteClusterReq
);
await deleteOperation.promise();
// Output a success message
console.log(`Cluster ${clusterName} successfully deleted.`);
}
quickstart();
}
const args = process.argv.slice(2);
if (args.length !== 4) {
console.log(
'Insufficient number of parameters provided. Please make sure a ' +
'PROJECT_ID, REGION, CLUSTER_NAME and JOB_FILE_PATH are provided, in this order.'
);
}
main(...args);
Samples
Samples are in the samples/
directory. Each sample's README.md
has instructions for running its sample.
Sample | Source Code | Try it |
---|---|---|
Create Cluster
|
source code | ![]() |
Instantiate an inline workflow template
|
source code | ![]() |
Quickstart
|
source code | ![]() |
Submit Job
|
source code | ![]() |
The Google Cloud Dataproc Node.js Client API Reference documentation also contains samples.
Supported Node.js Versions
Our client libraries follow the Node.js release schedule . Libraries are compatible with all current active and maintenance versions of Node.js.
Client libraries targeting some end-of-life versions of Node.js are available, and
can be installed via npm dist-tags
.
The dist-tags follow the naming convention legacy-(version)
.
Legacy Node.js versions are supported as a best effort:
- Legacy versions will not be tested in continuous integration.
- Some security patches may not be able to be backported.
- Dependencies will not be kept up-to-date, and features will not be backported.
Legacy tags available
-
legacy-8
: install client libraries from this dist-tag for versions compatible with Node.js 8.
Versioning
This library follows Semantic Versioning .
This library is considered to be General Availability (GA). This means it is stable; the code surface will not change in backwards-incompatible ways unless absolutely necessary (e.g. because of critical security issues) or with an extensive deprecation period. Issues and requests against GAlibraries are addressed with the highest priority.
More Information: Google Cloud Platform Launch Stages
Contributing
Contributions welcome! See the Contributing Guide .
Please note that this README.md
, the samples/README.md
,
and a variety of configuration files in this repository (including .nycrc
and tsconfig.json
)
are generated from a central template. To edit one of these files, make an edit
to its templates in directory
.
License
Apache Version 2.0
See LICENSE