Document AI: Node.js Client
Document AI client for Node.js
A comprehensive list of changes in each version may be found in the CHANGELOG .
- Document AI Node.js Client API Reference
- Document AI Documentation
- github.com/googleapis/nodejs-document-ai
Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained .
Table of contents:
Quickstart
Before you begin
- Select or create a Cloud Platform project .
- Enable billing for your project .
- Enable the Document AI API .
- Set up authentication with a service account so you can access the API from your local workstation.
Installing the client library
npm install @google-cloud/documentai
Using the client library
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'
// const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console
// const filePath = '/path/to/local/pdf';
const {DocumentProcessorServiceClient} =
require(' @google-cloud/documentai
').v1;
// Instantiates a client
// apiEndpoint regions available: eu-documentai.googleapis.com, us-documentai.googleapis.com (Required if using eu based processor)
// const client = new DocumentProcessorServiceClient({apiEndpoint: 'eu-documentai.googleapis.com'});
const client = new DocumentProcessorServiceClient
();
async function quickstart() {
// The full resource name of the processor, e.g.:
// projects/project-id/locations/location/processor/processor-id
// You must create new processors in the Cloud Console first
const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;
// Read the file into memory.
const fs = require('fs').promises;
const imageFile = await fs.readFile(filePath);
// Convert the image data to a Buffer and base64 encode it.
const encodedImage = Buffer.from(imageFile).toString('base64');
const request = {
name,
rawDocument: {
content: encodedImage,
mimeType: 'application/pdf',
},
};
// Recognizes text entities in the PDF document
const [result] = await client.processDocument(request);
const {document} = result;
// Get all of the document text as one big string
const {text} = document;
// Extract shards from the text field
const getText = textAnchor => {
if (!textAnchor.textSegments || textAnchor.textSegments.length === 0) {
return '';
}
// First shard in document doesn't have startIndex property
const startIndex = textAnchor.textSegments[0].startIndex || 0;
const endIndex = textAnchor.textSegments[0].endIndex;
return text.substring(startIndex, endIndex);
};
// Read the text recognition output from the processor
console.log('The document contains the following paragraphs:');
const [page1] = document.pages;
const {paragraphs} = page1;
for (const paragraph of paragraphs) {
const paragraphText = getText(paragraph.layout.textAnchor);
console.log(`Paragraph text:\n${paragraphText}`);
}
}
Samples
Samples are in the samples/
directory. Each sample's README.md
has instructions for running its sample.
Sample | Source Code | Try it |
---|---|---|
Batch-parse-form.v1beta2
|
source code | ![]() |
Batch-parse-table.v1beta2
|
source code | ![]() |
Batch-process-document
|
source code | ![]() |
Parse-form.v1beta2
|
source code | ![]() |
Parse-table.v1beta2
|
source code | ![]() |
Parse-with-model.v1beta2
|
source code | ![]() |
Process-document
|
source code | ![]() |
Quickstart
|
source code | ![]() |
Set-endpoint.v1beta2
|
source code | ![]() |
The Document AI Node.js Client API Reference documentation also contains samples.
Supported Node.js Versions
Our client libraries follow the Node.js release schedule . Libraries are compatible with all current active and maintenance versions of Node.js.
Client libraries targeting some end-of-life versions of Node.js are available, and
can be installed via npm dist-tags
.
The dist-tags follow the naming convention legacy-(version)
.
Legacy Node.js versions are supported as a best effort:
- Legacy versions will not be tested in continuous integration.
- Some security patches may not be able to be backported.
- Dependencies will not be kept up-to-date, and features will not be backported.
Legacy tags available
-
legacy-8
: install client libraries from this dist-tag for versions compatible with Node.js 8.
Versioning
This library follows Semantic Versioning .
This library is considered to be in beta. This means it is expected to be mostly stable while we work toward a general availability release; however, complete stability is not guaranteed. We will address issues and requests against beta libraries with a high priority.
More Information: Google Cloud Platform Launch Stages
Contributing
Contributions welcome! See the Contributing Guide .
Please note that this README.md
, the samples/README.md
,
and a variety of configuration files in this repository (including .nycrc
and tsconfig.json
)
are generated from a central template. To edit one of these files, make an edit
to its templates in directory
.
License
Apache Version 2.0
See LICENSE