Document AI: Node.js Client
Document AI client for Node.js
A comprehensive list of changes in each version may be found in the CHANGELOG .
- Document AI Node.js Client API Reference
- Document AI Documentation
- github.com/googleapis/google-cloud-node/packages/google-cloud-documentai
Read more about the client libraries for Cloud APIs, including the older Google APIs Client Libraries, in Client Libraries Explained .
Table of contents:
Quickstart
Before you begin
- Select or create a Cloud Platform project .
- Enable billing for your project .
- Enable the Document AI API .
- Set up authentication so you can access the API from your local workstation.
Installing the client library
npm install @google-cloud/documentai
Using the client library
/**
* TODO(developer): Uncomment these variables before running the sample.
*/
// const projectId = 'YOUR_PROJECT_ID';
// const location = 'YOUR_PROJECT_LOCATION'; // Format is 'us' or 'eu'
// const processorId = 'YOUR_PROCESSOR_ID'; // Create processor in Cloud Console
// const filePath = '/path/to/local/pdf';
const {DocumentProcessorServiceClient} =
require(' @google-cloud/documentai
').v1;
// Instantiates a client
// apiEndpoint regions available: eu-documentai.googleapis.com, us-documentai.googleapis.com (Required if using eu based processor)
// const client = new DocumentProcessorServiceClient({apiEndpoint: 'eu-documentai.googleapis.com'});
const client = new DocumentProcessorServiceClient
();
async function quickstart() {
// The full resource name of the processor, e.g.:
// projects/project-id/locations/location/processor/processor-id
// You must create new processors in the Cloud Console first
const name = `projects/${projectId}/locations/${location}/processors/${processorId}`;
// Read the file into memory.
const fs = require('fs').promises;
const imageFile = await fs.readFile(filePath);
// Convert the image data to a Buffer and base64 encode it.
const encodedImage = Buffer.from(imageFile).toString('base64');
const request = {
name,
rawDocument: {
content: encodedImage,
mimeType: 'application/pdf',
},
};
// Recognizes text entities in the PDF document
const [result] = await client.processDocument(request);
const {document} = result;
// Get all of the document text as one big string
const {text} = document;
// Extract shards from the text field
const getText = textAnchor => {
if (!textAnchor.textSegments || textAnchor.textSegments.length === 0) {
return '';
}
// First shard in document doesn't have startIndex property
const startIndex = textAnchor.textSegments[0].startIndex || 0;
const endIndex = textAnchor.textSegments[0].endIndex;
return text.substring(startIndex, endIndex);
};
// Read the text recognition output from the processor
console.log('The document contains the following paragraphs:');
const [page1] = document.pages;
const {paragraphs} = page1;
for (const paragraph of paragraphs) {
const paragraphText = getText(paragraph.layout.textAnchor);
console.log(`Paragraph text:\n${paragraphText}`);
}
}
Samples
Samples are in the samples/
directory. Each sample's README.md
has instructions for running its sample.
Sample | Source Code | Try it |
---|---|---|
Document_processor_service.batch_process_documents
|
source code | ![]() |
Document_processor_service.create_processor
|
source code | ![]() |
Document_processor_service.delete_processor
|
source code | ![]() |
Document_processor_service.delete_processor_version
|
source code | ![]() |
Document_processor_service.deploy_processor_version
|
source code | ![]() |
Document_processor_service.disable_processor
|
source code | ![]() |
Document_processor_service.enable_processor
|
source code | ![]() |
Document_processor_service.evaluate_processor_version
|
source code | ![]() |
Document_processor_service.fetch_processor_types
|
source code | ![]() |
Document_processor_service.get_evaluation
|
source code | ![]() |
Document_processor_service.get_processor
|
source code | ![]() |
Document_processor_service.get_processor_type
|
source code | ![]() |
Document_processor_service.get_processor_version
|
source code | ![]() |
Document_processor_service.list_evaluations
|
source code | ![]() |
Document_processor_service.list_processor_types
|
source code | ![]() |
Document_processor_service.list_processor_versions
|
source code | ![]() |
Document_processor_service.list_processors
|
source code | ![]() |
Document_processor_service.process_document
|
source code | ![]() |
Document_processor_service.review_document
|
source code | ![]() |
Document_processor_service.set_default_processor_version
|
source code | ![]() |
Document_processor_service.train_processor_version
|
source code | ![]() |
Document_processor_service.undeploy_processor_version
|
source code | ![]() |
Document_understanding_service.batch_process_documents
|
source code | ![]() |
Document_understanding_service.batch_process_documents
|
source code | ![]() |
Document_understanding_service.process_document
|
source code | ![]() |
Document_processor_service.batch_process_documents
|
source code | ![]() |
Document_processor_service.create_processor
|
source code | ![]() |
Document_processor_service.delete_processor
|
source code | ![]() |
Document_processor_service.delete_processor_version
|
source code | ![]() |
Document_processor_service.deploy_processor_version
|
source code | ![]() |
Document_processor_service.disable_processor
|
source code | ![]() |
Document_processor_service.enable_processor
|
source code | ![]() |
Document_processor_service.evaluate_processor_version
|
source code | ![]() |
Document_processor_service.fetch_processor_types
|
source code | ![]() |
Document_processor_service.get_evaluation
|
source code | ![]() |
Document_processor_service.get_processor
|
source code | ![]() |
Document_processor_service.get_processor_type
|
source code | ![]() |
Document_processor_service.get_processor_version
|
source code | ![]() |
Document_processor_service.import_processor_version
|
source code | ![]() |
Document_processor_service.list_evaluations
|
source code | ![]() |
Document_processor_service.list_processor_types
|
source code | ![]() |
Document_processor_service.list_processor_versions
|
source code | ![]() |
Document_processor_service.list_processors
|
source code | ![]() |
Document_processor_service.process_document
|
source code | ![]() |
Document_processor_service.review_document
|
source code | ![]() |
Document_processor_service.set_default_processor_version
|
source code | ![]() |
Document_processor_service.train_processor_version
|
source code | ![]() |
Document_processor_service.undeploy_processor_version
|
source code | ![]() |
Document_service.batch_delete_documents
|
source code | ![]() |
Document_service.get_dataset_schema
|
source code | ![]() |
Document_service.get_document
|
source code | ![]() |
Document_service.import_documents
|
source code | ![]() |
Document_service.list_documents
|
source code | ![]() |
Document_service.update_dataset
|
source code | ![]() |
Document_service.update_dataset_schema
|
source code | ![]() |
Quickstart
|
source code | ![]() |
The Document AI Node.js Client API Reference documentation also contains samples.
Supported Node.js Versions
Our client libraries follow the Node.js release schedule . Libraries are compatible with all current active and maintenance versions of Node.js. If you are using an end-of-life version of Node.js, we recommend that you update as soon as possible to an actively supported LTS version.
Google's client libraries support legacy versions of Node.js runtimes on a best-efforts basis with the following warnings:
- Legacy versions are not tested in continuous integration.
- Some security patches and features cannot be backported.
- Dependencies cannot be kept up-to-date.
Client libraries targeting some end-of-life versions of Node.js are available, and
can be installed through npm dist-tags
.
The dist-tags follow the naming convention legacy-(version)
.
For example, npm install @google-cloud/documentai@legacy-8
installs client libraries
for versions compatible with Node.js 8.
Versioning
This library follows Semantic Versioning .
This library is considered to be stable. The code surface will not change in backwards-incompatible ways unless absolutely necessary (e.g. because of critical security issues) or with an extensive deprecation period. Issues and requests against stablelibraries are addressed with the highest priority.
More Information: Google Cloud Platform Launch Stages
Contributing
Contributions welcome! See the Contributing Guide .
Please note that this README.md
, the samples/README.md
,
and a variety of configuration files in this repository (including .nycrc
and tsconfig.json
)
are generated from a central template. To edit one of these files, make an edit
to its templates in directory
.
License
Apache Version 2.0
See LICENSE