Build a document summarizer in the Google Cloud console

You can create a summarizer processor using Document AI to summarize the content of documents. The output can be customized based on length and format .

Here is some sample JSON output from the resulting entity:

  { 
  
 "type" 
 : 
  
 "summary" 
 , 
  
 "mentionText" 
 : 
  
 " Superconductivity is a phenomenon in which a material conducts 
 electricity with no resistance. It was discovered in 1911 by Dutch physicist Heike 
 Kamerlingh Onnes. In 1986, a new class of materials was discovered that can superconduct 
 at much higher temperatures. These materials are called high-temperature superconductors. 
 They have the potential to revolutionize the way we use electricity. However, 
 high-temperature superconductors are still very expensive to produce. Scientists 
 are working on ways to make them more affordable." 
 , 
  
 "normalizedValue" 
 : 
  
 { 
  
 "text" 
 : 
  
 " Superconductivity is a phenomenon in which a material conducts 
 electricity with no resistance. It was discovered in 1911 by Dutch physicist 
 Heike Kamerlingh Onnes. In 1986, a new class of materials was discovered that 
 can superconduct at much higher temperatures. These materials are called 
 high-temperature superconductors. They have the potential to revolutionize 
 the way we use electricity. However, high-temperature superconductors are 
 still very expensive to produce. Scientists are working on ways to make 
 them more affordable." 
  
 } 
 } 
 

Procedure

In this quickstart, you create a document summarizer processor, upload a sample document for processing, and create a custom processor version to adjust the summary structure.


To follow step-by-step guidance for this task directly in the Google Cloud console, click Guide me :

Guide me


Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. Enable the Document AI, Cloud Storage APIs.

    Enable the APIs

  5. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the Document AI, Cloud Storage APIs.

    Enable the APIs

Create a summarizer processor

Use the Google Cloud console to create a summarizer processor. See creating and managing processors for more information.

  1. In the Google Cloud console, in the Document AI section, go to the Workbenchpage.

    Workbench

  2. For Summarizer, select Create processor .summarizer-1

  3. In the Create processormenu, enter a name for your processor, such as quickstart-summarizer .

  4. Select the region closest to you.

  5. Select Create.

Your processor has now been created.

Test Processor

You are on the Processor overviewpage of the processor you just created.

summarizer-2

  1. Select on the Customize & build tab to experiment with the processor.

    summarizer-3

  2. Download a sample document

    It is a PDF file containing the Wikipedia page for Superconductivity .

  3. Select Upload Test Document and select the document you just downloaded.

  4. You are now on the Summarypage. You can view the OCR detected text and document summarization.

    summarizer-4

  5. Adjust the Length and Format settings to Moderateand Bulletedrespectively, then select Rewriteand observe the results.

  6. Go back to the Customize & buildpage.

Deploy processor version

If you want to use specific summarization settings when processing documents with the API, create a processor version for those settings.

  1. The Summarization settings are set to the last values you used on the previous page.

  2. Select on Create New Version to create a processor version with the specified Summarization settings.

  3. Enter a name for the processor version, such as quickstart-moderate-bulleted , and select Create Version.

  4. Go to the Deploy & Use tab to view the deployment status. Deployment takes a few minutes.

  5. When the version is deployed, you can set it as the Default version , or you can provide the version ID when processing documents with the API.

  6. To use the Document AI API:

You have successfully used Document AI to extract text from a document and summarize it.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

To avoid unnecessary Google Cloud charges, use the Google Cloud console to delete your processor and project if you do not need them.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: