Run a batch translation using the Cloud Translation connector


This tutorial shows you how to create a workflow that uses the Cloud Translation API connector to translate files to other languages in asynchronous batch mode. This provides real-time output as the inputs are being processed.

Objectives

In this tutorial you will:

  1. Create an input Cloud Storage bucket.
  2. Create two files in English and upload them to the input bucket.
  3. Create a workflow that uses the Cloud Translation API connector to translate the two files to French and Spanish and saves the results in an output bucket.
  4. Deploy and execute the workflow to orchestrate the entire process.

Costs

In this document, you use the following billable components of Google Cloud:

To generate a cost estimate based on your projected usage, use the pricing calculator .

New Google Cloud users might be eligible for a free trial .

Before you begin

Security constraints defined by your organization might prevent you from completing the following steps. For troubleshooting information, see Develop applications in a constrained Google Cloud environment .

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. Install the Google Cloud CLI.

  3. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .

  4. To initialize the gcloud CLI, run the following command:

    gcloud  
    init
  5. Create or select a Google Cloud project .

    • Create a Google Cloud project:

      gcloud projects create PROJECT_ID 
      

      Replace PROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set project PROJECT_ID 
      

      Replace PROJECT_ID with your Google Cloud project name.

  6. Verify that billing is enabled for your Google Cloud project .

  7. Enable the Cloud Storage, Translation, and Workflows APIs:

    gcloud  
    services  
     enable 
      
    storage.googleapis.com  
     translate.googleapis.com  
     workflows.googleapis.com
  8. Install the Google Cloud CLI.

  9. If you're using an external identity provider (IdP), you must first sign in to the gcloud CLI with your federated identity .

  10. To initialize the gcloud CLI, run the following command:

    gcloud  
    init
  11. Create or select a Google Cloud project .

    • Create a Google Cloud project:

      gcloud projects create PROJECT_ID 
      

      Replace PROJECT_ID with a name for the Google Cloud project you are creating.

    • Select the Google Cloud project that you created:

      gcloud config set project PROJECT_ID 
      

      Replace PROJECT_ID with your Google Cloud project name.

  12. Verify that billing is enabled for your Google Cloud project .

  13. Enable the Cloud Storage, Translation, and Workflows APIs:

    gcloud  
    services  
     enable 
      
    storage.googleapis.com  
     translate.googleapis.com  
     workflows.googleapis.com
  14. Update gcloud components:
    gcloud  
    components  
    update
  15. Log in using your account:
    gcloud  
    auth  
    login
  16. Set the default location used in this tutorial:
    gcloud  
    config  
     set 
      
    workflows/location  
    us-central1

    Since this tutorial uses the default AutoML Translation model which resides in us-central1 , you must set the location to us-central1 .

    If using an AutoML Translation model or glossary other than the default, ensure that it resides in the same location as the call to the connector; otherwise, an INVALID_ARGUMENT (400) error is returned. For details, see the batchTranslateText method .

Create an input Cloud Storage bucket and files

You can use Cloud Storage to store objects. Objects are immutable pieces of data consisting of a file of any format, and are stored in containers called buckets.

  1. Create a Cloud Storage bucket to hold the files to translate:

     BUCKET_INPUT 
     = 
     ${ 
     GOOGLE_CLOUD_PROJECT 
     } 
    -input-files
    gcloud  
    storage  
    buckets  
    create  
    gs:// ${ 
     BUCKET_INPUT 
     } 
    
  2. Create two files in English and upload them to the input bucket:

     echo 
      
     "Hello World!" 
      
    >  
    file1.txt
    gcloud  
    storage  
    cp  
    file1.txt  
    gs:// ${ 
     BUCKET_INPUT 
     } 
     echo 
      
     "Workflows connectors simplify calling services." 
      
    >  
    file2.txt
    gcloud  
    storage  
    cp  
    file2.txt  
    gs:// ${ 
     BUCKET_INPUT 
     } 
    

Deploy and execute the workflow

A workflow is made up of a series of steps described using the Workflows syntax, which can be written in either YAML or JSON format. This is the workflow's definition. After creating a workflow, you deploy it to make it available for execution.

  1. Create a text file with the filename workflow.yaml and with the following content:

      main 
     : 
      
     steps 
     : 
      
     - 
      
     init 
     : 
      
     assign 
     : 
      
     - 
      
     projectId 
     : 
      
     ${sys.get_env("GOOGLE_CLOUD_PROJECT_ID")} 
      
     - 
      
     location 
     : 
      
     ${sys.get_env("GOOGLE_CLOUD_LOCATION")} 
      
     - 
      
     inputBucketName 
     : 
      
     ${projectId + "-input-files"} 
      
     - 
      
     outputBucketName 
     : 
      
     ${projectId + "-output-files-" + string(int(sys.now()))} 
      
     - 
      
     createOutputBucket 
     : 
      
     call 
     : 
      
     googleapis.storage.v1.buckets.insert 
      
     args 
     : 
      
     project 
     : 
      
     ${projectId} 
      
     body 
     : 
      
     name 
     : 
      
     ${outputBucketName} 
      
     - 
      
     batchTranslateText 
     : 
      
     call 
     : 
      
     googleapis.translate.v3beta1.projects.locations.batchTranslateText 
      
     args 
     : 
      
     parent 
     : 
      
     ${"projects/" + projectId + "/locations/" + location} 
      
     body 
     : 
      
     inputConfigs 
     : 
      
     gcsSource 
     : 
      
     inputUri 
     : 
      
     ${"gs://" + inputBucketName + "/*"} 
      
     outputConfig 
     : 
      
     gcsDestination 
     : 
      
     outputUriPrefix 
     : 
      
     ${"gs://" + outputBucketName + "/"} 
      
     sourceLanguageCode 
     : 
      
     "en" 
      
     targetLanguageCodes 
     : 
      
     [ 
     "es" 
     , 
      
     "fr" 
     ] 
      
     result 
     : 
      
     batchTranslateTextResult 
     
    

    The workflow assigns variables, creates an output bucket, and initiates the translation of the files, saving the results to the output bucket.

  2. After creating the workflow, deploy it:

    gcloud  
    workflows  
    deploy  
    batch-translation  
    --source = 
    workflow.yaml
  3. Execute the workflow:

    gcloud  
    workflows  
    execute  
    batch-translation
  4. To view the workflow status, you can run the returned command. For example:

    gcloud workflows executions describe eb4a6239-cffa-4672-81d8-d4caef7d8424 /
      --workflow batch-translation /
      --location us-central1

    The workflow should be ACTIVE . After a few minutes, the translated files (in French and Spanish) are uploaded to the output bucket.

List objects in the output bucket

You can confirm that the workflow worked as expected by listing the objects in your output bucket.

  1. Retrieve your output bucket name:

    gcloud  
    storage  
    ls

    The output is similar to the following:

    gs:// PROJECT_ID 
    -input-files/
    gs:// PROJECT_ID 
    -output-files- TIMESTAMP 
    /
  2. List the objects in your output bucket:

    gcloud  
    storage  
    ls  
    gs:// PROJECT_ID 
    -output-files- TIMESTAMP 
    /**  
    --recursive

    After a few minutes, the translated files, two of each in French and Spanish, are listed.

Clean up

If you created a new project for this tutorial, delete the project . If you used an existing project and wish to keep it without the changes added in this tutorial, delete resources created for the tutorial .

Delete the project

The easiest way to eliminate billing is to delete the project that you created for the tutorial.

To delete the project:

  1. In the Google Cloud console, go to the Manage resources page.

    Go to Manage resources

  2. In the project list, select the project that you want to delete, and then click Delete .
  3. In the dialog, type the project ID, and then click Shut down to delete the project.

Delete tutorial resources

  1. Remove the gcloud default configuration you added during the tutorial setup:

    gcloud  
    config  
     unset 
      
    workflows/location
  2. Delete the workflow created in this tutorial:

    gcloud  
    workflows  
    delete  
     WORKFLOW_NAME 
    
  3. Delete a bucket and its objects created in this tutorial:

    gcloud storage rm gs:// BUCKET_NAME 
    --recursive

    Where BUCKET_NAME is the name of the bucket to delete. For example, my-bucket .

    The response is similar to the following:

    Removing gs://my-bucket/...

What's next

Create a Mobile Website
View Site in Mobile | Classic
Share by: