Creating and managing datasets

A dataset contains representative samples of the type of content you want to translate, as matching sentence pairs in the source and target languages. The dataset serves as the input for training a model.

The main steps for building a dataset are:

  1. Create a dataset and identify the source and target languages.
  2. Import sentence pairs into the dataset.

A project can have multiple datasets, each used to train a separate model. You can get a list of the available datasets and you can delete datasets no longer needed.

Creating a dataset

The first step in creating a custom model is to create an empty dataset that will eventually hold the training data for the model. When you create a dataset, you identify the source and target languages for the model. For more information about the supported languages and variants, see Language support for custom models .

Web UI

The AutoML Translation UI enables you to create a new dataset and import items into it from the same page.

  1. Visit the AutoML Translation UI .

  2. Select the project for which you enabled AutoML Translation from the drop-down list in the upper right of the title bar.

  3. On the Datasetstab, click Create Dataset.

    Datasets page with one dataset

  4. In the Create datasetdialog, do the following:

    • Enter a name for the dataset.
    • Select the source and target languages from the drop-down lists. When you select a Translate fromlanguage, the available Translate tolanguages appear.

    • Click Create. The Importtab opens up.

REST

Send the create dataset request

The following shows how to send a POST request to the project.locations.datasets/create method. The example uses the access token for a service account set up for the project using the Google Cloud CLI.

Before using any of the request data, make the following replacements:

  • project-id : your Google Cloud Platform project ID
  • dataset-name : the name of your new dataset
  • source-language-code : the language you want to translate from, as an ISO 639-1 code such as 'en'
  • target-language-code : the language you want to translate to, as an ISO 639-1 code such as 'es'

HTTP method and URL:

POST https://automl.googleapis.com/v1/projects/ project-id 
/locations/us-central1/datasets

Request JSON body:

{
    "displayName": " dataset-name 
",
    "translationDatasetMetadata": {
       "sourceLanguageCode": " source-language-code 
",
       "targetLanguageCode": " target-language-code 
"
     }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/ project-number 
/locations/us-central1/operations/ operation-id 
",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2019-10-01T22:13:48.155710Z",
    "updateTime": "2019-10-01T22:13:48.155710Z",
    "createDatasetDetails": {}
  }
}

Get the results

To get the results of your request, you must send a GET request to the operations resource. The following shows how to send such a request.

Before using any of the request data, make the following replacements:

  • operation-name : the name of the operation as returned in the response to the original call to the API
  • project-id : your Google Cloud Platform project ID

HTTP method and URL:

GET https://automl.googleapis.com/v1/ operation-name 

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2019-10-01T22:13:48.155710Z",
    "updateTime": "2019-10-01T22:13:52.321072Z",
    ...
  },
  "done": true,
  "response": {
    "@type": " resource-type 
",
    "name": " resource-name 
"
  }
}

Go

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Go API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 ( 
  
 "context" 
  
 "fmt" 
  
 "io" 
  
 automl 
  
 "cloud.google.com/go/automl/apiv1" 
  
 "cloud.google.com/go/automl/apiv1/automlpb" 
 ) 
 // translateCreateDataset creates a dataset for translate. 
 func 
  
 translateCreateDataset 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 projectID 
  
 string 
 , 
  
 location 
  
 string 
 , 
  
 datasetName 
  
 string 
 , 
  
 sourceLanguageCode 
  
 string 
 , 
  
 targetLanguageCode 
  
 string 
 ) 
  
 error 
  
 { 
  
 // projectID := "my-project-id" 
  
 // location := "us-central1" 
  
 // datasetName := "dataset_display_name" 
  
 // Supported languages: 
  
 //   https://cloud.google.com/translate/automl/docs/languages 
  
 // sourceLanguageCode := "en" 
  
 // targetLanguageCode := "ja" 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 automl 
 . 
 NewClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "NewClient: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 req 
  
 := 
  
& automlpb 
 . 
 CreateDatasetRequest 
 { 
  
 Parent 
 : 
  
 fmt 
 . 
 Sprintf 
 ( 
 "projects/%s/locations/%s" 
 , 
  
 projectID 
 , 
  
 location 
 ), 
  
 Dataset 
 : 
  
& automlpb 
 . 
 Dataset 
 { 
  
 DisplayName 
 : 
  
 datasetName 
 , 
  
 DatasetMetadata 
 : 
  
& automlpb 
 . 
 Dataset_TranslationDatasetMetadata 
 { 
  
 TranslationDatasetMetadata 
 : 
  
& automlpb 
 . 
 TranslationDatasetMetadata 
 { 
  
 SourceLanguageCode 
 : 
  
 sourceLanguageCode 
 , 
  
 TargetLanguageCode 
 : 
  
 targetLanguageCode 
 , 
  
 }, 
  
 }, 
  
 }, 
  
 } 
  
 op 
 , 
  
 err 
  
 := 
  
 client 
 . 
 CreateDataset 
 ( 
 ctx 
 , 
  
 req 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "CreateDataset: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Processing operation name: %q\n" 
 , 
  
 op 
 . 
 Name 
 ()) 
  
 dataset 
 , 
  
 err 
  
 := 
  
 op 
 . 
 Wait 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "Wait: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Dataset name: %v\n" 
 , 
  
 dataset 
 . 
 GetName 
 ()) 
  
 return 
  
 nil 
 } 
 

Java

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Java API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 com.google.api.gax.longrunning.OperationFuture 
 ; 
 import 
  
 com.google.cloud.automl.v1. AutoMlClient 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. Dataset 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. LocationName 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. OperationMetadata 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. TranslationDatasetMetadata 
 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.concurrent.ExecutionException 
 ; 
 class 
 TranslateCreateDataset 
  
 { 
  
 public 
  
 static 
  
 void 
  
 main 
 ( 
 String 
 [] 
  
 args 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
  
 { 
  
 // TODO(developer): Replace these variables before running the sample. 
  
 String 
  
 projectId 
  
 = 
  
 "YOUR_PROJECT_ID" 
 ; 
  
 String 
  
 displayName 
  
 = 
  
 "YOUR_DATASET_NAME" 
 ; 
  
 createDataset 
 ( 
 projectId 
 , 
  
 displayName 
 ); 
  
 } 
  
 // Create a dataset 
  
 static 
  
 void 
  
 createDataset 
 ( 
 String 
  
 projectId 
 , 
  
 String 
  
 displayName 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
  AutoMlClient 
 
  
 client 
  
 = 
  
  AutoMlClient 
 
 . 
 create 
 ()) 
  
 { 
  
 // A resource that represents Google Cloud Platform location. 
  
  LocationName 
 
  
 projectLocation 
  
 = 
  
  LocationName 
 
 . 
 of 
 ( 
 projectId 
 , 
  
 "us-central1" 
 ); 
  
 // Specify the source and target language. 
  
  TranslationDatasetMetadata 
 
  
 translationDatasetMetadata 
  
 = 
  
  TranslationDatasetMetadata 
 
 . 
 newBuilder 
 () 
  
 . 
 setSourceLanguageCode 
 ( 
 "en" 
 ) 
  
 . 
 setTargetLanguageCode 
 ( 
 "ja" 
 ) 
  
 . 
 build 
 (); 
  
  Dataset 
 
  
 dataset 
  
 = 
  
  Dataset 
 
 . 
 newBuilder 
 () 
  
 . 
 setDisplayName 
 ( 
 displayName 
 ) 
  
 . 
 setTranslationDatasetMetadata 
 ( 
 translationDatasetMetadata 
 ) 
  
 . 
 build 
 (); 
  
 OperationFuture<Dataset 
 , 
  
 OperationMetadata 
>  
 future 
  
 = 
  
 client 
 . 
 createDatasetAsync 
 ( 
 projectLocation 
 , 
  
 dataset 
 ); 
  
  Dataset 
 
  
 createdDataset 
  
 = 
  
 future 
 . 
 get 
 (); 
  
 // Display the dataset information. 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Dataset name: %s\n" 
 , 
  
 createdDataset 
 . 
  getName 
 
 ()); 
  
 // To get the dataset id, you have to parse it out of the `name` field. As dataset Ids are 
  
 // required for other methods. 
  
 // Name Form: `projects/{project_id}/locations/{location_id}/datasets/{dataset_id}` 
  
 String 
 [] 
  
 names 
  
 = 
  
 createdDataset 
 . 
  getName 
 
 (). 
 split 
 ( 
 "/" 
 ); 
  
 String 
  
 datasetId 
  
 = 
  
 names 
 [ 
 names 
 . 
 length 
  
 - 
  
 1 
 ] 
 ; 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Dataset id: %s\n" 
 , 
  
 datasetId 
 ); 
  
 } 
  
 } 
 } 
 

Node.js

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Node.js API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  /** 
 * TODO(developer): Uncomment these variables before running the sample. 
 */ 
 // const projectId = 'YOUR_PROJECT_ID'; 
 // const location = 'us-central1'; 
 // const displayName = 'YOUR_DISPLAY_NAME'; 
 // Imports the Google Cloud AutoML library 
 const 
  
 { 
 AutoMlClient 
 } 
  
 = 
  
 require 
 ( 
 '@google-cloud/automl' 
 ). 
 v1 
 ; 
 // Instantiates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 AutoMlClient 
 (); 
 async 
  
 function 
  
 createDataset 
 () 
  
 { 
  
 // Construct request 
  
 const 
  
 request 
  
 = 
  
 { 
  
 parent 
 : 
  
 client 
 . 
 locationPath 
 ( 
 projectId 
 , 
  
 location 
 ), 
  
 dataset 
 : 
  
 { 
  
 displayName 
 : 
  
 displayName 
 , 
  
 translationDatasetMetadata 
 : 
  
 { 
  
 sourceLanguageCode 
 : 
  
 'en' 
 , 
  
 targetLanguageCode 
 : 
  
 'ja' 
 , 
  
 }, 
  
 }, 
  
 }; 
  
 // Create dataset 
  
 const 
  
 [ 
 operation 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 createDataset 
 ( 
 request 
 ); 
  
 // Wait for operation to complete. 
  
 const 
  
 [ 
 response 
 ] 
  
 = 
  
 await 
  
 operation 
 . 
 promise 
 (); 
  
 console 
 . 
 log 
 ( 
 `Dataset name: 
 ${ 
 response 
 . 
 name 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 ` 
 Dataset id: 
 ${ 
  
 response 
 . 
 name 
  
 . 
 split 
 ( 
 '/' 
 ) 
  
 [ 
 response 
 . 
 name 
 . 
 split 
 ( 
 '/' 
 ). 
 length 
  
 - 
  
 1 
 ]. 
 split 
 ( 
 '\n' 
 )[ 
 0 
 ] 
  
 } 
 ` 
 ); 
 } 
 createDataset 
 (); 
 

Python

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Python API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud 
  
 import 
 automl 
 # TODO(developer): Uncomment and set the following variables 
 # project_id = "YOUR_PROJECT_ID" 
 # display_name = "YOUR_DATASET_NAME" 
 client 
 = 
 automl 
 . 
 AutoMlClient 
 () 
 # A resource that represents Google Cloud Platform location. 
 project_location 
 = 
 f 
 "projects/ 
 { 
 project_id 
 } 
 /locations/us-central1" 
 # For a list of supported languages, see: 
 # https://cloud.google.com/translate/automl/docs/languages 
 dataset_metadata 
 = 
 automl 
 . 
 TranslationDatasetMetadata 
 ( 
 source_language_code 
 = 
 "en" 
 , 
 target_language_code 
 = 
 "ja" 
 ) 
 dataset 
 = 
 automl 
 . 
 Dataset 
 ( 
 display_name 
 = 
 display_name 
 , 
 translation_dataset_metadata 
 = 
 dataset_metadata 
 , 
 ) 
 # Create a dataset with the dataset metadata in the region. 
 response 
 = 
 client 
 . 
 create_dataset 
 ( 
 parent 
 = 
 project_location 
 , 
 dataset 
 = 
 dataset 
 ) 
 created_dataset 
 = 
 response 
 . 
 result 
 () 
 # Display the dataset information 
 print 
 ( 
 f 
 "Dataset name: 
 { 
 created_dataset 
 . 
 name 
 } 
 " 
 ) 
 print 
 ( 
 "Dataset id: 
 {} 
 " 
 . 
 format 
 ( 
 created_dataset 
 . 
 name 
 . 
 split 
 ( 
 "/" 
 )[ 
 - 
 1 
 ])) 
 

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for Ruby.

Importing items into a dataset

After you have created a dataset, you can import training sentence pairs into it. For details on preparing your training data, see Preparing training data .

Web UI

The AutoML Translation UI enables you to create a new dataset and import items into it from the same page (see: Creating a dataset ). The steps below import items into an existing dataset.

After creating the dataset folder, you then upload your data.
  1. Upload the sentence pairs to use for training the model.

    On the Importtab, you can upload TSV or TMX files from your local computer or from Cloud Storage. For locally imported files, after selecting your file, click Browse. A list of folders appears. Select the folder where you want your file uploaded to. This directory hosted on Cloud Storage is required to guarantee data residency.

    Select the checkbox for Use separate files for training, validation, and testing (advanced), if you want to upload separate files containing the sentence pairs. This option is recommended if your dataset has more than 100,000 sentence pairs. You must allocate 10,000 sentence pairs at most for validation and test sets; otherwise, AutoML Translation returns an error.

    Import tab

  2. Click Continue.

    You're returned to the Datasetspage. Your dataset shows an in progress animation while your documents are being imported. When your dataset is successfully uploaded, you will receive a message at the email address that you used to sign up for the program.

  3. Review the dataset.

    After your data has been successfully imported, select the dataset from the Datasetstab to see the dataset details. The Sentencetab is enabled, and shows the name of the dataset. The sentence pairs are listed. Each pair is assigned "training," "validation" or "testing," indicating at which stage of processing the pair will be used.

REST

Use the projects.locations.datasets.importData method to import items into a dataset.

Before using any of the request data, make the following replacements:

  • dataset-name : the name of your dataset, as returned by the API when you created the dataset
  • bucket-name : the Cloud Storage bucket that contains the input CSV that describes your dataset
  • csv-file-name : the name of the input CSV file that describes your dataset
  • project-id : your Google Cloud Platform project ID

HTTP method and URL:

POST https://automl.googleapis.com/v1/ dataset-name 
:importData

Request JSON body:

{
  "inputConfig": {
    "gcsSource": {
      "inputUris": "gs:// bucket-name 
/ csv-file-name 
"
    }
  }
}

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/ project-number 
/locations/us-central1/operations/ operation-id 
",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1beta1.OperationMetadata",
    "createTime": "2018-04-27T01:28:36.128120Z",
    "updateTime": "2018-04-27T01:28:36.128150Z",
    "cancellable": true
  }
}

Go

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Go API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 ( 
  
 "context" 
  
 "fmt" 
  
 "io" 
  
 automl 
  
 "cloud.google.com/go/automl/apiv1" 
  
 "cloud.google.com/go/automl/apiv1/automlpb" 
 ) 
 // importDataIntoDataset imports data into a dataset. 
 func 
  
 importDataIntoDataset 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 projectID 
  
 string 
 , 
  
 location 
  
 string 
 , 
  
 datasetID 
  
 string 
 , 
  
 inputURI 
  
 string 
 ) 
  
 error 
  
 { 
  
 // projectID := "my-project-id" 
  
 // location := "us-central1" 
  
 // datasetID := "TRL123456789..." 
  
 // inputURI := "gs://BUCKET_ID/path_to_training_data.csv" 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 automl 
 . 
 NewClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "NewClient: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 req 
  
 := 
  
& automlpb 
 . 
 ImportDataRequest 
 { 
  
 Name 
 : 
  
 fmt 
 . 
 Sprintf 
 ( 
 "projects/%s/locations/%s/datasets/%s" 
 , 
  
 projectID 
 , 
  
 location 
 , 
  
 datasetID 
 ), 
  
 InputConfig 
 : 
  
& automlpb 
 . 
 InputConfig 
 { 
  
 Source 
 : 
  
& automlpb 
 . 
 InputConfig_GcsSource 
 { 
  
 GcsSource 
 : 
  
& automlpb 
 . 
 GcsSource 
 { 
  
 InputUris 
 : 
  
 [] 
 string 
 { 
 inputURI 
 }, 
  
 }, 
  
 }, 
  
 }, 
  
 } 
  
 op 
 , 
  
 err 
  
 := 
  
 client 
 . 
 ImportData 
 ( 
 ctx 
 , 
  
 req 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "ImportData: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Processing operation name: %q\n" 
 , 
  
 op 
 . 
 Name 
 ()) 
  
 if 
  
 err 
  
 := 
  
 op 
 . 
 Wait 
 ( 
 ctx 
 ); 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "Wait: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Data imported.\n" 
 ) 
  
 return 
  
 nil 
 } 
 

Java

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Java API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 com.google.api.gax.longrunning.OperationFuture 
 ; 
 import 
  
 com.google.cloud.automl.v1. AutoMlClient 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. DatasetName 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. GcsSource 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. InputConfig 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. OperationMetadata 
 
 ; 
 import 
  
 com.google.protobuf. Empty 
 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.Arrays 
 ; 
 import 
  
 java.util.concurrent.ExecutionException 
 ; 
 import 
  
 java.util.concurrent.TimeUnit 
 ; 
 import 
  
 java.util.concurrent.TimeoutException 
 ; 
 class 
 ImportDataset 
  
 { 
  
 public 
  
 static 
  
 void 
  
 main 
 ( 
 String 
 [] 
  
 args 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
 , 
  
 TimeoutException 
  
 { 
  
 // TODO(developer): Replace these variables before running the sample. 
  
 String 
  
 projectId 
  
 = 
  
 "YOUR_PROJECT_ID" 
 ; 
  
 String 
  
 datasetId 
  
 = 
  
 "YOUR_DATASET_ID" 
 ; 
  
 String 
  
 path 
  
 = 
  
 "gs://BUCKET_ID/path_to_training_data.csv" 
 ; 
  
 importDataset 
 ( 
 projectId 
 , 
  
 datasetId 
 , 
  
 path 
 ); 
  
 } 
  
 // Import a dataset 
  
 static 
  
 void 
  
 importDataset 
 ( 
 String 
  
 projectId 
 , 
  
 String 
  
 datasetId 
 , 
  
 String 
  
 path 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
 , 
  
 TimeoutException 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
  AutoMlClient 
 
  
 client 
  
 = 
  
  AutoMlClient 
 
 . 
 create 
 ()) 
  
 { 
  
 // Get the complete path of the dataset. 
  
  DatasetName 
 
  
 datasetFullId 
  
 = 
  
  DatasetName 
 
 . 
 of 
 ( 
 projectId 
 , 
  
 "us-central1" 
 , 
  
 datasetId 
 ); 
  
 // Get multiple Google Cloud Storage URIs to import data from 
  
  GcsSource 
 
  
 gcsSource 
  
 = 
  
  GcsSource 
 
 . 
 newBuilder 
 (). 
 addAllInputUris 
 ( 
 Arrays 
 . 
 asList 
 ( 
 path 
 . 
 split 
 ( 
 "," 
 ))). 
 build 
 (); 
  
 // Import data from the input URI 
  
  InputConfig 
 
  
 inputConfig 
  
 = 
  
  InputConfig 
 
 . 
 newBuilder 
 (). 
 setGcsSource 
 ( 
 gcsSource 
 ). 
 build 
 (); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Processing import..." 
 ); 
  
 // Start the import job 
  
 OperationFuture<Empty 
 , 
  
 OperationMetadata 
>  
 operation 
  
 = 
  
 client 
 . 
 importDataAsync 
 ( 
 datasetFullId 
 , 
  
 inputConfig 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Operation name: %s%n" 
 , 
  
 operation 
 . 
 getName 
 ()); 
  
 // If you want to wait for the operation to finish, adjust the timeout appropriately. The 
  
 // operation will still run if you choose not to wait for it to complete. You can check the 
  
 // status of your operation using the operation's name. 
  
  Empty 
 
  
 response 
  
 = 
  
 operation 
 . 
 get 
 ( 
 45 
 , 
  
 TimeUnit 
 . 
 MINUTES 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Dataset imported. %s%n" 
 , 
  
 response 
 ); 
  
 } 
  
 catch 
  
 ( 
 TimeoutException 
  
 e 
 ) 
  
 { 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "The operation's polling period was not long enough." 
 ); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "You can use the Operation's name to get the current status." 
 ); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "The import job is still running and will complete as expected." 
 ); 
  
 throw 
  
 e 
 ; 
  
 } 
  
 } 
 } 
 

Node.js

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Node.js API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  /** 
 * TODO(developer): Uncomment these variables before running the sample. 
 */ 
 // const projectId = 'YOUR_PROJECT_ID'; 
 // const location = 'us-central1'; 
 // const datasetId = 'YOUR_DISPLAY_ID'; 
 // const path = 'gs://BUCKET_ID/path_to_training_data.csv'; 
 // Imports the Google Cloud AutoML library 
 const 
  
 { 
 AutoMlClient 
 } 
  
 = 
  
 require 
 ( 
 '@google-cloud/automl' 
 ). 
 v1 
 ; 
 // Instantiates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 AutoMlClient 
 (); 
 async 
  
 function 
  
 importDataset 
 () 
  
 { 
  
 // Construct request 
  
 const 
  
 request 
  
 = 
  
 { 
  
 name 
 : 
  
 client 
 . 
 datasetPath 
 ( 
 projectId 
 , 
  
 location 
 , 
  
 datasetId 
 ), 
  
 inputConfig 
 : 
  
 { 
  
 gcsSource 
 : 
  
 { 
  
 inputUris 
 : 
  
 path 
 . 
 split 
 ( 
 ',' 
 ), 
  
 }, 
  
 }, 
  
 }; 
  
 // Import dataset 
  
 console 
 . 
 log 
 ( 
 'Proccessing import' 
 ); 
  
 const 
  
 [ 
 operation 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 importData 
 ( 
 request 
 ); 
  
 // Wait for operation to complete. 
  
 const 
  
 [ 
 response 
 ] 
  
 = 
  
 await 
  
 operation 
 . 
 promise 
 (); 
  
 console 
 . 
 log 
 ( 
 `Dataset imported: 
 ${ 
 response 
 } 
 ` 
 ); 
 } 
 importDataset 
 (); 
 

Python

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Python API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud 
  
 import 
 automl 
 # TODO(developer): Uncomment and set the following variables 
 # project_id = "YOUR_PROJECT_ID" 
 # dataset_id = "YOUR_DATASET_ID" 
 # path = "gs://YOUR_BUCKET_ID/path/to/data.csv" 
 client 
 = 
 automl 
 . 
 AutoMlClient 
 () 
 # Get the full path of the dataset. 
 dataset_full_id 
 = 
 client 
 . 
 dataset_path 
 ( 
 project_id 
 , 
 "us-central1" 
 , 
 dataset_id 
 ) 
 # Get the multiple Google Cloud Storage URIs 
 input_uris 
 = 
 path 
 . 
 split 
 ( 
 "," 
 ) 
 gcs_source 
 = 
 automl 
 . 
 GcsSource 
 ( 
 input_uris 
 = 
 input_uris 
 ) 
 input_config 
 = 
 automl 
 . 
 InputConfig 
 ( 
 gcs_source 
 = 
 gcs_source 
 ) 
 # Import data from the input URI 
 response 
 = 
 client 
 . 
 import_data 
 ( 
 name 
 = 
 dataset_full_id 
 , 
 input_config 
 = 
 input_config 
 ) 
 print 
 ( 
 "Processing import..." 
 ) 
 print 
 ( 
 f 
 "Data imported. 
 { 
 response 
 . 
 result 
 () 
 } 
 " 
 ) 
 

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for Ruby.

Once you have created and populated the dataset, you are ready to train the model (see: Creating and managing models ).

Managing datasets

Listing datasets

A project can include numerous datasets. This section describes how to retrieve a list of the available datasets for a project.

Web UI

To see a list of the available datasets using the AutoML Translation UI , click the Datasetslink at the top of the left navigation menu.

Datasets page with one dataset

To see the datasets for a different project, select the project from the drop-down list in the upper right of the title bar.

REST

Before using any of the request data, make the following replacements:

  • project-id : your Google Cloud Platform project ID

HTTP method and URL:

GET https://automl.googleapis.com/v1/projects/ project-id 
/locations/us-central1/datasets

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "datasets": [
    {
      "name": "projects/ project-number 
/locations/us-central1/datasets/ dataset-id 
",
      "displayName": " dataset-display-name 
",
      "createTime": "2019-10-01T22:47:38.347689Z",
      "etag": "AB3BwFpPWn6klFqJ867nz98aXr_JHcfYFQBMYTf7rcO-JMi8Ez4iDSNrRW4Vv501i488",
      "translationDatasetMetadata": {
        "sourceLanguageCode": " source-language 
",
        "targetLanguageCode": " target-language 
"
      }
    },
    ...
  ]
}

Go

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Go API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 ( 
  
 "context" 
  
 "fmt" 
  
 "io" 
  
 automl 
  
 "cloud.google.com/go/automl/apiv1" 
  
 "cloud.google.com/go/automl/apiv1/automlpb" 
  
 "google.golang.org/api/iterator" 
 ) 
 // listDatasets lists existing datasets. 
 func 
  
 listDatasets 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 projectID 
  
 string 
 , 
  
 location 
  
 string 
 ) 
  
 error 
  
 { 
  
 // projectID := "my-project-id" 
  
 // location := "us-central1" 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 automl 
 . 
 NewClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "NewClient: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 req 
  
 := 
  
& automlpb 
 . 
 ListDatasetsRequest 
 { 
  
 Parent 
 : 
  
 fmt 
 . 
 Sprintf 
 ( 
 "projects/%s/locations/%s" 
 , 
  
 projectID 
 , 
  
 location 
 ), 
  
 } 
  
 it 
  
 := 
  
 client 
 . 
 ListDatasets 
 ( 
 ctx 
 , 
  
 req 
 ) 
  
 // Iterate over all results 
  
 for 
  
 { 
  
 dataset 
 , 
  
 err 
  
 := 
  
 it 
 . 
 Next 
 () 
  
 if 
  
 err 
  
 == 
  
 iterator 
 . 
 Done 
  
 { 
  
 break 
  
 } 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "ListGlossaries.Next: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Dataset name: %v\n" 
 , 
  
 dataset 
 . 
 GetName 
 ()) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Dataset display name: %v\n" 
 , 
  
 dataset 
 . 
 GetDisplayName 
 ()) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Dataset create time:\n" 
 ) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\tseconds: %v\n" 
 , 
  
 dataset 
 . 
 GetCreateTime 
 (). 
 GetSeconds 
 ()) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\tnanos: %v\n" 
 , 
  
 dataset 
 . 
 GetCreateTime 
 (). 
 GetNanos 
 ()) 
  
 // Translate 
  
 if 
  
 metadata 
  
 := 
  
 dataset 
 . 
 GetTranslationDatasetMetadata 
 (); 
  
 metadata 
  
 != 
  
 nil 
  
 { 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Translation dataset metadata:\n" 
 ) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\tsource_language_code: %v\n" 
 , 
  
 metadata 
 . 
 GetSourceLanguageCode 
 ()) 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "\ttarget_language_code: %v\n" 
 , 
  
 metadata 
 . 
 GetTargetLanguageCode 
 ()) 
  
 } 
  
 } 
  
 return 
  
 nil 
 } 
 

Java

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Java API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 com.google.cloud.automl.v1. AutoMlClient 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. Dataset 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. ListDatasetsRequest 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. LocationName 
 
 ; 
 import 
  
 java.io.IOException 
 ; 
 class 
 ListDatasets 
  
 { 
  
 static 
  
 void 
  
 listDatasets 
 () 
  
 throws 
  
 IOException 
  
 { 
  
 // TODO(developer): Replace these variables before running the sample. 
  
 String 
  
 projectId 
  
 = 
  
 "YOUR_PROJECT_ID" 
 ; 
  
 listDatasets 
 ( 
 projectId 
 ); 
  
 } 
  
 // List the datasets 
  
 static 
  
 void 
  
 listDatasets 
 ( 
 String 
  
 projectId 
 ) 
  
 throws 
  
 IOException 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
  AutoMlClient 
 
  
 client 
  
 = 
  
  AutoMlClient 
 
 . 
 create 
 ()) 
  
 { 
  
 // A resource that represents Google Cloud Platform location. 
  
  LocationName 
 
  
 projectLocation 
  
 = 
  
  LocationName 
 
 . 
 of 
 ( 
 projectId 
 , 
  
 "us-central1" 
 ); 
  
  ListDatasetsRequest 
 
  
 request 
  
 = 
  
  ListDatasetsRequest 
 
 . 
 newBuilder 
 (). 
 setParent 
 ( 
 projectLocation 
 . 
  toString 
 
 ()). 
 build 
 (); 
  
 // List all the datasets available in the region by applying filter. 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "List of datasets:" 
 ); 
  
 for 
  
 ( 
  Dataset 
 
  
 dataset 
  
 : 
  
 client 
 . 
 listDatasets 
 ( 
 request 
 ). 
 iterateAll 
 ()) 
  
 { 
  
 // Display the dataset information 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "\nDataset name: %s\n" 
 , 
  
 dataset 
 . 
 getName 
 ()); 
  
 // To get the dataset id, you have to parse it out of the `name` field. As dataset Ids are 
  
 // required for other methods. 
  
 // Name Form: `projects/{project_id}/locations/{location_id}/datasets/{dataset_id}` 
  
 String 
 [] 
  
 names 
  
 = 
  
 dataset 
 . 
 getName 
 (). 
 split 
 ( 
 "/" 
 ); 
  
 String 
  
 retrievedDatasetId 
  
 = 
  
 names 
 [ 
 names 
 . 
 length 
  
 - 
  
 1 
 ] 
 ; 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Dataset id: %s\n" 
 , 
  
 retrievedDatasetId 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Dataset display name: %s\n" 
 , 
  
 dataset 
 . 
 getDisplayName 
 ()); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Dataset create time:" 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "\tseconds: %s\n" 
 , 
  
 dataset 
 . 
 getCreateTime 
 (). 
 getSeconds 
 ()); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "\tnanos: %s\n" 
 , 
  
 dataset 
 . 
 getCreateTime 
 (). 
 getNanos 
 ()); 
  
 System 
 . 
 out 
 . 
 println 
 ( 
 "Translation dataset metadata:" 
 ); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
  
 "\tSource language code: %s\n" 
 , 
  
 dataset 
 . 
 getTranslationDatasetMetadata 
 (). 
 getSourceLanguageCode 
 ()); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
  
 "\tTarget language code: %s\n" 
 , 
  
 dataset 
 . 
 getTranslationDatasetMetadata 
 (). 
 getTargetLanguageCode 
 ()); 
  
 } 
  
 } 
  
 } 
 } 
 

Node.js

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Node.js API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  /** 
 * TODO(developer): Uncomment these variables before running the sample. 
 */ 
 // const projectId = 'YOUR_PROJECT_ID'; 
 // const location = 'us-central1'; 
 // Imports the Google Cloud AutoML library 
 const 
  
 { 
 AutoMlClient 
 } 
  
 = 
  
 require 
 ( 
 '@google-cloud/automl' 
 ). 
 v1 
 ; 
 // Instantiates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 AutoMlClient 
 (); 
 async 
  
 function 
  
 listDatasets 
 () 
  
 { 
  
 // Construct request 
  
 const 
  
 request 
  
 = 
  
 { 
  
 parent 
 : 
  
 client 
 . 
 locationPath 
 ( 
 projectId 
 , 
  
 location 
 ), 
  
 filter 
 : 
  
 'translation_dataset_metadata:*' 
 , 
  
 }; 
  
 const 
  
 [ 
 response 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 listDatasets 
 ( 
 request 
 ); 
  
 console 
 . 
 log 
 ( 
 'List of datasets:' 
 ); 
  
 for 
  
 ( 
 const 
  
 dataset 
  
 of 
  
 response 
 ) 
  
 { 
  
 console 
 . 
 log 
 ( 
 `Dataset name: 
 ${ 
 dataset 
 . 
 name 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
  
 `Dataset id: 
 ${ 
  
 dataset 
 . 
 name 
 . 
 split 
 ( 
 '/' 
 )[ 
 dataset 
 . 
 name 
 . 
 split 
 ( 
 '/' 
 ). 
 length 
  
 - 
  
 1 
 ] 
  
 } 
 ` 
  
 ); 
  
 console 
 . 
 log 
 ( 
 `Dataset display name: 
 ${ 
 dataset 
 . 
 displayName 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 'Dataset create time' 
 ); 
  
 console 
 . 
 log 
 ( 
 `\tseconds 
 ${ 
 dataset 
 . 
 createTime 
 . 
 seconds 
 } 
 ` 
 ); 
  
 console 
 . 
 log 
 ( 
 `\tnanos 
 ${ 
 dataset 
 . 
 createTime 
 . 
 nanos 
  
 / 
  
 1e9 
 } 
 ` 
 ); 
  
 if 
  
 ( 
 dataset 
 . 
 translationDatasetMetadata 
  
 !== 
  
 undefined 
 ) 
  
 { 
  
 console 
 . 
 log 
 ( 
 'Translation dataset metadata:' 
 ); 
  
 console 
 . 
 log 
 ( 
  
 `\tSource language code: 
 ${ 
 dataset 
 . 
 translationDatasetMetadata 
 . 
 sourceLanguageCode 
 } 
 ` 
  
 ); 
  
 console 
 . 
 log 
 ( 
  
 `\tTarget language code: 
 ${ 
 dataset 
 . 
 translationDatasetMetadata 
 . 
 targetLanguageCode 
 } 
 ` 
  
 ); 
  
 } 
  
 } 
 } 
 listDatasets 
 (); 
 

Python

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Python API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud 
  
 import 
 automl 
 # TODO(developer): Uncomment and set the following variables 
 # project_id = "YOUR_PROJECT_ID" 
 client 
 = 
 automl 
 . 
 AutoMlClient 
 () 
 # A resource that represents Google Cloud Platform location. 
 project_location 
 = 
 f 
 "projects/ 
 { 
 project_id 
 } 
 /locations/us-central1" 
 # List all the datasets available in the region. 
 request 
 = 
 automl 
 . 
 ListDatasetsRequest 
 ( 
 parent 
 = 
 project_location 
 , 
 filter 
 = 
 "" 
 ) 
 response 
 = 
 client 
 . 
 list_datasets 
 ( 
 request 
 = 
 request 
 ) 
 print 
 ( 
 "List of datasets:" 
 ) 
 for 
 dataset 
 in 
 response 
 : 
 print 
 ( 
 f 
 "Dataset name: 
 { 
 dataset 
 . 
 name 
 } 
 " 
 ) 
 print 
 ( 
 "Dataset id: 
 {} 
 " 
 . 
 format 
 ( 
 dataset 
 . 
 name 
 . 
 split 
 ( 
 "/" 
 )[ 
 - 
 1 
 ])) 
 print 
 ( 
 f 
 "Dataset display name: 
 { 
 dataset 
 . 
 display_name 
 } 
 " 
 ) 
 print 
 ( 
 f 
 "Dataset create time: 
 { 
 dataset 
 . 
 create_time 
 } 
 " 
 ) 
 print 
 ( 
 "Translation dataset metadata:" 
 ) 
 print 
 ( 
 " 
 \t 
 source_language_code: 
 {} 
 " 
 . 
 format 
 ( 
 dataset 
 . 
 translation_dataset_metadata 
 . 
 source_language_code 
 ) 
 ) 
 print 
 ( 
 " 
 \t 
 target_language_code: 
 {} 
 " 
 . 
 format 
 ( 
 dataset 
 . 
 translation_dataset_metadata 
 . 
 target_language_code 
 ) 
 ) 
 

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for Ruby.

Deleting a dataset

Web UI

  1. In the AutoML Translation UI , click the Datasetslink at the top of the left navigation menu to display the list of available datasets.

    Datasets page with one dataset

  2. Click the three-dot menu at the far right of the row you want to delete and select Delete.

  3. Click Confirmin the confirmation dialog box.

REST

  • Replace dataset-name with the full name of your dataset, from the response when you created the dataset. The full name has the format: projects/{project-id}/locations/us-central1/datasets/{dataset-id}

Before using any of the request data, make the following replacements:

  • dataset-name : the name of the dataset that you want to delete, in the format project/ project-id /locations/us-central1/datasets/ dataset-id

HTTP method and URL:

DELETE https://automl.googleapis.com/v1/ dataset-name 

To send your request, expand one of these options:

You should receive a JSON response similar to the following:

{
  "name": "projects/ project-number 
/locations/us-central1/operations/ operation-id 
",
  "metadata": {
    "@type": "type.googleapis.com/google.cloud.automl.v1.OperationMetadata",
    "createTime": "2019-10-02T16:43:03.923442Z",
    "updateTime": "2019-10-02T16:43:03.923442Z",
    "deleteDetails": {}
  },
  "done": true,
  "response": {
    "@type": "type.googleapis.com/google.protobuf.Empty"
  }
}

Go

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Go API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 ( 
  
 "context" 
  
 "fmt" 
  
 "io" 
  
 automl 
  
 "cloud.google.com/go/automl/apiv1" 
  
 "cloud.google.com/go/automl/apiv1/automlpb" 
 ) 
 // deleteDataset deletes a dataset. 
 func 
  
 deleteDataset 
 ( 
 w 
  
 io 
 . 
 Writer 
 , 
  
 projectID 
  
 string 
 , 
  
 location 
  
 string 
 , 
  
 datasetID 
  
 string 
 ) 
  
 error 
  
 { 
  
 // projectID := "my-project-id" 
  
 // location := "us-central1" 
  
 // datasetID := "TRL123456789..." 
  
 ctx 
  
 := 
  
 context 
 . 
 Background 
 () 
  
 client 
 , 
  
 err 
  
 := 
  
 automl 
 . 
 NewClient 
 ( 
 ctx 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "NewClient: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 defer 
  
 client 
 . 
 Close 
 () 
  
 req 
  
 := 
  
& automlpb 
 . 
 DeleteDatasetRequest 
 { 
  
 Name 
 : 
  
 fmt 
 . 
 Sprintf 
 ( 
 "projects/%s/locations/%s/datasets/%s" 
 , 
  
 projectID 
 , 
  
 location 
 , 
  
 datasetID 
 ), 
  
 } 
  
 op 
 , 
  
 err 
  
 := 
  
 client 
 . 
 DeleteDataset 
 ( 
 ctx 
 , 
  
 req 
 ) 
  
 if 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "DeleteDataset: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Processing operation name: %q\n" 
 , 
  
 op 
 . 
 Name 
 ()) 
  
 if 
  
 err 
  
 := 
  
 op 
 . 
 Wait 
 ( 
 ctx 
 ); 
  
 err 
  
 != 
  
 nil 
  
 { 
  
 return 
  
 fmt 
 . 
 Errorf 
 ( 
 "Wait: %w" 
 , 
  
 err 
 ) 
  
 } 
  
 fmt 
 . 
 Fprintf 
 ( 
 w 
 , 
  
 "Dataset deleted.\n" 
 ) 
  
 return 
  
 nil 
 } 
 

Java

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Java API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  import 
  
 com.google.cloud.automl.v1. AutoMlClient 
 
 ; 
 import 
  
 com.google.cloud.automl.v1. DatasetName 
 
 ; 
 import 
  
 com.google.protobuf. Empty 
 
 ; 
 import 
  
 java.io.IOException 
 ; 
 import 
  
 java.util.concurrent.ExecutionException 
 ; 
 class 
 DeleteDataset 
  
 { 
  
 static 
  
 void 
  
 deleteDataset 
 () 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
  
 { 
  
 // TODO(developer): Replace these variables before running the sample. 
  
 String 
  
 projectId 
  
 = 
  
 "YOUR_PROJECT_ID" 
 ; 
  
 String 
  
 datasetId 
  
 = 
  
 "YOUR_DATASET_ID" 
 ; 
  
 deleteDataset 
 ( 
 projectId 
 , 
  
 datasetId 
 ); 
  
 } 
  
 // Delete a dataset 
  
 static 
  
 void 
  
 deleteDataset 
 ( 
 String 
  
 projectId 
 , 
  
 String 
  
 datasetId 
 ) 
  
 throws 
  
 IOException 
 , 
  
 ExecutionException 
 , 
  
 InterruptedException 
  
 { 
  
 // Initialize client that will be used to send requests. This client only needs to be created 
  
 // once, and can be reused for multiple requests. After completing all of your requests, call 
  
 // the "close" method on the client to safely clean up any remaining background resources. 
  
 try 
  
 ( 
  AutoMlClient 
 
  
 client 
  
 = 
  
  AutoMlClient 
 
 . 
 create 
 ()) 
  
 { 
  
 // Get the full path of the dataset. 
  
  DatasetName 
 
  
 datasetFullId 
  
 = 
  
  DatasetName 
 
 . 
 of 
 ( 
 projectId 
 , 
  
 "us-central1" 
 , 
  
 datasetId 
 ); 
  
  Empty 
 
  
 response 
  
 = 
  
 client 
 . 
 deleteDatasetAsync 
 ( 
 datasetFullId 
 ). 
 get 
 (); 
  
 System 
 . 
 out 
 . 
 format 
 ( 
 "Dataset deleted. %s\n" 
 , 
  
 response 
 ); 
  
 } 
  
 } 
 } 
 

Node.js

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Node.js API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  /** 
 * TODO(developer): Uncomment these variables before running the sample. 
 */ 
 // const projectId = 'YOUR_PROJECT_ID'; 
 // const location = 'us-central1'; 
 // const datasetId = 'YOUR_DATASET_ID'; 
 // Imports the Google Cloud AutoML library 
 const 
  
 { 
 AutoMlClient 
 } 
  
 = 
  
 require 
 ( 
 '@google-cloud/automl' 
 ). 
 v1 
 ; 
 // Instantiates a client 
 const 
  
 client 
  
 = 
  
 new 
  
 AutoMlClient 
 (); 
 async 
  
 function 
  
 deleteDataset 
 () 
  
 { 
  
 // Construct request 
  
 const 
  
 request 
  
 = 
  
 { 
  
 name 
 : 
  
 client 
 . 
 datasetPath 
 ( 
 projectId 
 , 
  
 location 
 , 
  
 datasetId 
 ), 
  
 }; 
  
 const 
  
 [ 
 operation 
 ] 
  
 = 
  
 await 
  
 client 
 . 
 deleteDataset 
 ( 
 request 
 ); 
  
 // Wait for operation to complete. 
  
 const 
  
 [ 
 response 
 ] 
  
 = 
  
 await 
  
 operation 
 . 
 promise 
 (); 
  
 console 
 . 
 log 
 ( 
 `Dataset deleted: 
 ${ 
 response 
 } 
 ` 
 ); 
 } 
 deleteDataset 
 (); 
 

Python

To learn how to install and use the client library for AutoML Translation, see AutoML Translation client libraries . For more information, see the AutoML Translation Python API reference documentation .

To authenticate to AutoML Translation, set up Application Default Credentials. For more information, see Set up authentication for a local development environment .

  from 
  
 google.cloud 
  
 import 
 automl 
 # TODO(developer): Uncomment and set the following variables 
 # project_id = "YOUR_PROJECT_ID" 
 # dataset_id = "YOUR_DATASET_ID" 
 client 
 = 
 automl 
 . 
 AutoMlClient 
 () 
 # Get the full path of the dataset 
 dataset_full_id 
 = 
 client 
 . 
 dataset_path 
 ( 
 project_id 
 , 
 "us-central1" 
 , 
 dataset_id 
 ) 
 response 
 = 
 client 
 . 
 delete_dataset 
 ( 
 name 
 = 
 dataset_full_id 
 ) 
 print 
 ( 
 f 
 "Dataset deleted. 
 { 
 response 
 . 
 result 
 () 
 } 
 " 
 ) 
 

Additional languages

C#: Please follow the C# setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for .NET.

PHP: Please follow the PHP setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for PHP.

Ruby: Please follow the Ruby setup instructions on the client libraries page and then visit the AutoML Translation reference documentation for Ruby.

Import issues

When you create a dataset, AutoML Translation might drop sentence pairs if they are too long or if the pairs are exactly the same in the source and target languages.

For sentence pairs that are too long, we recommend that you break up sentences to roughly 200 words or less, and then recreate the dataset to include the dropped pairs. While processing your data, AutoML Translation uses an internal process to tokenize your input data, which can increase the size of your sentences. This tokenized data is what AutoML Translation uses to measure data size. Therefore, the 200 word limit is an estimate for the maximum length.

For sentences pairs that are the same in the source and target languages, you can remove them from your dataset. If you want to keep these sentences untranslated, use a glossary resource to build a custom dictionary that defines how AutoML Translation handles specific terms.

Design a Mobile Site
View Site in Mobile | Classic
Share by: