Stay organized with collectionsSave and categorize content based on your preferences.
This page shows you the steps to create and manage
AML AI datasets. A dataset is used as an input for the engine
configuration, training, backtest, and prediction pipelines. An
AML AI dataset contains references to BigQuery
tables matching theAML AI input data modelin a
Google Cloud project.
Prerequisites
To get the permissions that
you need to create and manage datasets,
ask your administrator to grant you the
Financial Services Admin (financialservices.admin)
IAM role on your project.
For more information about granting roles, seeManage access to projects, folders, and organizations.
Some API methods return along-running operation(LRO).
These methods are asynchronous and return an Operation object; for details, seeREST Reference. The
operation might not be completed when the method returns a response. For these methods, send the
request and then check for the result. In general, all POST, PUT, UPDATE, and DELETE operations are
long-running.
Create a dataset
To create a dataset, send the create request and then check for the result of the LRO.
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.create
Before using any of the request data,
make the following replacements:
PROJECT_ID: your Google Cloud project ID listed
in theIAM Settings
LOCATION: the location of the instance; use one of thesupported regions
Show locations
us-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
INSTANCE_ID: the user-defined identifier for the instance
DATASET_ID: a user-defined identifier for the
AML AI dataset; use only lowercase letters, numbers, dashes, and underscores (for
example,train_jan2018_apr2020)
BQ_INPUT_DATASET_NAME: the
BigQuery input dataset name
PARTY_TABLE: thePartytable in the
BigQuery input dataset
ACCOUNT_PARTY_LINK_TABLE: theAccountPartyLinktable in the BigQuery input dataset
TRANSACTION_TABLE: theTransactiontable in the BigQuery input dataset
RISK_CASE_EVENT_TABLE: theRiskCaseEventtable in the BigQuery input dataset
PARTY_SUPPLEMENTARY_DATA: thePartySupplementaryDatatable in the BigQuery input dataset; this
table is optional and can be removed from the request JSON
DATA_START_DATE: the start date and time of the data to
use in the dataset; use RFC3339 UTC "Zulu" format (for example,2014-10-02T15:01:23Z)
DATA_END_DATE: the end date and time of the data to
use in the dataset; use RFC3339 UTC "Zulu" format (for example,2014-10-02T15:01:23Z)
To send your request, choose one of these options:
curl
Save the request body in a file namedrequest.json.
Run the following command in the terminal to create or overwrite
this file in the current directory:
Save the request body in a file namedrequest.json.
Run the following command in the terminal to create or overwrite
this file in the current directory:
Copy the returnedOPERATION_IDto use in the next section.
Check for the result
Use theprojects.locations.operations.getmethod to check if the dataset has been created. If the response contains"done": false, repeat the command until the response contains"done": true.
These operations can take a few minutes to several hours to complete.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.operations.get
Before using any of the request data,
make the following replacements:
PROJECT_ID: your Google Cloud project ID listed
in theIAM Settings
LOCATION: the location of the instance; use one of
thesupported regions
Show locations
us-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
OPERATION_ID: the identifier for the operation
To send your request, choose one of these options:
The only fields which can be updated are label fields in AML AI.
The following example updates the key-value pairuser labelsassociated with the dataset.
Permissions required for this task
To perform this task, you must have been granted the following permissions:
Permissions
financialservices.v1datasets.update
Before using any of the request data,
make the following replacements:
PROJECT_ID: your Google Cloud project ID listed
in theIAM Settings
LOCATION: the location of the instance; use one of
thesupported regions
Show locations
us-central1
us-east1
asia-south1
europe-west1
europe-west2
europe-west4
northamerica-northeast1
southamerica-east1
australia-southeast1
INSTANCE_ID: a user-defined identifier for the instance
DATASET_ID: the user-defined identifier for the
dataset
KEY: The key in a key-value pair used to organize
datasets. Seelabelsfor more information.
VALUE: The value in a key-value pair used to organize
datasets. Seelabelsfor more information.
Request JSON body:
{
"labels": {
"KEY": "VALUE"
}
}
To send your request, choose one of these options:
curl
Save the request body in a file namedrequest.json.
Run the following command in the terminal to create or overwrite
this file in the current directory:
Save the request body in a file namedrequest.json.
Run the following command in the terminal to create or overwrite
this file in the current directory:
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eAML AI datasets are essential inputs for engine configuration, training, backtesting, and prediction pipelines, containing references to BigQuery tables that adhere to the AML AI input data model.\u003c/p\u003e\n"],["\u003cp\u003eCreating and managing datasets requires specific permissions, notably the Financial Services Admin (\u003ccode\u003efinancialservices.admin\u003c/code\u003e) IAM role, or equivalent custom or predefined roles, within the Google Cloud project.\u003c/p\u003e\n"],["\u003cp\u003eThe process of creating a dataset involves sending a request using the \u003ccode\u003eprojects.locations.instances.datasets.create\u003c/code\u003e method, which results in a long-running operation that must be checked for completion using the \u003ccode\u003eprojects.locations.operations.get\u003c/code\u003e method.\u003c/p\u003e\n"],["\u003cp\u003eYou can manage the datasets by getting them with \u003ccode\u003eprojects.locations.instances.datasets.get\u003c/code\u003e, listing them with \u003ccode\u003eprojects.locations.instances.datasets.list\u003c/code\u003e, updating them using \u003ccode\u003eprojects.locations.instances.datasets.patch\u003c/code\u003e, and deleting them using the \u003ccode\u003eprojects.locations.instances.datasets.delete\u003c/code\u003e methods.\u003c/p\u003e\n"],["\u003cp\u003eWhen creating or managing a dataset, replace the placeholders such as \u003ccode\u003ePROJECT_ID\u003c/code\u003e, \u003ccode\u003eLOCATION\u003c/code\u003e, \u003ccode\u003eINSTANCE_ID\u003c/code\u003e, and \u003ccode\u003eDATASET_ID\u003c/code\u003e with your actual project and instance details.\u003c/p\u003e\n"]]],[],null,[]]