Stay organized with collectionsSave and categorize content based on your preferences.
This page describes how to load data fromSalesforce
sObjectsinto Google Cloud
with Cloud Data Fusion. The Salesforce batch source plugin reads sObjects from
Salesforce. The sObjects are the Salesforce tables that you plan to pull from.
Some examples of sObjects include opportunities, contacts, accounts, leads, and
custom objects.
To reuse an existing connection, follow these steps:
Turn onUse connection.
ClickBrowse connections.
Click the connection name.
Optional: If a connection doesn't exist and you want to create a
new reusable connection, clickAdd connectionand refer to the
steps in theNew connectiontab.
Enter theSObject nameto load all the columns from the object.
Optional: If you select the sObject name, you can filter the data with
the following fields:
Last modified after: Only include records that were last
modified after a given time.
Last modified before: Only include records that were
last modified earlier than a given time.
Duration: Only include records that were last modified within a
time window of the specified size.
Offset: Only include records where the Last modified date is
less than the logical start time of the pipeline, minus the given
offset.
Optional: For supported sObjects, to improve the performance of a
pipeline, turn onEnable PK chunking. For more information, seeImprove performance with PK chunking.
Optional: Instead of specifying the sObject Name, you can enter a SOQL
query, such asSELECT LastName from Contact. For more information, seeSOQL queries for the Salesforce source.
To test connectivity, clickGet schema. Cloud Data Fusion
connects to Salesforce and pulls the schema for the listed table
(technically, an sObject).
Properties
Property
Macro enabled
Required property
Description
Reference name
No
Yes
Used to uniquely identify this source for tasks, such as lineage,
annotating metadata.
Use connection
No
No
Use an existing connection. If a connection is used, you don't need to
provide the credentials.
Browse connections
Yes
No
Name of the connection to use.
Username
Yes
Yes
Salesforce username.
Password
Yes
Yes
Salesforce password.
Security token
Yes
No
Salesforce security token. If the password doesn't contain the
security token Cloud Data Fusion appends the token before
authenticating with Salesforce.
Salesforce OAuth2 login URL. The default ishttps://login.salesforce.com/services/oauth2/token
Connection timeout
Yes
No
Maximum time, in milliseconds, to wait for connection initialization
before it times out. Default is 30000 milliseconds.
Proxy URL
Yes
No
Proxy URL, which contain a protocol, address, and port.
SOQL
Yes
No
An SOQL query to fetch data into source. Examples:
SELECT Id, Name, BillingCity FROM Account
SELECT Id FROM Contact WHERE Name LIKE 'A%' AND MailingCity =
'California'
SObject name
Yes
No
Salesforce object name to read. If value is provided, the connector gets
all fields for this object from Salesforce and generates a SOQL query, such
asselectFIELD_1,FIELD_2from
${sObjectName}. Ignored if SOQL query is provided. There are also sObjects that aren't supported in the
Salesforce Bulk API. Creating a job with an unsupported object causes the
following error:Entity is not supported by the Bulk
API. These objects also aren't supported by Einstein Analytics,
which uses Bulk API for querying data. Cases when BULK API isn't used:When query length sends
the query to Salesforce to receive the array of batch information, the Bulk
API isn't used in one case. If the query is within the limit, it
executes the original query. Otherwise, it switches to wide object logic.
For example, it generates an ID query that only retrieves batch information
for IDs that will be used later for retrieving data through the SOAP API.
Last modified after
Yes
No
Filter data to only include records where the system field,LastModifiedDate, is greater than or equal to the specified
date. Use the Salesforce date format (seeexamples). If no value is provided,
no lower-bound date is applied.
Last modified before
Yes
No
Filter data to only include records where the system field,LastModifiedDate, is less than the specified date. Use the
Salesforce date format
(seeexamples). Specifying this value
with theLast modified afterproperty lets you read data
that was modified within a window of time. If no value is provided, no
upper-bound date is applied.
Duration
Yes
No
Filter data to only read records that were last modified within a
window of time of the specified size. For example, if the duration is6 hoursand the pipeline runs at 9 AM, it will read data
that was last updated from 3 AM (inclusive) to 9 AM (exclusive).
Specify the duration with numbers and the following time units:
seconds
minutes
hours
days
months
years
Several units can be specified, but each unit can only be used once. For
example,2 days, 1 hours, 30 minutes. If a value is already
specified forLast modified afterorLast modified before, the duration is ignored.
Offset
Yes
No
Filter data to only read records where the system field,LastModifiedDate, is less than the logical start time of the
pipeline, minus the given offset. For example, if duration is6 hours, the offset is1 hours, and the pipeline
runs at 9 AM, data that was last modified between 2 AM
(inclusive) and 8 AM (exclusive) is read. Specify the duration
using numbers and the following units of time:
seconds
minutes
hours
days
months
years
Several units can be specified, but each unit can only be used once. For
example,2 days, 1 hours, 30 minutes. If a value is already
specified forLast modified afterorLast modified before, the offset is ignored.
SOQL operation type
No
No
Specify the query operation to run on the table. If a query is selected,
only current records are returned. SelectingqueryAllreturns all current and deleted records. The default operation isquery.
Enable PK chunking
Yes
No
Primary key (PK) chunking splits a query on large tables into pieces, or
chunks, based on record IDs, or primary keys, of the queried records. Salesforce recommends that you enable PK chunking when querying tables with
more than 10 million records, or when a bulk query constantly times out.
For more information, seePK chunking. PK chunking only works with queries that don't includeSELECTclauses, or conditions other thanWHERE.
Chunking is supported for custom objects and any Sharing and History tables
that support standard objects.
Chunk size
Yes
No
Specify size of chunk. Maximum size is 250,000. Default size is
100,000.
SObject parent name
Yes
No
Parent of the Salesforce Object. This is used to enable chunking for
history tables or shared objects.
Salesforce date format examples
Format syntax
Example
YYYY-MM-DDThh:mm:ss+hh:mm
1999-01-01T23:01:01+01:00
YYYY-MM-DDThh:mm:ss-hh:mm
1999-01-01T23:01:01-08:00
YYYY-MM-DDThh:mm:ssZ
1999-01-01T23:01:01Z
Data type mappings
The following table is a list of Salesforce data types with corresponding
CDAP types.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003eThis page outlines the process of loading data from Salesforce sObjects into Google Cloud using Cloud Data Fusion's Salesforce batch source plugin.\u003c/p\u003e\n"],["\u003cp\u003eThe plugin allows data retrieval through Salesforce Object Query Language (SOQL) and supports the use of incremental or range date filters for selecting specific data subsets.\u003c/p\u003e\n"],["\u003cp\u003eConfiguration of the plugin requires either setting up a new connection with Salesforce credentials or using a previously configured, reusable connection.\u003c/p\u003e\n"],["\u003cp\u003eUsers can load all columns from a specified sObject or apply filters such as 'Last modified after/before,' 'Duration,' and 'Offset' to narrow down the data pulled.\u003c/p\u003e\n"],["\u003cp\u003eThe Salesforce batch source plugin supports PK chunking for enhanced performance when querying large datasets, and provides data mapping information from Salesforce data types to CDAP schema data types.\u003c/p\u003e\n"]]],[],null,["# Salesforce batch source\n\nThis page describes how to load data from [Salesforce\nsObjects](https://trailhead.salesforce.com/content/learn/modules/apex_database/apex_database_sobjects) into Google Cloud\nwith Cloud Data Fusion. The Salesforce batch source plugin reads sObjects from\nSalesforce. The sObjects are the Salesforce tables that you plan to pull from.\nSome examples of sObjects include opportunities, contacts, accounts, leads, and\ncustom objects.\n\nThe Salesforce batch source plugin supports reading data with [Salesforce Object\nQuery Language (SOQL)](/data-fusion/docs/use-case/salesforce-soql-queries)\nqueries and incremental or range date filters.\n\nBefore you begin\n----------------\n\n- The Cloud Data Fusion Salesforce batch source uses the Salesforce Bulk API V1.\n- Set up Salesforce before deploying and using Cloud Data Fusion Salesforce plugin. For more information, see [Create a Salesforce connected app for Cloud Data Fusion](/data-fusion/docs/how-to/create-salesforce-app-for-plugin).\n\nConfigure the plugin\n--------------------\n\n1. [Go to the Cloud Data Fusion web interface](/data-fusion/docs/create-data-pipeline#navigate-web-interface) and click **Studio**.\n2. Check that **Data Pipeline - Batch** is selected (not **Realtime**).\n3. In the **Source** menu, click **Salesforce** . The Salesforce node appears in your pipeline. If you don't see the Salesforce source on the **Studio** page, [deploy the Salesforce plugins from the Cloud Data Fusion Hub](/data-fusion/docs/how-to/deploy-a-plugin).\n4. To configure the source, go to the Salesforce node and click **Properties**.\n5. Enter the following properties. For a complete list, see\n [Properties](#properties).\n\n 1. Enter a **Label** for the Salesforce node---for example, `Salesforce\n tables`.\n 2. Enter a **Reference name** for the Salesforce source for lineage---for example, `Salesforce tables`.\n 3. Enter the connection details. You can set up a new, one-time connection,\n or an existing, reusable connection.\n\n ### New connection\n\n\n To add a one-time connection to Salesforce, follow these steps:\n 1. Keep **Use connection** turned off.\n 2. In the **Connection** section, enter the following information\n from the Salesforce account in these fields:\n\n - **Username**\n - **Password**\n - **Security token**\n - **Consumer key**\n - **Consumer secret**\n\n To get the credentials from Salesforce, see\n [Get properties from Salesforce](/data-fusion/docs/how-to/create-salesforce-app-for-plugin#get-salesforce-information).\n\n ### Reusable connection\n\n\n To reuse an existing connection, follow these steps:\n 1. Turn on **Use connection**.\n 2. Click **Browse connections**.\n 3. Click the connection name.\n\n | **Note:** For more information about adding, importing, and editing the connections that appear when you browse connections, see [Manage connections](/data-fusion/docs/how-to/managing-connections).\n 4. Optional: If a connection doesn't exist and you want to create a\n new reusable connection, click **Add connection** and refer to the\n steps in the [**New connection**](#new-connection) tab.\n\n 4. Enter the **SObject name** to load all the columns from the object.\n\n 5. Optional: If you select the sObject name, you can filter the data with\n the following fields:\n\n - **Last modified after**: Only include records that were last modified after a given time.\n - **Last modified before**: Only include records that were last modified earlier than a given time.\n - **Duration**: Only include records that were last modified within a time window of the specified size.\n - **Offset**: Only include records where the Last modified date is less than the logical start time of the pipeline, minus the given offset.\n 6. Optional: For supported sObjects, to improve the performance of a\n pipeline, turn on **Enable PK chunking** . For more information, see\n [Improve performance with PK chunking](/data-fusion/docs/concepts/best-practice-salesforce#pk-chunking).\n\n 7. Optional: Instead of specifying the sObject Name, you can enter a SOQL\n query, such as `SELECT LastName from Contact`. For more information, see\n [SOQL queries for the Salesforce source](/data-fusion/docs/use-case/salesforce-soql-queries).\n\n 8. To test connectivity, click **Get schema**. Cloud Data Fusion\n connects to Salesforce and pulls the schema for the listed table\n (technically, an sObject).\n\n### Properties\n\n### Salesforce date format examples\n\n### Data type mappings\n\nThe following table is a list of Salesforce data types with corresponding\nCDAP types.\n\nUse cases\n---------\n\nSee the following use cases for the Salesforce batch source:\n\n- [Use the Salesforce batch source plugin to analyze leads data in BigQuery](/data-fusion/docs/tutorials/connect-salesforce-to-bq)\n\n- [Use SOQL queries for the Salesforce source](/use-case/salesforce-soql-queries)\n\nBest practices\n--------------\n\nFor more information about improving performance in the Salesforce batch source, see the [best practices](/data-fusion/docs/concepts/best-practice-salesforce). \n\nRelease notes\n-------------\n\n- [March 14, 2024](/data-fusion/docs/release-notes#March_14_2024)\n- [August 7, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#August-7%2C-2023)\n- [August 4, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#August-4%2C-2023)\n- [July 7, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#July-7%2C-2023)\n- [June 7, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#June-7%2C-2023)\n- [May 25, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#May-25%2C-2023)\n- [April 13, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#April-13%2C-2023)\n- [March 20, 2023](https://cdap.atlassian.net/wiki/spaces/DOCS/pages/1280901131/CDAP+Hub+Release+Log#March-20%2C-2023)\n\nWhat's next\n-----------\n\n- Work through a [Salesforce plugin tutorial](/data-fusion/docs/tutorials/connect-salesforce-to-bq)."]]