Set up a Microsoft SharePoint data store

This page describes how to create a data store and connect Microsoft SharePoint to Gemini Enterprise.

Before you begin

Ensure the following before you set up your Microsoft SharePoint federated connection:

  1. Grant the Discovery Engine Editor role ( roles/discoveryengine.editor ). This role is required for the user to create the data store. To grant this role, do the following:

    1. In the Google Cloud console, go to the IAMpage.

      Go to IAM

    2. Locate the user account and click the edit Editicon.
    3. Grant the Discovery Engine Editor role to the user. For more information, see IAM roles and permissions .
  2. Register Gemini Enterprise as an OAuth 2.0 application in Microsoft Entra ID and obtain the following credentials :

    • Client ID

    • Client secret

    • Tenant ID

    • Instance URI

  3. Configure the Microsoft API permissions with the consent of a Microsoft SharePoint admin.

Create Microsoft SharePoint data store

To create a Microsoft SharePoint data store, do the following:

  1. In the Google Cloud console, go to the Gemini Enterprise page.

    Gemini Enterprise

  2. Select or create a Google Cloud project.

  3. In the navigation menu, click Data stores.

  4. Click Create data store.

  5. In the Sourcesection, search for Microsoft SharePoint, and click Select.

  6. In the Datasection:

    1. In the Connector modesection, select Federated searchor Data ingestionas the connection mode.

    2. Click Continue.

    3. In the Authentication settingssection, configure authentication based on your chosen connection mode.

      1. Provide the following authentication details as necessary:
        • Client ID:The unique identifier of the application registered in your organization's Microsoft Entra ID.
        • Client secret:The secret key generated during the OAuth 2.0 application registration process in Microsoft Entra ID.
        • Tenant ID:The unique identifier for your organization in Microsoft Entra admin center.
        • Instance URI: The base URL for your Microsoft SharePoint instance (e.g., https://{your-domain}.sharepoint.com ).

        For more information, see Obtain client credentials .

      • If you selected Data ingestionwith Federated credentialsauthentication method, copy the Subject identifierand use it to add federated credential for data ingestion .

        • Subject identifier:The unique ID of the service account you plan to use.
      • If you selected Federated search, click Loginand complete the third-party sign-in.

    4. Click Continue.

    5. If you selected Federated search, the Destinationssection appears. Enter the base URL for the site.

    6. Click Continue.

    7. If you selected Federated search, in the Advanced Optionssection, you can refine your search scope by specifying which Microsoft SharePoint sites or specific paths to include or exclude from your results.

      You can add filters to the data store using the Google Cloud console or API:

      Console

      Note: When entering multiple sites, separate each URL with a comma. You can also copy and paste a list of URLs directly from a CSV file.

      1. In the Site filter section:
        • Choose the filter type: Include in search or Exclude from search .
        • In the Site URL filter field, enter the exact Microsoft SharePoint site URLs to filter, in the format https://{your-tenant}.sharepoint.com/sites/{your-site-name} .

          Note: Enter the exact site URLs. Subpaths, folders, or document libraries are not supported in this specific field.

      2. In the Path filter section:
        • Choose the filter type: Include in search or Exclude from search .
        • In the Path filter field, enter the full SharePoint paths.

          Note: This field supports subpaths, folders, and document libraries, in the format, https://{your-tenant}.sharepoint.com/sites/{your-site-name}/{your-folder-name}/ . Any content under the specified path will be filtered.

      REST

      To add filters when creating a data store, call the setUpDataConnector method.

       curl -X POST \
              -H "Authorization: Bearer $(gcloud auth print-access-token)" \
              -H "Content-Type: application/json" \
              -H "X-Goog-User-Project: PROJECT_ID 
      " \
              "https:// ENDPOINT_LOCATION 
      -discoveryengine.googleapis.com/v1alpha/projects/ PROJECT_ID 
      /locations/ LOCATION 
      :setUpDataConnector" \
              -d '{
              "collectionId": " COLLECTION_ID 
      ",
              "collectionDisplayName": " COLLECTION_DISPLAY_NAME 
      ",
              "dataConnector": {
                  "dataSource": "sharepoint_federated_search",
                  "params": {
                  "client_id": " CLIENT_ID 
      ",
                  "client_secret": " CLIENT_SECRET 
      ",
                  "instance_uri": " INSTANCE_URI 
      ",
                  "tenant_id": " TENANT_ID 
      ",
                  " FILTER_TYPE 
      ": {
                      " FILTER_KEY 
      ": [
                      " FILTER_VALUE1 
      ",
                      " FILTER_VALUE2 
      "
                      ]
                  }
                  },
                  "entities": [
                  {
                      "entityName": "file"
                  }
                  ],
                  "refreshInterval": "7200s",
                  "connectorType": "THIRD_PARTY_FEDERATED",
                  "connectorModes": [
                  "FEDERATED"
                  ]
              }
              }' 
      

      Replace the following:

      • PROJECT_ID : the ID of your project.
      • ENDPOINT_LOCATION : the multi-region for your API request. Specify one of the following values:
        • us for the US multi-region
        • eu for the EU multi-region
        • global for the Global location
        For more information, see Specify a multi-region for your data store .
      • LOCATION : the multi-region of your data store: global , us , or eu
      • COLLECTION_ID : the unique ID of the data store.
      • COLLECTION_DISPLAY_NAME : the display name of the data store.
      • CLIENT_ID : the client ID for SharePoint authentication.
      • CLIENT_SECRET : the client secret for SharePoint authentication.
      • INSTANCE_URI : the instance URI for SharePoint.
      • TENANT_ID : the tenant ID for SharePoint.
      • FILTER_TYPE : the type of filter added to your Microsoft SharePoint data stores. For more information, see Filter types .
      • FILTER_KEY : the key for the filter, corresponding to a field in your data. For more information, see Filter keys .
      • FILTER_VALUES : the value or values to filter on for the specified FILTER_KEY .

    For more details on filter types and adding filters, see Add filters to Microsoft SharePoint data store .

  7. Click Continue.

    1. In the Entities to search(if you selected Federated search) or Entities to sync(if you selected Data ingestion) section:
      1. Select all the required entities.
      2. If you selected Federated search, proceed to the next step.
      3. If you selected Data ingestion, continue with the following steps:
        1. Optional: To sync specific projects, do the following:
          1. Click Filter.
          2. To filter entities out of the index, select the Exclude from the indexcheckbox, or to ensure that they are included in the index, select the Include to the indexcheckbox.
          3. Enter the keys. Press enter after each key.
          4. Click Save.
        2. To configure the sync schedule, do the following:
          1. In the Sync frequencylist, select the sync frequency.
            • To schedule separate full syncs of entity and identity data, expand the menu in the Full syncsection and then select Custom options.
          2. In the Incremental sync frequencylist, select the incremental sync frequency. For more information, see Sync schedules .
  8. Click Continue.

  9. In the Actionssection:

    1. If you selected Federated search:
      1. From Select Microsoft SharePoint actions to enable, select the actions from the category to enable them for the connector.
    2. If you selected Data ingestion:

      1. In the Authentication settingssection, configure authentication based on your chosen connection mode.

        1. Provide the following authentication details as necessary:

          • Client ID:The unique identifier of the application registered in your organization's Microsoft Entra ID.
          • Client Secret:The secret key generated during the OAuth 2.0 application registration process in Microsoft Entra ID.
          • Tenant ID:The unique identifier for your organization in Microsoft Entra admin center.
          • Instance URI: The base URL for your Microsoft SharePoint instance (e.g., https://{your-domain}.sharepoint.com ).

          For more information, see Obtain client credentials .

      2. Click Continue.

      3. In the Destinationssection, enter the base URL for the site.

      4. Click Continue.

      5. From Select Microsoft SharePoint actions to enable, select the actions from the category to enable them for the connector.

  10. Click Continue.

    To manage the list of actions, see Manage actions .

  11. In the Configurationsection:

    1. From the Multi-regionlist, select the location for your data connector.
    2. In the Your data connector namefield, enter a name for your connector.
    3. If you selected us or eu as the location, configure the Encryption settings:
      • Optional: If you haven't configured single-region keys, click Go to settings pageto do so. For more information, see Register a single-region key for third-party connectors .
      • Select Google-managed encryption keyor Cloud KMS key.
      • If you selected Cloud KMS key:
        • In the Key management typelist, select the appropriate type.
        • In the Cloud KMS keylist, select the key.
      For more information, see Customer-managed encryption keys .
  12. Click Continue.

  13. In the Billingsection, select General pricingor Configurable pricing. For more information, see Verify the billing status of your projects and Licenses .

  14. Click Create. Gemini Enterprise creates your data store and displays your data stores on the Data Storespage.

On the Data Storespage, click your data store name to see the status. After the data store state changes from Creatingto Active, the Microsoft SharePoint connector is ready to be used.

For an ingestion connector created with Microsoft SharePoint, the data store state transitions from Creatingto Runningupon synchronization initiation. It then changes to Activeonce ingestion is complete, signifying that the data store is fully configured. Depending on data volume, ingestion may require several hours.

After creating the data store, create an app , connect it to the Microsoft SharePoint data store , and authorize Gemini Enterprise to access Microsoft SharePoint before executing any queries.

Enable real-time sync for data ingestion

Real-time sync only syncs document entities and doesn't sync data related to identity entities. The following table shows which document events are supported with real-time sync.

Microsoft SharePoint entities Create Update Delete Permission changes
Attachments
Comments
Events
Files
Pages

To enable real-time sync for your data store, follow these steps.

  1. In the Google Cloud console, go to the Gemini Enterprisepage.

    Gemini Enterprise

  2. In the navigation menu, click Data Stores.

  3. Click the name of the Microsoft SharePoint data store for which you want to enable real-time sync.

  4. On the data store Datapage, wait until the Connector statechanges to Active.

  5. In the Real-time syncfield, click View/edit.

  6. To enable real-time sync, click the Enable real-time synctoggle.

  7. In the Client secretfield, enter a value. This value is used to verify Microsoft SharePoint webhook events. We recommend using a string of 20 characters.

  8. Click Save.

    Wait for the Real-time syncfield to change to Running.

Data handling and query execution

This section describes how Gemini Enterprise manages your query and the privacy implications of using the federated data store.

Query execution

After you authorize Microsoft SharePoint and send a search query to Gemini Enterprise:

  • Gemini Enterprise sends your search query directly to the Microsoft API.
  • Gemini Enterprise blends the results with those from other connected data sources and displays a comprehensive search result.

Data handling

When using third-party federated search, the following data handling rules apply:

  • Your query string is sent to the third-party search backend (Microsoft API).
  • These third parties may associate queries with your identity.
  • If multiple federated search data sources are enabled, the query might be sent to all of them.
  • After the data reaches the third-party system, it is governed by that system's terms of service and privacy policies.

What's next

Design a Mobile Site
View Site in Mobile | Classic
Share by: