Stream changes to data in near real-time with Datastream

Before you begin

  1. Sign in to your Google Cloud account. If you're new to Google Cloud, create an account to evaluate how our products perform in real-world scenarios. New customers also get $300 in free credits to run, test, and deploy workloads.
  2. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  3. Verify that billing is enabled for your Google Cloud project .

  4. In the Google Cloud console, on the project selector page, select or create a Google Cloud project.

    Go to project selector

  5. Verify that billing is enabled for your Google Cloud project .

  6. Enable the Datastream API.

    Enable the API

  7. Make sure you have the Datastream Admin role assigned to your user account.

    Go to the IAM page

If you want to create a private connectivity configuration for a standard or shared VPC network, you need to complete additional prerequisites. For more information, see Create a private connectivity configuration .

Requirements

Datastream offers a variety of source options, destination options, and networking connectivity methods.

In this quickstart, we assume that you're using a standalone Oracle database and a destination Cloud Storage service. For the source database, you should be able to configure your network to add an inbound firewall rule. The source database can be on-premises or in a cloud provider. Because the destination is Cloud Storage, it should be in Google's cloud provider.

Because we can't know the specifics of your environment, we can't provide detailed steps when it comes to your networking configuration.

For this quickstart, you'll select IP allowlistingas the connectivity method. IP allowlisting is a security feature often used for limiting and controlling access to the data in your source database to trusted users. You can use IP allowlists to create lists of trusted IP addresses or IP ranges from which your users and other Cloud services such as Datastream can access this data. To use IP allowlists, you must open the source database or firewall to incoming connections from Datastream.

Create connection profiles

By creating connection profiles for a source database and a destination, you're creating records that contain information about the source and the destination.

In this quickstart, you'll select Oracleas the profile type for your source connection profile, and Cloud Storageas the profile type for your destination connection profile. Datastream uses the information in the connection profiles to migrate data from the source Oracle database into a destination bucket in Cloud Storage.

Create a source connection profile for Oracle database

  1. Go to the Connection profilespage for Datastream in the Google Cloud Console.

    Go to the Connection profiles page

  2. Click CREATE PROFILE.

  3. In the Create a connection profilepage, click the Oracleprofile type (because you want to create a source connection profile for Oracle database).

  4. Supply the following information in the Define connection settingssection of the Create Oracle profilepage:

    • Enter My Source Connection Profile as the Connection profile namefor your source database.
    • Keep the auto-generated Connection profile ID.
    • Select the Regionwhere the connection profile will be stored.

    • Enter Connection details:

      • In the Hostname or IPfield, enter a hostname or public IP address that Datastream can use to connect to the source Oracle database. You're providing a public IP address because IP allowlistwill be used as the network connectivity method for this quickstart.
      • In the Portfield, enter the port number that's reserved for the source database. For an Oracle database, the default port is typically 1521 .
      • Enter a Usernameand Passwordto authenticate to your source database.
      • In the System identifier (SID)field, enter the SID or service name that identifies the database instance. For Oracle databases, this is typically ORCL .
  5. In the Define connection settingssection, click CONTINUE. The Define connectivity methodsection of the Create Oracle profilepage is active.

  6. Choose the networking method that you'd like to use to establish connectivity between Datastream and the source database. For this quickstart, use the Connectivity methoddrop-down menu to select IP allowlistingas the networking method.

  7. Configure your source database to allow incoming connections from the Datastream public IP addresses that appear.

  8. In the Define connectivity methodsection, click CONTINUE. The Test connection profilesection of the Create Oracle profilepage is active.

  9. Click RUN TESTto verify that the source Oracle database and Datastream can communicate with each other.

  10. Verify that the "Test passed" status appears.

  11. If the test fails, you can address the problem in the appropriate part of the flow, and then return to re-test. Refer to the Diagnose issues page for troubleshooting steps.

  12. Click CREATE.

Create a destination connection profile for Cloud Storage

  1. Go to the Connection profilespage for Datastream in the Google Cloud Console.

    Go to the Connection profiles page

  2. Click CREATE PROFILE.

  3. In the Create a connection profilepage, click the Cloud Storageprofile type (because you want to create a destination connection profile for Cloud Storage).

  4. Supply the following information in the Create Cloud Storage profilepage:

    • Enter My Destination Connection Profile as the Connection profile namefor your destination Cloud Storage service.
    • Keep the auto-generated Connection profile ID.
    • Select the Regionwhere the connection profile will be stored.
    • In the Connection detailspane, click BROWSE.
    • In the Select bucketpane, select the destination bucket in Cloud Storage into which Datastream will transfer data from the source database, and then click SELECT.

      Your bucket appears in the Bucket namefield of the Create Cloud Storage profilepage.

    • Optionally, in the Connection profile path prefixfield, you can provide a prefix for the path that will be appended to the bucket name when Datastream transfers data to the destination.

  5. Click CREATE.

After creating a source connection profile for Oracle database and a destination connection profile for Cloud Storage, you can use them to create a stream.

Create a stream

In this section, you create a stream. Datastream uses this stream to transfer data from a source Oracle database to a destination bucket in Cloud Storage.

Creating a stream includes:

  • Defining settings for the stream.
  • Selecting the connection profile that you created for your source database (the source connection profile) . For this quickstart, this is My Source Connection Profile.
  • Configuring information about the source database for the stream by specifying the tables and schemas in the source database that Datastream:
    • Can transfer into the destination.
    • Is restricted from transferring into the destination.
  • Determining whether Datastream will backfill historical data, as well as stream ongoing changes into the destination, or stream only changes to the data.
  • Selecting the connection profile that you created for Cloud Storage (the destination connection profile) . For this quickstart, this is My Destination Connection Profile.
  • Configuring information about the destination bucket for the stream. This information includes:
    • The folder of the destination bucket into which Datastream will transfer schemas, tables, and data from a source Oracle database.
    • The output format of files written to Cloud Storage. Datastream supports two output formats: Avro and JSON. For this quickstart, Avrois the file format.

Define settings for the stream

  1. Go to the Streamspage for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Click CREATE STREAM.

  3. Supply the following information in the Define stream detailspanel of the Create streampage:

    • Enter My Stream as the Stream name.
    • Keep the auto-generated Stream ID.
    • From the Regionmenu, select the region where you created your source connection profile.
    • From the Source typemenu, select the Oracleprofile type.
    • From the Destination typemenu, select the Cloud Storageprofile type.
  4. Review the required prerequisites that are generated automatically to reflect how your environment must be prepared for a stream. These prerequisites can include how to configure the source database and how to connect Datastream to the destination bucket in Cloud Storage.

  5. Click CONTINUE. The Define Oracle connection profilepanel of the Create streampage appears.

Specify information about the source connection profile

  1. From the Source connection profilemenu, select your source connection profile for Oracle database .

  2. Click RUN TESTto verify that the source database and Datastream can communicate with each other.

    If the test fails, then the issue associated with the connection profile appears. Refer to the Diagnose issues page for troubleshooting steps. Make the necessary changes to correct the issue, and then retest.

  3. Click CONTINUE. The Configure stream sourcepanel of the Create streampage appears.

Configure information about the source database for the stream

  1. Use the Objects to includemenu to specify the tables and schemas in your source database that Datastream can transfer into a folder in the destination bucket in Cloud Storage. The menu only loads if your database has up to 5,000 objects.

    For this quickstart, you want Datastream to transfer all tables and schemas. Therefore, select All tables from all schemasfrom the menu.

  2. Specify the CDC method. For this tutorial, select Logminer.

  3. Click CONTINUE. The Define Cloud Storage connection profilepanel of the Create streampage appears.

Select a destination connection profile

  1. From the destination connection profilemenu, select your destination connection profile for Cloud Storage .

  2. Click CONTINUE. The Configure stream destinationpanel of the Create streampage appears.

Configure information about the destination for the stream

  1. In the Stream path prefixfield, enter the folder of the destination bucket into which Datastream will transfer schemas, tables, and data from a source Oracle database.

    For this quickstart, you want Datastream to transfer data from the source database into the /root/tutorial folder in the destination bucket of Cloud Storage. Therefore, enter /root/tutorial in the Stream path prefixfield.

  2. In the Output formatfield, select the format of files written to Cloud Storage. For this quickstart, Avrois the file format.

  3. Click CONTINUE. The Review stream details and createpanel of the Create streampage appears.

Create the stream

  1. Verify details about the stream as well as the source and destination connection profiles that the stream will use to transfer data from a source Oracle database to a destination bucket in Cloud Storage.

  2. Click RUN VALIDATIONto validate the stream. By validating a stream, Datastream checks that the source is configured properly, validates that the stream can connect to both the source and the destination, and verifies the end-to-end configuration of the stream.

  3. After all validation checks pass, click CREATE .

  4. In the Create stream? dialog, click CREATE .

After creating a stream, you can start it.

Start the stream

In the previous section of the quickstart, you created a stream, but you didn't start it. You can do this now.

For this quickstart, you create and start a stream separately in case the stream creation process incurs an increased load on your source database. To put off that load, you create the stream without starting it, and then start the stream when the load can be incurred.

By starting the stream, Datastream can transfer data, schemas, and tables from the source database to the destination.

  1. Go to the Streamspage for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Select the checkbox to the left of the stream that you want to start. For this quickstart, this is My Stream.

  3. Click START.

  4. In the dialog, click START. The status of the stream changes from Not started to Starting to Running .

After starting a stream, you can verify that Datastream transferred data from the source database to the destination.

Verify the stream

In this section, you confirm that Datastream transfers the data from all tables of your source Oracle database into the /root/tutorial folder of your Cloud Storage destination bucket.

  1. Go to the Streamspage for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Click the stream that you created. For this quickstart, this is My Stream.

  3. In the Stream detailspage, click the link that appears below the Destination write pathfield. The Bucket detailspage of Cloud Storage opens in a separate tab.

  4. Verify that you see folders that represent tables of your source Oracle database.

  5. Click one of the table folders and drill down until you see data that's associated with the table.

Clean up

To avoid incurring charges to your Google Cloud account for the resources used on this page, follow these steps.

  1. Use the Google Cloud console to delete your project, Datastream stream and connection profiles, and Cloud Storage destination bucket.

By cleaning up the resources that you created on Datastream, they won't take up quota and you won't be billed for them in the future. The following sections describe how to delete or turn off these resources.

Delete your project

The easiest way to eliminate billing is to delete the project that you created for this quickstart.

  1. In the Google Cloud console, go to the Manage resourcespage.

    Go to the Manage resources page

  2. In the project list, select the project that you want to delete, and then click Delete.

  3. In the dialog, type the project ID, and then click Shut downto delete the project.

Delete the stream

  1. Go to the Streamspage for Datastream in the Google Cloud Console.

    Go to the Streams page

  2. Click the stream that you want to delete. For this quickstart, this is My Stream.

  3. Click PAUSE.

  4. In the dialog, click PAUSE.

  5. In the Stream statuspane of the Stream detailspage, verify that the status of the stream is Paused .

  6. Click DELETE.

  7. In the dialog, enter Delete in the text field, and then click DELETE.

Delete the connection profiles

  1. Go to the Connection profilespage for Datastream in the Google Cloud Console.

    Go to the Connection profiles page

  2. Select the checkbox for each connection profile that you want to delete. For this quickstart, select the check boxes for My Source Connection Profileand My Destination Connection Profile.

  3. Click DELETE.

  4. In the dialog, click DELETE.

Delete your Cloud Storage destination bucket

  1. Go to the Browser page for Cloud Storage in the Google Cloud Console.

    Go to the Browser page

  2. Select the checkbox to the left of your bucket, and then click DELETE .

  3. In the dialog, enter DELETE in the text field, and then click DELETE.

What's next

  • Learn more about Datastream .
  • Try out other Google Cloud features for yourself. Have a look at our quickstarts .
Create a Mobile Website
View Site in Mobile | Classic
Share by: