Stay organized with collectionsSave and categorize content based on your preferences.
Pub/Sub Liteis
a real-time messaging service built for low cost and offers lower reliability
compared to Pub/Sub. Pub/Sub Lite offers zonal and regional
topics for storage.
ThePub/Sub Lite Spark Connectorsupports Pub/Sub Lite as an input source to Apache Spark Structured Streaming in
the default micro-batch processing and experimental continuous
processing modes.
To get started, clone thejava-pubsublite-sparkGitHub repository:
git clone https://github.com/googleapis/java-pubsublite-spark
cd java-pubsublite-spark/samples
Python / Scala
The connector is available from theMaven Central repository.
You can download and provide it using the--packagesoption when using the
spark-submit command or set it using thespark.jars.packagesconfiguration property.
What's next
SeeUsing Pub/Sub Lite with Apache Spark,
a quickstart that runs a Python script on a Dataproc cluster to
read and write data from and to Pub/Sub Lite.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[[["\u003cp\u003ePub/Sub Lite is a real-time messaging service designed for low cost, providing zonal and regional storage topics, but with lower reliability than standard Pub/Sub.\u003c/p\u003e\n"],["\u003cp\u003eThe Pub/Sub Lite Spark Connector allows Pub/Sub Lite to serve as an input source for Apache Spark Structured Streaming in both micro-batch and experimental continuous processing modes.\u003c/p\u003e\n"],["\u003cp\u003eA Java-based Spark example using Pub/Sub Lite with Dataproc is available in the \u003ccode\u003esamples\u003c/code\u003e directory of the \u003ccode\u003ejava-pubsublite-spark\u003c/code\u003e GitHub repository.\u003c/p\u003e\n"],["\u003cp\u003eThe Pub/Sub Lite Spark connector is available on Maven Central repository, and can be downloaded via the \u003ccode\u003e--packages\u003c/code\u003e option in spark-submit or with spark.jars.packages.\u003c/p\u003e\n"],["\u003cp\u003eYou can read and write data from and to Pub/Sub Lite by following the instructions in the quickstart guide, "Using Pub/Sub Lite with Apache Spark", which runs a Python script on a Dataproc cluster.\u003c/p\u003e\n"]]],[],null,["[Pub/Sub Lite](/pubsub/lite/docs \"Pub/Sub Lite\") is\na real-time messaging service built for low cost and offers lower reliability\ncompared to Pub/Sub. Pub/Sub Lite offers zonal and regional\ntopics for storage.\n\nThe [Pub/Sub Lite Spark Connector](https://github.com/googleapis/java-pubsublite-spark)\nsupports Pub/Sub Lite as an input source to Apache Spark Structured Streaming in\nthe default micro-batch processing and experimental continuous\nprocessing modes.\n\nUse Pub/Sub Lite with Dataproc \n\nJava\n\n\nThe `samples` directory in the [`java-pubsublite-spark` repository on\nGitHub](https://github.com/googleapis/java-pubsublite-spark) contains\na Spark example in Java that uses Pub/Sub Lite with\nDataproc. To run the example, follow the\n[directions in the Spark example](https://github.com/googleapis/java-pubsublite-spark/tree/master/samples).\n\n1. To get started, clone the `java-pubsublite-spark` GitHub repository: \n\n ```\n git clone https://github.com/googleapis/java-pubsublite-spark\n cd java-pubsublite-spark/samples\n ```\n\nPython / Scala\n\nThe connector is available from the [Maven Central repository](https://search.maven.org/artifact/com.google.cloud/pubsublite-spark-sql-streaming).\nYou can download and provide it using the `--packages` option when using the\nspark-submit command or set it using the `spark.jars.packages`\n[configuration property](https://spark.apache.org/docs/latest/configuration.html#available-properties).\n\nWhat's next\n\n- See [Using Pub/Sub Lite with Apache Spark](/pubsub/lite/docs/write-messages-apache-spark), a quickstart that runs a Python script on a Dataproc cluster to read and write data from and to Pub/Sub Lite.\n- [Select the version of the Pub/Sub Lite Spark Connector](https://search.maven.org/artifact/com.google.cloud/pubsublite-spark-sql-streaming), and then download its JAR on the linked page."]]