Lightning Engine delivers up to 4.9x faster performance than open source Spark and up to 2x the price-performance over the leading high-speed Spark alternative.

Managed Service for Apache Spark (formerly Dataproc)

The new way to Spark: Easier, smarter, faster

Run Apache Spark workloads with zero-ops serverless Spark or managed clusters. Accelerate development with agentic AI workflows and boost performance with Lightning Engine.

New customers get $300 in free credits to try Managed Service for Apache Spark and other Google Cloud products.

Apache Spark is a trademark of the Apache Software Foundation .

Features

Industry-leading performance with Lightning Engine

Accelerate large-scale ETL and SQL workloads up to 4.9x faster than open source Apache Spark with zero code changes. Lightning Engine utilizes a native C++ vectorized execution engine, intelligent caching, and optimized columnar shuffling. Combine this with intelligent Spark autotuning to eliminate the manual tuning tax, optimizing memory and preventing OOM errors automatically.

*The queries are derived from the TPC-DS standard and TPC-H standard

Learn technical details & hear Lowe's experience with Lightning Engine

Flexible lakehouse interoperability

Build an open lakehouse architecture that guarantees engine independence. Process data in open formats like Apache Iceberg directly from Google Cloud Storage . Integrate seamlessly with BigQuery and Knowledge Catalog (formerly Dataplex) for unified analytics and governance, ensuring true multi-engine interoperability without translation layers.

117% ROI Unlocked and Accelerated AI Innovation with a Data Lakehouse on Google Cloud

Download the study

Unified AI powered developer experience

Clear your backlog with data agents that take action, not just answer questions. Accelerate your workflow using Gemini baked into the VSCode agentic extension for accelerated productivity of Spark workloads from development to production, or use the IDE of your choice. Leverate the Data Engineering and Data Science Agents to automate data wrangling, build pipelines from natural language, and generate PySpark code. Automatically troubleshoot broken Spark jobs with Gemini Cloud Assist . Combine SQL and Spark in a single, unified AI-first notebook.

Autonomous agents: Your data's next evolution

Read the guidebook

Enterprise AI/ML ready

Build and operationalize your entire machine learning lifecycle. Accelerate model training and inference with GPU support, powered by NVIDIA RAPIDS, and pre-configured ML Runtimes for PyTorch and XGBoost. Integrate with the Google Cloud AI ecosystem to orchestrate end-to-end MLOps and manage assets with Gemini Enterprise Agent Platform Model Registry integration.

A practical guide to data science with Google Cloud

Get the ebook

Secure, scalable, and seamless migrations

Integrate seamlessly with your security posture using IAM, VPC Service Controls, and Kerberos. Easily migrate cloud and legacy Spark workloads using Managed Service for Apache Spark templates and tooling. Lift-and-shift workloads with support for Spark 2.x up to Spark 4.0 without immediate code refactoring.

A practical guide to data lake migration

Modernizing your data lake starts here

Multi-tenant efficiency and FinOps controls

Maximize resource utilization and reduce idle costs. Deploy multi-tenant Spark clusters that allow up to 800 users to share compute resources while maintaining strict data and environment isolation. Control your bill with scale-to-zero capabilities, per-second billing, and Spot VM support for flexible workloads.

Open and flexible ecosystem

Avoid vendor lock-in. While optimized for Apache Spark, our managed clusters support 30+ open source tools like Apache Hadoop, Flink, and Trino. Integrate seamlessly with orchestrators like Managed Service for Apache Airflow and extend with Kubernetes and Docker for maximum flexibility.

Deployment options

Deployment options	Choose between the fine-grained control of managed clusters or the zero-ops simplicity of a serverless experience for the best option for your workload.
Deployment Mode:	What it is:	Ideal for:	Pay For:
Serverless	Spark jobs as a service. Managed Spark, managed infrastructure.	New pipelines, interactive analysis, and spiky workloads where a zero-ops, pay-per-job model is preferred.	Job run time
Clusters	Spark clusters as a service. Managed Spark, your infrastructure.	Migrating legacy Spark or OSS workloads, running persistent clusters, or requiring deep open-source customization.	Cluster uptime

See a detailed comparison

Deployment options

Choose between the fine-grained control of managed clusters or the zero-ops simplicity of a serverless experience for the best option for your workload.

Serverless

What it is:

Spark jobs as a service.

Managed Spark, managed infrastructure.

Ideal for:

New pipelines, interactive analysis, and spiky workloads where a zero-ops, pay-per-job model is preferred.

Pay For:

Job run time

Clusters

What it is:

Spark clusters as a service.

Managed Spark, your infrastructure.

Ideal for:

Migrating legacy Spark or OSS workloads, running persistent clusters, or requiring deep open-source customization.

Pay For:

Cluster uptime

See a detailed comparison

How It Works

Make Spark easier with zero-ops serverless or managed clusters. Work smarter with Gemini in your IDE of choice, using agentic AI to accelerate PySpark development. Run jobs faster with Lightning Engine, all while maintaining unified governance across your open lakehouse with Knowledge Catalog.

Common Uses

Data engineering at scale

Automated ETL pipelines

Build robust, event-driven Spark ETL pipelines that automatically scale on demand. Leverage serverless execution for spiky workloads or managed clusters for persistent jobs. Use workflow templates to automate your most critical, production-level data processing jobs from end to end.

Logical design for a data lake pipeline

Tutorials, quickstarts, & labs

Automated ETL pipelines

Logical design for a data lake pipeline

Data science and machine learning

Interactive data science

Empower data scientists to explore data and iterate on Spark ML models. Unify SQL and Spark using Gemini with the VSCode agentic extension or your IDE of choice, moving seamlessly from data exploration to model building with PySpark using serverless execution. Attach GPUs with a single command.

Tutorials, quickstarts, & labs

Interactive data science

Lakehouse modernization

Open data lakehouse

Use Managed Service for Apache Spark as the processing engine for your modern data lakehouse. Process data in open formats like Apache Iceberg directly from your data lake, eliminating data silos. Integrate with BigQuery and Lakehouse for Apache Iceberg for a unified, multi-engine analytics platform.

Tutorials, quickstarts, & labs

Open data lakehouse

Generate a solution

What problem are you trying to solve?

What you'll get:

Step-by-step guide

Reference architecture

Available pre-built solutions

This service was built with Gemini Enterprise Agent Platform . You must be 18 or older to use it. Do not enter sensitive, confidential, or personal info.

Pricing

How Managed Service for Apache Spark pricing works

Pricing depends on your chosen deployment model. Serverless bills per job execution, while clusters bill for underlying compute and uptime.

Deployment mode:

What you pay for:

What you pay:

Serverless

Pay only for what you use. Billed per-second for compute, GPUs, and shuffle storage. Scale-to-zero ensures you never pay for idle capacity.

Starting at

$0.06 per DCU hour

Premium tier and accelerators:

Access Lightning Engine for up to 4.9x faster performance or attach NVIDIA GPUs for AI/ML workloads.

Starting at

$0.089 per DCU hour

Serverless premium tier

Clusters

Pay for cluster uptime. Billed for underlying Compute Engine resources plus a flat management fee. Leverage Spot VMs and zero-scale to optimize costs.

Starting at

$0.01 per vCPU hour

Management fee

Lightning Engine add-on:

Bring breakthrough performance to your clusters. Experience up to 4.9x faster execution than open source Spark.

Starting at

$0.0025 per vCPU hour

Learn more about Managed Service for Apache Spark pricing. View all pricing details .

How Managed Service for Apache Spark pricing works

Pricing depends on your chosen deployment model. Serverless bills per job execution, while clusters bill for underlying compute and uptime.

Serverless

What you pay for:

Pay only for what you use. Billed per-second for compute, GPUs, and shuffle storage. Scale-to-zero ensures you never pay for idle capacity.

What you pay:

Starting at

$0.06 per DCU hour

Premium tier and accelerators:

Access Lightning Engine for up to 4.9x faster performance or attach NVIDIA GPUs for AI/ML workloads.

What you pay for:

Starting at

$0.089 per DCU hour

Serverless premium tier

Clusters

What you pay for:

Pay for cluster uptime. Billed for underlying Compute Engine resources plus a flat management fee. Leverage Spot VMs and zero-scale to optimize costs.

What you pay:

Starting at

$0.01 per vCPU hour

Management fee

Lightning Engine add-on:

Bring breakthrough performance to your clusters. Experience up to 4.9x faster execution than open source Spark.

What you pay for:

Starting at

$0.0025 per vCPU hour

Learn more about Managed Service for Apache Spark pricing. View all pricing details .

Pricing calculator

Estimate your monthly costs, including region-specific pricing, and fees.

Custom quote

Connect with our sales team to get a custom quote for your organization.

Start your proof of concept

$300 in credit for new customers

Have a large project?

Create a cluster

Run a serverless batch job

Choose the right deployment

Business Case

Customer success stories

“We saw some of our quality checks go from 11 hours down to minutes.”

Michael Manos, Chief Technology Officer of Dun & Bradstreet

Migrating to Google Cloud has helped Dun & Bradstreet significantly increase the speed of data flows, reducing quality check processes from hours to minutes and cutting the time it takes to publish new data in half. This strong data foundation also enables Dun & Bradstreet to leverage the full power of Google Cloud’s ecosystem, including cutting-edge data and AI technologies.

The Managed Service for Apache Spark difference

Zero-ops productivity with flexible deployment options. Choose serverless execution or fully managed clusters to eliminate infrastructure overhead and the manual tuning tax.

Agentic AI development. Accelerate your workflow with Gemini baked into the VSCode agentic extension or with your IDE of choice along with Data Agents that automate PySpark coding, data wrangling, and job troubleshooting in a unified notebook.

Industry-leading performance. powered by Lightning Engine. Accelerate your most demanding ETL and data science workloads by up to 4.9x, significantly reducing your total cost of ownership

Additional resources:

FAQ

What happened to Dataproc and Serverless Spark?

To simplify your experience, we have unified Dataproc and Google Cloud Serverless for Apache Spark under a single product: Managed Service for Apache Spark. You get the exact same powerful capabilities, but now you simply choose your preferred deployment model—zero-ops serverless or fully managed clusters—from a single, unified interface. Compare both deployment modes in greater detail .

When should I choose serverless versus managed clusters?

Choose serverless when you want to focus purely on code with zero infrastructure management, ideal for new pipelines and ad-hoc analysis. Choose managed clusters when you need fine-grained control, are migrating legacy or cloud Spark or other OSS workloads, or require persistent clusters with diverse open-source tools.

What is Lightning Engine?

Lightning Engine is Google Cloud’s native, highly optimized execution engine. Built with C++ libraries, it optimizes every layer—from high-throughput storage connectors to intelligent caching. It delivers up to 4.9x better performance than standard Spark and 2x the price-performance over the leading high-speed Spark alternative, integrating seamlessly into your serverless or cluster deployments with zero code changes.

Do I need to install my own ML libraries like PyTorch?

No. If you are running AI/ML workloads, you can use our pre-configured ML Runtimes. These environments come with common libraries like PyTorch, XGBoost, and scikit-learn built-in, along with optimized NVIDIA GPU drivers, eliminating complex setup.

Is Managed Service for Apache Spark fully open-source compatible?

Yes. We provide a 100% open-source compatible Apache Spark environment. You can run your existing Spark code without modifications, ensuring complete workload portability and avoiding vendor lock-in.

How does Gemini AI help with Spark development?

Gemini AI can be brought directly into your IDE of choice to act as your AI co-pilot. It helps you write and debug PySpark code faster, while Gemini Cloud Assist provides automated root-cause analysis and troubleshooting recommendations for failed jobs.

Can I use this service to build an open data lakehouse?

Absolutely. Managed Service for Apache Spark is a core processing engine for Google Cloud's open lakehouse. It allows you to process data in open formats like Apache Iceberg directly from Cloud Storage, integrating seamlessly with BigQuery and Knowledge Catalog for Apache Iceberg.

How do the standard and premium pricing tiers work?

The standard and premium tiers currently only apply to serverless deployments. Standard is ideal for cost-effective, general-purpose batch processing and ETL. The premium tier is designed for your most demanding workloads, unlocking the 4.9x performance boost over open source Apache Spark with Lightning Engine and providing access to GPU-accelerated AI/ML capabilities.

Docs Support

Console

Accelerate your digital transformation
Whether your business is early in its journey or well on its way to digital transformation, Google Cloud can help solve your toughest challenges.
Learn more

Industry Solutions

Reduce cost, increase operational agility, and capture new market opportunities.

Retail

Analytics and collaboration tools for the retail value chain.

Consumer Packaged Goods

Solutions for CPG digital transformation and brand growth.

Financial Services

Computing, data management, and analytics tools for financial services.

Healthcare and Life Sciences

Advance research at scale and empower healthcare innovation.

Media and Entertainment

Solutions for content production and distribution operations.

Telecommunications

Hybrid and multi-cloud services to deploy and monetize 5G.

Games

AI-driven solutions to build and scale games faster.

Manufacturing

Migration and AI tools to optimize the manufacturing value chain.

Supply Chain and Logistics

Enable sustainable, efficient, and resilient data-driven operations across supply chain and logistics operations.

Government

Data storage, AI, and analytics solutions for government agencies.

Education

Teaching tools to provide more engaging learning experiences.

Not seeing what you're looking for?
See all industry solutions

Application Modernization

Assess, plan, implement, and measure software practices and capabilities to modernize and simplify your organization’s business application portfolios.

CAMP

Program that uses DORA to improve your software delivery capabilities.

Modernize Traditional Applications

Analyze, categorize, and get started with cloud migration on traditional workloads.

Migrate from PaaS: Cloud Foundry, Openshift

Tools for moving your existing containers into Google's managed container services.

Migrate from Mainframe

Automated tools and prescriptive guidance for moving your mainframe apps to the cloud.

Modernize Software Delivery

Software supply chain best practices - innerloop productivity, CI/CD and S3C.

DevOps Best Practices

Processes and resources for implementing DevOps in your org.

SRE Principles

Tools and resources for adopting SRE in your org.

Platform Engineering

Comprehensive suite of managed services and Golden Paths to build, manage, and scale IDPs.

Architect for Multicloud

Manage workloads across multiple clouds with a consistent platform.

Artificial Intelligence

Add intelligence and efficiency to your business with AI and machine learning.

Gemini Enterprise for Customer Experience

Build and manage agents that live across the entire customer lifecycle.

Gemini Enterprise

Unified agentic portfolio for your entire organization.

AI Commerce Search

Google-quality search and product recommendations for retailers.

Google Cloud with Gemini

AI assistants for application development, coding, and more.

Physical AI

Simulate, train, and operate the next generation of robots, autonomous vehicles, industrial devices, and machines.

APIs and Applications

Speed up the pace of innovation without coding, using APIs, apps, and automation.

New Business Channels Using APIs

Attract and empower an ecosystem of developers and partners.

Unlocking Legacy Applications Using APIs

Cloud services for extending and modernizing legacy apps.

Open Banking APIx

Simplify and accelerate secure delivery of open banking compliant APIs.

Data Analytics

Generate instant insights from data at any scale with a serverless, fully managed analytics platform that significantly simplifies analytics.

Data Migration

Migrate and modernize your data warehouse and data lakes with AI-powered migration services.

Data Lakehouse

Unify and govern your multimodal data with a high-performance and open data lakehouse.

Real-time Analytics

Insights from ingesting, processing, and analyzing event streams.

Marketing Analytics

Solutions for collecting, analyzing, and activating customer data.

Datasets

Data from Google, public, and commercial providers to enrich your analytics and AI initiatives.

Business Intelligence

Solutions for modernizing your BI stack and creating rich data experiences.

Data Analytics Agents

Built-in agents for data lifecycle and tools to build your own agents.

Geospatial Analytics

A comprehensive platform to solve for geospatial use cases at scale.

Data Science

Managed services and integrated workflows to build, manage, and scale data science.

Databases

Migrate and manage enterprise data with security, reliability, high availability, and fully managed data services.

Database Migration

Guides and tools to simplify your database migration life cycle.

Database Modernization

Upgrades to modernize your operational database infrastructure.

Databases for Games

Build global, live games with Google Cloud databases.

Google Cloud Databases

Database services to migrate, manage, and modernize data.

Migrate Oracle workloads to Google Cloud

Rehost, replatform, rewrite your Oracle workloads.

Open Source Databases

Fully managed open source databases with enterprise-grade support.

SQL Server on Google Cloud

Options for running SQL Server virtual machines on Google Cloud.

Gemini for Databases

Supercharge database development and management with AI.

Infrastructure

Migrate quickly with solutions for SAP, VMware, Windows, Oracle, and other workloads.

Application Migration

Discovery and analysis tools for moving to the cloud.

SAP on Google Cloud

Certifications for running SAP applications and SAP HANA.

High Performance Computing

Compute, storage, and networking options to support any workload.

Windows on Google Cloud

Tools and partners for running Windows workloads.

Data Center Migration

Migration solutions for VMs, apps, databases, and more.

Active Assist

Automatic cloud resource optimization and increased security.

Virtual Desktops

Remote work solutions for desktops and applications (VDI & DaaS).

Rapid Migration and Modernization Program

End-to-end migration program to simplify your path to the cloud.

Backup and Disaster Recovery

Ensure your business continuity needs are met.

Red Hat on Google Cloud

Google and Red Hat provide an enterprise-grade platform for traditional on-prem and custom applications.

Cross-Cloud Network

Simplify hybrid and multicloud networking, and secure your workloads, data, and users.

AI Infrastructure

Train, serve and operate your AI applications on the agent-native infrastructure powering Google.

Productivity and Collaboration

Change the way teams work with solutions designed for humans and built for impact.

Google Workspace

Collaboration and productivity tools for enterprises.

Google Workspace Essentials

Secure video meetings and modern collaboration for teams.

Cloud Identity

Unified platform for IT admins to manage user devices and apps.

Chrome Enterprise

ChromeOS, Chrome Browser, and Chrome devices built for business.

Security

Detect, investigate, and respond to online threats to help protect your business.

Agentic SOC

Delivering better security outcomes with AI agents.

Web App and API Protection

Threat and fraud protection for your web applications and APIs.

Security and Resilience Framework

Solutions for each phase of the security and resilience life cycle.

Risk and compliance as code (RCaC)

Solution to modernize your governance, risk, and compliance function with automation.

Software Supply Chain Security

Solution for improving end-to-end software supply chain security.

Security Foundation

Recommended products to help achieve a strong security posture.

Google Cloud Cybershield™

Strengthen nationwide cyber defense.

Startups and SMB

Accelerate startup and SMB growth with tailored solutions and programs.

Startup Program

Get financial, business, and technical support to take your startup to the next level.

Small and Medium Business

Explore solutions for web hosting, app development, AI, and analytics.

Software as a Service

Build better SaaS products, scale efficiently, and grow your business.

Featured Products

Compute Engine

Virtual machines running in Google’s data center.

Cloud Storage

Object storage that’s secure, durable, and scalable.

BigQuery

Autonomous data to AI platform for analytics and data science.

Cloud Run

Fully managed environment for running containerized apps.

Google Kubernetes Engine

Managed environment for running containerized apps.

Agent Platform

Unified platform for ML models, generative AI, and agent building.

Looker

Platform for BI, data applications, and embedded analytics.

Apigee API Management

Manage the full life cycle of APIs anywhere with visibility and control.

Cloud SQL

Relational database services for MySQL, PostgreSQL and SQL Server.

Gemini Enterprise app

Secure platform to discover, create, run, and govern AI agents for employees.

Cloud CDN

Content delivery network for delivering web and video.

Not seeing what you're looking for?
See all products (100+)

AI and Machine Learning

Gemini Enterprise Agent Platform

Unified platform for ML models, generative AI, and agent building.

Gemini Enterprise app

Secure platform to discover, create, run, and govern AI agents for employees.

Gemini Enterprise for Customer Experience

Build and manage agents that live across the entire customer lifecycle.

Model Garden

Single place to discover over 200 models from Google and Google partners.

Customer Experience Agent Studio

Build conversational AI with both deterministic and gen AI functionality.

Agent Search

Build Google-quality search for your enterprise apps and experiences.

Speech-to-Text

Speech recognition and transcription across 125 languages.

Text-to-Speech

Speech synthesis in 220+ voices and 40+ languages.

Translation AI

Language detection, translation, and glossary support.

Vision AI

Custom and pre-trained models to detect emotion, text, and more.

Contact Center as a Service

Omnichannel contact center solution that is native to the cloud.

Not seeing what you're looking for?
See all AI and machine learning products

Business Intelligence

Looker

Platform for BI, data applications, and embedded analytics.

Data Studio

Interactive data suite for dashboarding, reporting, and analytics.

Compute

Compute Engine

Virtual machines running in Google’s data center.

App Engine

Serverless application platform for apps and back ends.

Cloud GPUs

GPUs for ML, scientific computing, and 3D visualization.

Migrate to Virtual Machines

Server and virtual machine migration to Compute Engine.

Spot VMs

Compute instances for batch jobs and fault-tolerant workloads.

Batch

Fully managed service for scheduling batch jobs.

Sole-Tenant Nodes

Dedicated hardware for compliance, licensing, and management.

Bare Metal

Infrastructure to run specialized workloads on Google Cloud.

Recommender

Usage recommendations for Google Cloud products and services.

VMware Engine

Fully managed, native VMware Cloud Foundation software stack.

Cloud Run

Fully managed environment for running containerized apps.

Not seeing what you're looking for?
See all compute products

Containers

Google Kubernetes Engine

Managed environment for running containerized apps.

Cloud Run

Fully managed environment for running containerized apps.

Cloud Build

Solution for running build steps in a Docker container.

Artifact Registry

Package manager for build artifacts and dependencies.

Cloud Code

IDE support to write, run, and debug Kubernetes applications.

Cloud Deploy

Fully managed continuous delivery to GKE and Cloud Run.

Migrate to Containers

Components for migrating VMs into system containers on GKE.

Deep Learning Containers

Containers with data science frameworks, libraries, and tools.

Knative

Components to create Kubernetes-native cloud-based software.

Data Analytics

BigQuery

Autonomous data to AI platform for analytics and data science.

Managed Service for Apache Spark

Zero-ops serverless or managed clusters, accelerated by Lightning Engine.

Dataflow

Real-time analytics for stream and batch processing.

Looker

Platform for BI, data applications, and embedded analytics.

Lakehouse

Open lakehouse platform with enterprise storage and performance capabilities.

Pub/Sub

Messaging service for event ingestion and delivery.

Managed Service for Apache Airflow

Workflow orchestration service built on Apache Airflow.

Knowledge Catalog

Always-on catalog for AI that provides universal context for agents.

Data Analytics Agents

Built-in agents for data lifecycle and tools to build your own agents.

Data Analytics Migration Services

Free-to-use, cloud-native and AI-powered data migration services.

Managed Service for Apache Kafka

Managed Kafka service to operate highly available Apache Kafka clusters.

Not seeing what you're looking for?
See all data analytics products

Databases

AlloyDB for PostgreSQL

Fully managed, PostgreSQL-compatible database for enterprise workloads.

Cloud SQL

Fully managed database for MySQL, PostgreSQL, and SQL Server.

Firestore

Highly scalable and serverless NoSQL document database, with MongoDB compatibility.

Spanner

Cloud-native relational database with unlimited scale and 99.999% availability.

Bigtable

Cloud-native wide-column database for large-scale, low-latency workloads.

Datastream

Serverless change data capture and replication service.

Database Migration Service

Serverless, minimal downtime migrations to Cloud SQL.

Bare Metal Solution

Fully managed infrastructure for your Oracle workloads.

Memorystore

Fully managed Redis and Memcached for sub-millisecond data access.

Developer Tools

Artifact Registry

Universal package manager for build artifacts and dependencies.

Cloud Code

IDE support to write, run, and debug Kubernetes applications.

Cloud Build

Continuous integration and continuous delivery platform.

Cloud Deploy

Fully managed continuous delivery to GKE and Cloud Run.

Cloud Deployment Manager

Service for creating and managing Google Cloud resources.

Cloud SDK

Command-line tools and libraries for Google Cloud.

Cloud Scheduler

Cron job scheduler for task automation and management.

Cloud Source Repositories

Private Git repository to store, manage, and track code.

Infrastructure Manager

Automate infrastructure management with Terraform.

Cloud Workstations

Managed and secure development environments in the cloud.

Gemini Code Assist

AI-powered assistant available across Google Cloud and your IDE.

Not seeing what you're looking for?
See all developer tools

Distributed Cloud

Google Distributed Cloud Connected

Distributed cloud services for edge workloads.

Google Distributed Cloud Air-gapped

Distributed cloud for air-gapped workloads.

Hybrid and Multicloud

Google Kubernetes Engine

Managed environment for running containerized apps.

Apigee API Management

API management, development, and security platform.

Migrate to Containers

Tool to move workloads and existing applications to GKE.

Cloud Build

Service for executing builds on Google Cloud infrastructure.

Observability

Monitoring, logging, and application performance suite.

Cloud Service Mesh

Fully managed service mesh based on Envoy and Istio.

Google Distributed Cloud

Fully managed solutions for the edge and data centers.

Industry Specific

Anti Money Laundering AI

Detect suspicious, potential money laundering activity with AI.

Cloud Healthcare API

Solution for bridging existing care systems and apps on Google Cloud.

Device Connect for Fitbit

Gain a 360-degree patient view with connected Fitbit data on Google Cloud.

Telecom Network Automation

Ready to use cloud-native automation for telecom networks.

Telecom Data Fabric

Telecom data management and analytics with an automated approach.

Telecom Subscriber Insights

Ingests data to improve subscriber acquisition and retention.

Spectrum Access System (SAS)

Controls fundamental access to the Citizens Broadband Radio Service (CBRS).

Integration Services

Application Integration

Connect to 3rd party apps and enable data consistency without code.

Workflows

Workflow orchestration for serverless products and API services.

Apigee API Management

Manage the full life cycle of APIs anywhere with visibility and control.

Cloud Tasks

Task management service for asynchronous task execution.

Cloud Scheduler

Cron job scheduler for task automation and management.

Managed Service for Apache Spark

Zero-ops serverless or managed clusters, accelerated by Lightning Engine.

Cloud Data Fusion

Data integration for building and managing data pipelines.

Managed Service for Apache Airflow

Workflow orchestration service built on Apache Airflow.

Pub/Sub

Messaging service for event ingestion and delivery.

Eventarc

Build an event-driven architecture that can connect any service.

Management Tools

Cloud Shell

Interactive shell environment with a built-in command line.

Cloud console

Web-based interface for managing and monitoring cloud apps.

Cloud Endpoints

Deployment and development management for APIs on Google Cloud.

Cloud IAM

Permissions management system for Google Cloud resources.

Cloud APIs

Programmatic interfaces for Google Cloud services.

Service Catalog

Service catalog for admins managing internal enterprise solutions.

Cost Management

Tools for monitoring, controlling, and optimizing your costs.

Observability

Monitoring, logging, and application performance suite.

Carbon Footprint

Dashboard to view and export Google Cloud carbon emissions reports.

Config Connector

Kubernetes add-on for managing Google Cloud resources.

Active Assist

Tools for easily managing performance, security, and cost.

Not seeing what you're looking for?
See all management tools

Maps and Geospatial

Earth Engine

Geospatial platform for Earth observation data and analysis.

Google Maps Platform

Create immersive location experiences and improve business operations.

Media Services

Cloud CDN

Content delivery network for serving web and video content.

Live Stream API

Service to convert live video and package for streaming.

OpenCue

Open source render manager for visual effects and animation.

Transcoder API

Convert video files and package them for optimized delivery.

Video Stitcher API

Service for dynamic or server side ad insertion.

Migration

Migration Center

Unified platform for migrating and modernizing with Google Cloud.

Application Migration

App migration to the cloud for low-cost refresh cycles.

Migrate to Virtual Machines

Components for migrating VMs and physical servers to Compute Engine.

Cloud Foundation Toolkit

Reference templates for Deployment Manager and Terraform.

Database Migration Service

Serverless, minimal downtime migrations to Cloud SQL.

Migrate to Containers

Components for migrating VMs into system containers on GKE.

Data Analytics Migration Services

Streamlined data warehouse and data lake migration tooling and incentives.

Rapid Migration and Modernization Program

End-to-end migration program to simplify your path to the cloud.

Transfer Appliance

Storage server for moving large volumes of data to Google Cloud.

Storage Transfer Service

Data transfers from online and on-premises sources to Cloud Storage.

VMware Engine

Migrate and run your VMware workloads natively on Google Cloud.

Networking

Cloud Armor

Security policies and defense against web and DDoS attacks.

Cloud CDN and Media CDN

Content delivery network for serving web and video content.

Cloud DNS

Domain name system for reliable and low-latency name lookups.

Cloud Load Balancing

Service for distributing traffic across applications and regions.

Cloud NAT

NAT service for giving private instances internet access.

Cloud Connectivity

Connectivity options for VPN, peering, and enterprise needs.

Network Connectivity Center

Connectivity management to help simplify and scale networks.

Network Intelligence Center

Network monitoring, verification, and optimization platform.

Network Service Tiers

Cloud network options based on performance, availability, and cost.

Virtual Private Cloud

Single VPC for an entire organization, isolated within projects.

Private Service Connect

Secure connection between your VPC and services.

Not seeing what you're looking for?
See all networking products

Operations

Cloud Logging

Google Cloud audit, platform, and application logs management.

Cloud Monitoring

Infrastructure and application health with rich metrics.

Error Reporting

Application error identification and analysis.

Managed Service for Prometheus

Fully-managed Prometheus on Google Cloud.

Cloud Trace

Tracing system collecting latency data from applications.

Cloud Profiler

CPU and heap profiler for analyzing application performance.

Cloud Quotas

Manage quotas for all Google Cloud services.

Productivity and Collaboration

AppSheet

No-code development platform to build and extend applications.

AppSheet Automation

Build automations and applications on a unified platform.

Gemini Enterprise app

Secure platform to discover, create, run, and govern AI agents for employees.

Google Workspace

Collaboration and productivity tools for individuals and organizations.

Google Workspace Essentials

Secure video meetings and modern collaboration for teams.

Cloud Identity

Unified platform for IT admins to manage user devices and apps.

Chrome Enterprise

ChromeOS, Chrome browser, and Chrome devices built for business.

Security and Identity

Cloud IAM

Permissions management system for Google Cloud resources.

Sensitive Data Protection

Discover, classify, and protect your valuable data assets.

Mandiant Managed Defense

Find and eliminate threats with confidence 24x7.

Google Threat Intelligence

Know who’s targeting you.

Security Command Center

Platform for defending against threats to your Google Cloud assets.

Cloud Key Management

Manage encryption keys on Google Cloud.

Mandiant Incident Response

Minimize the impact of a breach.

Chrome Enterprise Premium

Get secure enterprise browsing with extensive endpoint visibility.

Assured Workloads

Compliance and security controls for sensitive workloads.

Google Security Operations

Detect, investigate, and respond to cyber threats.

Mandiant Consulting

Get expert guidance before, during, and after an incident.

Not seeing what you're looking for?
See all security and identity products

Serverless

Cloud Run

Fully managed environment for running containerized apps.

Cloud Functions

Platform for creating functions that respond to cloud events.

App Engine

Serverless application platform for apps and back ends.

Workflows

Workflow orchestration for serverless products and API services.

API Gateway

Develop, deploy, secure, and manage APIs with a fully managed gateway.

Storage

Cloud Storage

Object storage that’s secure, durable, and scalable.

Block Storage

High-performance storage for AI, analytics, databases, and enterprise applications.

Filestore

File storage that is highly scalable and secure.

Persistent Disk

Block storage for virtual machine instances running on Google Cloud.

Cloud Storage for Firebase

Object storage for storing and serving user-generated content.

Local SSD

Block storage that is locally attached for high-performance needs.

Storage Transfer Service

Data transfers from online and on-premises sources to Cloud Storage.

Google Cloud Managed Lustre

High performance managed parallel file service.

Google Cloud NetApp Volumes

File storage service for NFS, SMB, and multi-protocol environments.

Backup and DR Service

Service for centralized, application-consistent data protection.

Web3

Blockchain Node Engine

Fully managed node hosting for developing on the blockchain.

Blockchain RPC

Enterprise-grade RPC for building on the blockchain.

Save money with our transparent approach to pricing
Google Cloud's pay-as-you-go pricing offers automatic savings based on monthly usage and discounted rates for prepaid resources. Contact us today to get a quote.
Request a quote

Pricing overview and tools
Google Cloud pricing

Pay only for what you use with no lock-in.
Pricing calculator

Calculate your cloud savings.
Google Cloud free tier

Explore products with free monthly usage.

Save money with our transparent approach to pricing
Request a quote
Pricing overview and tools
Google Cloud pricing
Pricing calculator
Google Cloud free tier
Cost optimization framework
Cost management tools
Product-specific Pricing
Compute Engine
Cloud SQL
Google Kubernetes Engine
Cloud Storage
BigQuery
See full price list with 100+ products

Create a Mobile Website

View Site in Mobile | Classic

Managed Service for Apache Spark (formerly Dataproc)

The new way to Spark: Easier, smarter, faster

Product highlights

Industry-leading performance with Lightning Engine

Flexible lakehouse interoperability

Unified AI powered developer experience

Enterprise AI/ML ready

Secure, scalable, and seamless migrations

Multi-tenant efficiency and FinOps controls

Open and flexible ecosystem

Make Spark easier with zero-ops serverless or managed clusters. Work smarter with Gemini in your IDE of choice, using agentic AI to accelerate PySpark development. Run jobs faster with Lightning Engine, all while maintaining unified governance across your open lakehouse with Knowledge Catalog.

Data engineering at scale

Automated ETL pipelines

Tutorials, quickstarts, & labs

Automated ETL pipelines

Data science and machine learning

Interactive data science

Tutorials, quickstarts, & labs

Interactive data science

Lakehouse modernization

Open data lakehouse

Tutorials, quickstarts, & labs

Open data lakehouse

Pricing calculator

Custom quote

Start your proof of concept

$300 in credit for new customers

Have a large project?

Create a cluster

Run a serverless batch job

Choose the right deployment

Other success stories:

Additional resources:

What happened to Dataproc and Serverless Spark?

When should I choose serverless versus managed clusters?

What is Lightning Engine?

Do I need to install my own ML libraries like PyTorch?

Is Managed Service for Apache Spark fully open-source compatible?

How does Gemini AI help with Spark development?

Can I use this service to build an open data lakehouse?

How do the standard and premium pricing tiers work?