"Managed Service for Apache Spark" is the new name for the product formerly known as "Dataproc on Compute Engine" (cluster deployment) and "Google Cloud Serverless for Apache Spark" (serverless deployment).
Manages Hive table metadata. As a default, uses the localmariadb(image versions < 1.5) ormysql(image versions 1.5+) database
on the master node as the Hive table metadata store.
Using the default database is not recommended because these databases
are tied to the cluster's lifecycle. Instead, use either of the following as
the Hive metastore database (in recommendation order):
In Managed Service for Apache SparkHigh Availability (HA) clusters,
different services run on different master nodes, as show below. HA cluster worker
node services are the same as those listed forstandard clusters.
A quorum of journal nodes maintains an edit log of HDFS namespace modifications.
If a failover occurs, the Standby NameNode reads the edit log
and takes control from the Active NameNode.
Manages Hive table metadata. As a default, uses the localmariadb(image versions < 1.5) ormysql(image versions 1.5+) database
on the master node as the Hive table metadata store.
Using the default database is not recommended because these databases
are tied to the cluster's lifecycle. Instead, use either of the following as
the Hive metastore database (in recommendation order):
ZKFC is theZKFailoverControllerprocess, which runs
with the HDFS NameNode. It monitors the health of the NameNode, and manages leader
election via ZooKeeper in the event of a failover.
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2026-04-10 UTC."],[],[]]