You can install additional components like Trino when you create a Dataproc cluster using the Optional components feature. This page describes how you can optionally install the Trino component on a Dataproc cluster.
 Trino 
is an open
source distributed SQL query engine. The Trino server and
Web UI are by default available on port 8060 
(or port 7778 
if Kerberos is
enabled) on the cluster's first master node.
By default, Trino on Dataproc is configured to work with Hive 
, BigQuery 
, Memory 
, TPCH 
and TPCDS 
 connectors 
.
After creating a cluster with the Trino component, you can run queries:
- from a local terminal with the  gcloud dataproc jobs submit trinocommand
- from a terminal window on the cluster's first master node using the trinoCLI (Command Line Interface)—see Use Trino with Dataproc .
Install the component
Install the component when you create a Dataproc cluster.
See Supported Dataproc versions for the component version included in each Dataproc image release.
Console
- In the Google Cloud console, go to the Dataproc Create a clusterpage.  
 The Set up clusterpanel is selected. 
- In the Components section: - In Optional components, select Trino and other optional components to install on your cluster.
- Under Component Gateway, select Enable component gateway (see Viewing and Accessing Component Gateway URLs ).
 
gcloud CLI
To create a Dataproc cluster that includes the Trino component,
    use the gcloud dataproc clusters create 
command with the --optional-components 
flag.
gcloud dataproc clusters create CLUSTER_NAME \ --optional-components= TRINO \ --region= region \ --enable-component-gateway \ ... other flags
- CLUSTER_NAME : The name of the cluster.
- REGION : A Compute Engine region where the cluster will be located.
Configuring properties
Add the  --properties 
 
flag to the gcloud dataproc clusters create 
command to set trino 
, trino-jvm 
and trino-catalog 
config properties.
-  Application properties:Use cluster properties with the trino:prefix to configure Trino application properties —for example,--properties="trino:join-distribution-type=AUTOMATIC".
-  JVM configuration properties:Use cluster properties with the trino-jvm:prefix to configure JVM properties for Trino coordinator and worker Java processes—for example,--properties="trino-jvm:XX:+HeapDumpOnOutOfMemoryError".
-  Creating new catalogs and adding catalog properties:Use trino-catalog: catalog-name . property-nameto configure Trino catalogs.Example:The following `properties` flag can be used with the `gcloud dataproc clusters create` command to create a Trino cluster with a "prodhive" Hive catalog. A prodhive.propertiesfile will be created under/usr/lib/trino/etc/catalog/to enable the prodhive catalog.--properties="trino-catalog:prodhive.connector.name=hive,trino-catalog:prodhive.hive.metastore.uri=thrift://localhost:9000" 
REST API
The Trino component can be specified through the Dataproc API using SoftwareConfig.Component as part of a clusters.create request.

