Stay organized with collectionsSave and categorize content based on your preferences.
You can install additional components likeApache Pigwhen you create a Dataproc cluster using theOptional componentsfeature. This page describes the Pig component, an open source platform for
analyzing large data sets.
Install the component
Install the component when you create a Dataproc cluster.
Apache Pig is an optional component in Dataproc2.3and later
image versions.
To create a Dataproc cluster that includes the Pig component,
use thegcloud dataproc clusters createCLUSTER_NAMEcommand with the--optional-componentsflag (using image version
2.3 or later).
[[["Easy to understand","easyToUnderstand","thumb-up"],["Solved my problem","solvedMyProblem","thumb-up"],["Other","otherUp","thumb-up"]],[["Hard to understand","hardToUnderstand","thumb-down"],["Incorrect information or sample code","incorrectInformationOrSampleCode","thumb-down"],["Missing the information/samples I need","missingTheInformationSamplesINeed","thumb-down"],["Other","otherDown","thumb-down"]],["Last updated 2025-09-04 UTC."],[],[],null,["You can install additional components like [Apache Pig](https://pig.apache.org/)\nwhen you create a Dataproc cluster using the\n[Optional components](/dataproc/docs/concepts/components/overview#available_optional_components)\nfeature. This page describes the Pig component, an open source platform for\nanalyzing large data sets.\n\nInstall the component\n\nInstall the component when you create a Dataproc cluster.\n\nApache Pig is an optional component in Dataproc `2.3` and later\nimage versions.\n| **Note:** Apache Pig is automatically installed on Dataproc `2.2` and earlier image versions.\n\nSee\n[Supported Dataproc versions](/dataproc/docs/concepts/versioning/dataproc-versions#supported_cloud_dataproc_versions)\nfor component versions included in the latest Dataproc image\nreleases. \n\ngcloud\n\nTo create a Dataproc cluster that includes the Pig component,\nuse the\n[`gcloud dataproc clusters create `\u003cvar translate=\"no\"\u003eCLUSTER_NAME\u003c/var\u003e](/sdk/gcloud/reference/dataproc/clusters/create)\ncommand with the `--optional-components` flag (using image version\n2.3 or later). \n\n```\ngcloud dataproc clusters create CLUSTER_NAME \\\n --region=REGION \\\n --optional-components=PIG \\\n --image-version=2.3 \\\n ... other flags\n```\n\nREST API\n\nThe Pig component can be specified through the Dataproc API\nusing\n[SoftwareConfig.Component](/dataproc/docs/reference/rest/v1/ClusterConfig#Component)\nas part of a\n[clusters.create](/dataproc/docs/reference/rest/v1/projects.regions.clusters/create)\nrequest.\n\nConsole\n\nEnable the component:\n\n1. In the Google Cloud console, open the Dataproc [Create a cluster](https://console.cloud.google.com/dataproc/clustersAdd) page. The Set up cluster panel is selected.\n2. In the Components section, under Optional components, select Pig and other optional components to install on your cluster."]]