Overview of Google crawlers and fetchers (user agents)
Google uses crawlers and fetchers to perform actions for its products, either automatically or triggered by user request.
"Crawler" (sometimes also called a "robot" or "spider") is a generic term for any program that is used to automatically discover and scan websites by following links from one web page to another. Google's main crawler used for Google Search is called Googlebot .
Fetchers, like a browser, are tools that request a single URL when prompted by a user.
The following tables show the Google crawlers and fetchers used by various products and services, how you may see in your referrer logs, and how to specify them in robots.txt . The lists are not exhaustive, they only cover the most common requestors that may show up in log files.
- The user agent token
is used in the
User-agent:
line in robots.txt to match a crawler type when writing crawl rules for your site. Some crawlers have more than one token, as shown in the table; you need to match only one crawler token for a rule to apply. This list is not complete, but covers most crawlers you might see on your website. - The full user agent string is a full description of the crawler, and appears in the HTTP request and your web logs.
Common crawlers
Google's common crawlers are used to find information for building Google's search indexes, perform other product specific crawls, and for analysis. They always obey robots.txt rules and generally crawl from the IP ranges published in the googlebot.json object.
Googlebot Smartphone
User agent token | Googlebot
|
Full user agent string | Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ W.X.Y.Z
Mobile Safari/537.36 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
|
Googlebot Desktop
Googlebot
-
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; Googlebot/2.1; +http://www.google.com/bot.html) Chrome/ W.X.Y.Z Safari/537.36
- Rarely:
-
Mozilla/5.0 (compatible; Googlebot/2.1; +http://www.google.com/bot.html)
-
Googlebot/2.1 (+http://www.google.com/bot.html)
-
Googlebot Image
Used for crawling image URLs for Google Images and products dependent on images.
-
Googlebot-Image
-
Googlebot
Googlebot-Image/1.0
Googlebot News
Googlebot News uses Googlebot for crawling news articles, however it respects its
historic user agent token Googlebot-News
.
-
Googlebot-News
-
Googlebot
Googlebot Video
Used for crawling video URLs for Google Video and products dependent on videos.
-
Googlebot-Video
-
Googlebot
Googlebot-Video/1.0
Google StoreBot crawls through certain types of pages, including, but not limited to, product details pages, cart pages, and checkout pages.
Storebot-Google
- Desktop agent:
Mozilla/5.0 (X11; Linux x86_64; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ W.X.Y.Z Safari/537.36
- Mobile agent:
Mozilla/5.0 (Linux; Android 8.0; Pixel 2 Build/OPD3.170816.012; Storebot-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ W.X.Y.Z Mobile Safari/537.36
Google-InspectionTool
Google-InspectionTool is the crawler used by Search testing tools such as the Rich Result Test and URL inspection in Search Console. Apart from the user agent and user agent token, it mimics Googlebot.
-
Google-InspectionTool
-
Googlebot
- Mobile
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ W.X.Y.Z Mobile Safari/537.36 (compatible; Google-InspectionTool/1.0;)
- Desktop
Mozilla/5.0 (compatible; Google-InspectionTool/1.0;)
GoogleOther
GoogleOther is the generic crawler that may be used by various product teams for fetching publicly accessible content from sites. For example, it may be used for one-off crawls for internal research and development.
GoogleOther
-
Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ W.X.Y.Z Mobile Safari/537.36 (compatible; GoogleOther)
-
Mozilla/5.0 AppleWebKit/537.36 (KHTML, like Gecko; compatible; GoogleOther) Chrome/ W.X.Y.Z Safari/537.36
-
GoogleOther
GoogleOther-Image
GoogleOther-Image is the version of GoogleOther optimized for fetching publicly accessible image URLs.
-
GoogleOther-Image
-
GoogleOther
GoogleOther-Image/1.0
GoogleOther-Video
GoogleOther-Video is the version of GoogleOther optimized for fetching publicly accessible video URLs.
-
GoogleOther-Video
-
GoogleOther
GoogleOther-Video/1.0
Google-Extended
Google-Extended
is a standalone product token that web publishers can use to
manage whether their sites help
improve Gemini Apps
and Vertex AI generative APIs, including future generations of models that power those
products. Google-Extended does not impact a site's inclusion or ranking in Google Search.
User agent token | Google-Extended
|
Full user agent string | Google-Extended doesn't have a separate HTTP request user agent string. Crawling is done with existing Google user agent strings; the robots.txt user-agent token is used in a control capacity. |
Special-case crawlers
The special-case crawlers are used by specific products where there's an agreement between the
crawled site and the product about the crawl process. For example, AdsBot
ignores the
global robots.txt user agent ( *
) with the ad publisher's permission. The
special-case crawlers may ignore robots.txt rules and so they operate from a different IP range
than the common crawlers. The IP ranges are published in the special-crawlers.json
object.
Used by Google APIs to deliver push notification messages. Ignores the global user agent
( *
) in robots.txt.
User agent token | APIs-Google
|
Full user agent string | APIs-Google (+https://developers.google.com/webmasters/APIs-Google.html)
|
AdsBot Mobile Web
Checks mobile web page ad quality
.
Ignores the global user agent ( *
) in robots.txt.
User agent token | AdsBot-Google-Mobile
|
Full user agent string | Mozilla/5.0 (Linux; Android 6.0.1; Nexus 5X Build/MMB29P) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/ W.X.Y.Z
Mobile Safari/537.36 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
|
AdsBot
Checks desktop web page ad quality
.
Ignores the global user agent ( *
) in robots.txt.
User agent token | AdsBot-Google
|
Full user agent string | AdsBot-Google (+http://www.google.com/adsbot.html)
|
The AdSense crawler visits your site to determine its content in order to provide relevant
ads. Ignores the global user agent ( *
) in robots.txt.
User agent token | Mediapartners-Google
|
---|---|
Full user agent string | Mediapartners-Google
|
The Mobile AdSense crawler visits your site to determine its content in order to provide
relevant ads. Ignores the global user agent ( *
) in robots.txt.
User agent token | Mediapartners-Google
|
Full user agent string | (Various mobile device types)
(compatible; Mediapartners-Google/2.1; +http://www.google.com/bot.html)
|
Google-Safety
The Google-Safety user agent handles abuse-specific crawling, such as malware discovery for publicly posted links on Google properties. This user agent ignores robots.txt rules.
Google-Safety
User-triggered fetchers
User-triggered fetchers are initiated by users to perform a product specific fetching function. For example, Google Site Verifier acts on a user's request, or a site hosted on Google Cloud (GCP) has a feature that allows the site's users to retrieve an external RSS feed. Because the fetch was requested by a user, these fetchers generally ignore robots.txt rules. The IP ranges the user-triggered fetchers use are published in the user-triggered-fetchers.json and user-triggered-fetchers-google.json objects.
Feedfetcher is used for crawling RSS or Atom feeds for Google Podcasts, Google News, and PubSubHubbub.
User agent token | FeedFetcher-Google
|
Full user agent string | FeedFetcher-Google; (+http://www.google.com/feedfetcher.html)
|
Google Publisher Center
Fetches and processes feeds that publishers explicitly supplied through the Google Publisher Center to be used in Google News landing pages.
GoogleProducer; (+http://goo.gl/7y4SX)
Upon user request, Google Read Aloud fetches and reads out web pages using text-to-speech (TTS).
Current agents :
- Desktop agent:
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/41.0.2272.118 Safari/537.36 (compatible; Google-Read-Aloud; +https://support.google.com/webmasters/answer/1061943)
- Mobile agent:
Mozilla/5.0 (Linux; Android 7.0; SM-G930V Build/NRD90M) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/59.0.3071.125 Mobile Safari/537.36 (compatible; Google-Read-Aloud; +https://support.google.com/webmasters/answer/1061943)
Former agent ( deprecated ) :
google-speakr
Google Site Verifier fetches upon user request Search Console verification tokens.
Mozilla/5.0 (compatible; Google-Site-Verification/1.0)
A note about Chrome/ W.X.Y.Z in user agents
Wherever you see the string Chrome/ W.X.Y.Z
in the user agent
strings in the table, W.X.Y.Z
is actually a placeholder that represents the version
of the Chrome browser used by that user agent: for example, 41.0.2272.96
. This version
number will increase over time to match the latest Chromium release version used by Googlebot
.
If you are searching your logs or filtering your server for a user agent with this pattern, use wildcards for the version number rather than specifying an exact version number.
User agents in robots.txt
Where several user agents are recognized in the robots.txt file, Google will follow the most
specific. If you want all of Google to be able to crawl your pages, you don't need a
robots.txt file at all. If you want to block or allow all of Google's crawlers from accessing
some of your content, you can do this by specifying Googlebot as the user agent. For example,
if you want all your pages to appear in Google Search, and if you want AdSense ads to appear
on your pages, you don't need a robots.txt file. Similarly, if you want to block some pages
from Google altogether, blocking the Googlebot
user agent will also block all
Google's other user agents.
But if you want more fine-grained control, you can get more specific. For example, you might
want all your pages to appear in Google Search, but you don't want images in your personal
directory to be crawled. In this case, use robots.txt to disallow the Googlebot-Image
user agent from crawling the files in your personal directory
(while allowing Googlebot to crawl all files), like this:
User-agent: Googlebot Disallow: User-agent: Googlebot-Image Disallow: /personal
To take another example, say that you want ads on all your pages, but you don't want those
pages to appear in Google Search. Here, you'd block Googlebot, but allow the Mediapartners-Google
user agent, like this:
User-agent: Googlebot Disallow: / User-agent: Mediapartners-Google Disallow:
Controlling crawl speed
Each Google crawler accesses sites for a specific purpose and at different rates. Google uses algorithms to determine the optimal crawl rate for each site. If a Google crawler is crawling your site too often, you can reduce the crawl rate .
Retired Google crawlers
The following Google crawlers are no longer in use, and are only noted here for historical reference.
Duplex on the web
Supported the Duplex on the web service.
User agent token | DuplexWeb-Google
|
Full user agent string | Mozilla/5.0 (Linux; Android 11; Pixel 2; DuplexWeb-Google/1.0) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/86.0.4240.193 Mobile Safari/537.36
|
Web Light
Checked for the presence of the no-transform
header whenever a user clicked
your page in search under appropriate conditions. The Web Light user agent was used only
for explicit browse requests of a human visitor, and so it ignored robots.txt rules,
which are used to block automated crawling requests.
User agent token | googleweblight
|
Full user agent string | Mozilla/5.0 (Linux; Android 4.2.1; en-us; Nexus 5 Build/JOP40D) AppleWebKit/535.19 (KHTML, like Gecko; googleweblight) Chrome/38.0.1025.166 Mobile Safari/535.19
|
AdsBot Mobile Web
Checks iPhone web page ad quality
.
Ignores the global user agent ( *
) in robots.txt.
User agent token | AdsBot-Google-Mobile
|
Full user agent string | Mozilla/5.0 (iPhone; CPU iPhone OS 14_7_1 like Mac OS X) AppleWebKit/605.1.15 (KHTML, like Gecko) Version/14.1.2 Mobile/15E148 Safari/604.1 (compatible; AdsBot-Google-Mobile; +http://www.google.com/mobile/adsbot.html)
|
Mobile Apps Android
Checks Android app page ad quality
.
Obeys AdsBot-Google
robots rules, but ignores the global
user agent ( *
) in robots.txt.
User agent token | AdsBot-Google-Mobile-Apps
|
Full user agent string | AdsBot-Google-Mobile-Apps
|
-
Googlebot-Image
-
Googlebot
Mozilla/5.0 (X11; Linux x86_64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/49.0.2623.75 Safari/537.36 Google Favicon