-
Notifications
You must be signed in to change notification settings - Fork 245
Description
For example Googlebot gets blocked by following robots.txt (check it in google testing tool):
# Slow down bots
User-agent: *
Crawl-delay: 10
# Disallow: Badbot
User-agent: badbot
Disallow: /
# allow explicitly all other bots
User-agent: *
Disallow:
If you remove Crawl-delay directive Googlebot will pass. This works:
# Disallow: Badbot
User-agent: badbot
Disallow: /
# allow explicitly all other bots
User-agent: *
Disallow:
And this too:
# Disallow: Badbot
User-agent: badbot
Disallow: /
If you would like to use Crawl-delay directive and to not block Googlebot you must add Allow directive:
# Slow down bots
User-agent: *
Crawl-delay: 10
# Disallow: Badbot
User-agent: badbot
Disallow: /
# allow explicitly all other bots
User-agent: *
Disallow:
# allow explicitly all other bots (supported only by google and bing)
User-agent: *
Allow: /
Both Crawl-delay and Allow are unofficial directives. Crawl-delay is widely supported (except of Googlebot). Allow is supported only by Googlebot and Bingbot (AFAIK). Normally Googlebot should be allowed by all above mentioned robots.txt. E.g. if you choose Adsbot-Google in mentioned google tool it will pass for all. All other google bots will fail in the same way. For first time we have noticed this unexpected behaviour at the end of 2021.
Is this a mistake in parsing of robots.txt by Googlebot or do I just miss something?