Cloudfare’s block AI bots and bot fight mode will block google & other search engines from crawling your site

cpvr

Paragon
Joined
Feb 2, 2011
Messages
2,235
Reaction score
471
FP$
1,091
If you enable Cloudfare’s block AI bots, scrapers and crawlers with a single click feature, you run a chance of blocking all the search engine spiders & will run a risk of getting deindexed on all the search engines, which you don’t want to do. The same thing will happen if you enable “bot fight mode” without changing the settings to exclude the main search engine spiders.

Cloudflare’s Bot Fight Mode is the cause. Essentially, this tool works at a js level and blocks anything it considers to be suspicious - including Google’s crawler. Cloudflare will tell you that you can exclude the crawler meaning it is permitted to access. However, they omit one critical element - you need to have a paid plan to do so.

With the Bot Fight Mode enabled, the website cannot be crawled

So instead, you should block the bots that you don’t want crawling your sites via the use of a robots.txt file. https://help.raptive.com/hc/en-us/articles/25756415800987-How-to-manually-block-common-AI-crawlers

How it works​

You can block AI crawlers by adding them to your site’s robots.txt file as disallowed user agents (according to each AI company’s instructions.)

User-agent: anthropic-ai
Disallow: /
User-agent: Claude-Web
Disallow: /
User-agent: CCbot
Disallow: /
User-agent: FacebookBot
Disallow: /
User-agent: Google-Extended
Disallow: /
User-agent: GPTBot
Disallow: /
User-agent: PiplBot
Disallow: /
 
You only need to enable it if you see one of your sites under attack; most admins keep the setting essentially off.🙂
 
So basically, their handy little feature is a double-edged sword... surely there's other ways to fight off spambots and keep your site high in the ranks. I think Cloudflare should fix that issue soon...
 
I'm not exactly onboard with using CloudFlare or other third-party apps because of that.

It reminds me of how sketchy Tapatalk was/is.

I know a lot of people use CF but things like that makes me wonder even if you can disable it.
 
I don't use CloudFlare for a variety of reasons although I do appreciate it has its uses. Is it really still the case that you need a paid plan to effectively whitelist desirable crawlers?

I do recall reading something about this being a potential issue a while ago. I'm a little surprised they have now chosen to implement it like that if it's still the case.
 
I don't think that I have my end enable anyway, I mean I need to double check XD
 
All the recommendations from people I trust point out that you should NOT enable the Bot-Fight mode unless you are having serious problems.
It is recommended that you block by CF rules using user agents, which are MUCH better than robots.txt, which many bad-bots ignore. So if the bot ignores it, you STILL get hit with it unless you block it at the CF level.
Yes, it take some extra work as frequently the bad-bots will change their user agent... but stuff like the AI bots that many want to block will not.

Screen Shot 2024-09-26 at 7.51.07 AM.png
 
Back
Top Bottom