Ignore Google Bots on Fail2Ban

This post is related to How to block 404 attacks using fail2ban.

If you have a fail2ban rule where you notice that Google bots are bing jailed, then there is a way to keep legitimate Google bots from being jailed.

Google Tidbits

An obvious thought might be to whitelist Google’s IPs on fail2ban so that Google bots can safely crawl your site. The problem here is that Google doesn’t share it’s IP ranges and they have stated they can change at any time. However, they do recommend verifying by doing a reverse DNS lookup, as you can see in this article https://developers.google.com/search/blog/2006/09/how-to-verify-googlebot.

Knowing this, we can leverage fail2ban’s ignorecommand found in the jail.local file.

This operation lets you point to a script to run some checks to determine if the provided IP should be jailed or ignored.

For the below steps, I’ll SSH into the server and make the updates using command line.

Step 1: Add script

Navigate to local bin, where we’ll add the script:

cd /usr/local/bin

Here, we’ll create a new script file and make it executable.

touch ignore_ip_check.sh && chmod +x ./ignore_ip_check.sh

Edit the file and add the following contents:

HOSTRESULT="$(host -W 1 ${IP})"
if [[ "$HOSTRESULT" =~ $REGEX ]]; then exit 0; else exit 1; fi

To edit, I typically use vim. To use vim, just run vim ignore_ip_check.sh.

Once vim launches, tap i on your keyboard, paste in the above contents, then tap esc on your keyboard, followed by :wq! + tap enter. This will save the new file.

And that’s how you’d use vim 80% of the time :rofl:

Update jail.local

Now, in /etc/fail2ban, edit the jail.local file.

There is a section for ignorecommand =

This will need to be updated as follows:

ignorecommand = /usr/local/bin/ignore_ip_check.sh <ip>

Restart fail2ban and test

Lastly, in Cleavr, in server > services, select option to restart fail2ban.

To test, it may be easiest in Cleavr 2.0 (app.cleavr.io), go to the server > logs section and view the fail2ban logs. Also, I’d recommend doing this on a different device or via a VPN where you can change IPs - since you’ll be triggering the IP you’re using to be banned.

Now, open up a browser, go to a site on the server, and generate 404 error enough times to get jailed. Check out the logs and make sure there are no fail2ban errors. If there is an error with the script, you’ll see it presented in the logs.

To give credit where credit is due, I based this approach on this article https://deeb.me/20180320/how-not-to-ban-googlebot, but I made a few updates to iron out the kinks.