Referer Whitelist Bypassing Bad Bot Block: A Fix
Hey folks, have you ever encountered a situation where your carefully crafted security measures seem to be playing hide-and-seek with bad actors? Well, I recently stumbled upon an interesting quirk with the apache-ultimate-bad-bot-blocker, specifically related to how it handles referers and user agents. Let's dive into it, shall we?
The Mystery: When Whitelists Meet Blacklists
The heart of the matter lies in a scenario where a whitelisted domain, as defined in custom.d/whitelist-domains.conf, manages to bypass a blacklisted user agent listed in custom.d/blacklist-user-agents.conf. It's like having a VIP pass that allows a known troublemaker to sneak past security. Not ideal, right? My initial reaction was a mix of confusion and mild panic. I mean, the whole point of these configurations is to keep unwanted requests at bay. But, as we'll see, there's a logical explanation and, more importantly, a fix!
To understand this, we need to quickly recap how these configurations typically work. The whitelist-domains.conf file allows you to specify referers you trust. This is super useful for allowing traffic from your own sites, or from partners, to bypass certain restrictions. On the other hand, the blacklist-user-agents.conf file is designed to block requests from user agents you know are malicious, such as bots scraping your site or other automated nastiness. The problem arises when these two settings interact in a way that doesn't quite align with expectations.
Imagine this: you've set up a whitelist for yourdomain.com. You want to make sure traffic from this domain gets a special pass. At the same time, you've blacklisted a particularly nasty user agent, MyVeryBadUserAgentName, which you want to block. Now, here's where the plot thickens: if a request comes in with the blacklisted user agent, but also includes a referer from your whitelisted domain, the blacklisting seems to be ignored. The request goes through, and you're left scratching your head.
This isn't necessarily a bug in the code. It's more of an interaction between the logic and the way the configurations are structured. The default configuration might be unintentionally prioritizing the referer whitelist over the user agent blacklist in certain scenarios. This leads to the undesirable outcome where a request from a bad actor slips through.
Let's break down how this happens and how to fix it.
Reproducing the Issue: The Steps
To make sure we're all on the same page, let's go through the steps to reproduce this behavior. This way, you can test it yourself and verify that you're seeing the same issue.
Step 1: Whitelist a Domain. First, you'll need to add a domain to your custom.d/whitelist-domains.conf file. This tells Apache to trust requests that come from this domain. Here's what it might look like:
SetEnvIfNoCase Referer ~*yourdomain\.com good_ref
This line sets an environment variable, good_ref, if the referer matches yourdomain.com. The SetEnvIfNoCase directive checks the referer, ignoring case, and if it matches the regular expression, it sets the environment variable good_ref.
Step 2: Blacklist a User Agent. Next, add the offending user agent to your custom.d/blacklist-user-agents.conf file. This tells Apache to block requests from this user agent. Here's an example:
BrowserMatchNoCase "^(.*?)\bMyVeryBadUserAgentName\b(.*?){{content}}quot; bad_bot
This line uses BrowserMatchNoCase to check the User-Agent header, ignoring case, and if it matches the regular expression (which looks for our bad user agent name), it sets the environment variable bad_bot.
Step 3: Restart Apache. After making these changes, you need to restart Apache for them to take effect. This is crucial; otherwise, your changes won't do anything. The command to restart Apache will depend on your system, but it's usually something like:
sudo systemctl restart apache2 # or
sudo service apache2 restart
Make sure to use the correct command for your system.
Step 4: Test with cURL. Now, the moment of truth! Use curl to test if the blacklisting is working. You'll send a request that includes both the bad user agent and the referer from the whitelisted domain. Here's the curl command:
curl -I -A "MyVeryBadUserAgentName" http://yourdomain.com -e "http://yourdomain.com"
This command sends a HEAD request (-I) to your domain. The -A option sets the user agent to our blacklisted value, and the -e option sets the referer to your whitelisted domain. If everything is working as expected (that is, if the blacklist is being honored), you should get a 403 Forbidden response. However, if you get a 200 OK response, you've confirmed the issue.
The Expected Outcome vs. Reality
The big question: What should happen versus what actually is happening? If you’ve followed the steps and tested the setup, you'll likely notice something's amiss. The ideal scenario is that you'd get a 403 Forbidden error. This is because we explicitly blacklisted the user agent, and any request using it should be blocked, regardless of the referer.
However, in this case, the reality is often different. You'll likely see a 200 OK response. The server is happily serving the request, completely ignoring the blacklist we set up. This is where the head-scratching begins. Why is the blacklist being bypassed? The answer, as we hinted at earlier, lies in how the configuration files are processed and how the conditions are evaluated.
The core of the problem is that the whitelist is taking precedence over the blacklist. The server sees the referer as valid and, by default, might be configured to prioritize this over the user agent check. This means that a request from a potentially malicious user agent can sneak through, as long as it originates from a whitelisted referer. This is a potential security vulnerability because it allows unwanted traffic to bypass your defenses. In essence, the VIP pass (referer whitelist) is overriding the bouncer's instructions (user agent blacklist).
The Fix: A Workaround to the Rescue
Luckily, there's a relatively simple workaround that can fix this issue. It involves modifying the globalblacklist.conf file to ensure the blacklist is properly enforced, even when a referer is whitelisted. This modification ensures that both conditions—the whitelist and the blacklist—are evaluated correctly.
The fix involves changing the order of how these environment variables are checked. Here is how you can achieve this workaround.
Locate the Relevant Section. First, you need to find the section in globalblacklist.conf that handles the good_ref and good_bot environment variables. This section usually looks like this:
8441 <RequireAny>
8442
8443 Require env good_ref
8444 Require env good_bot
This code checks if either good_ref or good_bot is set. If either condition is met, the request is allowed. This is where the problem lies: a request with a good_ref (from a whitelisted domain) will be allowed, even if it also has a bad_bot (a blacklisted user agent).
Modify the Configuration. To fix this, we'll change the RequireAny directive to a nested structure with RequireAll. This ensures both conditions are met. Replace the code above with the following:
8441 <RequireAny>
8442 <RequireAll>
8443 Require env good_ref
8444 Require env !bad_bot
8445 </RequireAll>
8446
8447 Require env good_bot
Here’s what the changes do. We use <RequireAll> to make sure that the request has a good_ref and does not have a bad_bot. This ensures that even if a request comes from a whitelisted referer, it's still blocked if it uses a blacklisted user agent. The !bad_bot means