Because I’m a giant loser who thinks that analyzing apache logs is an awesome way to spend a Friday night, I’ve noticed a huge upswing in the amount of traffic coming from IP addresses in China – to the degree that it’s actually eating up huge amounts of bandwidth.
Identifying That You Have a Problem
These are otherwise very low-traffic websites that are suddenly pulling millions of visits per month, with the vast majority of these requests consisting of hack attempts and egregious abuse of forum registrations and other forms. I already have two or three text, image and logic captchas running in addition to RBL lookups on IP. Forum registration spam is a whole other battle on its own.
Here’s the stats report from January 2013. As you can see, China alone is responsible for 30GB of bandwidth and over 1 million pageviews just from January. In addition, all of the bogus forum registrations caused about 30,000 email bounces from the “Please confirm your email address” registration confirmation messages being sent to bogus addresses.
Thirty-thousand bounced emails is not just annoying. It puts the reputation of your IP address at risk. Even if you yourself aren’t sending out spam from the server and you don’t have any vulnerable scripts that are being used to send out spam to others, your IP can get blacklisted on extortion-based blacklists like Backscatterer, who blacklist you for a month unless you pay them for “expedited” delisting. Fuckers.
Blacklisted mail server IPs mean you en up with serious deliverability issues with your legitimate emails (and those of your company, your clients, etc.)
If you have any programming chops, email validation services such as BriteVerify can help prevent those registrations (and thus, the bouncing confirmation emails) from going through using their APIs for a penny a pop. (Full disclosure: I know the guys at BriteVerify. That said, if I didn’t like their product, I wouldn’t link to them.)
While I am loathe to block an entire country from accessing a website, as you can see from the diagram below, something had to be done. On a few of these servers, I do not have root access and have a mere mortal linux user there.
The easiest way to handle this blocking was with .htaccess. I added the .htaccess in this article on January 29. Below is the traffic stat report for February 2013. Page requests from China were down from over 1 million to 17k, with bandwidth used down to 510MB from almost 30GB.
This htaccess I used is a combination of IPs from Wizcraft’s Chinese Blocklist and my own stock .htaccess I use on all of my sites. You can grab a copy of my htaccess from the gist from GitHub.
Certainly, blocking an entire country isn’t always going to be an option, but when you’re on a server you don’t have admin access on, you may not have a ton of options open to you. You either take it in the can with bandwidth costs and IP reputation, or you do what you have to do.
As an exercise, I’m checking out CloudFlare, which lets you make a few DNS tweaks and they basically proxy, cache and CDN your entire website. They basically throw an interstitial CAPTCHA up to users who are browsing your site from IP addresses that are repeatedly tied to bad behavior.
They never charge for bandwidth, and have varying levels of protection based on tolerance and whether you’re on a free or pro tier. I’ve only just started with them today, on the recommendation of a friend on Twitter, but so far I’m pretty impressed. Paid packages start at $20/month, and the Enterprise level boasts full DDoS attack protection. At the absolute worst, it’s made my site faster.
Once I’ve spent a little time with CloudFlare, I’ll write-up a review and post it here.
In the meantime, how are you handling the deluge of country-specific spam and bad actors?