Integrate fail2ban with WordPress: Spam Log Plugin

Spam Log is a WordPress plugin that writes a log entry for every comment marked as spam. The log file is suitable for processing by fail2ban.

Recently, I’ve encountered some very aggressive WordPress spam bots. These bots post a new spam comment almost every minute for hours on end. Needless to say my spam queue is a mess. I wrote the following plugin to solve this problem.

What is Spam Log?

Spam Log is a simple WordPress plugin that logs a message every time a comment is marked as spam. Each log message includes the IP address of the poster and the comment’s ID. The log can easily be processed by fail2ban. fail2ban is a daemon that scans log files for misbehaving clients and bans them by IP address. Here is sample output generated by Spam Log:

2009-04-20 04:15:03 comment id=527 from host= marked as spam
2009-04-20 04:18:15 comment id=528 from host= marked as spam
2009-04-20 04:20:36 comment id=529 from host= marked as spam
2009-04-20 04:21:46 comment id=530 from host= marked as spam
2009-04-20 04:22:49 comment id=531 from host= marked as spam

Why use Spam Log and fail2ban if Akismet/wp-recaptcha/etc. is already catching all the spam?

  • Many spammers post 50+ comments a day from a single IP address. Even if every comment is correctly marked as spam, the volume alone means that you can’t easily monitor the spam queue for false positives. Spam Log and fail2ban should considerably reduce the total amount of spam.
  • Even if spam comments never appear on your blog, they still waste valuable resources on your server. Low-memory virtual servers need all available resources for serving legitimate users. Banning spammers at the firewall before they ever connect to your web server is very efficient.


Spam Log

  1. Upload the spam-log folder to the wp-content/plugins directory.
  2. Active the plugin through the WordPress Admin menu.
  3. Set the location of the spam log through Spam Log’s Options page in the WordPress Admin menu. By default, the location is set to wp-content/spam.log. The file or containing directory needs to be writeable by the user that the web server runs as. On Debian or Ubuntu systems, you can do the following:

$ sudo touch /path/to/spam.log
$ sudo chown www-data.www-data /path/to/spam.log

fail2ban Configuration

Create /etc/fail2ban/filter.d/spam-log.conf with the following contents:

failregex = ^\s*comment id=\d+ from host=<HOST> marked as spam$
ignoreregex =

Add the following lines to /etc/fail2ban/jail.local:

enabled  = true
port     = http,https
filter   = spam-log
logpath  = /path/to/spam.log
maxretry = 5
findtime = 3600
bantime  = 86400

Change logpath to the path you set on Spam Log’s Options page. This configuration will ban an IP address for a day if it’s used to post 5 comments within an hour that are marked as spam. Warning: Some captcha plugins mark comments as spam when a user fails a captcha. Be careful decreasing maxretry if you’re using such a plugin as there’s a risk that you will ban legitimate users.



Log iptables Messages to a Separate File with rsyslog

Learn how to filter iptables log messages to a separate file. Two methods are presented: one using traditional syslog and one using rsyslog.

Firewall logging is very important, both to detect break-in attempts and to ensure that firewall rules are working properly. Unfortunately, it’s often difficult to predict in advance which rules and what information should be logged. Consequently, it’s common practice to err on the side of verbosity. Given the amount of traffic that any machine connected to the Internet is exposed to, it’s critical that firewall logs be separated from normal logs in order to ease monitoring. What follows are two methods to accomplish this using iptables on Linux. The first method uses traditional syslog facility/priority filtering. The second, more robust method filters based on message content with rsyslog.

The Old Way: Use a Fixed Priority for iptables

The traditional UNIX syslog service only has two ways to categorize, and consequently route, messages: facility and priority. Facilities include kernel, mail, daemon, etc. Priorities include emergency, alert, warning, debug, etc. The Linux iptables firewall runs in the kernel and therefore always has the facility set to kern. Using traditional syslog software, the only way you can separate iptables messages from other kernel messages is to set the priority on all iptables messages to something specific that hopefully isn’t used for other kernel logging.

For example, you could add something like the following to /etc/syslog.conf:

kern.=debug -/var/log/iptables.log

and specifically remove the kernel debugging messages from all other logs like so:

kern.*;kern.!=debug -/var/log/kern.log

and in each iptables logging rule use the command line option --log-level debug.

There are two distinct disadvantages to this approach. First, there’s no guarantee that other kernel components won’t use the priority you’ve set iptables to log at. There’s a real possibility that useful messages will be lost in the deluge of firewall logging. Second, this approach prevents you from actually setting meaningful priorities in your firewall logs. You might not care about random machines hammering Windows networking ports, but you definitely want to know about malformed packets reaching your server.

The New Way: Filter Based on Message Content with rsyslog

rsyslog is mostly a drop-in replacement for a tradtional syslog daemon–on Linux, klogd and sysklogd. In fact, on Debian and Ubuntu, you can simply:

$ sudo apt-get install rsyslog

and if you haven’t customized /etc/syslog.conf, logging should continue to work in precisely the same way. rsyslog has been the default syslog on Red Hat/Fedora based systems for a number of versions now, but if it’s not installed:

$ sudo yum install rsyslog

Configure iptables to Use a Unique Prefix

We’ll setup rsyslog to filter based on the beginning of a message from iptables. So, for each logging rule in your firewall script, add --log-prefix "iptables: ". Most firewall builder applications can be easily configured to add a prefix to every logging rule. For example, if you’re using firehol as I am, you could add:


to /etc/firehol/firehol.conf.

Configure rsyslog to Filter Based on Prefix

Create /etc/rsyslog.d/iptables.conf with the following contents:

:msg, startswith, "iptables: " -/var/log/iptables.log
& ~

The first line means send all messages that start with “iptables: ” to /var/log/iptables.log. The second line means discard the messages that were matched in the previous line. The second line is of course optional, but it saves the trouble of explicitly filtering out firewall logs from subsequent syslog rules.

When I configured this on my own machines, I did notice one issue that may be a peculiarity of firehol, but it’s probably worth mentioning anyway. It seems that firehol adds an extra single quote at the beginning of log messages that needs to be matched in the rsyslog rule. For example, here’s a log message from firehol:

Apr 17 12:41:07 tick kernel: 'firehol: 'IN-internet':'IN=eth0 OUT= MAC=fe:fd:cf:c0:47:b5:00:0e:39:6f:48:00:08:00 SRC= DST= LEN=64 TOS=0x00 PREC=0x00 TTL=32 ID=5671 DF PROTO=TCP SPT=3549 DPT=5555 WINDOW=65535 RES=0x00 SYN URGP=0

Notice the extra quote after “kernel: ” and before “firehol: “. So, on my machine I configured the rsyslog filter like so:

:msg, startswith, "'firehol: " -/var/log/iptables.log
& ~

Configure iptables Log Rotation

Finally, since we’re logging to a new file, it’s useful to create a log rotation rule. Create a file /etc/logrotate.d/iptables with the following contents:

	rotate 7
		invoke-rc.d rsyslog rotate > /dev/null

The preceding script tells logrotate to rotate the firewall log daily and keep logs from the past seven days.

Firefox 3 Native Form Widgets Look Terrible

Firefox 3 added native form widgets on Linux. Most of the time they look great, but on some sites including mine, they look awful. Here’s how I styled my forms to avoid native widgets.

Actually, the native widgets only look terrible if you’re on Linux with certain gtk+ themes and you’re viewing a web site with a dark background. Unfortunately, all of those conditions apply to me. See below:

Firefox 3 native widgets on Linux
Firefox 3 native widgets on Linux

As you can see, there’s an ugly white box around the otherwise rounded widgets. Originally, I thought my CSS was lacking, but after hours–literally–of googling I determined that Firefox’s implementation of gtk+ widgets is just shoddy. Of course, on a site with a white background, they look great. I’m reminded of that Henry Ford quote: “Any customer can have a car painted any colour that he wants as long as it is black.”

Applying any styling to the input elements will completely disable the native widgets. I ended up writing some basic CSS:

input, textarea { border: 2px solid #888888; }
input:focus, textarea:focus {
	border-color: #D9D27C;
	background-color: #FFFBC4;

They don’t look nearly as good as native widgets on a white background, but they look a whole lot better than native widgets on my site. See below:

Basic styled widgets
Basic styled widgets

I also added some input focus bling to the form:

Text input with keyboard focus
Text input with keyboard focus

Even though they look alright now, I’ve decided that my next major project on my blog should be a complete theme redesign. Traditional color schemes are easier to work with and I’ll also get familiar with a lot of WordPress PHP code.

Safely Removing External Drives in Linux

Simply unmounting a filesystem is not the ideal way to remove an external USB/firewire/SATA drive in Linux. This tutorial explains why and gives a solution.


About a year ago I bought an external SATA drive for backups. My normal usage consisted of:

  1. Power on and connect the drive
  2. mount /media/backup
  3. Run my backup script
  4. umount /media/backup
  5. Power off and unplug the drive

This seemed to work pretty well–at the very least, I wasn’t losing data–except the drive made a strange sound when I powered it off. It wasn’t a normal drive spin down sound; it was louder and shorter. So, I googled for authoritative instructions on using external drives with Linux. While most sources suggest doing exactly what I did, it’s not ideal.

It turns out that most cheap external USB/SATA/firewire enclosures don’t properly issue a stop command to the drive when you flick the power switch. Instead, the power switch simply cuts power to the drive, which forces the drive to do an emergency head retract. If you think that sounds bad, you’re right. Emergency retracts aren’t going to brick your drive immediately, but if they occur regularly they’re putting a lot of unnecessary wear and tear on the drive. In fact, some drives monitor how often this happens with S.M.A.R.T. attribute 192. (Check Wikipedia’s S.M.A.R.T. page for a comprehensive list of attributes)


The solution is to spin down the drive via software before turning it off and unplugging it. The best way to do this is with a utility called scsiadd. This program can add and remove drives to Linux’s SCSI subsystem. Additionally, with fairly modern kernels, removing a device will issue a stop command, which is exactly what we’re looking for. Run:

$ sudo scsiadd -p

which should print something like:

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: SAMSUNG HD300LJ  Rev: ZT10
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi4 Channel: 00 Id: 00 Lun: 00
  Vendor: LITE-ON  Model: DVDRW LH-20A1L   Rev: BL05
  Type:   CD-ROM                           ANSI  SCSI revision: 05
Host: scsi5 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: WDC WD10EACS-00Z Rev: 01.0
  Type:   Direct-Access                    ANSI  SCSI revision: 05

Identify the drive you want to remove and then issue:

$ sudo scsiadd -r host channel id lun

substituting the corresponding values from the scsiadd -p output. For example, if I wanted to remove “WDC WD10EACS-00Z”, I would run:

$ sudo scsiadd -r 5 0 0 0

If everything works, scsiadd should print:

Attached devices:
Host: scsi0 Channel: 00 Id: 00 Lun: 00
  Vendor: ATA      Model: SAMSUNG HD300LJ  Rev: ZT10
  Type:   Direct-Access                    ANSI  SCSI revision: 05
Host: scsi4 Channel: 00 Id: 00 Lun: 00
  Vendor: LITE-ON  Model: DVDRW LH-20A1L   Rev: BL05
  Type:   CD-ROM                           ANSI  SCSI revision: 05

You can double-check the end of dmesg. You should see:

[608188.235216] sd 5:0:0:0: [sdb] Synchronizing SCSI cache
[608188.235362] sd 5:0:0:0: [sdb] Stopping disk
[608188.794296] ata6.00: disabled

At this point, the drive is removed from Linux’s SCSI subsystem and it should not be spinning. It’s safe to unplug and turn off.

Using scsiadd directly can be inconvenient because it requires looking up the host, channel, id, and lun of the drive. I wrote a short script that will take a normal Linux device file like /dev/sdb, figure out the correct arguments to scsiadd, and run scsiadd -r. I use this script in my larger backup script.


if [ $# -ne 1 ]; then
    echo "Usage: $0 <device>"
    exit 1

if ! which lsscsi >/dev/null 2>&1; then
    echo "Error: lsscsi not installed";
    exit 1

if ! which scsiadd >/dev/null 2>&1; then
    echo "Error: scsiadd not installed"
    exit 1

device=`lsscsi | grep $1`
if [ -z "$device" ]; then
    echo "Error: could not find device: $1"
    exit 1

hcil=`echo $device | awk \
    '{split(substr($0, 2, 7),a,":"); print a[1], a[2], a[3], a[4]}'`

scsiadd -r $hcil

It does require the lsscsi command to be present on the system.

Monitoring Hard Drive Health on Linux with smartmontools

S.M.A.R.T. is a system in modern hard drives designed to report conditions that may indicate impending failure. smartmontools is a free software package that can monitor S.M.A.R.T. attributes and run hard drive self-tests. Although smartmontools runs on a number of platforms, I will only cover installing and configuring it on Linux.

Continue reading “Monitoring Hard Drive Health on Linux with smartmontools”

Updated awstats for Debian

The awstats package in Debian is pretty outdated. Etch has version 6.5. Sid has 6.7. Version 6.9 was released on December 28, 2008. I’m a statistics junky and the new version has better robot detection so I built an updated package on my lenny/sid machine: awstats_69-1_all.deb. This package also works perfectly on my etch server without changes.

I did apply all the Debian patches that were still relevant so it should be the equivalent of the official Debian package.

Fixing uTorrent File Associations in Linux

Running uTorrent under wine works pretty well in Linux. One problem with installing under wine is that file associations are not set correctly. This means that double-clicking a .torrent file or opening a .torrent link in your browser won’t automatically start uTorrent. Here’s how to solve that.

Continue reading “Fixing uTorrent File Associations in Linux”

Making KDE Play Nice on a GNOME Desktop

I prefer GNOME for the most part, but some KDE applications are simply better than their GNOME counterparts. That said, there are some problems getting them to work well together. I recently discovered that installing k3b cause my GNOME menus to, for lack of a better word, explode. Here’s how I went about fixing them.

Continue reading “Making KDE Play Nice on a GNOME Desktop”