Initial situation

These are my needs:

  • After an IP address has repeatedly failed to login (2 or 3 times), this IP address must be kept from using the SSH service for a long time; indeed, I do not want, that the attacker only has to wait for a few hours or minutes to be able to use the service again.
  • It is thus unavoidable, that the software keeps track of login failures using counters. Yet it cannot keep track of countless counters for all IP addresses that ever connect to the server. Thus, there must be a time when a counter is discarded. I wish that a counter does not get discarded before a long time, so that an attacker cannot just throttle their attacks in order to not get caught.
  • However, if an IP address successfully connects to the server, the counter for this IP address must be discarded. I often mistype a password (smartphone keyboard, complicated password…), and I do not want, that an alternation of good and bad passwords leads to my IP address being kept from accessing the server.
  • If the software restarts, the access restrictions must all be restored, even if the restart is caused by a reboot due to a power failure.
  • If at all possible, I would like to use the current restrictions from DenyHosts as a starting point for Fail2ban.

The default action in Fail2ban regarding SSH is to block access for a short time after an IP address has failed to authenticate a number of times in a short time; both of these durations can be changed. Besides, no means are provided to take into account the successful authentication attempts. So I have to customize Fail2ban.

Durations

Let me begin with the easiest part: I am going to raise the two aforementioned durations. In the context of this blog post, I will assume that I wish to search for authentication failures in the last 48 hours (172800 seconds), and have IP addresses be banned for 5 days (432000 seconds). Thus I create a first version of my /etc/fail2ban/jail.local configuration file, where all settings have precedence over the default values from /etc/fail2ban/jail.conf:

[DEFAULT]
ignoreip = 127.0.0.1/8 192.168.1.0/24
findtime = 172800
bantime = 432000

[ssh]
enabled = true
port = 22,443
maxretry = 3

As I try Fail2ban with these settings, I can see that it mostly works. However, I deal with enough attacks on my server to immediately see that Fail2ban parses log files at its start, but obviously does not parse files that were archived by logrotate. This first attempt at using Fail2ban shows that all access restrictions older than the last 24 hours will be lost whenever the software will restart.

Persistent rules upon reboot

In order to have persistent restrictions, there must be some sort of database, which is not the case with Fail2ban until version 0.9. Firstly, Debian currently does not have such a version; secondly, I’m not sure the method by which I take successful logins into account is compatible with the provided database feature. Thus I will maintain my own database, actually a simple text file. Fail2ban does indeed allows such leeway in defining custom actions.

The principle is simple: each newly banned IP address gets appended to the file; when an IP address is allowed to connect to the server again, it is removed from the file. When Fail2ban starts, I enforce the ban of IP addresses in the file myself. Since Fail2ban has no knowledge of this process, I also plan for the timely release of IP addresses thus banned. Lastly, as this custom process somewhat overlaps the standard initialisation process, additional checks keep the commands from failing because of a double-banishment or a double-release.

Now, that was the theory. In practice, I have to take into account one particular behaviour of Fail2ban that becomes a nuisance for my purpose: when Fail2ban stops cleanly (for server maintenance for example), Fail2ban unbans every IP address that was previously banned. These IP addresses ought to be banned again when Fail2ban restarts, but we already saw that this was not completely true for long-running banishments; so I have to keep this flush from happening. My need is for all banishments to remain in the file while their ban delays have not come to an end. Since there is no way to tell the difference between the end of a banishment caused by its delay coming to an end and the end of a banishment caused by Fail2ban stopping, I need to delete the lines in the file only when their delays have come to an end, not every time Fail2ban issues the request.

Therefore, with all the requirements in mind, I create a first draft of a custom action in a file named /etc/fail2ban/action.d/blacklist-iptables.local. In order to shorten the post, I do not copy over the standard comments (those present in every action file); I advise to take an existing file as a template and follow its general layout. On the other hand, I add some comments, in italics and beginning with the “#” character, with the aim of explaining a bit of what the scripts do; these comments should be discarded in the real files, since I am not sure that keeping these comments would be fine regarding the file’s syntax (I could find no documentation for action files' syntax!):

[Definition]

              # Standard configuration of the firewall.
actionstart = iptables -N f2b-blwl-<name>
              iptables -A f2b-blwl-<name> -j RETURN
              [ -e "/var/lib/fail2ban/iptables-<name>.bl" ] || touch /var/lib/fail2ban/iptables-<name>.bl
iptables -I <chain> -p <protocol> -m multiport --dports <port> -j f2b-blwl-<name> # Deletion of outdated lines. awk -vnow=`date +%%s` -vunban=<unban> '($1+unban+60)>now{print}' \ /var/lib/fail2ban/iptables-<name>.bl > /var/lib/fail2ban/iptables-<name>.bl.new mv -f /var/lib/fail2ban/iptables-<name>.bl.new /var/lib/fail2ban/iptables-<name>.bl # Banishment of IP from the file, and plan for their release. while read TS IP; do # — banish: iptables -I f2b-blwl-<name> -s $IP -j DROP # — plan for release: UNBAN=`expr $TS + <unban>` echo "if iptables -n -L f2b-blwl-<name> | grep -q 'DROP.* $IP '; then iptables -D f2b-blwl-<name> -s $IP -j DROP fi" | at -t `date +%%Y%%m%%d%%H%%M.%%S -d "1970-01-01 00:00:00 UTC +$UNBAN second"` done < /var/lib/fail2ban/iptables-<name>.bl # Deletion of planned tasks. actionstop = for atid in `at -l | cut -f1`; do at -c $atid | grep -qF ' f2b-blwl-<name> ' && at -r $atid done # Firewall reset. iptables -D <chain> -p <protocol> -m multiport --dports <port> -j f2b-blwl-<name> iptables -F f2b-blwl-<name> iptables -X f2b-blwl-<name> actioncheck = iptables -n -L <chain> | grep -q f2b-blwl-<name> # Banishment of an IP, unless it already is in the file. actionban = if ! grep -q ' <ip>$' /var/lib/fail2ban/iptables-<name>.bl; then TS=`echo <time> | sed 's/\..*//'` printf '%%d %%s\n' "$TS" '<ip>' >> /var/lib/fail2ban/iptables-<name>.bl iptables -I f2b-blwl-<name> -s <ip> -j DROP fi # Release of an IP in the firewall, unless it is not banned to start with. actionunban = if iptables -n -L f2b-blwl-<name> | grep -q 'DROP.* <ip> '; then iptables -D f2b-blwl-<name> -s <ip> -j DROP fi # Deletion of outdated lines in the file. awk -vnow=`date +%%s` -vunban=<unban> '($1+unban)>now{print}' \ /var/lib/fail2ban/iptables-<name>.bl > /var/lib/fail2ban/iptables-<name>.bl.new mv -f /var/lib/fail2ban/iptables-<name>.bl.new /var/lib/fail2ban/iptables-<name>.bl [Init] name = default port = ssh protocol = tcp chain = INPUT unban = 300

The actionstart script configures the iptables firewall the same way as is done by Fail2ban by default; then lines that became outdated between the server’s stop and its start (plus a security delay of 60 seconds to avoid IP addresses getting freed, and the file changed, while the latter is still being read) are flushed from the /var/lib/fail2ban/iptables-….bl file; jobs are then planned through the atd service with the purpose of freeing IP addresses still in the file, when their banishment delays come to an end. Note that I choose not to alter the database file from the atd jobs even though I could do so (those tasks are only triggered by the end of a delay, never by Fail2ban stopping), or even should do so, maybe; I make this choice to reduce the risk of concurrent modifications to the file by two or more tasks; the rate of attacks on my server is high enough that the above scripts should make the file right again in one hour’s time :-p

The actionstop script deletes the jobs responsible for cancelling the banishments set from the database file, and then reset the firewall in its initial state, the same way as Fail2ban does.

The actioncheck script checks that the firewall is correctly configured before allowing the actionban script to run. It is a pity, that the Fail2ban variable <ip> is not available to this script: it would have avoided a number of if tests, and would have consequently made Fail2ban logs more relevant (I will return to this later).

The actionban script bans an IP received through the Fail2ban variable <ip>. The actionunban script frees a previously banned IP address in the firewall, and also in the file if the banishment delay has come to an end (which is always the case except when Fail2ban is stopping).

Notification by e-mail

Stemming from the fact that most action scripts are enclosed in if tests, and that Fail2ban has no knowledge of what is done at its start (direct banishments based on the file, then release of these through atd jobs), Fail2ban logs are bound to not quite reflect reality. To compensate for this, I add an action of notification by e-mail, that I write so that the generated e-mails look like those from DenyHosts; in this way, I can keep using the sorting rules already set in Thunderbird, and I can parse and analyse these e-mails like I have done so far.

Unfortunately, if I declare two action files in the main configuration file /etc/fail2ban/jail.local, the second action file is triggered even if the first one returns with an error status (I tried both return 1 and exit 1). As a consequence, I have to manage the sending of e-mails in the same action file as before: /etc/fail2ban/action.d/blacklist-iptables.local; here is this file’s new contents (changes are in bold):

[Definition]

actionstart = iptables -N f2b-blwl-<name>
              iptables -A f2b-blwl-<name> -j RETURN
              [ -e "/var/lib/fail2ban/iptables-<name>.bl" ] || touch /var/lib/fail2ban/iptables-<name>.bl
iptables -I <chain> -p <protocol> -m multiport --dports <port> -j f2b-blwl-<name> awk -vnow=`date +%%s` -vunban=<unban> '($1+unban+60)>now{print}' \ /var/lib/fail2ban/iptables-<name>.bl > /var/lib/fail2ban/iptables-<name>.bl.new mv -f /var/lib/fail2ban/iptables-<name>.bl.new /var/lib/fail2ban/iptables-<name>.bl while read TS IP; do iptables -I f2b-blwl-<name> -s $IP -j DROP UNBAN=`expr $TS + <unban>` echo "if iptables -n -L f2b-blwl-<name> | grep -q 'DROP.* $IP '; then iptables -D f2b-blwl-<name> -s $IP -j DROP fi" | at -t `date +%%Y%%m%%d%%H%%M.%%S -d "1970-01-01 00:00:00 UTC +$UNBAN second"` done < /var/lib/fail2ban/iptables-<name>.bl actionstop = for atid in `at -l | cut -f1`; do at -c $atid | grep -qF ' f2b-blwl-<name> ' && at -r $atid done iptables -D <chain> -p <protocol> -m multiport --dports <port> -j f2b-blwl-<name> iptables -F f2b-blwl-<name> iptables -X f2b-blwl-<name> actioncheck = iptables -n -L <chain> | grep -q f2b-blwl-<name> actionban = if ! grep -q ' <ip>$' /var/lib/fail2ban/iptables-<name>.bl; then TS=`echo <time> | sed 's/\..*//'` printf '%%d %%s\n' "$TS" '<ip>' >> /var/lib/fail2ban/iptables-<name>.bl iptables -I f2b-blwl-<name> -s <ip> -j DROP printf %%b "Subject: DenyHosts Report Date: `env LANG=C date +"%%a, %%d %%h %%Y %%T %%z"` From: DenyHosts <<sender>> To: <dest>\n Added the following hosts to /var/lib/fail2ban/iptables-<name>.bl:\n <ip> (unknown)\n ----------------------------------------------------------------------" | /usr/sbin/sendmail -f <sender> <dest> fi actionunban = if iptables -n -L f2b-blwl-<name> | grep -q 'DROP.* <ip> '; then iptables -D f2b-blwl-<name> -s <ip> -j DROP fi awk -vnow=`date +%%s` -vunban=<unban> '($1+unban)>now{print}' \ /var/lib/fail2ban/iptables-<name>.bl > /var/lib/fail2ban/iptables-<name>.bl.new mv -f /var/lib/fail2ban/iptables-<name>.bl.new /var/lib/fail2ban/iptables-<name>.bl [Init] name = default port = ssh protocol = tcp chain = INPUT unban = 300 dest = root@localhost sender = nobody@localhost

Finally, I change the main configuration (/etc/fail2ban/jail.local) file to use this newly defined action:

[DEFAULT]
ignoreip = 127.0.0.1/8 192.168.1.0/24
findtime = 172800
bantime = 432000
action = blacklist-iptables[name=%(__name__)s, port="%(port)s", protocol="%(protocol)s", chain="%(chain)s", unban="%(bantime)s"]

[ssh]
enabled = true
port = 22,443
maxretry = 3

I choose to put this action under [DEFAULT] because I will use this same action later when I use Fail2ban to also protect the web server. In Fail2ban’s syntax, the action line calls the custom action file that was defined before, and provides it with parameters between square brackets. Note that many parameters are defined using existing variables, either declared just above in the same file, or declared by Debian in the /etc/fail2ban/jail.conf file.

With this configuration, things are already much better: nothing gets lost upon restart, even in the case of a power failure, since the state of active banishments is always saved to the file, as it would to a journal, and gets restored on start. Besides, since the initial state is read from a file, it becomes trivial to get this state from DenyHosts; the following command, run before running Fail2ban for the first time, is enough to initialize Fail2ban:

grep '^# DenyHosts' /etc/hosts.deny | sed 's/[a-z]:/|/g' | while IFS='|' read x d y i; do printf '%d %s\n' $(date +%s -d "$d") $i; done >/var/lib/fail2ban/iptables-ssh.bl

This command extracts from the /etc/hosts file the lines that DenyHosts inserted there (for example: « # DenyHosts: Sun Oct 19 08:19:25 2014 | sshd: 222.186.56.49 ») and replaces the “:” characters that appear just after a letter with a “|” character in order to produce 4-field lines, with the 1st and 3rd fields holding no interest, and the 2nd and 4th fields being respectively the banishment date and time of an IP address, and this IP address; this text then gets read line by line, and formatted into my database file’s format: first the Unix time-stamp, then a space, and finally the IP address.

From my initially stated needs, the only one still not satisfied at this stage is to forgive an IP address its past failures as soon as it issues a correct authentication; in other words, reset an IP address’ counter.

Forgiveness

This part is really more complicated: Fail2ban offers no mean to reset a counter. So I follow the given advice: manage my own list of authorized IP addresses, just like I managed my own list of forbidden IP addresses. But how long should a safe-conduct be granted? After I have successfully logged in, the counter for my IP address will not actually get reset since Fail2ban is not able to do that… Considering the lifetime of a counter, given by the findtime variable, I choose to use this same variable for the reprieve.

But imagine this scenario: I successfully login and my IP is granted a grace time of findtime seconds; during this time, I sometimes fail to login but it does not matter because I’m protected (the counter keeps going up, though); just before the end of the protection time, I login successfully again but Fail2ban does not react because I am already protected; a few seconds later, just after my protection has gone away, I fail to login again and Fail2ban bans my IP address because I failed to login one time above limit, and my IP address is not protected anymore. Conclusion: even though the protection must actually last findtime seconds, Fail2ban must believe that this protection is for a few seconds only, so that the protection can be refreshed at each new successful login.

Concurrent accesses

Here is an interlude, dedicated to concurrent accesses… While testing the solution, I encountered several instances of iptables failing to execute, because of simultaneous runs; this is because many failed connections occur on my server, some of which are very close to each other [1].

In order to avoid this error, all the while avoiding any concurrent access to my small database, I put an exclusive lock on the usage of iptables. For this method to work, the same lock must be used for all uses of an iptables command, which means that I also had to alter Debian Wheezy’s init script for iptables (it is attached for reference, as well as the corresponding patch file).

Note that this locking mechanism around iptables uses is not specific to my solution, as it stems from a limitation of the Linux kernel. The same error may happen with the standard iptables module from Fail2ban, and the same fix applies.

Newer versions of iptables recognize and fix this problem, with the introduction of the xtables lock. Debian Jessie has this newer iptables version, which accepts the -w parameter in order to wait for the xtables lock to be available. Users of such a version of iptables should always use this -w parameter, firstly because it is a safety net in case some other tool issues iptables commands, secondly because this way it becomes unnecessary to meddle with the distribution’s files, such as iptables’s initialisation script. Still, I keep the lock I described a few paragraphs above because it also protects my database-file.

The final solution

Now that the interlude is past, let me describe the final solution. The corresponding files are attached to the article.

The first step is to create a custom filter that spots successful logins; a quick glance through my /var/log/auth.log file lets me recognize those. Here is the contents of the new filter file (/etc/fail2ban/filter.d/whitelist-ssh.local), once again without the standard comments:

[INCLUDES]

before = common.conf

[Definition]

_daemon = sshd

failregex = ^%(__prefix_line)sAccepted \S+ for \S+ from <HOST>\s.*$

ignoreregex =

The second step is to write a new action, in a file named /etc/fail2ban/action.d/whitelist-iptables.local, the contents of which follows (again without the usual comments):

[Definition]

actionstart =

actionstop =

actioncheck =

actionban = if lockfile-create --lock-name /tmp/f2b-blwl.lock; then
              if grep -q ' <ip>$' /var/lib/fail2ban/iptables-<name>.wl; then
                sed -i.old '/ <ip>$/d' /var/lib/fail2ban/iptables-<name>.wl
              elif iptables -w -n -L f2b-blwl-<name> | grep -q 'DROP.* <ip> '; then
                iptables -w -D f2b-blwl-<name> -s <ip> -j DROP
                sed -i.old '/ <ip>$/d' /var/lib/fail2ban/iptables-<name>.bl
              fi
              TS=`echo <time> | sed 's/\..*//'`
              printf '%%d %%s\n' "$TS" '<ip>' >> /var/lib/fail2ban/iptables-<name>.wl
              lockfile-remove --lock-name /tmp/f2b-blwl.lock
            fi

actionunban =

[Init]
name = default

In this action file, I do something whenever Fail2ban notifies me of an IP address (here, actionban receives addresses that committed a successful login), but I do nothing when Fail2ban notifies me of the end of the banishment (here: protection) delay. Indeed, I will put a fake, short, delay in Fail2ban configuration. This begs the question of when the database of authorized IP addresses should be up-to-date. The answer to this question is: when we need to know if an IP to ban will really be banned, or not; which happens in the actionban script of the /etc/fail2ban/action.d/blacklist-iptables.local action file. So I change this file yet again, to take into account the real protection delay, and to test IP addresses against this other database file. Changes are in bold:

[Definition]

actionstart = if lockfile-create --lock-name /tmp/f2b-blwl.lock; then
                iptables -w -N f2b-blwl-<name>
                iptables -w -A f2b-blwl-<name> -j RETURN
                iptables -w -I <chain> -p <protocol> -m multiport --dports <port> -j f2b-blwl-<name>
                [ -e "/var/lib/fail2ban/iptables-<name>.bl" ] || touch /var/lib/fail2ban/iptables-<name>.bl
[ -e "/var/lib/fail2ban/iptables-<name>.wl" ] || touch /var/lib/fail2ban/iptables-<name>.wl
awk -vnow=`date +%%s` -vunban=<unban> '($1+unban+60)>now{print}' \ /var/lib/fail2ban/iptables-<name>.bl > /var/lib/fail2ban/iptables-<name>.bl.new mv -f /var/lib/fail2ban/iptables-<name>.bl.new /var/lib/fail2ban/iptables-<name>.bl while read TS IP; do iptables -w -I f2b-blwl-<name> -s $IP -j DROP UNBAN=`expr $TS + <unban>` echo "if lockfile-create --lock-name /tmp/f2b-blwl.lock; then
if iptables -w -n -L f2b-blwl-<name> | grep -q 'DROP.* $IP '; then iptables -w -D f2b-blwl-<name> -s $IP -j DROP fi
lockfile-remove --lock-name /tmp/f2b-blwl.lock
fi" | at -t `date +%%Y%%m%%d%%H%%M.%%S -d "1970-01-01 00:00:00 UTC +$UNBAN second"` done < /var/lib/fail2ban/iptables-<name>.bl lockfile-remove --lock-name /tmp/f2b-blwl.lock fi actionstop = for atid in `at -l | cut -f1`; do at -c $atid | grep -qF ' f2b-blwl-<name> ' && at -r $atid done if lockfile-create --lock-name /tmp/f2b-blwl.lock; then iptables -w -D <chain> -p <protocol> -m multiport --dports <port> -j f2b-blwl-<name> iptables -w -F f2b-blwl-<name> iptables -w -X f2b-blwl-<name> lockfile-remove --lock-name /tmp/f2b-blwl.lock fi actioncheck = if lockfile-create --lock-name /tmp/f2b-blwl.lock; then iptables -w -n -L <chain> | grep -q f2b-blwl-<name> lockfile-remove --lock-name /tmp/f2b-blwl.lock fi actionban = if lockfile-create --lock-name /tmp/f2b-blwl.lock; then awk -vnow=`date +%%s` -vunban=<whitelist> '($1+unban)>now{print}' \ /var/lib/fail2ban/iptables-<name>.wl > /var/lib/fail2ban/iptables-<name>.wl.new mv -f /var/lib/fail2ban/iptables-<name>.wl.new /var/lib/fail2ban/iptables-<name>.wl if ! grep -q ' <ip>$' /var/lib/fail2ban/iptables-<name>.wl /var/lib/fail2ban/iptables-<name>.bl; then TS=`echo <time> | sed 's/\..*//'` printf '%%d %%s\n' "$TS" '<ip>' >> /var/lib/fail2ban/iptables-<name>.bl iptables -w -I f2b-blwl-<name> -s <ip> -j DROP printf %%b "Subject: DenyHosts Report Date: `env LANG=C date +"%%a, %%d %%h %%Y %%T %%z"` From: DenyHosts <<sender>> To: <dest>\n Added the following hosts to /var/lib/fail2ban/iptables-<name>.bl:\n <ip> (unknown)\n ----------------------------------------------------------------------" | /usr/sbin/sendmail -f <sender> <dest> fi lockfile-remove --lock-name /tmp/f2b-blwl.lock fi actionunban = if lockfile-create --lock-name /tmp/f2b-blwl.lock; then if iptables -w -n -L f2b-blwl-<name> | grep -q 'DROP.* <ip> '; then iptables -w -D f2b-blwl-<name> -s <ip> -j DROP fi awk -vnow=`date +%%s` -vunban=<unban> '($1+unban)>now{print}' \ /var/lib/fail2ban/iptables-<name>.bl > /var/lib/fail2ban/iptables-<name>.bl.new mv -f /var/lib/fail2ban/iptables-<name>.bl.new /var/lib/fail2ban/iptables-<name>.bl lockfile-remove --lock-name /tmp/f2b-blwl.lock fi [Init] name = default port = ssh protocol = tcp chain = INPUT unban = 300 dest = root@localhost sender = nobody@localhost whitelist = 300

The final step happens in the main configuration file /etc/fail2ban/jail.local, where recent enhancements must be integrated:

[DEFAULT]
ignoreip = 127.0.0.1/8 192.168.1.0/24
findtime = 172800
bantime = 432000
action = blacklist-iptables[name=%(__name__)s, port="%(port)s", protocol="%(protocol)s", chain="%(chain)s", unban="%(bantime)s", whitelist="%(findtime)s"]

[ssh]
enabled = true
port = 22,443
maxretry = 3

[ssh-ok]
enabled = true
maxretry = 1
bantime = 1
logpath = /var/log/auth.log
filter = whitelist-ssh action = whitelist-iptables[name="ssh"]

In this new “jail” above, named “[ssh-ok]”, the given name is that of the one “jail”, the blacklist-iptables action of which is expected to work alongside the whitelist-iptables of this “jail”. Similarly, I could define a “jail” named “[html]”, and another “jail” named “[html-ok], the name of which would have the value “html”.

Done. This configuration is without a doubt unusual, but it is the one I need.


Notes :

[1] In fact, I don’t have a single fraudulent access to feed to Fail2ban anymore. But once a problem is found, it would be a shame not to fix it anyway ;-)

Changelog :

  • 2015-01-13 — Exclusive lock on iptables, to avoid errors regarding concurrent accesses to the firewall; this also gets issues regarding concurrent accesses to the “database” out of the way.
  • 2015-03-30 — A “; then” was missing in the actioncheck code of the /etc/fail2ban/action.d/blacklist-iptables.local file; also in the downloadable file. My apologies to anyone who may have been stuck because of this.
  • 2015-06-03 — Debian Jessie brought a new iptables version with built-in locking. I also fixed some small mistakes.