A few months ago, I configured a Mageia 2.0 box with a static and public IP address. I was not sure of its purpose except perhaps as a way to access large files (pictures and videos of friends and family) that I did not want to keep on my domain web host machine (for space reasons). So the base install on this system consisted of an SSH and a HTTP server. Incidentally, this machine is behind a firewall appliance. So I did not configure any additional security except making sure that the machine's firewall was running too. Only the ports 22 (SSH) and 80 (WWW) were allowed by the machine's firewall.
Fast forward two months. I was debugging a PHP script when I noticed in the system logs that there were attempts to access web pages that did not exist. I then decided to check the SSH logs and I noticed failed login attempts as well. So I decided I would take a closer look at some later point. Well, today I got that chance and what follows is an analysis of the logs of the SSH service. This is not a very detailed analysis. Just something to satisfy my curiosity.
Background for the analysis
We begin with the failed attempts to remotely log into my machine with SSH. We will try to answer three questions:
- The usernames that are commonly attacked
- The number of attempts made for each username
- Identity of the attackers (i.e. their IP addresses) and their persistence.
# grep "Failed password" /var/log/auth.log* > ~ak/ssh-failed-password-logins.txtSince the observations should be presented in the context of the time period during which these attempts were made, we checked the logs to see the date and time of the first failed attempt and the last failed attempt.
[ak@bobcat ~]$ cut -d':' -f2 ssh-failed-password-logins.txt | cut -d' ' -f1,2 | sort -k1M -k2n | uniq
The results showed that the period under consideration is about 6 weeks. The first failed attempt in this dataset occured on 15 Sep 2013 and the latest was today i.e. 23 Oct 2013. Incidentally, these attempts were made only on 19 separate days (out of 39), i.e., on approximately 50% of the days. As I have mentioned before, the targeted machine is behind a firewall controlled by another entity and hence it is possible that more attempts were rebuffed by the firewall. This 50% therefore represents the minimum number of days on which attempts were made and were registered by the machine.
Analysis of the logs
We begin with our first assessment: the usernames that are commonly attacked. Consider the output below.
[ak@bobcat ~]$ cut -d' ' -f9 ssh-failed-password-logins.txt | sort | uniq -c 2 apache 98 bin 1 daemon 990 invalid 2 mysql 1 news 1 openvpn 1 operator 2156 root
Straightaway we observe, almost two-third of attempts were performed for the username root. The second most common attempt was made for the username bin. The third most common was for invalid. Turned out that when a user did not exist on the system, the log would write invalid user instead of the username and then list the attempted username. I show the results of these invalid username attempts next. Interestingly, there were two attempts as apache. Is that a coincidence or is it because the machine operator/script noticed that I have a web server running on my system? Another interesting point to note is that the attempts are made assuming a Unix/BSD system and not a Windows machine. Now what about those invalid usernames?
Since the attacker does not know what usernames have been created on the system, they naturally attempt the most common. We already saw that root and bin were most common and which exist on the default Linux installation. Since my Linux box is not really used for any practical purpose, it just has one user account, mine. This is probably also true for most Linux installs which face the public Internet directly (to limit the attack vector). However, many system administrators are rather uninformed when it comes to securing systems. Therefore, an uninformed attacker will likely try to attack a system using most commonly used usernames. What are they? Consider the output below.
[ak@bobcat ~]$ grep invalid ssh-failed-password-logins.txt| cut -d' ' -f11 | sort | uniq -c | sort -nk1 ... 4 postgres 5 administrator 5 adrian 5 ivan 5 shoutcast 5 support 6 backup 6 hadoop 6 user0 6 zimbra 8 tomcat 8 user 9 guest 9 minecraft 9 webmaster 10 admin 17 test 26 gateway 32 nagios 37 www 38 userftp 41 oracle 58 deploy 77 ftptest
In this list I have only listed the most attempted usernames with the frequency of the attempts and ranked them in increasing order of the frequency. None of these users exist on my machine and so they got logged as invalid user. This list is probably culled from the /etc/passwd dumps of most public machines. Nevertheless, it is surprising that a user called ftptest is supposed to exist when clearly my system does not have FTP installed or enabled. Is this a shot in the dark? Or merely a less intelligent script? It is not clear why usernames such as deploy and test exist. The user minecraft is a complete surprise to me. There were some other product name based usernames on which attempts were made such as D-Link, asterisk, plesk, centos, honda, mysql and huawei among others. I did not list them above. Since this list has a long tail, I suspect that the attackers attempted most of these usernames with their default passwords and gave up soon after. I also suspect that it is quite unlikely that this strategy is followed by determined attackers. They must use a more focused approach rather than this spray and hope it sticks approach.
The third item on my list was to understand who was trying to access my machine and with what level of persistence. I isolated the IP addresses from the logs and ranked them by the number of times each of these IP addresses attempted a login. Here's what got listed.
[ak@bobcat ~]$ grep -v invalid ssh-failed-password-logins.txt | cut -d' ' -f11 | sort | uniq -c | sort -nk1 ... 3 126.96.36.199 3 188.8.131.52 3 184.108.40.206 3 220.127.116.11 5 18.104.22.168 6 22.214.171.124 6 126.96.36.199 6 188.8.131.52 7 184.108.40.206 8 220.127.116.11 9 18.104.22.168 10 22.214.171.124 12 126.96.36.199 12 188.8.131.52 12 184.108.40.206 13 220.127.116.11 13 18.104.22.168 13 22.214.171.124 13 126.96.36.199 16 188.8.131.52 18 184.108.40.206 18 220.127.116.11 18 18.104.22.168 22 22.214.171.124 28 126.96.36.199 28 188.8.131.52 28 184.108.40.206 35 220.127.116.11 36 18.104.22.168 39 22.214.171.124 39 126.96.36.199 44 188.8.131.52 50 184.108.40.206 50 220.127.116.11 51 18.104.22.168 52 22.214.171.124 56 126.96.36.199 60 188.8.131.52 68 184.108.40.206 71 220.127.116.11 77 18.104.22.168 90 22.214.171.124 141 126.96.36.199 221 188.8.131.52 321 184.108.40.206 397 220.127.116.11
As before, the first column lists the number of attemps made from a particular IP address. The second column list the IP address. I assume that there is a one-to-one mapping between the IP address and a physical machine. Observe that although there appears to be only a small number of machines on the Internet that are attacking my machine, the attack by the top three attackers is vociferous. Clearly, these persistent machines are controlled by people trying out the large number of potential usernames (and potential passwords for each of those potential usernames).
I did not attempt to understand where these machines are located but perhaps I will do that another time. I think it is quite likely that these machines are controlled by some bot network and the owner of the machine is unaware of what her/his machine is doing. That begs investigation as well.
Next week, I will present a similar type of analysis for the attacks on the web server, namely the non-existent web pages being accessed and the strange URLs being requested from the web server.