I admit, I’ve been experimenting more with awk lately. Generally, my opinion has always been, “If it’s not simple enough for #!/bin/bash, I’d rather use python/perl/ruby.” Figured I’d simplify my life by having one less flavor of syntax/regexps to worry about.

What a silly idea! While Python may be great for “enterprise-class”[1] log analysis, nothing beats awk for one-liners. Take a few examples off the top of my head…

Who’s trying to hack me?

$ zcat /var/log/auth.log.*.gz  | awk '$6 == "Invalid" { print $8 }' | sort | uniq -c | sort -n -r | head -n 30
     79 admin
     71 test
     52 user
     43 michael
     40 alex
     39 guest
     32 oracle
     30 www
     30 dave
     28 info
     26 sales
     25 web
     25 ben
     23 victoria
     23 paul
     23 httpd
     23 adam
     22 john
     21 shop
     21 mike
     21 ftp
     21 david
     21 caroline
     21 amanda
     20 toor
     20 server
     20 samba
     20 linux
     20 danny
     20 claire

Most interesting… Nobody bothers to try root, but apparently someone’s used toor before. Also, I see a mix of common first names as well as known linux service names (httpd, ftp, etc). My question is… are there that many sysadmins named caroline?

Where are the Bastards Coming From?

$ zcat /var/log/auth.log.*.gz | awk '$6 == "Invalid" { print $10 }' | sort | uniq -c | sort -n -r
   5170 80.237.205.72
   1243 212.112.227.139
   1040 216.190.237.68
    336 193.137.179.181
    220 200.168.28.21
    132 222.128.249.253
     94 196.200.90.99
     64 200.105.16.242
     60 211.104.85.236
     44 61.192.163.188
     13 200.11.76.170
      6 210.100.157.9
      6 124.135.192.2
      5 222.69.93.27
      5 222.189.238.179
      3 70.97.158.195

Wow, 80.237.205.72 is a really persistent little bugger. Upon looking closer, I see all of the attempts were on a single day. Let’s see the latency between attempts:

$ zcat /var/log/auth.log.*.gz  | awk '
$6 == "Invalid" && $10 == "80.237.205.72" {
    oldsec = sec;
    split($3, time, ":");
    sec = time[3] + 60 * (time[2] + 60 * time[1]);
    if (oldsec > 0) {
        print sec - oldsec;
    }
}' | sort -n | uniq -c
    267 2
   3366 3
   1457 4
     23 5
     19 6
     15 7
      1 8
      4 9
      1 10
      5 11
      1 13
      3 14
      2 15
      2 16
      2 24
      1 44

So basically, throughout the day, every 3 seconds someone was trying to log in.

Anyways, I thought I would have something more interesting from awk, but this’ll have to suffice.

Footnote [1] Whatever that means

Advertisements