Groking audit

I've been working with Logstash lately, and one of the tasks I was given was attempting to improve parsing of audit.log entries. Turning things like this:

type=SYSCALL msg=audit(1445878971.457:6169): arch=c000003e syscall=59 success=yes exit=0 a0=c2c3a8 a1=c64bc8 a2=c34408 a3=7fff44e370f0 items=2 ppid=16974 pid=18771 auid=1004 uid=0 gid=0 euid=0 suid=0 fsuid=0 egid=0 sgid=0 fsgid=0 tty=pts0 ses=5 comm="compiled_evil" exe="/home/justsomeuser/bin/compiled_evil" key="hinkystuff"

Into nice and indexed entries where we can make Kibana graphs of all commands caught with the hinkystuff audit ruleset.

The problem with audit.log entries is that they're not very regexible. Oh, they can be. But optional sometimes-there-sometimes-not fields suck a lot. Take for example, the SYSCALL above. Items a0 through a3 are arguments 1-3 of the syscall, and there may be 1 to 3 of them. Expressing that in regex/grok is trying.

So I made a thing:

Logstash-auditlog: Grok patterns and examples for parsing Audit settings with Logstash.

May it be useful.