Log Management 101
If you are in any way involved with technology, you have at one time or another dealt with logging. You either didn’t have what you needed, spent hours looking a needle in the haystack, or had to go find someone to help you get what you needed.
In networking and security circles, logging is usually a hot-button topic. I’ve spent many man-months working with logs, investigating logging solutions, praying to God for a solution to logging, tweaking open-source logging programs, and discussing logging technology with various vendors. I’ve collected some of the more relevant articles on the topic and have some additional commentary / insight.
First and foremost, the primary problem of Log Management is often not the log(ging) portion – most systems spew limitless log data – the problem is the management of the log(ged) data. You’ve heard 1000 times, management styles vary by manager. Management of network, servers, and security differ wildly. Networks and servers are managed by exception, they get attention when they need it otherwise they just purr and no one bothers them (until Patch Tuesday or until Cisco has a bad day). Managing security is altogether different, because security for a given organization is the net sum of all the parts of said organization, and in order to properly manage security, information from all the parts must be reviewed on a routine basis.
This routine review of information is where the problem comes in and it is why nearly every regulation, guideline, or contract stipulation (read PCI DSS) has mandatory language about reviewing logs. Again there is seldom anything about requiring systems to output logs, because that is rarely the problem. Technology is often times misapplied in attempts solve human-created problems, log management is an ideal instance for technology to come to our rescue. How else would a sane person cull through gigabytes or terabytes of log data from ten’s or hundred’s of systems or network elements? <Don’t answer that question, I’ve done it myself .>
- Anton Chuvakin’s Security Warrior Blog
- Network Systems Design Line: Security Event Correlation Article
- GEEK Speek Blog: Dealing with Logs
- TaoSecurity: Log Mgmt Summit
- SANS Reading Room: Logging Technology / Techniques
Dealing with the actual logs and delving into whether you want to collect log data, consolidate the log data, correlate log data, and or perform event management usher in the next set of issues / challenges when dealing with logging. Originally logs might have been aggregated onto one server or they might have been aggregated by log severity using syslog-ng or syslogd. Network or systems management systems (CiscoWorks or HP OpenView) might have received logs and performed parsing or filtering on key events that drove alerts and alarms but these systems likely discarded the log data itself, so investigating the root cause of an alert or incident became next to impossible.
Compliance came to the scene and brought innovation in log collection and log management; software became available to manage logs from multiple sources but you had to pick your poison: Windows with MS SQL back-end, Solaris or HP-UX with Oracle, or Linux and MySQL. Aggregation and consolidation were solved by tiered deployments and agents were usually installed to siphon off Windows Event Data. Security hit the mainstream and event correlation was the big buzz word.
The problem was how the intensive resources necessary to store the log data competed with the intensive resources necessary to correlate dissimilar log formats to produce alerts or suppress logged events. Gartner and others coined the the term SIEM, which combined log collection and event correlation, and a new market arrived. Most SIEM (SIM, then SEM, finally SIEM) are really good at the collection piece (historical) or really good at the correlation (real-time) piece, while few are good at both. Go, go Magic Quad. rant! If you hadn’t already noticed, I’m not a fan of combining these technologies (I don’t mix my food while I’m eating either). I like my logging solution to have plenty of evidence preservation goodness and I don’t want it muddied because a correlator had to normalize the data before it could parse, alarm on, or display the log data.
Some of the options for solving log management challenges
Just scratching the surface, more please?