One of the strongest features of OSSEC is its log analysis engine. The log collector process monitors all of the specified logs and passes each log entry off to the analysis engine for decoding and rule matching. For agents, the log entries are transmitted to the server for processing.
There are multiple steps to identifying actionable matches starting with the decoding of each log entry. First, pre-decoding breaks apart a single log entry into manageable pieces. Decoders are then used to identify specific programs and log entry types. Decoders can build on one another allowing the operator to target specific patterns within a log entry with ease.
Let’s take a look at a simple example first. The following is a sample log entry from a typical /var/log/secure log. This log uses the syslog format for log entries.
Oct 21 00:01:00 dev sshd[31409]: Failed password for invalid user postfix from 189.126.97.181 port 57608 ssh2
Pre-decoding breaks apart the syslog format into three pieces, the hostname, program_name, and log. Using the ossec-logtest program provided with OSSEC, we can view the process OSSEC goes through for decoding and then rule matching. Pre-decoding this log entry produces the following :
[root@dev bin]# ./ossec-logtest
2010/10/21 00:01:00 ossec-testrule: INFO: Reading local decoder file.
2010/10/21 00:01:00 ossec-testrule: INFO: Started (pid: 1106).
ossec-testrule: Type one log per line.Oct 21 00:01:00 dev sshd[31409]: Failed password for invalid user postfix from 189.126.97.181 port 57608 ssh2
**Phase 1: Completed pre-decoding.
full event: ‘Oct 21 00:01:00 dev sshd[31409]: Failed password for invalid user postfix from 189.126.97.181 port 57608 ssh2’
hostname: ‘dev’
program_name: ‘sshd’
log: ‘Failed password for invalid user postfix from 189.126.97.181 port 57608 ssh2’
As you can see, the log entry ends up with three parts. Further decoding uses pattern matching on these three parts to further identify and categorize log entries.
<decoder name=”sshd”>
<program_name>^sshd</program_name>
</decoder>
This is about as simple as a decoder can get. The decoder block starts with an attribute identifying the name of the decoder. In this case, that name is sshd. Within the decoder block is a program_name tag that contains a regular expression used to match against the program_name from pre-decoding. If the regex matches, OSSEC will use that decoder to match against any defined rules.
As I mentioned before, however, decoders can build on each other. The first decoder above reduces the number of subsequent decoders that need to be checked before decoding is complete. For example, look at the following decoder definition :
<decoder name=”ssh-invfailed”>
<parent>sshd</parent>
<prematch>^Failed \S+ for invalid user|^Failed \S+ for illegal user</prematch>
<regex offset=”after_prematch”>from (\S+) port \d+ \w+$</regex>
<order>srcip</order>
</decoder>
This decoder builds on the first as can be seen via the parent tag. Decoders work on a first match basis. In other words, the first decoder to match is used to find secondary decoders (children of the first), the first secondary decoder is used to find tertiary (children of the second), etc. If the matching decoder has no children, then that decoder is the final decoder and the decoded information is passed on to rules processing.
There are three other tags within this decoder block worth looking at. First, the prematch tag. Prematch is used as a quick way to determine if the rest of the decoder should be run. Prematches should be written so that the portion of the entry they match can be skipped by the rest of the decoder. For instance, in the decoder example above, the prematch will match the phrase “Failed password for invalid user” in the log entry. This portion of the log contains enough information to identify the type of log entry without requiring us to parse it again to extract information. The remaining part of the log entry has the information we want to capture.
Which brings us to the regex. The regex, or regular expression, is a string used to match and pull apart a log entry. The regex expression in the example is used to extract the source ip address from the log so we can use it in an active response later. The order tag identifies what the extracted information is.
Now, using these two decoders, let’s run ossec-logtest again :
2010/10/21 00:01:00 ossec-testrule: INFO: Reading local decoder file.
2010/10/21 00:01:00 ossec-testrule: INFO: Started (pid: 28358).
ossec-testrule: Type one log per line.Oct 21 00:01:00 dev sshd[31409]: Failed password for invalid user postfix from 189.126.97.181 port 57608 ssh2
**Phase 1: Completed pre-decoding.
full event: ‘Oct 21 00:01:00 dev sshd[31409]: Failed password for invalid user postfix
from 189.126.97.181 port 57608 ssh2’
hostname: ‘dev’
program_name: ‘sshd’
log: ‘Failed password for invalid user postfix from 189.126.97.181 port 57608 ssh2’**Phase 2: Completed decoding.
decoder: ‘sshd’
srcip: ‘189.126.97.181’
As you can see, the decoding phase has identified the decoder as sshd. The logtest program reports the name of the parent decoder used, rather than the ultimate matching decoder.
Hopefully this has given you enough information to start writing your own decoders. The decoder.xml file that comes with OSSEC is a great place to look at examples of how to craft decoders. This is a more advanced task, however, and the existing decoders cover most of the standard log entries you’ll see on a Linux or Windows machine.
For more information on decoders, please see the OSSEC manual. You might also check out chapter 4 of the OSSEC book. The book is a little outdated now, but the information within is still quite accurate. Syngress released a few chapters of the book that you can download here.