Dan Gunter

Simplifying Bro IDS Log Parsing with ParseBroLogs

This week I pushed a Python package to pip to simplify parsing logs from the Bro Intrusion Detection System. This package works on both Python 2 and Python 3. You can use the following command to install the utility in your environment:

pip install parsebrologs

Additional examples and the source code are available on Github.

Motivation:

Recently I’ve been working on a few Python projects with friends that required parsing and automated analysis of Bro IDS logs. After writing and rewriting file parsers a few times, I figured the next logical step would be to go ahead and develop a lightweight Python package instead of continuing to reimplement the same code. I also wanted a library capable of replicating the filtering features from bro-cut, and that could quickly present the data in both Python and end user-friendly format.

The Result:

Parsebrologs is a Python package that doesn’t require any external dependencies outside the base python installation. Support for both Python 2 and Python 3 works right out of the box. A list of fields to filter down to can emulate the functionality of the bro-cut utility. Data can be output in CSV or JSON format. Overall these features covered the initial requires defined initially. More features will be added as requested and anyone is also welcome to send a pull request if you have any great ideas.

How to Use:

There are a few more nuanced features worth covering as we explore a few examples. The first example we will show reads in the entire connection log named conn.log and writes the data out to a file named out.json in JSON format.

The to_json() method returns a string containing the JSON representation of the data. The returned data is a list containing the individual JSON records. The format of the returned JSON data is important to note if you want to load the data into a pandas data frame or load the log data into a database like Elasticsearch.  To create a pandas data frame, you should convert the JSON string back to a python object using the loads() method of the JSON library as shown in the example below.

In addition to JSON, two methods are available to output data in CSV format. The to_csv() method returns the data as an unescaped CSV formatted string while the to_escaped_csv() method escapes all fields within the CSV. You should use the to_escaped_csv() method if you plan on opening the CSV file with Microsoft Excel or OpenOffice Calc. Escaping a CSV eliminates any issues with commas or other special CSV characters. The CSV example also shows using the fields variable to filter returned fields. The fields variable takes a list of strings containing the requested fields values to return. The ParseBroLog class instructor is where fields are specified.

Moving Forward:

Hopefully, this library can serve as a successful building block for future projects. As mentioned earlier, if you have a great idea send it my way or feel free to submit a pull request. If you have any feedback or success stories to share, you can also reach me on twitter (@dan_gunter) or via gmail (dangunter).