Pablo is presenting is work on protocol classification. As you may not have guess, nfgrep is not using regular expression but a descriptive language.
The basic architecture is the following:
- developped layer-7 filter in userspace
- filter is passed to a tool that generates byte-code
- it loads the byte-code to the kernel via nfnetlink
- The kernel does the classification
- nfgrep match can then be used to select or mark the flow
In userspace, nfgrep and libnfgrep can be used to interact with the system. There’s also a nfgrep-test to validate filter before sending them.
Pablo has started to work with BPF but this was hard to develop filter. Getting a simple field could take something like 10 lines. He looks at existing descriptive language like LUA or others but they offer too many feature and are not dedicated to that.
By linking the data to the connection tracking entry this is possible to store stateful information. Multiple informations can be attached, it is thus possible to have multiple match.
The language is simple, it contains a few keywords. One of the interest is to be able to have a multiple step to ensure the matching is accurate. The image below is a description of the HTTP protocol:
Filters can be chained. It could thus be possible to detect HTTP and then to detect HTTP subprotocol.
It is not currently possible to put the information about the detected protocol inside something like nfnetlink_queue but this could be added and provide very interesting classification information to an IPS like suricata.
The TCP segmentation is still an open issue. This could defeat the matching.
The code should be released in the following days.