The Netfilter workshop being a developer conference, I’ve decided to presente an introduction to the coccinelle tool. Coccinelle is a program matching and transformation engine for the C language which is used in many place and among them in the Linux kernel. It is able to perform C clever modification in the code. If you ever had to modify multiple code files following an API change, I invite you to have a look at the slides or my Coccinelle for the newbie page. I’ve also presented my coccigrep tool which is a easy to use semantic grep.
The slides are available: nfws_coccinelle
Jesper’s IPTables::libiptc is a perl module which allow you to modify Netfilter rules from Perl. He’s the maintener and this is available on CPAN. It currently supports up-to iptables 1.4.10 (version 0.51 of IPTables::libiptc).
It dynamically load xtables.so and libiptc.so to access to iptables feature. It is fast as it does not suffer of iptables limitation (which is running modification one by one). Performance are quite good: it takes only 16 sec to generate and implement a 80000 rules ruleset (which is quite good compare to the 42h hours that would be take by direct iptables calls)
Jesper would like to have a complete iptables lib to access to all function and in particular to the do_command() function. One interesting things for him would be to have access to the test command.
Pablo don’t want the team to guarantee the libiptc will not break API or ABI. As it is already exported, it is not possible to make it private again. As the part Jesper is interested in is linked with user command, there should not be API break. Thus exporting the function seems OK.
Next work, Jesper wish to do is to publish a wrapper module IPTables::Interface and moving this to CPAN maybe inside the IPTables::libiptc module.
Patrick presents one work that is aiming at getting rid of the second tuple in the connection tracking. This second tuple is only necessary when NAT is used. idea is not new but at the time the ct-extention where not available and thus it would not be possible to add it when needed. Patrick has done most of the work but there is still a missing point which is the hash function. It has to be symetrical:
hash_func(src,dst) = hash_func(dst, src) and it must be very fast to avoid slowdown of the conntrack.
If this point is fixed, then it will be possible to get rid of the second tuple for all non NATed connection tracking entries.
We have been ignoring the fact that NAT could have some interest in IPv6 during the latest 5 years. IPv6 will not fix everything and it may be time to reconsider NAT. There is some reasons for that:
- Dynamic IPv6 prefixes: some ISP decide to not give fixed address to people
- Server load balancing, DMZ
- Uplink Balancing (multi-homing): this is one of the most important reason. IPv6 client can handle multiple addresses but you may want not having your user to choose their internet output.
IPv6 NAT is available in OpenBSD for some years now. It is also available on FreeBSD when using pf. Cisco IOS has not IPv6 NAT support.
Linux status is quite complicated. There is at least three implementations and there is even a official one that come from Linux virtual server.
NAT66 – RFC 6286 is now available. There is no port translation and the mapping must be checksum-neutral (if you change the prefix, it must not change the checksum).
Ulrich proposes some choices for integration into Netfilter:
- No IPv6 NAT
- NAT66 ip6tables target (with or without conntrack dependency)
- Make nf_nat protocol independant and move to net/netfilter (let admin decide if they want 1:1 or n:1)
- Any other solutions?
Main discussion is about the impact of the change introduced by IPv6 NAT. Nobody seemed against the introduction of the feature and this was finally accepted to add IPv6 NAT inside Netfilter. The remaining point is who will do the job.
Pablo is presenting is work on protocol classification. As you may not have guess, nfgrep is not using regular expression but a descriptive language.
The basic architecture is the following:
- developped layer-7 filter in userspace
- filter is passed to a tool that generates byte-code
- it loads the byte-code to the kernel via nfnetlink
- The kernel does the classification
- nfgrep match can then be used to select or mark the flow
In userspace, nfgrep and libnfgrep can be used to interact with the system. There’s also a nfgrep-test to validate filter before sending them.
Pablo has started to work with BPF but this was hard to develop filter. Getting a simple field could take something like 10 lines. He looks at existing descriptive language like LUA or others but they offer too many feature and are not dedicated to that.
By linking the data to the connection tracking entry this is possible to store stateful information. Multiple informations can be attached, it is thus possible to have multiple match.
The language is simple, it contains a few keywords. One of the interest is to be able to have a multiple step to ensure the matching is accurate. The image below is a description of the HTTP protocol:
Filters can be chained. It could thus be possible to detect HTTP and then to detect HTTP subprotocol.
It is not currently possible to put the information about the detected protocol inside something like nfnetlink_queue but this could be added and provide very interesting classification information to an IPS like suricata.
The TCP segmentation is still an open issue. This could defeat the matching.
The code should be released in the following days.
Cyberoam team presents their work on active active cluster. They’ve done a 2 nodes active active setup, with a primary and an auxiliary sytem. The primary take care of load balancing. The setup is using virtual MAC addresses.
To avoid split-brain problem, the primary take all decisions by always treating the SYN packet. It also transfer the NAT, marks to the auxiliary thanks to a module. This is done via a module called ipt_SYNDATA. It is placed in PREROUTING
Another problem that they need to fix was to arp resolution. They need to have only one answer and one request. For that they developed an arptable extension which is used to have the primary that does all the request and it transfers the answer on the dedicated link between the two nodes.