Mar 102013


Iptables duplicate work for each family and is using a socket protocol which is far too static. Xtables2 is an ongoing effort to evolve the packet filter. It aims at providing finer frained modification (and not the whole ruleset modification).


  • rule packing: increase cache hit.
  • family independent: no more IPv4 and IPv6 specific code. Only the hook remains specific as they are dependant of core network.
  • xt extension support
  • atomic replace support

xtables syntax is quite similar but not the same. libxtadm is a high-level library for ruleset inspection/manipulation.

More info:

Mar 102013

PF_PACKET introduction

This is access to raw packet inside Linux. It is used by libpcap and by other projects like Suricata. PF_PACKET performance can be improved via dedicated features:

  • Zero-copy RX/TX
  • Socket clustering
  • Linux socket filtering (BPF)

BPF architecture looks like a small virtual machine with register and memory stores. It has different instructions and the kernel has its own kernel extensions to access to cpu number, vlan tag.


Netsniff-ng is a set of minimal tools:
  • netsniff-ng, a high-performance zero-copy analyzer, pcap capturing and replaying tool
  • trafgen, a high-performance zero-copy network traffic generator
  • mausezahn, a packet generator and analyzer for HW/SW appliances with a Cisco-CLI
  • bpfc, a Berkeley Packet Filter (BPF) compiler with Linux extensions
  • ifpps, a top-like kernel networking and system statistics tool
  • flowtop, a top-like netfilter connection tracking tool
  • curvetun, a lightweight multiuser IP tunnel based on elliptic curve cryptography
  • astraceroute, an autonomous system (AS) trace route utility

netsniff-ng can be used to capture with advanced option like using a dedicated CPU or using a BPF filter compiled via bpfc instead of a tcpdump like expression.

trafgen can used to generate high speed traffic, using multiple CPUs, and complex configuration/setup even including fuzzing. A description of the packet can be given where each element is built using different functions. It can even be combined with tc (for example netem to simulate specific network condition).

The future include to find a way to utilize multicore efficiently with packet_fanout and disk writing.

Mar 102013


ConnMan is a connection manager which integrate all critical networking components. It provides a smart D-Bus API to develop an User Interface. It is plugin oriented and all different network stacks are implemented in different modules. Connection sharing (aka tethering) is using Netfilter to setup NAT masquerading. So it is a simple usage.

Switching to nftables

Application connectivity is a more advanced part involving Netfilter as it makes a use of statistics and differenciated routing. For example, in a car, service data must be sent to manufacturer operator and not on the owner network.

To do so a session system has been implemented. Application can be modified to open a session to ConnMan. This allow to define a per-session policy for routing and accounting.

ConnMan team wanted to use a C API to do rules modification but this was difficult with iptables and xtables. This is not an official API so it is subject to bugs and change.

ConnMan team has then switch to nftables and is currently working on stabilizing nftables to ensure the acceptation of the project and of the maintainability of their solution in the long time. This work is not yet upstream but there is good chance it will be accepted.

Mar 102013

The problem

Centralized firewall design does not scale well when dealing with a lot of servers. It begins to collapse after a few thousands rules. Furthermore, to be able to have an application A to connect to server B, it would take a workflow and possibly 3 weeks to get the opening.

From Service Oriented Architecture to Service Oriented Security

Service are autonomous. They call each other using a standard protocol. The architecture is described by a list of dependencies between services. You can then specify security via things like ACCEPT Caching TO Frontend ON PORT 80. But this force you to do provisioning each time a server start.

Using Chef for firewall provisioning

Chef is an open source automation server that is queried by server to get information about their configuration.

Chef maintains a centralized database of all services and so it can derived from that the security policy. So if we manage to express the contraint in term of Netfilter rules, we will be able to build a firewall policy.

AFW is just that.

AFW is using a service oriented syntax where object like destination can be specified by doing a Chef search. Thus, each time Chef is launched, the object get reevaluated and the filtering rules is updated to the current state of the network.

Custom rules can be added using traditional iptables syntax. AFW is writing outbound rules using the owner target. This allow to define a policy per service and this policy can be easily searched by the developper by simply looking user policy.


The JSON document pushed to the chef node contains everything and it can be modified by the server. So it is possible for a server to trigger opening of ports by changing some of its parameters. A solution is to split chef in a chef for configuration and a chef for policy.

Mar 102013

Why ipset ?

iptables is enough sufficient but in some cases limit are found:

  • High number of rules: iptables is linear
  • Need to change the rules often

Independant study available at d(a)emonkeeper’s purgatory has shown that the performance of ipset are almost constant with respect to the number of filtered hosts:


The originating project was ippool featuring a a basic set and after some time it has been taken over by Jozsef and renamed ipset. A lot of type of sets are now handled.

ipset 6.x is the current version and features an impressive number of sets.


The communication between kernel and userspace is made via netlink

It is not possible to delete a set if it is referenced in kernel by iptables. So it may be appear as a problem but it is possible to use renaming and swapping operation to fix the issue.

The set type are numerous as pointed out here: ipset feature

Using different sets, it is possible to express a global policy in a few iptables rules.

Thanks to Florian Westphal, it is possible to access to set in tc.

Future of ipset

It will soon be possible to have per-element counters and this will allow to do some interesting accounting.

Mar 102013


Harald Welte when asked about IPv6 NAT was answering: “it will be over my dead body”. It is now available in official kernel.

Reasons for adding IPv6 NAT

  • Dynamic IPv6 Prefixes : ISP assigning dynamic IPv6 prefixes so Internal network address change. NAT can bring you stability.
  • Easier test setup.
  • Users are asking and most operating systems have it.
To resume the arguments of NAT, Patrick McHardy used this video:

A single, clean, bug free implementation is better than a lot of incomplete non official implementations that where already available.

Release and implementation

It has been implemented in 2011 and merged in 2.6.37. It is mainly based to on IPv4 implementation and provide from a user perspective a interface similar to IPv4. It also features an implementation of RFC 6296: stateless network prefix translation. It operates on a packet base and is completely stateless. It is implemented as an ip6tables target SNPT and DNPT which have to be used on both way to handle the modifications required by RFC 6296.

Current work

An implementation of NAT64 is currently in progress and should be available in the coming weeks.
Mar 102013


nftable is a kernel packet filtering framework to replaces iptables. It brings no changes in the core (conntrack, hooks). Match logic is changed: you fetch keys and once you have your key set, you make operation on them. Advanced and specialized matchs are built upon this system.

nftables vs iptables

In iptables, extension were coded in separate files and they must be put in iptables source tree. To act, they must modify on a binary array storing the ruleset and injecting it back to the kernel. So every update involve a full download and upload of the whole ruleset. nftables is working on a message based basis (exchanged via netlink) and thus allow better handling of incremental modification.

Nftables usage

nftables will provide a high level library which can be used to manipulate ruleset in dedicated tools. From userspace, backward compatibility is here with utilities fully compatible iptables and ip6tables. Even script wil not have been changed But there is some new things brought by the change:
  • event notifications: can have a software listening to rules change and logging the change. This is an interesting feature as tracability is often asked in secure environment
  • Better incremental rule update support
  • Enable or disable the chains per table you want (will provide performance optimisation)
There is work in progress on a new utility nft. It will provide a new syntax that will allow to do more efficient matching. It will be possible to form couple of keys and do high speed matching on them.