Pablo Neira Ayuso, Netfilter summary of changes since last workshop

Pablo Neira Ayuso has made a panorama of Netfilter changes since last workshop.

On user side, the first main change to be published after last workshop, is libnetfilter_cttimeout. It allows you to define different timeout policies and to apply them to connections by using the CT target.

An other important new “feature” is a possibility to disable to automatic helper assignment. More information on
Secure use of iptables and connection tracking helpers.

The ‘fail-open’ option of Netfilter allow to change the default behavior of kernel when packet are queued. When the maximum length of a queue is reached, the kernel is dropping by default all incoming packets. For some software, this is not the desired behavior. The ‘fail-open’ socket option allow you to change this and to have the kernel accept the packets if they can not be queued.

An other interesting feature is userspace cthelper. It is now possible to implement a connection tracking helper in userspace.

nfnetlink_queue has now CT support. This means it is possible to queue a packet with the conntrack information attached to it.

IPv6 NAT has been added by Patrick McHardy.

In october 2012, the old QUEUE target has been removed. A switch to NFQUEUE is now required.

connlabel has been added to kernel to provide more category than what was possible by using connmark.

A new bpf match has been committed but the final part in iptables part is missing.

Logging has been improved in helpers. They can reject packets and the user did not have the packet data and can not understand the reason of the drop. The nf_log_packet infrastructure can now be used to log the reason of the drop (with the packet). This should help user to understand the reason of the drops.

Martin Topholm: DDoS experiences with Linux and Netfilter

Martin is working for one.com a local ISP and is facing some DDoS. SYN cookie was implemented but the performance were too low with performance below 300kpps which is not what was expected. In fact SYN is on a slow path with a single spin lock protecting the SYN backtrack queue. So the system behave like a single core system relatively to SYN attacks.

Jesper Dangaard Brouer has proposed a patch to move the syn cookie out of the lock but it has some downside and could not be accepted. In particular, the syncookie system needs to check every type of packet to see if they belong to a previous syn cookie response and thus a central point is needed.

Alternate protection methods include using filtering in Netfilter. Regarding the performance, connection tracking is very costly as it split the packets rate by 2. With conntrack activated, the rate was 757 kpps and without conntrack it was 1738 kpps.

A Netfilter module implementing offloading of SYN cookies is proposed. The idea is to fake the SYN ACK part of the TCP handshake in the module which act as a proxy for the initiation of the connection. This would allow to treat syn cookie algorithm via a small dedicated table and will provided better performances.

David Miller: routing cache is dead, now what ?

The routing cache was maintaining a list of routing decisions. This was an hash table which was highly dynamic and was changing due to traffic. One of the major problem was the garbage collector. An other severe issue was the possibility of DoS using the increase

The routing cache has been suppressed in Linux 3.6 after a 2 years effort by David and the other Linux kernel developers. The global cache has been suppressed and some stored information have been moved to more separate resources like socket.

There was a lot of side effects following this big transformation. On user side, there is no more “neighbour cache overflow” thanks to synchronized sizes of routing and neighbour table.

Metrics were stored in the routing cache entry which has disappeared. So it has been necessary to introduce a separate TCP metrics cache. A netlink interface is available to update/delete/add entry to the cache.

A other side effect of these modifications is that, on TCP socket, xt_owner could be used on input socket but the code needs to be updated.

On security side, the Reverse path filtering has been updated. When activated it is causing up to two extra FIB lookups But when deactivated there is now no overhead at all.

Fabio Massimo Di Nitto: Kronosnet.org

Kronosnet is a “I conceived it when drunk but it works well” VPN implementation. It is using an Ether TAP for the VPN to provide a lyaer 2 vpn. To avoid reinventing the wheel, it is delegating most of the work to the kernel. It supports multilink and redundancy of servers. On multilink side, 8 links can be done per-host to help redundancy.

One of the use of this project is the creation of private network in the cloud as it can be easily setup to provide redundancy and connection for a lot of clients (64k simultaneous clients). And because a layer 2 VPN is really useful for this type of usage.

Configuration is made via a CLI comparable to the one of classical routers.

Fabio has run a demonstration on 4 servers and shows that cutting link has no impact on a ping flood thanks to the multilink system.

Eric Leblond: ulogd2, Netfilter logging reloaded

Introduction

I’ve made yesterday a presentation of ulogd2 at Open Source Days in Copenhagen. After a brief history of Netfilter logging, I’ve described the key features of ulogd2 and demonstrate two interfaces, nf3d and djedi.

The slides are available:
Ulogd2, Netfilter logging reloaded.

Screencasts

This video demonstrates some features of nf3d:

This screencast is showing some of the capabilities of djedi:

Thanks a lot to the organizers for this cool event.

Jan Engelhardt, Xtables2: Packet Filter Evolved

Introduction

Iptables duplicate work for each family and is using a socket protocol which is far too static. Xtables2 is an ongoing effort to evolve the packet filter.
It aims at providing finer frained modification (and not the whole ruleset modification).

Capabilities

  • rule packing: increase cache hit.
  • family independent: no more IPv4 and IPv6 specific code. Only the hook remains specific as they are dependant of core network.
  • xt extension support
  • atomic replace support

xtables syntax is quite similar but not the same. libxtadm is a high-level library for ruleset inspection/manipulation.

More info: xtables.de

Daniel Borkmann: Packets Sockets, BPF and Netsniff-NG

PF_PACKET introduction

This is access to raw packet inside Linux. It is used by libpcap and by other projects like Suricata.
PF_PACKET performance can be improved via dedicated features:

  • Zero-copy RX/TX
  • Socket clustering
  • Linux socket filtering (BPF)

BPF architecture looks like a small virtual machine with register and memory stores. It has different instructions and the kernel has its own kernel extensions to access to cpu number, vlan tag.

Netsniff-NG

Netsniff-ng is a set of minimal tools:

  • netsniff-ng, a high-performance zero-copy analyzer, pcap capturing and replaying tool
  • trafgen, a high-performance zero-copy network traffic generator
  • mausezahn, a packet generator and analyzer for HW/SW appliances with a Cisco-CLI
  • bpfc, a Berkeley Packet Filter (BPF) compiler with Linux extensions
  • ifpps, a top-like kernel networking and system statistics tool
  • flowtop, a top-like netfilter connection tracking tool
  • curvetun, a lightweight multiuser IP tunnel based on elliptic curve cryptography
  • astraceroute, an autonomous system (AS) trace route utility

netsniff-ng can be used to capture with advanced option like using a dedicated CPU or using a BPF filter compiled via bpfc instead of a tcpdump like expression.

trafgen can used to generate high speed traffic, using multiple CPUs, and complex configuration/setup even including fuzzing.
A description of the packet can be given where each element is built using different functions.
It can even be combined with tc (for example netem to simulate specific network condition).

The future include to find a way to utilize multicore efficiently with packet_fanout and disk writing.

Tomasz Bursztyka, ConnMan usage of Netfilter: a close overview

Introduction

ConnMan is a connection manager which integrate all critical networking components. It provides a smart D-Bus API to develop an User Interface. It is plugin oriented and all different network stacks are implemented in different modules.
Connection sharing (aka tethering) is using Netfilter to setup NAT masquerading. So it is a simple usage.

Switching to nftables

Application connectivity is a more advanced part involving Netfilter as it makes a use of statistics and differenciated routing. For example, in a car, service data must be sent to manufacturer operator and not on the owner network.

To do so a session system has been implemented. Application can be modified to open a session to ConnMan. This allow to define a per-session policy for routing and accounting.

ConnMan team wanted to use a C API to do rules modification but this was difficult with iptables and xtables. This is not an official API so it is subject to bugs and change.

ConnMan team has then switch to nftables and is currently working on stabilizing nftables to ensure the acceptation of the project and of the maintainability of their solution in the long time. This work is not yet upstream but there is good chance it will be accepted.

Julien Vehent, AFW: Automating host-based firewalls with Chef

The problem

Centralized firewall design does not scale well when dealing with a lot of servers. It begins to collapse after a few thousands rules.
Furthermore, to be able to have an application A to connect to server B, it would take a workflow and possibly 3 weeks to get the opening.

From Service Oriented Architecture to Service Oriented Security

Service are autonomous. They call each other using a standard protocol. The architecture is described by a list of dependencies between services.
You can then specify security via things like ACCEPT Caching TO Frontend ON PORT 80.
But this force you to do provisioning each time a server start.

Using Chef for firewall provisioning

Chef is an open source automation server that is queried by server to get information about their configuration.

Chef maintains a centralized database of all services and so it can derived from that the security policy. So if we manage to express the contraint in term of Netfilter rules, we will be able to build a firewall policy.

AFW is just that.

AFW is using a service oriented syntax where object like destination can be specified by doing a Chef search. Thus, each time Chef is launched, the object get reevaluated and the filtering rules is updated to the current state of the network.

Custom rules can be added using traditional iptables syntax. AFW is writing outbound rules using the owner target. This allow to define a policy per service and this policy can be easily searched by the developper by simply looking user policy.

Limitations

The JSON document pushed to the chef node contains everything and it can be modified by the server. So it is possible for a server to trigger opening of ports by changing some of its parameters. A solution is to split chef in a chef for configuration and a chef for policy.

Jozsef Kadlecsik, Faster firewalling with ipset

Why ipset ?

iptables is enough sufficient but in some cases limit are found:

  • High number of rules: iptables is linear
  • Need to change the rules often

Independant study available at d(a)emonkeeper’s purgatory has shown that the performance of ipset are almost constant with respect to the number of filtered hosts:

History

The originating project was ippool featuring a a basic set and after some time it has been taken over by Jozsef and renamed ipset. A lot of type of sets are now handled.

ipset 6.x is the current version and features an impressive number of sets.

Architecture

The communication between kernel and userspace is made via netlink

It is not possible to delete a set if it is referenced in kernel by iptables. So it may be appear as a problem but it is possible to use renaming and swapping operation to fix the issue.

The set type are numerous as pointed out here: ipset feature

Using different sets, it is possible to express a global policy in a few iptables rules.

Thanks to Florian Westphal, it is possible to access to set in tc.

Future of ipset

It will soon be possible to have per-element counters and this will allow to do some interesting accounting.