Mar 112013
 

Suricata and Netfilter can be better friend as they are doing some common work like decoding packet and maintaining flow table.

In IPS mode, Suricata is receiving raw packet from libnetfilter_queue. It has to made the parsing of this packet but this kind of thing has also been done by kernel. So it should be possible to avoid to duplicate the work.

In fact Netfilter work is limited as ipheaders srtucture are used. Patrik McHardy proposed that Netfilter offset but this is not the most costly part.

The flow discussion was more interesting because conntrack is really doing a similar work as the one done by Suricata. Using the CT extension of libnetfilter_queue, Suricata will be able to get access to all the CT information. And at a first glance, it seems it contains almost all information needed. So it should be possible to remove the flow engine from suricata. The garbage operation would not be necessary as Suricata will get information via NFCT destroy event.

Jozsef Kadlecsik proposed to use Tproxy to redirect flow and provide a “socket” stream instead of individual packet to Suricata. This would change Suricata a lot but could provide a interesting alternative mode.

Dec 132012
 

A huge work

Suricata 1.4 has been released December 13th 2012 and it has been a huge work. The number of modifications is just impressing:

390 files changed, 25299 insertions(+), 11982 deletions(-)

The following video is using gource to display the evolution of Suricata IDS/IPS source code between version 1.3 and version 1.4. It only displays the modified files and do not show the files existing at start.

A collaborative work

A total of 11 different authors have participated to this release. The following graph generated by gitstats shows the number of lines of code by author:

Some words about activity
The activity shows that most of the work is done during week day but there is some work done on sunday: As shown the following graph, the activity as decreased during the stabilization process:
0
0
0
0
0
1
0
0
6
7
9
10
5
11
11
11
15
51
15
35
23
22
16
24
10
29
32
26
18
16
6
8
3231302928272625242322212019181716151413121110987654321
Dec 052012
 

Introduction

Kernel oops have been reported by some users running Suricata with AF_PACKET multiple thread capture activated. This is due to a bug I’ve introduced in AF_PACKET when fixing an other bug.

Which kernel not to use with Suricata in AF_PACKET mode

The following kernel version will surely crash if Suricata or any other program is used with AF_PACKET capture with multiple capture threads:
  • Linux 3.2.30 to 3.2.33
  • Linux 3.4.12 to 3.4.18
  • Linux 3.5.5 to 3.5.7
  • Linux 3.6.0 to 3.6.6

If only one capture thread is used there is no risk of crash. If you are running a vulnerable kernel, your configuration should looks like:

af-packet:
  - interface: eth0
    # Number of receive threads (>1 will crash with bad kernel)
    threads: 1

Some explanations

AF_PACKET capture is one of my favorite capture method on Linux with Suricata. It is mainline and it offers really good performance with latest kernel. For example, this is deployed on one of our box and run at 10Gbps speed on non-expensive hardware. This speed is achieved by using load-balanced capture. This feature allows to have multiple thread/process listening to the same interface. The load-balancing is made by the kernel. This feature has been developed by David Miller and is available since Linux 3.1.

In summer 2012, I’ve worked on adding AF_PACKET IPS mode and I’ve discovered a kernel bug which was causing a packet sending loop in the IPS code. I’ve proposed a fix af_packet: don’t emit packet on orig fanout group. The patch has reached mainline with Linux 3.6. As it was fixing a real problem it was propagated to most Linux stable branches. Some distributions, like Ubuntu, have also backported the patch to their official kernel.

But the patch was buggy and some Suricata users have reported kernel oops. I’ve fixed the bug af-packet: fix oops when socket is not present and the patch will be available in Linux 3.7. The kernel stable team has incorporated this patch in their subsequent releases so most stable branches but 3.5 don’t suffer anymore of this problem.

Note

Ubuntu Quantal has a patched kernel since at least 3.5.0-25.

Nov 152012
 

The naive approach would consider that an IDS is just taking packet and doing a lot of matching on it. In fact, this is not at all what is happening. An IDS/IPS like Suricata is in fact rebuilding the data stream and in case of known protocols it is even normalizing the data stream and providing keyword which can be used to match on specific field of a protocol.

Let’s say, we a rule to match on a HTTP request where method is GET and the URL is “/download.php”.

But what happen inside Suricata when want to do such a match ? Let’s try to visualize this matching on a fictive example packets stream. With such a stream, we could have the following process: Flow reconstruction by Suricata

If you click on the image, you will get access to an interactive svg showing the alerting signatures.

A series of 7 IP packets are seen. They belong to the same flow but because of fragmentation they need to be assembled by the defrag engine. The #4 packet is invalid due to an invalid checksum. Suricata can alert on this packet if the following rule is activated:

alert ip any any -> any any (msg:"SURICATA IPv4 invalid checksum"; ipv4-csum:invalid; sid:2200073; rev:1;)
This rule is included in the provided signature file decoder-events.rules.

To match on individual TCP packet after defragmentation, one can use the following rule:

alert tcp-pkt any any -> any 80 (msg:"HTTP dl"; content:"Get /download.php"; sid:1; rev:1;)
It will try to find a per-packet match. This means that if the request is cut in two parts, there will be no detection.

At the TCP level, we’ve got three packets but one of them is invalid because of an invalid TCP windows. Suricata can alert on this by using the following rules:

alert tcp any any -> any any (msg:"SURICATA STREAM ESTABLISHED packet out of window"; stream-event:est_packet_out_of_window; sid:2210020; rev:1;)
This rule is included in the provided signature file stream-events.rules.

So at the stream level, we’ve got data made of packets #1, #2, #3 and #5. A matching rule at this level would be:

alert tcp any any -> any any (msg:"HTTP download"; flow:established,to_server; content:"Get /download.php";)
This is a case sensitive rule and this is not resistant to basic transformation on the request such as adding spaces between Get and the URI.

The data is part of an HTTP stream and it is normalized to avoid any application level manipulation that could alter the detection by signatures. In Suricata, we can use the following rule to have a match on normalized field:

alert http any any -> any any (msg:"Download"; content: "GET"; http_method; content: "/download.php"; http_uri)
The match is made when the traffic is identified as HTTP and we want the HTTP request method to be GET and the URL to match “/download.php”. We’ve got here one of the biggest advantage of Suricata, dedicated keywords and protocol recognition allow to write rule which are almost direct expression of our thinking.

I’m sure most of you are happy not to have this job cleanly done by Suricata!

Oct 232012
 

lstlisting is a convenient way to display code when using latex. It has no definition for suricata rules language and I’ve cooked one:

\lstdefinelanguage{suricata}
{morekeywords= {alert, tcp, http, tls, ip, ipv4, ipv4, drop, pass, sid, priority, rev, classtype, threshold, metadata, reference, tag, msg, content, uricontent, pcre, ack, seq, depth, distance, within, offset, replace, nocase, fast\_pattern, rawbytes, byte\_test, byte\_jump, sameip, ip\_proto, flow, window, ftpbounce, isdataat, id, rpc, dsize, flowvar, flowint, pktvar, noalert, flowbits, stream\_size, ttl, itype, icode, tos, icmp\_id, icmp\_seq, detection\_filter, ipopts, flags, fragbits, fragoffset, gid, nfq\_set\_mark, tls.version, tls.subject, tls.issuerdn, tls.fingerprint, tls.store, http\_cookie, http\_method, urilen, http\_client\_body, http\_server\_body, http\_header, http\_raw\_header, http\_uri, http\_raw\_uri, http\_stat\_msg, http\_stat\_code, http\_user\_agent, ssh.protoversion, ssh.softwareversion, ssl\_version, ssl\_state, byte\_extract, file\_data, dce\_iface, dce\_opnum, dce\_stub\_data, asn1, filename, fileext, filestore, filemagic, filemd5, filesize, l3\_proto, luajit},
otherkeywords={ipv4-csum, tcpv4-csum, tcpv6-csum, udpv4-csum, udpv6-csum, icmpv4-csum, icmpv6-csum, decode-event, app-layer-event, engine-event, stream-event},
sensitive=true,
morecomment=[l]{//},
morecomment=[s]{/*}{*/},
morestring=[b]",
}
To use it, you can simply add this code at start of your tex file and you can then use it:
\begin{lstlisting}[language=suricata]
alert tcp any 21 -> any any (msg:"Overlap data";   \
  flow:to_client; dsize:>0;                        \
  stream-event:reassembly_overlap_different_data;  \
  classtype:protocol-command-decode; sid:1; rev:1;)
\end{lstlisting}

which give you the following result:

By the way, the lst of keywords has been obtained by running the till now hidden command:

suricata --list-keywords

Oct 092012
 

Introduction

Some times ago, I’ve blogged about new IPS features in Suricata 1.1 and did not find at the time any killer application of the nfq_set_mark keyword. When using Suricata in Netfilter IPS mode, this keyword allows you to set the Netfilter mark on the packet when a rule match. This mark can be used by Netfilter or by other network subsystem to differentiate the treatment to apply to the packet.

It takes some time but I’ve found the idea. People around me know I don’t like Microsoft Office. One of the main reason, is that this is as awful as LibreOffice or OpenOffice.org but, on top of that, you’ve got pay for it. And, even if I don’t consider that people sending MS-Word file on the network should pay for that, I think they should at least benefit from a really slow bandwidth. I thus decided to implement this with Suricata and Netfilter.

If I’ve already told you that we will use the nfq_set_mark to mark the packet, one big task remains: how can I detect when someone upload a MS-Word file ? The answer is in the file extraction capabilities of Suricata. My preferred IDS/IPS is able to inspect HTTP traffic and to extract files that are uploaded or downloaded. You can then access to information about them. One of the new keyword is filemagic which return the same output as if you had run the unix file command on the file.

The setup

Now everything is in place. We just need to:

  • Write a signatures to mark packet relative to upload
  • Set up Suricata as IPS
  • Set up Netfilter to send all HTTP traffic to Suricata

The signature is really simple. We want HTTP traffic and a transferred file which is of type “Composite Document File V2 Document”. When this is the case, we alert and set the mark to 1:

alert http any any -> any any (msg:"Microsoft Word upload"; \
          nfq_set_mark:0x1/0x1; \
          filemagic:"Composite Document File V2 Document"; \
          sid:666 ; rev:1;)

Let suppose we’ve saved this signature into word.rules. Now we can start suricata with:

suricata -q 0 -S word.rules

This is cool, we now have a single packet that will be marked. But, as we want to slow down all the connection, we need to propagate the mark. We call Netfilter to the rescue and our friend here is CONNMARK that will transfer the mark to all packets:

iptables -A PREROUTING -t mangle -j CONNMARK --restore-mark
iptables -A POSTROUTING -t mangle -j CONNMARK --save-mark

If we are masochistic enough to try to slow yourself down, we need to add the following line to take care of OUTPUT packets:

iptables -A OUTPUT -t mangle -j CONNMARK --restore-mark

Now that the marking is done, let’s suppose that eth0 is the interface to Internet. In that case, we can setup the Quality of Service. We create a HTB disc and create a 1kbps class for our MS-Word uploader:

tc qdisc add dev eth0 root handle 1: htb default 0
tc class add dev eth0 parent 1: classid 1:1 htb rate 1kbps ceil 1kbps

Now, we can do the link with our packet marking by assigning packet of handle 1 fw to the 1kpbs class (flowid 1:1):

tc filter add dev eth0 parent 1: protocol ip prio 1 handle 1 fw flowid 1:1

The job is almost done, last step is to send the port 80 packet to suricata:

iptables -I FORWARD -p tcp –dport 80 -j NFQUEUE
iptables -I FORWARD -p tcp –sport 80 -j NFQUEUE

If we are sadistic, we can also treat the local traffic:

iptables -I OUTPUT -p tcp –dport 80 -j NFQUEUE
iptables -I INPUT -p tcp –sport 80 -j NFQUEUE

That’s all with that setup, all MS-Word uploader share a 1kbps bandwith.

Effort should pay

Even among the MS-Word users, there is some people with brain and they will try to escape the MS-Word curse by changing the extension of their document before uploading them. Effort must pay, so we will change the setup to provide the 10kbps for uploading. To do so, we add a signature to word.rules that will detect when a MS-Word file is uploaded with a changed extension:

alert http any any -> any any (msg:"Tricky Microsoft Word upload";
                nfq_set_mark:0x2/0x2; \
                fileext:!"doc"; \
                filemagic:"Composite Document File V2 Document";
                filestore;
                sid:667; rev:1;)

End of the task will be to send packet with mark 2 on a privileged pipe:

tc class add dev eth0 parent 1: classid 1:2 htb rate 10kbps ceil 10kbps
tc filter add dev eth0 parent 1: protocol ip prio 1 handle 2 fw flowid 1:2

I’m sure a lot of you think we have been to kind and that the little cheaters must be watch over. We’ve already used the filestore keyword in their signature to put the uploaded file on disk but that is not enough.

Keep an eye on cheaters

The traffic of the cheater must be watch over. To do so, we will send all the packets they exchange inside a pcap file. We will need multiple tools to do so. The first of them will be ipset that we will use to maintain a list of bad guys. And we will use ulogd to store their traffic into a pcap file.

To create the blacklist, we will just do:

ipset create cheaters hash:ip timeout 3600
iptables -A POSTROUTING -t mangle -m mark \
    --mark 0x2/0x2 \
    -j SET --add-set cheaters src --exists
The first line creates a set named cheaters that will contains IP. Every element of the set will stay for 1 hour in the set. The second line send all source IP that send packet which got the mark 2. If the IP is already in the set, it will see its timeout reset to 3600 thanks to the –exists option.

Now we will ask Nefilter to log all traffic of IP of the cheaters set:

iptables -A PREROUTING -t raw \
    -m set --match-set cheaters src,dst \
    -j NFLOG --nflog-group 1

The last step is to use ulogd to store the traffic in a pcap file. We need to modify a standard ulogd.conf to have the following lines not commented:

plugin="/home/eric/builds/ulogd/lib/ulogd/ulogd_output_PCAP.so"
stack=log2:NFLOG,base1:BASE,pcap1:PCAP

Now, we can start ulogd:

ulogd -c ulogd.conf
A PCAP file named /var/log/ulogd.pcap will be created and will contain all the packets of cheaters.

Conclusion

It has been hard. We’ve fight a lot. We’ve coded a lot. But now, they are done! We’re able to know more about them than a Facebook admin does!

Sep 182012
 

Introduction

I’ve been working for the past few days on a new Suricata feature. It is available in Suricata 1.4rc1.

Suricata can now listen to a unix socket and accept commands from the user. The exchange protocol is JSON-based and the format of the message has been done to be generic and it is described in this commit message. An example script called suricatasc is provided in the source and installed automatically when updating Suricata.

Unix socket command

By setting enabled to yes under unix-command in Suricata YAML configuration file, the creation of the socket is now activated:

unix-command:
  enabled: yes
  #filename: custom.socket # use this to specify an alternate file

This provides for now a set of really exciting commands like:

  • shutdown: this shutdown suricata
  • iface-list: list interfaces where Suricata is sniffing packets
  • iface-stat: list statistic for an interface
For example, a typical session with suricatasc will looks like:
# suricatasc 
>>> iface-list
Success: {'count': 2, 'ifaces': ['eth0', 'eth1']}
>>> iface-stat eth0
Success: {'pkts': 378, 'drop': 0, 'invalid-checksums': 0}

Useful but not really sexy for now. But the main part is the dedicated running mode.

Unix socket running mode

This mode is one of main motivation behind this code. The idea is to be able to ask to Suricata to treat different pcap files without having to restart Suricata between the files. This provides you a huge gain in time as you don’t need to wait for the signature engine to initialize.

To use this mode, start suricata with your preferred YAML file and provide the option –unix-socket as argument:

suricata -c /etc/suricata-full-sigs.yaml --unix-socket

Then, you can use the provided script suricatasc to connect to the command socket and ask for pcap treatment:

root@tiger:~# suricatasc
>>> pcap-file /home/benches/file1.pcap /tmp/file1
Success: Successfully added file to list
>>> pcap-file /home/benches/file2.pcap /tmp/file2
Success: Successfully added file to list

Yes, you can add multiple files without waiting the result: they will be sequentially processed and the generated log/alert files will be put into the directory specified as second arguments of the pcap-file command. You need to provide absolute path to the files and directory as suricata don’t know from where the script has been run.

To know how much files are waiting to get processed, you can do:

>>> pcap-file-number
Success: 3

To get the list of queued files, do:

>>> pcap-file-list
Success: {'count': 2, 'files': ['/home/benches/file1.pcap', '/home/benches/file2.pcap']}

suritasc script is just intended to be a basic tool to send commands. The protocol has been choozen to be simple so people can easily build their own scripts.

Some words about the protocol

The client connect to the socket and sends its protocol version:

{ "version": "$VERSION_ID" }
The server sends an answer:
{ "return": "OK|NOK" }
If server returns OK, the client is now allowed to send command.

The format of command is the following:

{
   "command": "$COMMAND_NAME",
   "arguments": { $KEY1: $VAL1, ..., $KEYN: $VALN }
}
For example, to ask suricata to treat a pcap, the command is something like:
{
  "command": "pcap-file",
  "arguments": { "filename": "smtp-clean.pcap", "output-dir": "/tmp/out" }
}
The server will try to execute the “command” specified with the (optional) provided “arguments”. The answer by server is the following:
{
  "return": "OK|NOK",
  "message": JSON_OBJECT or information string
}

Building it

This new features use jansson the encoding/decoding JSON and it needs thus to be installed on your system before you can do the build. On Debian (testing or sid) or Ubuntu, you can install libjansson-dev

Now, you can proceed with your normal build. For example:

./autogen.sh
./configure
make
make install

You can check if the support will be build in the message at end of configure:

Suricata Configuration:
  AF_PACKET support:                       yes
...
  Unix socket enabled:                     yes

  libnss support:                          no
  libnspr support:                         no
  libjansson support:                      yes

If this is your first install, you may want to run make install-full to get a working configuration of Suricata with Emerging Threats rules being downloaded and setup.

Now, you can just do:

suricata --unix-socket

Conclusion

This new features can be really interesting for people that are using Suricata to parse a large numbers of pcap. But the unix command may be really interesting when more commands will be available. Regarding this don’t hesitate to give me some feedbacks or ideas.

Sep 042012
 

A new Suricata IPS mode

Suricata IPS capabilities are not new. It is possible to use Suricata with Netfilter or ipfw to build a state-of-the-art IPS. On Linux, this system has not the best throughput performance. Patrick McHardy’s work on netlink: memory mapped I/O should bring some real improvement but this is not yet available.

I’ve thus decided to do an implementation of IPS based on AF_PACKET (read raw socket). The idea is based on one of the snort’s running mode. It peers two network interfaces and all packets received from one interface are sent to the other interface (if a signature with drop keyword does not fired on the packet). This requires to dedicate two network interfaces for Suricata but this provide a simple bridge system. As suricata is using latest AF_PACKET features (read load balancing), it was possible to build something really promising.

Victor Julien has commited my implementation which is now available in current git tree and will be part of Suricata 1.4beta1.

Configuration

You need to dedicate two network interfaces for this mode. The configuration is made via some new configuration variable available in the description of an AF_PACKET interface.

For example, the following configuration will create a Suricata acting as IPS between interface eth0 and eth1:

af-packet:
  - interface: eth0
    threads: 1
    defrag: yes
    cluster-type: cluster_flow
    cluster-id: 98
    copy-mode: ips
    copy-iface: eth1
    buffer-size: 64535
    use-mmap: yes
  - interface: eth1
    threads: 1
    cluster-id: 97
    defrag: yes
    cluster-type: cluster_flow
    copy-mode: ips
    copy-iface: eth0
    buffer-size: 64535
    use-mmap: yes

Basically, we’ve got an af-packet configuration with two interfaces. Interface eth0 will copy all received packets to eth1 because of the copy-* configuration variable:

    copy-mode: ips
    copy-iface: eth1

The configuration on eth1 is symetric:

    copy-mode: ips
    copy-iface: eth0

There is some important facts to consider when setting up this mode:

  • The implementation of this mode is dependent of the zero copy mode of AF_PACKET. Thus you need to set use-mmap to yes on both interface.
  • MTU on both interfaces have to be equal: the copy from one interface to the other is direct and packet bigger then the MTU will be dropped by kernel.
  • Set different values of cluster-id on both interfaces to avoid conflict.

The copy-mode variable can take the following values:

  • ips: the drop keyword is honored and matching packet are dropped.
  • tap: no drop occurs, Suricata act as a bridge

Please note, that mode can be different on the peered interfaces.

Current limitation

One other important point is that the threads variable must be equal on both interface. This is due to the fact the peering is done at the socket level: each packet received on a socket is sent to a peered socket listening to the peered interface.

At last but not least, you need at least Linux kernel 3.6 (or 3.4.12) to be able to set a threads value higher than 1. There is a bug in prior kernel that causes an infinite loop (packet being sent from an interface and reread, …). If you can’t wait or can’t made a full upgrade of your kernel you can use my patch which will be part of Linux 3.6 and of Linux 3.4.12.

Running it

Running Suricata in AF_PACKET IPS mode is straight forward. Once the configuration has been updated, you can simply run:

suricata -c /etc/suricata/suricata.yaml --af-packet

Please note that is possible to have normal IDS interface running simultaneously. For example, eth3 could be added to the af-packet configuration and used a regular interface.

Some implementation details

You can find some additional technical information in the commit.

As mentioned below, the peering is done at the socket level. In Suricata, each capture thread is opening a socket and is binding it to the cluster-id of the interface. Once a thread is started, it is peered with an thread listening to the copy-iface interface which socket will be used as a send interface. This system permit to avoid locking issue in workers running mode as only one thread will write to the peered socket at a time because we have only one packet treated by a capture thread at a time.

We need to reuse the interface for a simple reason: if a new dedicated socket was to be used, the packet sent to this socket would be received by Suricata read socket instead of being ignored. Linux kernel implement the natural feature of avoiding to send back packet to the sending socket.

This behavior was implemented in Linux kernel for regular socket but this was not the case for clustered sockets. This is why at least Linux 3.6 or my patch is needed to be able to use a value of threads bigger than one.

Conclusion

This mode is new in Suricata and is is really promising. It has suffered limited testing in high bandwidth environment. So, as usual, feedback is more than welcome. Don’t hesitate to send me a mail or to ask question on Suricata Mailing lists.

Aug 272012
 

Suricata TLS support

Victor Julien has just merged to main tree a branch containing some interesting new TLS related features. They have been contributed by me and Jean-Paul Roliers.

This patchset introduces TLS logging and brings some new keywords to Suricata engine. Here’s the list of all TLS related keywords that are available in latest Suricata git:

  • tls.version: match on version of protocol
  • tls.subject: match on subject of certificate
  • tls.issuerdn: match on issuer DN of certificate
  • tls.fingerprint: match on SHA1 fingerprint of certificate
  • tls.store: store the certificate on disk
You will find detailed explanation below.

TLS handshake parser in Suricata 1.3

In Suricata 1.3, Pierre Chifflier has contributed a TLS handshake parser which allows Suricata to analyse the TLS parameters of a connection. The first available keywords were tls.subject and tls.issuerdn. They allow the writing of rules similar to this one:

alert tls any any -> any any (msg:"forged ssl google user";
            tls.subject:"CN=*.googleusercontent.com";
            tls.issuerdn:!"CN=Google-Internet-Authority"; sid:8; rev:1;)

Here, suricata will alert if a certificate for CN=*.googleusercontent.com is not signed by CN=Google-Internet-Authority.

An other TLS related feature present in Suricata is the tls.version which allow you to match on the version of the protocol.

TLS Features included in the upcoming Suricata 1.4

TLS logging

Suricata can now log information about all TLS handshakes in a custom file. To activate this feature, you can add to your suricata.yaml file:

outputs:
  # a line based log of TLS handshake parameters (no alerts)
  - tls-log:
      enabled: yes  # Log TLS connections.
      filename: tls.log # File to store TLS logs.
      extended: yes # Log extended information like fingerprint

This will create a tls.log file that will contain information about TLS handshake:

08/27/2012-17:32:12.951518 2a01:0e35:1394:5ed0:0212:15ff:fe45:52bd:37957 -> 2001:41d0:0001:9598:0000:0000:0000:0001:443  TLS: Subject='OU=Domain Control Validated, OU=Gandi Standard SSL, CN=home.regit.org' Issuerdn='C=FR, O=GANDI SAS, CN=Gandi Standard SSL CA' SHA1='f3:40:21:48:70:2c:31:bc:b5:aa:22:ad:63:d6:bc:2e:b3:46:e2:5a' VERSION='TLS 1.2'
08/27/2012-17:45:07.812823 2a01:0e35:1394:5ed0:0212:15ff:fe45:52bd:56158 -> 2a00:1450:4007:0803:0020:0000:0400:100c:443  TLS: Subject='C=US, ST=California, L=Mountain View, O=Google Inc, CN=*.googleusercontent.com' Issuerdn='C=US, O=Google Inc, CN=Google Internet Authority' SHA1='c7:51:ea:f3:80:fa:ee:6e:3c:16:28:35:44:51:94:30:2d:76:31:9d' VERSION='TLS 1.1'
08/27/2012-17:45:13.600843 192.168.12.149:50035 -> 199.59.148.21:443  TLS: Subject='C=US, ST=CA, L=San Francisco, O=Twitter, Inc., OU=Twitter Security, CN=tdweb.twitter.com' Issuerdn='C=US, O=DigiCert Inc, OU=www.digicert.com, CN=DigiCert High Assurance CA-3' SHA1='4e:f3:3f:92:86:cf:52:6c:74:e9:fc:40:d0:d7:9d:f2:4e:e4:91:1f'VERSION='TLS 1.1' 

This can be used for statistical study and other usages.

TLS fingerprint match

The new keyword tls.fingerprint allows you to match on the fingerprint of a server certificate.

It allows you to write rules such as:

alert tls any any -> any any (msg:"Regit fingerprint"; 
         tls.subject:"CN=home.regit.org";
         tls.fingerprint:!"f3:40:21:48:70:2c:31:bc:b5:aa:22:ad:63:d6:bc:2e:b3:46:e2:5a"; sid:114; rev:1;)

Here, Suricata will alert when a certificate subject matches the one of home.regit.org but has not the correct fingerprint.

The SHA1 for a certificate can be retrieved from the tls.log file as well as from openssl or a browser.

$ openssl x509 -in cert.pem  -fingerprint
SHA1 Fingerprint=F3:40:21:48:70:2C:31:BC:B5:AA:22:AD:63:D6:BC:2E:B3:46:E2:5A
TLS store keyword

This last keyword permit to store server certificate and works in a similar way as suricata file storage. It does not take any arguments and trigger the storage of the certificate on the filesystem. Our home.regit.org rules can be improved to store the certificate by simply adding tls.store to it.

alert tls any any -> any any (msg:"Regit fingerprint"; 
         tls.subject:"OU=Domain Control Validated, OU=Gandi Standard SSL, CN=home.regit.org";
         tls.fingerprint:!"f3:40:21:48:70:2c:31:bc:b5:aa:22:ad:63:d6:bc:2e:b3:46:e2:5a";
         tls.store; sid:114; rev:1;)

Each certificate chain is stored in a separate PEM file and a meta file is a generated too. For example, a simple mach will create the following files:

  • 1346085544.64714-1.pem
  • 1346085544.64714-1.meta
The content of the meta file similar to this:
TIME:              08/27/2012-18:39:04.064714
SRC IP:            2a01:0e35:1394:5ed0:c147:93c6:8654:70fd
DST IP:            2001:41d0:0001:9598:0000:0000:0000:0001
PROTO:             6
SRC PORT:          53494
DST PORT:          443
TLS SUBJECT:       OU=Domain Control Validated, OU=Gandi Standard SSL, CN=home.regit.org
TLS ISSUERDN:      C=FR, O=GANDI SAS, CN=Gandi Standard SSL CA
TLS FINGERPRINT:   f3:40:21:48:70:2c:31:bc:b5:aa:22:ad:63:d6:bc:2e:b3:46:e2:5a
The whole certificates chain is stored in the PEM file:
-----BEGIN CERTIFICATE-----
MIIE2DCCA8CgAwIBAgIQXszObihffqDbQdTwC+ycCjANBgkqhkiG9w0BAQUFADBB
...
3WCiQrY7P56JbvG4jNJ5H4ERj90hQquHj/8Ej39db/MCCmNgjRLVLdVkW4l5DRCe
XX2Ac2xzZtLpVmft7sE3mQauGjW9tgCtpB/fxQqiA+uq+vm1PE37ukscByM=
-----END CERTIFICATE-----
-----BEGIN CERTIFICATE-----
MIIEozCCA4ugAwIBAgIQWrYdrB5NogYUx1U9Pamy3DANBgkqhkiG9w0BAQUFADCB
...
8/ifBlIK3se2e4/hEfcEejX/arxbx1BJCHBvlEPNnsdw8dvQbdqP
-----END CERTIFICATE-----
This allow you to get all needed information during analysis.

Have fun

All theses features are available for testing in the latest Suricata git. All feedbacks are welcome.
Jul 302012
 

Introduction

Since the beginning of July 2012, OISF team is able to access to a server where one interface is receiving some mirrored real European traffic. When reading "some", think between 5Gbps and 9.5Gbps constant traffic. With that traffic, this is around 1Mpps to 1.5M packet per seconds we have to study.

The box itself is a standard server with the following characteristics:

  • CPU: One Intel(R) Xeon(R) CPU E5-2680 0 @ 2.70GHz (16 cores counting Hyperthreading)
  • Memory: 32Go
  • capture NIC: Intel 82599EB 10-Gigabit SFI/SFP+

The objective is simple: be able to run Suricata on this box and treat the whole traffic with a decent number of rules. With the constraint not to use any non official system code (plain system and kernel if we omit a driver).

The code on the box have been updated October 4th:

  • It runs Suricata 1.4beta2
  • with 6719 signatures
  • and 0% packet loss
  • with setup described below

The setup is explained by the following schema: We want to use the multiqueue system on the Intel card to be able to load balance the treatment. Next goal is to have one single CPU to treat the packet from the start to the end.

Peter Manev, Matt Jonkmann, Anoop Saldanha and Eric Leblond (myself) have been involved in the here described setup.

Detailed method

The Intel NIC benefits from a multiqueue system. The RX/TX traffic can be load-balanced on different interrupts. In our case, this permit to handle a part of the flow on each CPU. One really interesting thing is that the load-balancing can be done with respect to the IP flows. By default, one RX queue is created per-CPU.

More information about multiqueue ethernet devices can be found in the document networking/scaling.txt in the Documentation directory of Linux sources.

Suricata is able to do zero-copy in AF_PACKET capture mode. One other interesting feature of this mode is that you can have multiple threads listening to the same interface. In our case, we can start one threads per queue to have a load-balancing of capture on all our resources.

Suricata has different running modes which define how the different parts of the engine (decoding, streaming, siganture, output) are chained. One of the mode is the ‘workers’ mode where all the treatment for a packet is made on a single thread. This mode is adapted to our setup as it will permit to keep the work from start to end on a single thread. By using the CPU affinity system available in Suricata, we can assign each thread to a single CPU. By doing this the treatment of each packet can be done on a single CPU.

But this does not solve one problem which is the link between the CPU receiving the packet and the one used in Suricata. To do so we have to ensure that when a packet is received on a queue, the CPU that will handle the packet will be the same as the one treating the packet in Suricata. David Miller had already planned this kind of setup when coding the fanout mode of AF_PACKET. One of the flow load balancing mode is flow_cpu. In this mode, the packet is delivered to the same track depending on the CPU.

The dispatch is made by using the formula "cpu % num" where cpu is the cpu number and num is the number of socket bound to the same fanout socket. By the way, this imply you can’t have a number of sockets superior to the number of CPUs. A code study shows that the assignement in the array of sockets is incrementally made. Thus first socket to bind will be assigned to the first CPU, etc.. In case, a socket disconnect from the set, the last socket of the array will take the empty place. This implies the optimizations will be partially lost in case of a disconnect.

By using the flow_cpu dispatch of AF_PACKET and the workers mode of Suricata, we can manage to keep all work on the same CPU.

Preparing the system

The operating system running on is an Ubuntu 12.04 and the driver igxbe was outdated. Using the instruction available on Intel website (README.txt), we’ve updated the driver. Next step was to unload the old driver and load the new one

sudo rmmod ixgbe
sudo modprobe ixgbe FdirPballoc=3

Doing this we’ve also tried to use the RSS variable. But it seems there is an issue, as we still having 16 queues although the RSS params was set to 8.

Once done, the next thing is to setup the IRQ handling to have each CPU linked in order with the corresponding RX queue. irqbalance was running on the system and the setup was correctly made.

The interface was using IRQ 101 to 116 and /proc/interrupts show a diagonale indicating that each CPU was assigned to one interrupt.

If not, it was possible to use the instruction contained in IRQ-affinity.txt available in the Documentation directory of Linux sources.

But the easy way to do it is to use the script provided with the driver:

ixgbe-3.10.16/scripts$ ./set_irq_affinity.sh eth3

Note: Intel latest driver was responsible of a decrease of CPU usage. With Ubuntu kernel version, the CPU usage as 80% and it is 45% with latest Intel driver.

The card used on the system was not load-balancing UDP flow using port. We had to use ‘ethtool’ to fix this

regit@suricata:~$ sudo ethtool -n eth3 rx-flow-hash udp4
UDP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA

regit@suricata:~$ sudo ethtool -N eth3 rx-flow-hash udp4 sdfn
regit@suricata:~$ sudo ethtool -n eth3 rx-flow-hash udp4
UDP over IPV4 flows use these fields for computing Hash flow key:
IP SA
IP DA
L4 bytes 0 & 1 [TCP/UDP src port]
L4 bytes 2 & 3 [TCP/UDP dst port]

In our case, the default setting of the ring parameters of the card, seems to indicate it is possible to increase the ring buffer on the card

regit@suricata:~$ ethtool -g eth3
Ring parameters for eth3:
Pre-set maximums:
RX:             4096
RX Mini:        0
RX Jumbo:       0
TX:             4096
Current hardware settings:
RX:             512
RX Mini:        0
RX Jumbo:       0
TX:             512

Our system is now ready and we can start configuring Suricata.

Suricata setup

Global variables

The run mode has been set to ‘workers’

max-pending-packets: 512
runmode: workers

As pointed out by Victor Julien, this is not necessary to increase max-pending-packets too much because only a number of packets equal to the total number of worker threads can be treated simultaneously.

Suricata 1.4beta1 introduce a delayed-detect variable under detect-engine. If set to yes, this trigger a build of signature after the packet capture threads have started working. This is a potential issue if your system is short in CPU as the task of building the detect engine is CPU intensive and can cause some packet loss. That’s why it is recommended to let it to the default value of no.

AF_PACKET

The AF_PACKET configuration is almost straight forward

af-packet:
  - interface: eth3
    threads: 16
    cluster-id: 99
    cluster-type: cluster_cpu
    defrag: yes
    use-mmap: yes
    ring-size: 300000
Affinity

Affinity settings permit to assign thread to set of CPUs. In our case, we onlSet to have in exclusive mode one packet thread dedicated to each CPU. The setting used to define packet thread property in ‘workers’ mode is ‘detect-cpu-set’

threading:
  set-cpu-affinity: yes
  cpu-affinity:
    - management-cpu-set:
      cpu: [ "all" ]
      mode: "balanced"
      prio:
        default: "low"

  - detect-cpu-set:
      cpu: ["all"]
      mode: "exclusive" # run detect threads in these cpus
      prio:
        default: "high"

The idea is to assign the highest prio to detect threads and to let the OS do its best to dispatch the remaining work among the CPUs (balanced mode on all CPUs for the management).

Defrag

Some tuning was needed here. The network was exhibing some serious fragmentation and we have to modify the default settings

defrag:
  memcap: 512mb
  trackers: 65535 # number of defragmented flows to follow
  max-frags: 65535  # number of fragments

The ‘trackers’ variable was not documented in the original YAML configuration file. Although defined in the YAML, the ‘max-frags’ one was not used by Suricata. A patch has been made to implement this.

Streaming

The variables relative to streaming have been set very high

stream:
  memcap: 12gb
  max-sessions: 20000000
  prealloc-sessions: 10000000
  inline: no                    # no inline mode
  reassembly:
    memcap: 14gb
    depth: 6mb                  # reassemble 1mb into a stream
    toserver-chunk-size: 2560
    toclient-chunk-size: 2560

To detect the potential issue with memcap, one can read the ‘stats.log’ file which contains various counters. Some of them matching the ‘memcap’ string.

Running suricata

Suricata can now be runned with the usual command line

sudo suricata -c /etc/suricata.yaml --af-packet=eth3

Our affinity setup is working as planned as show the following log line

Setting prio -2 for "AFPacketeth34" Module to cpu/core 3, thread id 30415

Tests

Tests have been made by simply running Suricata against the massive traffic mirrored on the eth3 interface.

At first, we started Suricata without rules to see if it was able to deal with the amount of packets for a long period. Most of the tuning was done during this phase.

To detect packet loss, the capture keyword can be search in ‘stats.log’. If ‘kernel_drops’ is set to 0, this is good

capture.kernel_packets    | AFPacketeth315            | 1436331302
capture.kernel_drops      | AFPacketeth315            | 0
capture.kernel_packets    | AFPacketeth316            | 1449320230
capture.kernel_drops      | AFPacketeth316            | 0

The statistics are available for each thread. For example, ‘AFPacketeth315′ is the 15th AFPacket thread bound to eth3.

When this phase was complete we did add some rules by using Emerging Threat PRO rules for malware, trojan and some others:

rule-files:
 - trojan.rules
 - malware.rules
 - chat.rules
 - current_events.rules
 - dns.rules
 - mobile_malware.rules
 - scan.rules
 - user_agents.rules
 - web_server.rules
 - worm.rules

This ruleset has the following characterics:

  • 6719 signatures processed
  • 8 are IP-only rules
  • 2307 are inspecting packet payload
  • 5295 inspect application layer
  • 0 are decoder event only

This is thus a decent ruleset with a high part of application level event which require a complex processing. With that ruleset, there is more than 16 alerts per second (output in unified2 format).

With the previously mentioned ruleset, the load of each CPU is around 60% and Suricata is remaining stable during hours long run.

In most run, we’ve observed some packet loss between capture start and first time Suricata grab the statistics. It seems the initialization phase is not fast enough.

Conclusion

OISF team has access to the box for a week now and has already managed to get real performance. We will continue to work on it to provide the best possible experience to all Suricata’s users.

Feel free to made any remark and suggestion about this blog post and this setup.