Nov 092014
 

Introduction

I’m currently working on a script that parses Suricata EVE log files and try to detect if some fields in the log are present in a list of bad patterns. So the script has two parts which are reading the log file and searching for the string in a list of strings. This list can be big with a target of around 20000 strings.

Note: This post may seem trivial for real Python developers but as I did not manage to find any documentation on this here is this blog post.

The tests

For my test I have used a 653Mo log file containing 896077 lines. Reading this JSON formatted file is taking 5.0s. As my list of strings was around 3000 elements so far below targeted size, a thumb rules was saying that I will be ok if script stayed below 6 seconds with the matching added. First test was a simple Python style inclusion test with the hostname being put in a list:
if event['http']['hostname'] in hostname_list:
For that test, the result was 9.5s so not awful but a bit over my expectation. Just to check I have run a test with a C-like implementation:
for elt in hostname_list:
   if elt == target:
       # we have a winner
Result was a nice surprise, … for Python, with a execution time of 20.20s.

I was beginning to fear some development to be able to reach the speed I needed and I gave a last try. As I was taking care of match, I can transform my list of strings in a Python set thus only getting unique elements. So I have run the test using:

if event['http']['hostname'] in set(hostname_list):

Result was an amazing execution time of 5.20s. Only 0.20s were used to check data against my set of strings.

Explanation

Python set required elements to be hashable. It is needed because internal implementation is using dictionary. So looking for an element in a set is equivalent to look for an element in a hash table. And this is really faster than searching in a list where there is no real magic possible.

So if you only care about match and if your elements are hashable then use Python set to test for existence of a object in your set.

Sep 292014
 

I’ve been lucky enough to do a talk during the third edition of Kernel Recipes. I’ve presented the evolution of nftables durig the previous year.

You can get the slides from here: 2014_kernel_recipes_nftables.

Thanks to Hupstream for uploading the video of the talk:

Not much material but this slides and a video of the work done during the previous year on nftables and its components:

 Posted by at 21:48
Sep 242014
 

DOM and SSH honeypot

DOM is a solution comparable to fail2ban but it uses Suricata SSH log instead of SSH server logs. The goal of DOM is to redirect the attacker based on its SSH client version. This allows to send attacker to a honeypot like pshitt directly after the first attempt. And this can be done for a whole network as Suricata does not need to be on the targeted box.

Using DOM with nftables

I’ve pushed a basic nftables support to DOM. Instead of adding element via ipset it uses a nftables set. It is simple to use it as you just need to add a -n flag to specify which table the set has been defined in:
./dom -f /usr/local/var/log/suricata/eve.json -n nat -s libssh -vvv -i -m OpenSSH
To activate the network address translation based on the set, you can use something like:
table ip nat {
        set libssh { 
                type ipv4_addr
        }

        chain prerouting {
                 type nat hook prerouting priority -150;
                 ip saddr @libssh ip protocol tcp counter dnat 192.168.1.1:2200
        }
}

A complete basic ruleset

Here’s the ruleset running on the box implementing pshitt and DOM:
table inet filter {
        chain input {
                 type filter hook input priority 0;
                 ct state established,related accept
                 iif lo accept
                 ct state new iif != lo tcp dport {ssh, 2200} tcp flags == syn counter log prefix "SSH attempt" group 1 accept
                 iif br0 ip daddr 224.2.2.4 accept
                 ip saddr 192.168.0.0/24 tcp dport {9300, 3142} counter accept
                 ip saddr 192.168.1.0/24 counter accept
                 counter log prefix "Input Default DROP" group 2 drop
        }
}

table ip nat {
        set libssh { 
                type ipv4_addr
        }

        chain prerouting {
                 type nat hook prerouting priority -150;
                 ip saddr @libssh ip protocol tcp counter dnat 192.168.1.1:2200
        }

        chain postrouting {
                 type nat hook postrouting priority -150;
                 ip saddr 192.168.0.0/24 snat 192.168.1.1
        }
}
There is a interesting rule in this ruleset. The first is:
ct state new iif != lo tcp dport {ssh, 2200} tcp flags == syn counter log prefix "SSH attempt" group 1 accept
It uses a negative construction to match on the interface iif != lo which means interface is not lo. Note that it also uses an unamed set to define the port list via tcp dport {ssh, 2200}. That way we have one single rule for normal and honeypot ssh. At least, this rule is logging and accepting and the logging is done via nfnetlink_log because of the group parameter. This allows to have ulogd to capture log message triggered by this rule.
Jun 262014
 

Introduction

I’ve been playing lately on analysis SSH bruteforce caracterization. I was a bit frustrated of just getting partial information:

  • ulogd can give information about scanner settings
  • suricata can give me information about software version
  • sshd server logs shows username
But having username without having the password is really frustrating.

So I decided to try to get them. Looking for a SSH server honeypot, I did find kippo but it was going too far for me by providing a fake shell access. So I’ve decided to build my own based on paramiko.

pshitt, Passwords of SSH Intruders Transferred to Text, was born. It is a lightweight fake SSH server that collect authentication data sent by intruders. It basically collects username and password and writes the extracted data to a file in JSON format. For each authentication attempt, pshitt is dumping a JSON formatted entry:

{"username": "admin", "src_ip": "116.10.191.236", "password": "passw0rd", "src_port": 36221, "timestamp": "2014-06-26T10:48:05.799316"}
The data can then be easily imported in Logstash (see pshitt README) or Splunk.

The setup

As I want to really connect to the box running ssh with a regular client, I needed a setup to automatically redirect the offenders and only them to pshitt server. A simple solution was to used DOM. DOM parses Suricata EVE JSON log file in which Suricata gives us the software version of IP connecting to the SSH server. If DOM sees a software version containing libssh, it adds the originating IP to an ipset set. So, the idea of our honeypot setup is simple:
  • Suricata outputs SSH software version to EVE
  • DOM adds IP using libssh to the ipset set
  • Netfilter NAT redirects all IP off the set to pshitt when they try to connect to our ssh server
Getting the setup in place is really easy. We first create the set:
ipset create libssh hash:ip
then we start DOM so it adds all offenders to the set named libssh:
cd DOM
./dom -f /usr/local/var/log/suricata/eve.json -s libssh
A more accurate setup for dom can be the following. If you know that your legitimate client are only based on OpenSSH then you can run dom to put in the list all IP that do not (-i) use an OpenSSH client (-m OpenSSh):
./dom -f /usr/local/var/log/suricata/eve.json -s libssh -vvv -i -m OpenSSH
If we want to list the elements of the set, we can use:
ipset list libssh
Now, we can start pshitt:
cd pshitt
./pshitt
And finally we redirect the connection coming from IP of the libssh set to the port 2200:
iptables -A PREROUTING -m set --match-set libssh src -t nat -i eth0 -p tcp -m tcp --dport 22 -j REDIRECT --to-ports 2200

Some results

Here’s an extract of the most used passwords when trying to get access to the root account: real root passwords And here’s the same thing for the admin account attempt: Root passwords Both data show around 24 hours of attempts on an anonymous box.

Conclusion

Thanks to paramiko, it was really fast to code pshitt. I’m now collecting data and I think that they will help to improve the categorization of SSH bruteforce tools.
Jun 112014
 

The slides of my lightning talk at SSTIC are available: Let’s talk about SELKS. The slides are in French and are intended to be humorous.

The presentation is about defensive security that needs to get sexier. And Suricata 2.0 with EVE logging combined with Elasticsearch and Kibana can really help to reach that target. If you want to try Suricata and Elasticsearch, you can download and test SELKS.

selks

The talk also present a small tool named Deny On Monitoring which demonstrate how easy it is to extract information from Suricata EVE JSON logging.

 Posted by at 10:14
May 192014
 

Introduction

I’m currently working on Scirius, the web management interface for Suricata developed by Stamus Networks. Scirius is able to fetch IDS signatures from external place and the backend is storing this element in a git tree. As Scirius is a Django application, this means we need to interact with git in Python.

Usually the documentation of Python modules is good and enough to develop. This is sadly not the case for GitPython. There is documentation but the overall quality it not excellent, at least for a non genuine Python developer, and there is some big part missing.

Doing a commit

Doing a commit is really simple once you have understand what to do. You need to open the repository and work on his index which is the object you add file to commit to. In the following example, I want to add everything under the rules directory:

    repo = git.Repo(source_git_dir)
    index = repo.index
    index.add(["rules"])
    message =  'source version at %s' % (self.updated_date)
    index.commit(message)

Set value in the configuration of a repository

It is possible to edit the configuration of a git repository with GitPython. To do that you need to get the config and to use the set_value function. For example, the following code snippet create a repository and set user.email and user.name for that repository:

    repo = git.Repo.init(source_git_dir)
    config = repo.config_writer()
    config.set_value("user", "email", "scirius@stamus-networks.com")
    config.set_value("user", "name", "Scirius")

OSError 25: Inappropriate ioctl for device

I’ve encountered this fabulous exception when trying to do a commit in Scirius. The problem is only showing up when running the application in wsfcgi mode. It is documented in Issue 39 on GitHub but there is no workaround proposed.

The error comes from the fact the function used to guess the identity of the user running the application is called even if value are set in the config. And this function is failing when it is called outside of a real session. This function is in fact trying to get things from environment but these value are not set when the application is started by init. To fix this, it is possible to force the USERNAME environment variable.

Here’s how it is implemented in Scirius:

+    os.environ['USERNAME'] = 'scirius'
    index.add(["rules"])
    message =  'source version at %s' % (self.updated_date)
    index.commit(message)

You can see the diff on GitHub

 Posted by at 19:44
Apr 272014
 

I’ve gave a lightning talk about coccigrep at Hackito Ergo Sum to show how it can be used to search in code during audit or hacking party. Here are the slides: coccigrep: a semantic grep for the C language.

The slides of my talk Suricata 2.0, Netfilter and the PRC will soon be available on Stamus Networks website.

 Posted by at 21:54
Apr 172014
 

Sending packets with scapy

I’m currently doing some code based on scapy. This code reads data from a possibly huge file and send a packet for each line in the file using the contained information. So the code contains a simple loop and uses sendp because the frame must be sent at layer 2.

     def run(self):
         filedesc = open(self.filename, 'r')
         # loop on read line
         for line in filedesc:
             # Build and send packet
             sendp(pkt, iface = self.iface, verbose = verbose)
             # Inter packet treatment

Doing that the performance are a bit deceptive. For 18 packets, we’ve got:

    real    0m2.437s
    user    0m0.056s
    sys     0m0.012s

If we strace the code, the explanation is quite obvious:

socket(PF_PACKET, SOCK_RAW, 768)        = 4
setsockopt(4, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0
select(5, [4], [], [], {0, 0})          = 0 (Timeout)
ioctl(4, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0
bind(4, {sa_family=AF_PACKET, proto=0x03, if1, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0
setsockopt(4, SOL_SOCKET, SO_RCVBUF, [1073741824], 4) = 0
setsockopt(4, SOL_SOCKET, SO_SNDBUF, [1073741824], 4) = 0
getsockname(4, {sa_family=AF_PACKET, proto=0x03, if1, pkttype=PACKET_HOST, addr(6)={772, 000000000000}, [18]) = 0
ioctl(4, SIOCGIFNAME, {ifr_index=1, ifr_name="lo"}) = 0
sendto(4, "\377\377\377\377\377\377\0\0\0\0\0\0\10\0E\0\0S}0@\0*\6\265\373\307;\224\24\300\250"..., 97, 0, NULL, 0) = 97
select(0, NULL, NULL, NULL, {0, 0})     = 0 (Timeout)
close(4)                                = 0
socket(PF_PACKET, SOCK_RAW, 768)        = 4
setsockopt(4, SOL_SOCKET, SO_RCVBUF, [0], 4) = 0
select(5, [4], [], [], {0, 0})          = 0 (Timeout)
ioctl(4, SIOCGIFINDEX, {ifr_name="lo", ifr_index=1}) = 0
bind(4, {sa_family=AF_PACKET, proto=0x03, if1, pkttype=PACKET_HOST, addr(0)={0, }, 20) = 0
setsockopt(4, SOL_SOCKET, SO_RCVBUF, [1073741824], 4) = 0
setsockopt(4, SOL_SOCKET, SO_SNDBUF, [1073741824], 4) = 0
getsockname(4, {sa_family=AF_PACKET, proto=0x03, if1, pkttype=PACKET_HOST, addr(6)={772, 000000000000}, [18]) = 0
ioctl(4, SIOCGIFNAME, {ifr_index=1, ifr_name="lo"}) = 0
sendto(4, "\377\377\377\377\377\377\0\0\0\0\0\0\10\0E\0\0004}1@\0*\6\266\31\307;\224\24\300\250"..., 66, 0, NULL, 0) = 66
select(0, NULL, NULL, NULL, {0, 0})     = 0 (Timeout)
close(4)                                = 0

For each packet, a new socket is opened and this takes age.

Speeding up the sending

To speed up the sending, one solution is to build a list of packets and to send that list via a sendp() call.

     def run(self):
         filedesc = open(self.filename, 'r')
         pkt_list = []
         # loop on read line
         for line in filedesc:
             # Build and send packet
             pkt_list.append(pkt)
         sendp(pkt_list, iface = self.iface, verbose = verbose)

This is not possible in our case due to the inter packet treatment we have to do. So the best way is to reuse the socket. This can be done easily when you’ve read the documentation^W code:

@@ -27,6 +27,7 @@ class replay:
     def run(self):
         # open filename
         filedesc = open(self.filename, 'r')
+        s = conf.L2socket(iface=self.iface)
         # loop on read line
         for line in filedesc:
             # Build and send packet
-            sendp(pkt, iface = self.iface, verbose = verbose)
+            s.send(pkt)

The idea is to create a socket via the function used in sendp() and to use the send() function of the object to send packets.

With that modification, the performance are far better:

    real    0m0.108s
    user    0m0.064s
    sys     0m0.004s

I’m not a scapy expert so ping me if there is a better way to do this.

Mar 082014
 

Some progress on the JSON side

Suricata 2.0-rc2 is out and it brings some progress on the JSON side. The logging of SSH protocol has been added: Screenshot from 2014-03-07 18:50:21 and the format of timestamp has been updated to be ISO 8601 compliant and it is now named timestamp instead of time.

Ulogd, the Netfilter logging daemon has seen similar change as it is now also using a ISO 8601 compliant timestamp for the . This feature is available in git and will be part of ulogd 2.0.4.

Thanks to this format change, the integration with logstash or splunk is easier and more accurate. This permit to fix one problem regarding the timestamp of an event inside of the event and logging manager. At least in logstash, the used date was the one of the parsing which was not really accurate. It could even be a problem when logstash was parsing a file with old entries because the difference in timestamp could be huge.

It is now possible to update logstash configuration to have a correct parsing of the timestamp. After doing this the internal @timestamp and the timestamp of the event are synchronized as show on the following screenshot:

timestamp

Logstash configuration

Screenshot from 2014-02-02 13:22:34

To configure logstash, you simply needs to tell him that the timestamp field in JSON message is a date. To do so, you need to add a filter:

      date {
        match => [ "timestamp", "ISO8601" ]
      }
A complete logstash.conf would then looks like:
input {
   file {
      path => [ "/usr/local/var/log/suricata/eve.json", "/var/log/ulogd.json" ]
      codec =>   json
      type => "json-log"
   }
}

filter {
   if [type] == "json-log" {
      date {
        match => [ "timestamp", "ISO8601" ]
      }
   }
}

output {
  stdout { codec => rubydebug }
  elasticsearch { embedded => true }
}

Splunk configuration

Screenshot from 2014-03-07 23:30:40

In splunk, auto detection of the file format is failing and it seems you need to define a type to parse JSON in $SPLUNK_DIR/etc/system/local/props.conf:

[suricata]
KV_MODE = json
NO_BINARY_CHECK = 1
TRUNCATE = 0

Then you can simply declare the log file in $SPLUNK_DIR/etc/system/local/inputs.conf:

[monitor:///usr/local/var/log/suricata/eve.json]
sourcetype = suricata

[monitor:///var/log/ulogd.json]
sourcetype = suricata

you can now build search events and build dashboard based on Suricata or Netfilter packet logging: Screenshot from 2014-03-05 23:17:12

Feb 242014
 

Nftables logging

If nftables is bringing a lot of changes on user side, this is also true in the logging area. There is now only one single keyword for logging: log and this target is using the Netfilter logging framework. A corollary of that is that why you may not see any log messages even if a rule with log is matching because the Netfilter logging framework has to be configured.

Netfilter logging framework

The Netfilter logging framework is a generic way of logging used in Netfilter components. This framework is implemented in two different kernel modules:

  • xt_LOG: printk based logging, outputting everything to syslog (same module as the one used for iptables LOG target). It can only log packets for IPv4 and IPv6
  • nfnetlink_log: netlink based logging requiring to setup ulogd2 to get the events (same module as the one used for iptables NFLOG target). It can log packet for any family.

To use one of the two modules, you need to load them with modprobe. It is possible to have both modules loaded and in this case, you can then setup logging on a per-protocol basis. The active configuration is available for reading in /proc:

# cat /proc/net/netfilter/nf_log 
 0 NONE (nfnetlink_log)
 1 NONE (nfnetlink_log)
 2 nfnetlink_log (nfnetlink_log,ipt_LOG)
 3 NONE (nfnetlink_log)
 4 NONE (nfnetlink_log)
 5 NONE (nfnetlink_log)
 6 NONE (nfnetlink_log)
 7 nfnetlink_log (nfnetlink_log)
 8 NONE (nfnetlink_log)
 9 NONE (nfnetlink_log)
10 nfnetlink_log (nfnetlink_log,ip6t_LOG)
11 NONE (nfnetlink_log)
12 NONE (nfnetlink_log)
The syntax is the following FAMILY ACTIVE_MODULE (AVAILABLE_MODULES). Here nfnetlink_log was loaded first and xt_LOG was loaded afterward (xt_LOG is aliased to ipt_LOG and ip6t_LOG).

Protocol family numbers can look a bit strange. It is in fact mapped on the socket family name that is used in underlying code. The list is the following:

#define AF_UNSPEC	0
#define AF_UNIX		1	/* Unix domain sockets 		*/
#define AF_INET		2	/* Internet IP Protocol 	*/
#define AF_AX25		3	/* Amateur Radio AX.25 		*/
#define AF_IPX		4	/* Novell IPX 			*/
#define AF_APPLETALK	5	/* Appletalk DDP 		*/
#define	AF_NETROM	6	/* Amateur radio NetROM 	*/
#define AF_BRIDGE	7	/* Multiprotocol bridge 	*/
#define AF_AAL5		8	/* Reserved for Werner's ATM 	*/
#define AF_X25		9	/* Reserved for X.25 project 	*/
#define AF_INET6	10	/* IP version 6			*/
#define AF_MAX		12	/* For now.. */

To update the configuration, you need to write in the file corresponding to the family in /proc/sys/net/netfilter/nf_log/ directory. For example, if you want to use ipt_LOG for IPv4 (2 in the list), you can do:

echo "ipt_LOG" >/proc/sys/net/netfilter/nf_log/2 
This will active ipt_LOG for IPv4 logging:
# cat /proc/net/netfilter/nf_log 
 0 NONE (nfnetlink_log)
 1 NONE (nfnetlink_log)
 2 ipt_LOG (nfnetlink_log,ipt_LOG)
 3 NONE (nfnetlink_log)
 4 NONE (nfnetlink_log)
 5 NONE (nfnetlink_log)
 6 NONE (nfnetlink_log)
 7 nfnetlink_log (nfnetlink_log)
 8 NONE (nfnetlink_log)
 9 NONE (nfnetlink_log)
10 nfnetlink_log (nfnetlink_log,ip6t_LOG)
11 NONE (nfnetlink_log)
12 NONE (nfnetlink_log)

Netfilter framework is used internally by Netfilter for some logging. For example, the connection tracking is using it to send messages when invalid packets are seen. These messages are useful because they contain the reason of the reject. For example, one of the message is “nf_ct_tcp: ACK is under the lower bound (possible overly delayed ACK)”. This logging messages are only sent if the logging of invalid packet is asked. This is done by doing:

echo "255"> /proc/sys/net/netfilter/nf_conntrack_log_invalid
More information on the magical 255 value are available in kernel documentation of nf_conntrack sysctl. If nfnetlink_log module is used for the protocol, then the used group is 0. So if you want to activate these messages, it could be a good idea to use non 0 nfnetlink group in the log rules. This way you will be able to differentiate the log sources in a software like ulogd.

Logging with Nftables

As mentioned before, logging is made via a log keyword. A typical log and accept rule will look like:

nft add rule filter input tcp dport 22 ct state new log prefix \"SSH for ever\" group 2 accept
This rule is accepting packet to port 22 in the state NEW and it is logging them with prefix SSH for ever on group 2. Here the group is only used when the active logging kernel module is nfnetlink_log. The option has no effect if xt_LOG is used. In fact, when used with xt_LOG, the only available option is prefix (at least for nftables 0.099).

The available options when using nfnetlink_log module are the following (at least for nftables 0.099):

  • prefix: A prefix string to include in the log message, up to 64 characters long, useful for distinguishing messages in the logs.
  • group: The netlink group (0 – 2^16-1) to which packets are (only applicable for nfnetlink_log). The default value is 0.
  • snaplen: The number of bytes to be copied to userspace (only applicable for nfnetlink_log). nfnetlink_log instances may specify their own range, this option overrides it.
  • queue-threshold: Number of packets to queue inside the kernel before sending them to userspace (only applicable for nfnetlink_log). Higher values result in less overhead per packet, but increase delay until the packets reach userspace. The default value is 1.
Note: the description are extracted from iptables man pages.

If you want to do some easy testing with nftables, simply load xt_LOG module before nfnetlink_log. It will bind to IPv4 and IPv6 protocol and provide you logging. For more fancy stuff involving nfnetlink_log, you can have a look at Using ulogd and JSON output.

Happy logging to all!