Coccigrep improved func operation

Coccigrep 1.11 is now available and mainly features some improvements related to the func search. The func operation can be used to search when a structure is used as argument of a function. For example, to search where the Packet structures are freed inside Suricata project, one can run:

$ coccigrep -t Packet -a "SCFree" -o func src/
src/alert-unified2-alert.c:1156 (Packet *p):         SCFree(p);
src/alert-unified2-alert.c:1161 (Packet *p):         SCFree(p);
...
src/alert-unified2-alert.c:1368 (Packet *pkt):         SCFree(pkt);

With coccigrep 1.11, it is now possible to look for a function with a regular expression. For example, to see how a time_t is used in the print function of Suricata which are all starting by SCLog (SCLogDebug, SCLogWarning, …), you can simply run:

$ coccigrep -t time_t -a "SCLog.*" -o func src/ 
src/defrag.c:480 (time_t *dc->timeout):     SCLogDebug("\tTimeout: %"PRIuMAX, (uintmax_t)dc->timeout);

With 1.11 version, the func operation is now more useful. It is also more accurate as casted parameters or direct use of a structure (instead of usage though a pointer) are now supported.

Minimal linux kernel config for Virtualbox

I was looking for some minimal Linux kernel configuration for Virtualbox guest and did only find some old one. I thus decide to build one and to publish them.
They are available on github: regit-config

For now, the only published configuration are for Linux kernel 3.5:

Run a build on all commits in a git branch

Sometime, you need to check that all the commits in a branch are building correctly. For example, when a rebase has been done, it is possible you or diff has made a mistake during the operation. The building operation can be run against all commits of the current branch with the following one-liner (splitted here for more readability):

for COMMIT in $(git log --reverse --format=format:%H origin/master..HEAD); do
    git checkout ${COMMIT} ;
    make -j8 1>/dev/null || { echo "Commit $COMMIT don't build";  break; }
done

The idea is trivial, we build the list of commits with git log using a simple format string (to get only the hash). We add the reverse tag to start from the oldest commit.
For each commit, we checkout and run the build command. If the build fails, we exit from the loop.

The result is a directory with the non-building code. Thus, don’t forget to get back to the original branch ORIG_BRANCH by running a git checkout ORIG_BRANCH.

Set or unset define variables in Coccigrep

Following a discussion with the great Julia Lawall, she added a new feature in coccinelle: it is now possible to define as set or unset some variables. This option has been added in coccigrep 1.9 and requires coccinelle 1.0-rc14.

For example, let’s have a code like Suricata where a lot of unit tests are implemented. The structure of the code is the following:

REGULAR CODE

#ifdef UNITTESTS
 TEST CODE
#endif

When doing search in the regular code, you don’t want to be bothered by results found in the test code. To obtain this result, you can pass the -U UNITTESTS option to coccigrep to tell him to consider UNITTESTS variable as undefined. If you want to define a variable, you can use the -D flag.

If you are using coccigrep inside vim, you can set the coccigrep_path variable with this option. The basic vim syntax is:

let g:coccigrep_path="coccigrep -U UNITTESTS"

As I wanted to have it for all query in my Suricata source directory, I’ve added at the end of my ~/.vim/after/syntax/suricata.vim file:

autocmd BufEnter,BufNewFile,BufRead */git/oisf/* let g:coccigrep_path="coccigrep -U UNITTESTS"

Using Scapy lfilter

Scapy BPF filtering is not working when some exotic interface are used. This includes Virtualbox interface such as vboxnet.

For example, the following code will not work if the interface is a virtualbox interface:

build_filter = "src host %s and src port 21"
sniff(iface=iface, prn=callback, filter=build_filter)

To fix this, you can use the lfilter option. The filtering is now done inside Scapy. This is powerful but less efficient.

The code can be modified like this:

build_lfilter = lambda (r): TCP in r and r[TCP].sport == 21 and r[IP].src == ip
sniff(iface=iface, prn=callback, lfilter=build_lfilter)

Tanks a lot to Guillaume Valadon for the tips!

What’s new in coccigrep 1.6?

I did not write any article on coccigrep since the 1.0 release. Here is an update on what has been added to the software since that release.

C++ support

Coccinelle has a basic C++ support which can be activated by using the –cpp flag in coccigrep.

Patches information

The -L -v options on command line will display a description of the match available on the system.

$ coccigrep -L -v
set: Search where a given attribute of structure 'type' is set
 * Confidence: 80%
 * Author: Eric Leblond 
 * Arguments: type, attribute
 * Revision: 2

For the developer, this is obtained from structured comments put at the start of the cocci file:

$ head src/data/set.cocci 
// Author: Eric Leblond 
// Desc: Search where a given attribute of structure 'type' is set
// Confidence: 80%
// Arguments: type, attribute
// Revision: 2
@init@

This is thus an easy way to document the search operation. Please note, that this will also work for the operations put in the user or system custom directory.

Context line display improvement

Guillaume Nault has contributed a series of patches that greatly improved the display of context lines:

$ coccigrep -C 3 -t Packet -a flags -o set decode*c 
decode.c-90            -     }
decode.c-91            - 
decode.c-92            -     PACKET_INITIALIZE(p);
decode.c:93 (Packet *p):     p->flags |= PKT_ALLOC;
decode.c-94            - 
decode.c-95            -     SCLogDebug("allocated a new packet only using alloc...");
decode.c-96            -

Documentation

A man page is now available.

Python 2.5 support

The 1.6 release came with a code modification that permit coccigrep to run with python 2.5. Some users seem to still use this old version of Python and the support was not requiring to degrade coccigrep code. It has even improved it.

Option to read file lists from a file

Thomas Graf has contributed the -l option which provides a way to specify a file containing the list of the files to search in.

Operation improvement

The set operation has been improved and is now more accurate thanks to the support of all related operators.

Conclusion

Coccigrep is becoming more and more mature over time. The existing code base remains and a polishing work is currently under progress. One last point on the project is that some Linux and *BSD distribution seems to have done packages. This is the case of Aur, Gentoo, NetBSD, OpenBSD, Mandriva and soon Debian if the intention to package is confirmed.

Victor Julien: Development status

Work has started in september 2007. The work depends on some externel library like multithread of input handling library. The main external depedency is libhtp which is initally developped by Ivan Ristic.

The development is managed in a single git repository. Victor is the only one with commit right. The review are done by Victor and cross review are made by developpers.

Work unit for developers are tasks which are written by Victor and describe a specific task to do. This task are mainly done by OISF funded developers. Some simpler task are let to the comunity and everyone can help with this.

To offload Victor’s load, subsystem mainteners are nominated:

  • Eric Leblond: packet acquisition
  • Anoop Saldanha: detection part

They will have freedom on the way to improve the subsystem they are in charge.

The development is currently done with two branches (1.0 which is bug fix only and master). Victor is currently not happy with this and would like to switch to time-based release. This is discussed as this can be difficult for company to deal with frequent updates. A funding of maintenance by companies could help to keep the current working system.

Peter Manev is in charge of the QA. There is a lot of work to do in this area. Unit test is currently good but there is a lot of work to do to improve detection of regression.

Performance has been improved in 1.1 with a focus on efficient algorithm.

CUDA support needs help. The performance is still lower with than without and thus this really need developer power!

Performance profiling have been added recently by Victor and it shows clearly that work is needed on this.

To sum up, the two main areas where help will be most than welcome are QA and performance profiling.

Coccigrep, a semantic grep for the C language

Introduction

When diving in some code with a relative important size, I’ve often ask myself: where is this attribute used for this structure ? Where it is set ? Using grep is not a good answer to theses questions: you can’t guess the name of the variable of a given type and even an attribute name can be shared between multiple structures. I was in need of a semantic grep!

Having played a bit with coccinelle, I did know that it could be used with success to build a semantic grep for the C language. But coccinelle syntax is not easy and I decided to build something with a simplistic syntax. Something that would integrate easily into vim to be able to use it like grep. I’ve launched my preferred editor and this something is now called coccigrep.

Meet the beast

The syntax is trivial, to find where the datalink attribute of the Packet structure is set in all source*c file:

$ coccigrep  -t Packet -a datalink -o set src/source*c   
src/source-af-packet.c:300 (Packet *p):     p->datalink = ptv->datalink;
src/source-erf-dag.c:525 (Packet *pkt):     pkt->datalink = LINKTYPE_ETHERNET;

You can use coccigrep from vim. If you run in vim:

:Coccigrep -t Packet -a datalink source-*.c

The matches will appear in the quickfix list and the file corresponding to first match will be opened at the corresponding line. Note that you can use completion on structure and attribute names based on tags (generated by :make tags).

Coccigrep supports syntax highlighting through the pygments module. For example, running coccigrep -t Packet -a datalink -o test -c -A 3 -B 3 -f html /tmp/test.c will output to stdout some colorized HTML code:

/tmp/test.c: l.300 -3, l.300 +3, Packet *p
        hdrp->sll_protocol = from.sll_protocol;
    }

    while (p->datalink >= ptv->datalink) {
    	SET_PKT_LEN(p, caplen + offset);
    	if (PacketCopyData(p, ptv->data, GET_PKT_LEN(p)) == -1) {
           TmqhOutputPacketpool(ptv->tv, p);

More information and download are available from coccigrep page.