Introduction
coccigrep is a semantic grep for the C language based on coccinelle. It can be used to find where a given structure is used in code files. coccigrep depends on the spatch program which comes with coccinelle.
Download and source
Latest version is 1.13: coccigrep-1.13.tar.gz
The source can be accessed via github.
Examples
To find where in a set of files the structure named Packet
is used, you can run:
$ coccigrep -t Packet *c source-af-packet.c:272: p = ptv->in_p; source-af-packet.c:300: p->datalink = ptv->datalink; source-af-packet.c:758: switch(p->datalink) {
To find where in a set of files the datalink
attribute is used in the structure
named Packet
, you can simply do:
$ coccigrep -t Packet -a datalink *c source-af-packet.c:300: p->datalink = ptv->datalink; source-af-packet.c:758: switch(p->datalink) { source-erf-dag.c:525: p->datalink = LINKTYPE_ETHERNET;
If you want to be more precise and find where this attribute is set, you can use
the operation flag (-o). One of its value is set
which indicate we only want
the match where the attribute is set:
$ coccigrep -t Packet -a datalink -o set source*c source-af-packet.c:300: p->datalink = ptv->datalink; source-erf-dag.c:525: p->datalink = LINKTYPE_ETHERNET;
coccigrep supports syntax highlighting through the pygments module. For example, running coccigrep -t Packet -a datalink -o test -c -A 3 -B 3 -f html /tmp/test.c
will output to stdout some colorized HTML code:
/tmp/test.c: l.300 -3, l.300 +3, Packet *p
hdrp->sll_protocol = from.sll_protocol;
}
while (p->datalink >= ptv->datalink) {
SET_PKT_LEN(p, caplen + offset);
if (PacketCopyData(p, ptv->data, GET_PKT_LEN(p)) == -1) {
TmqhOutputPacketpool(ptv->tv, p);
Installation
The dependencies of coccigrep are spatch which comes with coccinelle. On python side, you need setuptools and optionally pygments (for colorized output). Happy Debian user can do
aptitude install python-setuptools python-pygments
To install coccigrep run
sudo python ./setup.py install
See next section, for usage of coccigrep inside Vim or Emacs.
Usage
usage: coccigrep [-h] [-t TYPE] [-a ATTRIBUT] [-o {set,callg,used,func,test,deref}] [-s SP] [-C CONTEXT] [-A AFTER] [-B BEFORE] [-p NCPUS] [-c] [--cpp] [-V] [-E] [-f {term,html}] [-v] [-L] [-l FILE_LIST] [--version] [file [file ...]] Semantic grep based on coccinelle positional arguments: file List of files optional arguments: -h, --help show this help message and exit -t TYPE, --type TYPE C type where looking for -a ATTRIBUT, --attribut ATTRIBUT C attribut that is set -o {set,callg,used,func,test,deref}, --operation {set,callg,used,func,test,deref} Operation on structure -s SP, --sp SP Semantic patch to use -C CONTEXT, --context CONTEXT Number of lines before and after context -A AFTER, --after-context AFTER Number of lines after context -B BEFORE, --before-context BEFORE Number of lines before context -p NCPUS, --process NCPUS Number of cpus to use -c, --color colorize output (need pygments) --cpp Activate coccinelle C++ support -V, --vim vim output -E, --emacs emacs output -f {term,html}, --output-format {term,html} colorize format for output -v, --verbose verbose output (including coccinelle error) -L, --list-operations List available operations -l FILE_LIST, --file-list FILE_LIST File containing a list of files to search in --version show program's version number and exit
Run coccigrep -h
for up-to-date and complete list of options.
Vim integration
To use coccigrep in vim, you can use the cocci-grep.vim
plugin provided in the `editors` directory. To do so you can simply copy it to your plugin directory which is usually ~/.vim/plugin/
. If your coccigrep
script in not in your path, you can use the coccigrep_path variable to give complete path. For example, you can add to your .vimrc
:
let g:coccigrep_path = '/usr/local/bin/coccigrep'
And then you can run commands like ::
:Coccigrep :Coccigrep Packet datalink source-*.c :Coccigrep Packet datalink set source-*.c
First command will interactively ask you the value. Second one will search all dereference of the datalink attribut for Packet structure. The last one will look where the set operation is done on the datalink attribute of Packet. To get the list of operations on your system, you can run coccigrep -L
or look at the list provided when input for operation is asked in interactive mode.
The matches will appear in the quickfix list and the file corresponding to first match will be opened at the corresponding line. Note that you can use completion on structure and attribute names based on tags (generated by :make tags
).
Running coccigrep in emacs
To use coccigrep in emacs, you need to load the cocci-grep.el
module provided in the editors
directory of the source code. For example, if you copy it in ~/.emacs.d/site-lisp/
, you can do:
(add-to-list 'load-path "~/.emacs.d/site-lisp/") (require 'cocci-grep)
And then you can run something like:
Meta+x cocci-grep
The script is interactive and you will need to answer to the questions which are:
- Type: The structure type you are searching
- Attribut: The attribute in the structure
- Operation: The operation on the structure. The set of commands include set,used,func,test,deref
- Files: A blob expression that will match the file you want to search in
The matches will appear in a buffer with mode set to grep-mode
and you will thus be able to jump on occurence. History is available on the different parameters.
Known bugs
The operation
option could lead to some missed match because the semantic patches used internally may needing some improvement.
Reporting issue or idea
Please use github to report issue. All ideas are welcome.
Extending coccigrep
It is easy to extend coccigrep: adding a new match function is just dropping a semantic patch in the data directory. For example, the latest add at the time of the writing is the func match. The commit on github will show you that:
- A file named func.cocci has been added in the src/data directory
- It contains something that looks like a semantic patch
Here’s the file:
@init@ typedef $type; $type *p; position p1; @@ ( $attribut(p@p1, ...) | $attribut(..., p@p1, ...) | $attribut(..., p@p1) )
In fact this is a templatized semantic patch. There is two variables $type
and $attribut
that will be replaced by the content of the command line option -t
and -a
. The last constraint about the semantic patch is that there is a position p1
which is used to display the positionnal information about the match.
Using coccinelle directly
It is possible to use coccinelle from command line to perform some of the task. For example, to get all access to the datalink attribute, one can run:
$ spatch -sp "e:Packet *:->datalink" stream-tcp.c init_defs_builtins: /home/eric/builds/coccinelle//share/coccinelle/standard.h HANDLING: stream-tcp.c diff = @@ Packet * e; @@ * e->datalink --- stream-tcp.c +++ /tmp/cocci-output-27859-9d2f27-stream-tcp.c @@ -4467,7 +4467,6 @@ Packet *StreamTcpPseudoSetup(Packet *par /* copy packet and set lenght, proto */ p->proto = parent->proto; - p->datalink = parent->datalink; PacketCopyData(p, pkt, len); p->recursion_level = parent->recursion_level + 1;
This feature is available since version 1.0-rc5. More information in this mail.
Bug report and feature request
Bug report and feature request can be made through Github interface.
You can also follow coccigrep activity on Google+.
Developer documentation
Documentation of the coccigrep
module is generated via sphinx: coccigrep documentation
Looks interesting, but it’s rather odd that it still uses PYTHONPATH:
Checking .pth file support in /tmp/yaourt-tmp-wishi/aur-coccigrep/pkg/usr/lib/python2.7/site-packages/
/usr/bin/python2 -E -c pass
TEST FAILED: /tmp/yaourt-tmp-wishi/aur-coccigrep/pkg/usr/lib/python2.7/site-packages/ does NOT support .pth files
error: bad install directory or PYTHONPATH
You are attempting to install a package to a directory that is not
on PYTHONPATH and which Python does not read “.pth” files from. The
installation directory you specified (via –install-dir, –prefix, or
the distutils default setting) was:
/tmp/yaourt-tmp-wishi/aur-coccigrep/pkg/usr/lib/python2.7/site-packages/
and your PYTHONPATH environment variable currently contains:
‘/tmp/yaourt-tmp-wishi/aur-coccigrep/’
Here are some of your options for correcting the problem:
* You can choose a different installation directory, i.e., one that is
on PYTHONPATH or supports .pth files
* You can add the installation directory to the PYTHONPATH environment
variable. (It must then also be on PYTHONPATH whenever you run
Python and want to use the package(s) you are installing.)
* You can set up the installation directory to support “.pth” files by
using one of the approaches described here:
http://packages.python.org/distribute/easy_install.html#custom-installation-locations
Please make the appropriate changes for your system and try again.
Hello,
For some this is an issue with setuptools (easy_install): http://code.google.com/p/pydbgr/issues/detail?id=7
A workaround is to switch to distutils insteas of setuptools. You can do so by changing setup.py:
-from setuptools import setup
+from distutils.core import setup
I will continue to investigate to see if there is a cleaner alternative.
Is it possible to use Coccigrep upon nesting structures ?
For example, I have the following structure :
struct stream_sys_t
{
struct hls_playback_s
{
uint64_t offset;
int stream;
int segment;
} playback;
};
and I want to know where “segment” attribute is set ?
Hello yhuelf,
The attribute argument in coccigrep is free form. You should thus be able to run:
coccigrep -t "struct stream_sys_t" -a "playback.segment" -o set FILES
Let me know if it is working.
It works perfectly. Thank you 🙂
Would it be possible to use Coccinelle or to expand coccigrep in order to search for non constant static data through a large code base?
Would it be possible to use Coccinelle or to expand coccigrep in order to search for non constant static data through a large code base?
I don’t really get what you want to do. Could you give an concrete example ?
Well, the lead dev of VLC asked on vlc-devel [1] if somebody would know a way to look for static non constant data throughout the code base, because he discovered such static data in the code and he’s not happy with that (I suppose it’s not a good idea in a multi-threaded environment).
Of course the use of static data that is never modified is ok, so it seems not that easy to identify the bad guys.
I thought that maybe coccinelle could help, but it’s probably a silly idea 🙂
[1]Â http://mailman.videolan.org/pipermail/vlc-devel/2011-October/082517.html
You mean detect static variable and look for where they are set and thus modified ?
Yes this is exactly what I meant. Indeed I could have been more clear and consise 😉
Hi Regit,
I’ve found a small bug: if you increment an integer attribute with the operator “++”, this is not detected by the “-set” operation (but it is if you replace “++” by “+= 1”).
For example you can test it with the following stupid code:
#include
typedef struct cocci_s {
int a;
} cocci_t;
int main(void)
{
cocci_t *cocci = (cocci_t *) malloc(sizeof(cocci_t));
cocci->a = 0;
cocci->a++;
return 0;
}
and
coccigrep -t cocci_t -a a -c -o set coccitest.c
Ok I didn’t read the “Known bugs” and “Reporting issue” sections, sorry for the noise… 🙂
Hello yhuelf. Good catch, this has been fixed in git tree: https://github.com/regit/coccigrep/commit/9de42f62a04cd6df0ef62fd75c6c1f7ea2c17292
I can’t seem to find a place to put in bug reports.
$ coccigrep -t ast_group_info main/*.c
Traceback (most recent call last):
File “/usr/bin/coccigrep”, line 5, in
pkg_resources.run_script(‘coccigrep==1.5’, ‘coccigrep’)
File “/usr/lib/python2.5/site-packages/pkg_resources.py”, line 467, in run_script
self.require(requires)[0].run_script(script_name, ns)
File “/usr/lib/python2.5/site-packages/pkg_resources.py”, line 1200, in run_script
execfile(script_filename, namespace, namespace)
File “/usr/lib/python2.5/site-packages/coccigrep-1.5-py2.5.egg/EGG-INFO/scripts/coccigrep”, line 106, in
coccigrep.run(args.file)
File “/usr/lib/python2.5/site-packages/coccigrep-1.5-py2.5.egg/coccigrep/coccigrep.py”, line 420, in run
tmp_cocci_file = NamedTemporaryFile(suffix=”.cocci”, delete=False)
TypeError: NamedTemporaryFile() got an unexpected keyword argument ‘delete’
Thank you Mark. You can open an issue on https://github.com/regit/coccigrep/issues for this bug report. This is link with the fact you are using python 2.5. The delete parameter has appeared in python 2.6. I will try to make a fix.
I’ve just pushed a patch to the git tree. It should fix the problem: https://github.com/regit/coccigrep/commit/0db22aeae07c40081e4e07988f7a1834b2fe9598
I’ve introduced a bug when the run was done without concurrency. This is fixed by https://github.com/regit/coccigrep/commit/2739674926e8ce2dda08a9fd2125ee5a6b30ee2e
Well, after giving coccigrep a try on a 64 bits Ubuntu 10.04 LTS machine, installing python-setuptools to be able to run install script and python-argparse that was claimed missing at first try, there is still this one
Config error: Unable to run spatch command ‘spatch’: No such file or directory.
Indeed, there is something imo missing in current version: The ability to recurse subdirs (as grep -r)!
Hello yann,
Thanks a lot for your interest in coccigrep.
For spatch, you need to install coccinelle which is the semantic tool used by coccigrep.
Regarding recursivity, if you give a directory on the command line, coccigrep will select recursively all files inside.
Can coccigrep be used to list all used types in a sourcetree ? Handy for determining which autoconf tests I need.
No, I don’t think that coccigrep or coccinelle can be used easily to do an enumeration of the types.