From: Julia Lawall <julia.lawall@inria.fr>
To: Eric Wheeler <cocci@lists.ewheeler.net>
Cc: Julia Lawall <julia.lawall@inria.fr>, cocci@inria.fr
Subject: Re: [cocci] Using `parallel` to run over lots of .c files and avoiding stack overflows.
Date: Fri, 1 Apr 2022 22:28:17 +0200 (CEST) [thread overview]
Message-ID: <alpine.DEB.2.22.394.2204012224060.3048@hadrien> (raw)
In-Reply-To: <ac254b9c-3923-5b7b-5d56-564d50e06760@ewheeler.net>
On Fri, 1 Apr 2022, Eric Wheeler wrote:
> On Sun, 20 Mar 2022, Julia Lawall wrote:
> > On Sat, 19 Mar 2022, Eric Wheeler wrote:
> > > On Sat, 19 Mar 2022, Julia Lawall wrote:
> > > > On Sat, 19 Mar 2022, Eric Wheeler wrote:
> > > > > Just a quick tip for others (like me) who are new to Coccinelle:
> > > > >
> > > > > You might already use this, but if not, here's a hint for doing lots of
> > > > > replacements in parallel:
> > > > >
> > > > > parallel -j24 spatch --sp-file smpl.cocci {} --in-place ::: *.c
> > > > >
> > > > > If the SmPL depends on interactions between files then `parallel` won't
> > > > > work, but if the changes work per-file then it runs much faster with lots
> > > > > of big .c files.
> > > > >
> > > > > In cases where you do need to run over lots of files and they _do_
> > > > > interact, you might get a Stack Overflow error. In this case set
> > > > > something like this for 1GB of stack space:
> > > > > ulimit -s $((1024*1024))
> > > > >
> > > > > In my case spatch needed 1GB of stack and 2.4GB of RAM. It took a few
> > > > > minutes and finished thousands of replacements!
> > > > >
> > > > > Julia, you might add this to documentation if you think it would be
> > > > > useful.
> > > >
> > > > I'm not sure what you are trying to do. You can give spatch the name of a
> > > > directory and the argument -j N for a number for cores N, ad it will run
> > > > in parallel on the files in the directroy.
> > >
> > --jobs the number of processes to be used
> > -j the number of processes to be used
>
> Running `top` shows spatch at 100% cpu: it is not parallelizing, though
> the rules and files are completely independent of eachother (no
> inter-SmPL-rule dependencies).
I don't understand what you mean by spatch at 100%. I think you should
have 24 spatches?
On the other hand, you should not put a bunch of files on the command
line. The intended behavior is for that not to be parallel. Coccinelle
makes no effort to figure out whether parallelism is useful.
Normally, one runs spatch on a directory, and then it takes care of
working on the different files in parallel.
It seems that you don't want to process resources.c. Maybe the semantic
patch should be adjusted so that the code in resource.c is not matched?
With python or OCaml you can also cause Coccinelle to exit on a file you
don't want to process.
julia
>
> Not sure if this is an spatch bug or not, but its ~3x faster to use
> `parallel` as follows:
>
> ]# time parallel -j24 -- spatch --sp-file cocci/printf-refactor.cocci ::: `ls src/*.c|grep -v resources.c` &> /dev/null
> real 0m1.224s
>
> ]# time spatch -j24 --sp-file cocci/printf-refactor.cocci `ls src/*.c|grep -v resources.c` &> /dev/null
> real 0m4.852s
>
>
> Here is the SmPL:
> @ p4 @
> expression list EL;
> constant C =~ "BUG";
> @@
> -printf(C, EL);
> +BUG(C, EL);
>
> @ p3 @
> expression list EL;
> constant C =~ "warn";
> @@
> -printf(C, EL);
> +pr_warn(C, EL);
>
> @ p @
> expression list EL;
> constant C;
> @@
> -printf(C, EL);
> +pr_info(C, EL);
>
> @ p2 @
> expression list EL;
> constant C;
> @@
> -fprintf(stderr, C, EL);
> +pr_err(C, EL);
>
>
> The .c files are from here:
> https://github.com/KJ7LNW/xnec2c
>
> --
> Eric Wheeler
>
>
> >
> > julia
> >
>
next prev parent reply other threads:[~2022-04-01 20:28 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-19 19:10 [cocci] Using `parallel` to run over lots of .c files and avoiding stack overflows Eric Wheeler
2022-03-19 19:39 ` Julia Lawall
2022-03-19 22:38 ` Eric Wheeler
2022-03-20 6:33 ` Julia Lawall
2022-03-20 21:18 ` Eric Wheeler
2022-03-20 21:25 ` Julia Lawall
2022-03-20 22:15 ` Eric Wheeler
2022-03-20 22:23 ` Julia Lawall
2022-04-01 20:20 ` Eric Wheeler
2022-04-01 20:28 ` Julia Lawall [this message]
2022-04-02 17:39 ` Eric Wheeler
2022-04-02 17:50 ` Julia Lawall
2022-04-02 20:59 ` Eric Wheeler
2022-04-02 21:19 ` Julia Lawall
2022-04-02 23:03 ` Eric Wheeler
2022-04-03 8:32 ` [cocci] Using `parallel` to run over lots of .c files Markus Elfring
2022-04-06 1:28 ` Eric Wheeler
2022-04-01 21:10 ` Markus Elfring
2022-03-20 6:47 ` [cocci] Parallel data processing for selected SmPL scripts Markus Elfring
2022-04-03 15:58 ` [cocci] using gcc & clang -MF to reduce spatch work (was: Using `parallel` [...]) Ævar Arnfjörð Bjarmason
2022-04-03 16:27 ` Julia Lawall
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=alpine.DEB.2.22.394.2204012224060.3048@hadrien \
--to=julia.lawall@inria.fr \
--cc=cocci@inria.fr \
--cc=cocci@lists.ewheeler.net \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.