* [cocci] Multiple returns significant performance impact
@ 2024-10-24 11:56 Derek M Jones
2024-10-24 12:42 ` Markus Elfring
2024-10-24 14:48 ` Julia Lawall
0 siblings, 2 replies; 39+ messages in thread
From: Derek M Jones @ 2024-10-24 11:56 UTC (permalink / raw)
To: cocci
All,
I'm interested in finding the start/end line of
functions, and the pattern below does this.
However, return statements cause multiple matches to
occur. For instance, in the following code the python
in the cocci script below is called twice.
This is not a problem because I can filter on the
highest line number.
The real problem is the performance hit. Matching
a function containing lots of returns takes forever (there
are lots of these in the Linux kernel). I am regularly
seeing timeouts after 600 seconds (the specified timeout).
Is here a way of improving performance, given that I am
not interested in the returns?
void f2(int a)
{
a++;
if (g)
{
a++;
return;
}
}
@ func_def
@
identifier f;
parameter list parms;
position p_1, p_2;
@@
f(parms)
{@p_1
...
}@p_2
@
script:python @ func << func_def.f;
parm_n << func_def.parms;
loc_1 << func_def.p_1;
loc_2 << func_def.p_2;
@@
import sys
def printf(format, *args):
sys.stdout.write(format % args)
printf("%s,func", loc_1[0].file)
printf(",%s", func)
printf(",\"%s\"", parm_n)
printf(",%s,%s", loc_1[0].line, loc_1[0].column)
printf(",%s,%s\n", loc_2[0].line, loc_2[0].column)
int g;
void f1(int a)
{
a++;
}
void f2(int a)
{
a++;
if (g)
{
a++;
return;
}
}
tfunc.c,func,f1,int a,5,0,7,0
tfunc.c,func,f2,int a,11,0,17,3
tfunc.c,func,f2,int a,11,0,18,0
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 11:56 [cocci] Multiple returns significant performance impact Derek M Jones
@ 2024-10-24 12:42 ` Markus Elfring
2024-10-24 14:48 ` Julia Lawall
1 sibling, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-24 12:42 UTC (permalink / raw)
To: Derek M. Jones, cocci
> I'm interested in finding the start/end line of
> functions, and the pattern below does this.
It determines the positions of some curly brackets.
> However, return statements cause multiple matches to
> occur. For instance, in the following code the python
> in the cocci script below is called twice.
> This is not a problem because I can filter on the
> highest line number.
Is such behaviour relevant for the clarification of this test case?
Would you like to use any finalisation rules for the desired
Python or OCaml code?
https://gitlab.inria.fr/coccinelle/coccinelle/-/blob/7bc8a7acf4673a3143cae7aa00ef9374c9fdf893/docs/manual/cocci_syntax.tex#L688
https://github.com/coccinelle/coccinelle/blob/7bc8a7acf4673a3143cae7aa00ef9374c9fdf893/docs/manual/cocci_syntax.tex#L688
> The real problem is the performance hit. Matching
> a function containing lots of returns takes forever (there
> are lots of these in the Linux kernel). I am regularly
> seeing timeouts after 600 seconds (the specified timeout).
Would you like to enlarge computation resources anyhow
if the software implementation would be too limited so far?
> Is here a way of improving performance,
I hope so, too.
> given that I am not interested in the returns?
How do you think about to influence the availability of
development resources any more?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 11:56 [cocci] Multiple returns significant performance impact Derek M Jones
2024-10-24 12:42 ` Markus Elfring
@ 2024-10-24 14:48 ` Julia Lawall
2024-10-24 15:23 ` Derek M Jones
1 sibling, 1 reply; 39+ messages in thread
From: Julia Lawall @ 2024-10-24 14:48 UTC (permalink / raw)
To: Derek M Jones; +Cc: cocci
On Thu, 24 Oct 2024, Derek M Jones wrote:
> All,
>
> I'm interested in finding the start/end line of
> functions, and the pattern below does this.
> However, return statements cause multiple matches to
> occur. For instance, in the following code the python
> in the cocci script below is called twice.
> This is not a problem because I can filter on the
> highest line number.
>
> The real problem is the performance hit. Matching
> a function containing lots of returns takes forever (there
> are lots of these in the Linux kernel). I am regularly
> seeing timeouts after 600 seconds (the specified timeout).
>
> Is here a way of improving performance, given that I am
> not interested in the returns?
I don't think it's multiple returns, just multiple code paths.
In any case, this information is stored in any position variable. In
OCaml it would be (List.hd p).current_element_line and (List.hd
p).current_element_line_end. I guess eg p[0].current_element_line in
python.
julia
>
> void f2(int a)
> {
> a++;
> if (g)
> {
> a++;
> return;
> }
> }
>
>
> @ func_def
> @
> identifier f;
> parameter list parms;
> position p_1, p_2;
> @@
> f(parms)
> {@p_1
> ...
> }@p_2
> @
>
> script:python @ func << func_def.f;
> parm_n << func_def.parms;
> loc_1 << func_def.p_1;
> loc_2 << func_def.p_2;
> @@
>
> import sys
> def printf(format, *args):
> sys.stdout.write(format % args)
>
> printf("%s,func", loc_1[0].file)
> printf(",%s", func)
> printf(",\"%s\"", parm_n)
> printf(",%s,%s", loc_1[0].line, loc_1[0].column)
> printf(",%s,%s\n", loc_2[0].line, loc_2[0].column)
>
>
> int g;
>
> void f1(int a)
> {
> a++;
> }
>
>
> void f2(int a)
> {
> a++;
> if (g)
> {
> a++;
> return;
> }
> }
>
> tfunc.c,func,f1,int a,5,0,7,0
> tfunc.c,func,f2,int a,11,0,17,3
> tfunc.c,func,f2,int a,11,0,18,0
>
> --
> Derek M. Jones Evidence-based software engineering
> blog:https://shape-of-code.com
>
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 14:48 ` Julia Lawall
@ 2024-10-24 15:23 ` Derek M Jones
2024-10-24 15:29 ` Julia Lawall
0 siblings, 1 reply; 39+ messages in thread
From: Derek M Jones @ 2024-10-24 15:23 UTC (permalink / raw)
To: cocci
Julia,
>> Is here a way of improving performance, given that I am
>> not interested in the returns?
>
> I don't think it's multiple returns, just multiple code paths.
Ok.
Is there an option that reduces the overhead, or perhaps
an alternative way of writing the pattern?
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:23 ` Derek M Jones
@ 2024-10-24 15:29 ` Julia Lawall
2024-10-24 15:35 ` Derek M Jones
0 siblings, 1 reply; 39+ messages in thread
From: Julia Lawall @ 2024-10-24 15:29 UTC (permalink / raw)
To: Derek M Jones; +Cc: cocci
On Thu, 24 Oct 2024, Derek M Jones wrote:
> Julia,
>
> > > Is here a way of improving performance, given that I am
> > > not interested in the returns?
> >
> > I don't think it's multiple returns, just multiple code paths.
>
> Ok.
>
> Is there an option that reduces the overhead, or perhaps
> an alternative way of writing the pattern?
Maybe using when exists on the ....
But the solution I proposed was created for precisely this problem.
julia
>
> --
> Derek M. Jones Evidence-based software engineering
> blog:https://shape-of-code.com
>
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:29 ` Julia Lawall
@ 2024-10-24 15:35 ` Derek M Jones
2024-10-24 15:38 ` Julia Lawall
0 siblings, 1 reply; 39+ messages in thread
From: Derek M Jones @ 2024-10-24 15:35 UTC (permalink / raw)
To: cocci
Julia,
>> Is there an option that reduces the overhead, or perhaps
>> an alternative way of writing the pattern?
>
> Maybe using when exists on the ....
Would checking an always true condition have the
desired effect?
> But the solution I proposed was created for precisely this problem.
I did not understand your comment about any position variable.
Would changing the current code
printf(",%s,%s\n", loc_2[0].line, loc_2[0].column)
to use
p[0].current_element_line
make a difference?
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:35 ` Derek M Jones
@ 2024-10-24 15:38 ` Julia Lawall
2024-10-24 15:44 ` Victor Gambier
` (2 more replies)
0 siblings, 3 replies; 39+ messages in thread
From: Julia Lawall @ 2024-10-24 15:38 UTC (permalink / raw)
To: Derek M Jones; +Cc: cocci
On Thu, 24 Oct 2024, Derek M Jones wrote:
> Julia,
>
> > > Is there an option that reduces the overhead, or perhaps
> > > an alternative way of writing the pattern?
> >
> > Maybe using when exists on the ....
>
> Would checking an always true condition have the
> desired effect?
... when exists
when any
might be faster than what you have now. But it will still b slower than
necessary.
> > But the solution I proposed was created for precisely this problem.
>
> I did not understand your comment about any position variable.
>
> Would changing the current code
> printf(",%s,%s\n", loc_2[0].line, loc_2[0].column)
>
> to use
> p[0].current_element_line
> make a difference?
Not in itself. In my solution you only need one position variable, which
is the one on the top {
Every position variable contains complete information about the enclosing
object.
julia
>
> --
> Derek M. Jones Evidence-based software engineering
> blog:https://shape-of-code.com
>
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:38 ` Julia Lawall
@ 2024-10-24 15:44 ` Victor Gambier
2024-10-24 16:00 ` Julia Lawall
2024-10-24 15:50 ` Derek M Jones
2024-10-24 16:16 ` [cocci] SmPL position variables … Markus Elfring
2 siblings, 1 reply; 39+ messages in thread
From: Victor Gambier @ 2024-10-24 15:44 UTC (permalink / raw)
To: cocci
> Not in itself. In my solution you only need one position variable, which
> is the one on the top {
>
> Every position variable contains complete information about the enclosing
> object.
>
> julia
Presumably something along the lines of:
$ cat temp.c
int g;
void f1(int a)
{
a++;
}
void f2(int a)
{
a++;
if (g)
{
a++;
return;
}
}
$ cat temp.cocci
@ func_def
@
identifier f;
parameter list parms;
position p_1;
@@
f(parms)
{@p_1
...
}
@
script:python @ func << func_def.f;
parm_n << func_def.parms;
loc_1 << func_def.p_1;
@@
import sys
def printf(format, *args):
sys.stdout.write(format % args)
printf("%s,func", loc_1[0].file)
printf(",%s", func)
printf(",\"%s\"", parm_n)
printf(",%s,%s", loc_1[0].current_element_line,
loc_1[0].current_element_column)
printf(",%s,%s\n", loc_1[0].current_element_line_end,
loc_1[0].current_element_column_end)
>
>
>> --
>> Derek M. Jones Evidence-based software engineering
>> blog:https://shape-of-code.com
>>
>>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:38 ` Julia Lawall
2024-10-24 15:44 ` Victor Gambier
@ 2024-10-24 15:50 ` Derek M Jones
2024-10-24 16:07 ` Markus Elfring
` (3 more replies)
2024-10-24 16:16 ` [cocci] SmPL position variables … Markus Elfring
2 siblings, 4 replies; 39+ messages in thread
From: Derek M Jones @ 2024-10-24 15:50 UTC (permalink / raw)
To: cocci
Julia,
> ... when exists
> when any>
> might be faster than what you have now. But it will still b slower than
> necessary.
Just tried
... when exists
on a couple of files that timed out after 600s.
The reported time was 0.209s, so it's a lot faster!
I will kill the run that has currently taken 16 hours, and
still going.
> Every position variable contains complete information about the enclosing
Thanks. This will simplify the scripts.
The OCaml source for the location declarations used such similar
meaning names
I found the slide deck
Introduction to Semantic Patching of C programs with Coccinelle
by Michele MARTONE
was a very useful update for me on Cocinnelle
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:44 ` Victor Gambier
@ 2024-10-24 16:00 ` Julia Lawall
0 siblings, 0 replies; 39+ messages in thread
From: Julia Lawall @ 2024-10-24 16:00 UTC (permalink / raw)
To: Victor Gambier; +Cc: cocci
[-- Attachment #1: Type: text/plain, Size: 1386 bytes --]
On Thu, 24 Oct 2024, Victor Gambier wrote:
> > Not in itself. In my solution you only need one position variable, which
> > is the one on the top {
> >
> > Every position variable contains complete information about the enclosing
> > object.
> >
> > julia
>
> Presumably something along the lines of:
>
> $ cat temp.c
>
> int g;
>
> void f1(int a)
> {
> a++;
> }
>
>
> void f2(int a)
> {
> a++;
> if (g)
> {
> a++;
> return;
> }
> }
>
> $ cat temp.cocci
> @ func_def
> @
> identifier f;
> parameter list parms;
> position p_1;
> @@
> f(parms)
> {@p_1
> ...
> }
> @
>
> script:python @ func << func_def.f;
> parm_n << func_def.parms;
> loc_1 << func_def.p_1;
> @@
>
> import sys
> def printf(format, *args):
> sys.stdout.write(format % args)
>
> printf("%s,func", loc_1[0].file)
> printf(",%s", func)
> printf(",\"%s\"", parm_n)
> printf(",%s,%s", loc_1[0].current_element_line,
> loc_1[0].current_element_column)
> printf(",%s,%s\n", loc_1[0].current_element_line_end,
> loc_1[0].current_element_column_end)
That's the idea. Thanks for making ti more concrete.
Perhaps that is worth an example in the demos directory.
julia
>
> >
> >
> > > --
> > > Derek M. Jones Evidence-based software engineering
> > > blog:https://shape-of-code.com
> > >
> > >
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:50 ` Derek M Jones
@ 2024-10-24 16:07 ` Markus Elfring
2024-10-24 16:14 ` Julia Lawall
2024-10-24 16:48 ` Markus Elfring
` (2 subsequent siblings)
3 siblings, 1 reply; 39+ messages in thread
From: Markus Elfring @ 2024-10-24 16:07 UTC (permalink / raw)
To: Derek M. Jones; +Cc: cocci
>> ... when exists
>> when any>
>> might be faster than what you have now. But it will still b slower than necessary.
>
> Just tried
> ... when exists
> on a couple of files that timed out after 600s.
> The reported time was 0.209s, so it's a lot faster!
Such an effect on the software run time characteristics is nice.
Will a program parameter like “--no-loops” be also helpful another bit?
> I will kill the run that has currently taken 16 hours, and
> still going.
>
>> Every position variable contains complete information about the enclosing
>
> Thanks. This will simplify the scripts.
Will the attention grow also for a specification like “@finalize:python@”?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 16:07 ` Markus Elfring
@ 2024-10-24 16:14 ` Julia Lawall
0 siblings, 0 replies; 39+ messages in thread
From: Julia Lawall @ 2024-10-24 16:14 UTC (permalink / raw)
To: Markus Elfring; +Cc: Derek M. Jones, cocci
[-- Attachment #1: Type: text/plain, Size: 664 bytes --]
On Thu, 24 Oct 2024, Markus Elfring wrote:
> >> ... when exists
> >> when any>
> >> might be faster than what you have now. But it will still b slower than necessary.
> >
> > Just tried
> > ... when exists
> > on a couple of files that timed out after 600s.
> > The reported time was 0.209s, so it's a lot faster!
>
> Such an effect on the software run time characteristics is nice.
>
> Will a program parameter like “--no-loops” be also helpful another bit?
With the use of current_element_line, --no-loops is irrelevant. The body
of the function is no matched at all when the position variable is at the
top of the function somewhere.
julia
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] SmPL position variables …
2024-10-24 15:38 ` Julia Lawall
2024-10-24 15:44 ` Victor Gambier
2024-10-24 15:50 ` Derek M Jones
@ 2024-10-24 16:16 ` Markus Elfring
2024-10-24 16:20 ` Julia Lawall
2 siblings, 1 reply; 39+ messages in thread
From: Markus Elfring @ 2024-10-24 16:16 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones
> Every position variable contains complete information about the enclosing object.
Will such information trigger any improvements for the software documentation?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] SmPL position variables …
2024-10-24 16:16 ` [cocci] SmPL position variables … Markus Elfring
@ 2024-10-24 16:20 ` Julia Lawall
2024-10-24 16:28 ` Markus Elfring
0 siblings, 1 reply; 39+ messages in thread
From: Julia Lawall @ 2024-10-24 16:20 UTC (permalink / raw)
To: Markus Elfring; +Cc: cocci, Derek M. Jones
On Thu, 24 Oct 2024, Markus Elfring wrote:
> > Every position variable contains complete information about the enclosing object.
>
> Will such information trigger any improvements for the software documentation?
man Coccilib already shows the information, at least in the OCaml case.
julia
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] SmPL position variables …
2024-10-24 16:20 ` Julia Lawall
@ 2024-10-24 16:28 ` Markus Elfring
0 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-24 16:28 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones
>>> Every position variable contains complete information about the enclosing object.
>>
>> Will such information trigger any improvements for the software documentation?
>
> man Coccilib already shows the information, at least in the OCaml case.
Can the information representation still become nicer anyhow?
https://gitlab.inria.fr/coccinelle/coccinelle/-/blob/7bc8a7acf4673a3143cae7aa00ef9374c9fdf893/docs/Coccilib.3cocci#L36
https://gitlab.inria.fr/coccinelle/coccinelle/-/blob/7bc8a7acf4673a3143cae7aa00ef9374c9fdf893/python/coccilib/elems.py#L1
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:50 ` Derek M Jones
2024-10-24 16:07 ` Markus Elfring
@ 2024-10-24 16:48 ` Markus Elfring
2024-10-24 17:30 ` Derek M Jones
2024-10-26 11:43 ` [cocci] Checking SmPL run time characteristics for code block position " Markus Elfring
3 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-24 16:48 UTC (permalink / raw)
To: Derek M. Jones, cocci
>> Every position variable contains complete information about the enclosing
>
> Thanks. This will simplify the scripts.
Do you transfer any source code position information into higher level structures
so that you can eventually benefit more from parallel data processing?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 15:50 ` Derek M Jones
2024-10-24 16:07 ` Markus Elfring
2024-10-24 16:48 ` Markus Elfring
@ 2024-10-24 17:30 ` Derek M Jones
2024-10-24 18:26 ` Markus Elfring
2024-10-26 11:43 ` [cocci] Checking SmPL run time characteristics for code block position " Markus Elfring
3 siblings, 1 reply; 39+ messages in thread
From: Derek M Jones @ 2024-10-24 17:30 UTC (permalink / raw)
To: cocci
Julia,
> Just tried
> ... when exists
> on a couple of files that timed out after 600s.
> The reported time was 0.209s, so it's a lot faster!
For the linux-6.11.2 kernel
Processed 95,512 functions in (before being terminated)
real 1082m31.767s
user 1082m23.202s
sys 0m1.594s
After adding
... when exists
Processed 830,471 functions
real 83m54.085s
user 83m4.394s
sys 0m3.496s
Which is around 112 times faster
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 17:30 ` Derek M Jones
@ 2024-10-24 18:26 ` Markus Elfring
2024-10-24 21:03 ` Derek M Jones
0 siblings, 1 reply; 39+ messages in thread
From: Markus Elfring @ 2024-10-24 18:26 UTC (permalink / raw)
To: Derek M. Jones; +Cc: cocci
> For the linux-6.11.2 kernel
Which analysis goals did you try to achieve by extracting source code
position information from function implementations?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 18:26 ` Markus Elfring
@ 2024-10-24 21:03 ` Derek M Jones
2024-10-25 5:38 ` Markus Elfring
0 siblings, 1 reply; 39+ messages in thread
From: Derek M Jones @ 2024-10-24 21:03 UTC (permalink / raw)
To: cocci
Markus,
> Which analysis goals did you try to achieve by extracting source code
> position information from function implementations?
An example
https://shape-of-code.com/2024/10/13/if-statement-conditions-some-basic-measurements/
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-24 21:03 ` Derek M Jones
@ 2024-10-25 5:38 ` Markus Elfring
2024-10-25 11:54 ` Derek M Jones
2024-10-27 12:48 ` [cocci] Broken code block size determination in function implementations Markus Elfring
0 siblings, 2 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-25 5:38 UTC (permalink / raw)
To: Derek M. Jones; +Cc: cocci
>> Which analysis goals did you try to achieve by extracting source code
>> position information from function implementations?
>
> An example
> https://shape-of-code.com/2024/10/13/if-statement-conditions-some-basic-measurements/
* Do you plan to analyse any function implementation sizes (according to your statistic approach)?
* Would source code be taken into account also from header files?
* How will the clarification of software run time characteristics evolve further
for advanced usage of the semantic patch language (Coccinelle)?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Multiple returns significant performance impact
2024-10-25 5:38 ` Markus Elfring
@ 2024-10-25 11:54 ` Derek M Jones
2024-10-27 12:48 ` [cocci] Broken code block size determination in function implementations Markus Elfring
1 sibling, 0 replies; 39+ messages in thread
From: Derek M Jones @ 2024-10-25 11:54 UTC (permalink / raw)
To: cocci
Markus,
> * Do you plan to analyse any function implementation sizes (according to your statistic approach)?
Some weekend reading for you (chapters 4 and 7 to start).
My book Evidence-based Software Engineering
discusses what is currently known about software engineering,
based on an analysis of all the publicly available data
pdf+code+all data freely available here:
http://knosof.co.uk/ESEUR/
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Checking SmPL run time characteristics for code block position determination in function implementations
2024-10-24 15:50 ` Derek M Jones
` (2 preceding siblings ...)
2024-10-24 17:30 ` Derek M Jones
@ 2024-10-26 11:43 ` Markus Elfring
3 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-26 11:43 UTC (permalink / raw)
To: cocci; +Cc: Derek M. Jones, Victor Gambier
[-- Attachment #1: Type: text/plain, Size: 1422 bytes --]
> Just tried
> ... when exists
> on a couple of files that timed out after 600s.
> The reported time was 0.209s, so it's a lot faster!
Corresponding measurements can be considerably improved also according to
the software combination “1.2-00097-g7bc8a7ac-dirty” with the help of attached
two files (for example).
Markus_Elfring@Sonne:…/Projekte/Coccinelle/janitor> time spatch --profile list_curly_bracket_positions_for_function_implementations1.cocci ../Probe/puts-test1.c
…
action|"source file"|"block start line"|"block start column"|"block end line"|"block end column"
my_message|../Probe/puts-test1.c|4|1|6|2
…
profiling result
…
Rule searching : 0.000173 sec 1 count
…
real 0m0,227s
user 0m0,189s
sys 0m0,034s
Markus_Elfring@Sonne:…/Projekte/Coccinelle/janitor> time spatch --profile list_curly_bracket_positions_for_function_implementations2.cocci ../Probe/puts-test1.c
…
action|"source file"|"block start line"|"block start column"|"block end line"|"block end column"
my_message|../Probe/puts-test1.c|4|1|6|2
…
profiling result
…
Rule searching : 0.000196 sec 1 count
…
real 0m0,242s
user 0m0,203s
sys 0m0,015s
Will effects be reconsidered any more for the discussed SmPL ellipsis option
together with similar source code search approaches?
Regards,
Markus
[-- Attachment #2: list_curly_bracket_positions_for_function_implementations1.cocci --]
[-- Type: text/plain, Size: 1788 bytes --]
@initialize:python@
@@
import sys
records = {}
class integrity_error:
pass
def store_data(places):
"""Add source code positions to an internal table."""
for place in places:
key = place.file, place.line, int(place.column) + 1
if key in records:
sys.stderr.write("\n".join(["-> duplicate data",
"file:", key[0],
"function:", place.current_element,
"line:", str(place.line)]))
sys.stderr.write("\n")
raise integrity_error
else:
records[key] = place.current_element, place.current_element_line_end, int(place.current_element_column_end) + 1
@searching@
identifier f;
position p;
@@
f(...)
{@p
...
}
@script:python collection@
place << searching.p;
@@
store_data(place)
@finalize:python@
@@
if len(records) > 0:
delimiter = "|"
sys.stdout.write(delimiter.join(['action',
'"source file"',
'"block start line"',
'"block start column"',
'"block end line"',
'"block end column"'
]))
sys.stdout.write("\n")
for key, value in records.items():
sys.stdout.write(delimiter.join([value[0],
key[0],
key[1],
str(key[2]),
value[1],
str(value[2])
]))
sys.stdout.write("\n")
else:
sys.stderr.write("No result for this analysis!\n")
[-- Attachment #3: puts-test1.c --]
[-- Type: text/x-csrc, Size: 63 bytes --]
#include <stdio.h>
void my_message(void)
{
puts("Hello!");
}
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-25 5:38 ` Markus Elfring
2024-10-25 11:54 ` Derek M Jones
@ 2024-10-27 12:48 ` Markus Elfring
2024-10-27 13:31 ` Julia Lawall
1 sibling, 1 reply; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 12:48 UTC (permalink / raw)
To: cocci; +Cc: Derek M. Jones, Victor Gambier
> * Do you plan to analyse any function implementation sizes (…)?
I got into the mood to try further data processing out also together with
an SmPL script variant (like the following).
@initialize:python@
@@
import sys
records = {}
class integrity_error(Exception):
pass
def store_data(places):
"""Add source code positions to an internal table."""
for place in places:
key = place.file, place.line, int(place.column) + 1
if key in records:
sys.stderr.write("\n".join(["-> duplicate data",
"file:", key[0],
"function:", place.current_element,
"line:", str(place.line)]))
sys.stderr.write("\n")
raise integrity_error
else:
records[key] = place.current_element, int(place.current_element_line_end) - int(place.line) - 1
@searching@
identifier f;
position p;
@@
f(...)
{@p
... when exists
}
@script:python collection@
place << searching.p;
@@
store_data(place)
@finalize:python@
@@
if len(records) > 0:
for key, value in records.items():
sys.stdout.write("|".join([value[0],
str(value[1]),
key[0],
key[1],
str(key[2])
]))
sys.stdout.write("\n")
else:
sys.stderr.write("No result for this analysis!\n")
Unfortunately, I stumbled on a few test results which trigger further development considerations.
Examples:
A) GENERATE_PERMUTATIONS_3_EVENTS(name)
https://elixir.bootlin.com/linux/v6.12-rc4/source/lib/locking-selftest.c#L288
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci lib/locking-selftest.c
…
-> duplicate data
file:
lib/locking-selftest.c
function:
W1R2_W2R3_W3R1_123
line:
290
…
B) CTX_RQ_SEQ_OPS(name, type)
https://elixir.bootlin.com/linux/v6.12-rc4/source/block/blk-mq-debugfs.c#L500
Markus_Elfring@Sonne:…/elfring/Projekte/Linux/next-analyses> spatch …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci block/blk-mq-debugfs.c
…
-> duplicate data
file:
block/blk-mq-debugfs.c
function:
ctx_poll_rq_list_next
line:
512
…
Will the chances grow to handle involved macro code better anyhow?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 12:48 ` [cocci] Broken code block size determination in function implementations Markus Elfring
@ 2024-10-27 13:31 ` Julia Lawall
2024-10-27 15:15 ` Markus Elfring
2024-10-27 16:40 ` Markus Elfring
0 siblings, 2 replies; 39+ messages in thread
From: Julia Lawall @ 2024-10-27 13:31 UTC (permalink / raw)
To: Markus Elfring; +Cc: cocci, Derek M. Jones, Victor Gambier
[-- Attachment #1: Type: text/plain, Size: 2958 bytes --]
On Sun, 27 Oct 2024, Markus Elfring wrote:
> > * Do you plan to analyse any function implementation sizes (…)?
>
> I got into the mood to try further data processing out also together with
> an SmPL script variant (like the following).
>
>
> @initialize:python@
> @@
> import sys
>
> records = {}
>
> class integrity_error(Exception):
> pass
>
> def store_data(places):
> """Add source code positions to an internal table."""
> for place in places:
> key = place.file, place.line, int(place.column) + 1
>
> if key in records:
> sys.stderr.write("\n".join(["-> duplicate data",
> "file:", key[0],
> "function:", place.current_element,
> "line:", str(place.line)]))
> sys.stderr.write("\n")
> raise integrity_error
> else:
> records[key] = place.current_element, int(place.current_element_line_end) - int(place.line) - 1
>
> @searching@
> identifier f;
> position p;
> @@
> f(...)
> {@p
> ... when exists
> }
>
> @script:python collection@
> place << searching.p;
> @@
> store_data(place)
>
> @finalize:python@
> @@
> if len(records) > 0:
> for key, value in records.items():
> sys.stdout.write("|".join([value[0],
> str(value[1]),
> key[0],
> key[1],
> str(key[2])
> ]))
> sys.stdout.write("\n")
> else:
> sys.stderr.write("No result for this analysis!\n")
>
>
> Unfortunately, I stumbled on a few test results which trigger further development considerations.
>
> Examples:
>
> A) GENERATE_PERMUTATIONS_3_EVENTS(name)
> https://elixir.bootlin.com/linux/v6.12-rc4/source/lib/locking-selftest.c#L288
> Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci lib/locking-selftest.c
> …
> -> duplicate data
> file:
> lib/locking-selftest.c
> function:
> W1R2_W2R3_W3R1_123
> line:
> 290
> …
>
> B) CTX_RQ_SEQ_OPS(name, type)
> https://elixir.bootlin.com/linux/v6.12-rc4/source/block/blk-mq-debugfs.c#L500
> Markus_Elfring@Sonne:…/elfring/Projekte/Linux/next-analyses> spatch …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci block/blk-mq-debugfs.c
> …
> -> duplicate data
> file:
> block/blk-mq-debugfs.c
> function:
> ctx_poll_rq_list_next
> line:
> 512
> …
>
>
> Will the chances grow to handle involved macro code better anyhow?
Derek reported off list something related to macros that Victor is looking
into.
If you could make a semantic patch that doesn't involve 30-some lines of
python code and would provide a minimal test case, then I could try to
understand if you are concerned about the same issue.
julia
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 13:31 ` Julia Lawall
@ 2024-10-27 15:15 ` Markus Elfring
2024-10-27 16:40 ` Markus Elfring
1 sibling, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 15:15 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
…
>> @searching@
>> identifier f;
>> position p;
>> @@
>> f(...)
>> {@p
>> ... when exists
>> }
…
> If you could make a semantic patch that doesn't involve 30-some lines of
> python code and would provide a minimal test case, then I could try to
> understand if you are concerned about the same issue.
Can you get further development ideas from the shown general source code search approach?
How will the support grow for better handling of mentioned macro code?
(How many of such macros should generate variations for function implementations?)
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 13:31 ` Julia Lawall
2024-10-27 15:15 ` Markus Elfring
@ 2024-10-27 16:40 ` Markus Elfring
2024-10-27 17:05 ` Julia Lawall
1 sibling, 1 reply; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 16:40 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
…
> If you could make a semantic patch that doesn't involve 30-some lines of
> python code and would provide a minimal test case, then I could try to
> understand if you are concerned about the same issue.
Can you find another example helpful for achieving a better understanding?
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci arch/sh/drivers/pci/common.c
…
-> duplicate data
file:
arch/sh/drivers/pci/common.c
function:
early_read_config_dword
line:
36
…
https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L33 :
…
#define EARLY_PCI_OP(rw, size, type) \
int __init early_##rw##_config_##size(struct pci_channel *hose, \
int top_bus, int bus, int devfn, int offset, type value) \
{ \
return pci_##rw##_config_##size( \
fake_pci_dev(hose, top_bus, bus, devfn), \
offset, value); \
}
EARLY_PCI_OP(read, byte, u8 *)
EARLY_PCI_OP(read, word, u16 *)
EARLY_PCI_OP(read, dword, u32 *)
EARLY_PCI_OP(write, byte, u8)
EARLY_PCI_OP(write, word, u16)
EARLY_PCI_OP(write, dword, u32)
…
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 16:40 ` Markus Elfring
@ 2024-10-27 17:05 ` Julia Lawall
2024-10-27 17:28 ` Markus Elfring
` (3 more replies)
0 siblings, 4 replies; 39+ messages in thread
From: Julia Lawall @ 2024-10-27 17:05 UTC (permalink / raw)
To: Markus Elfring; +Cc: cocci, Derek M. Jones, Victor Gambier
[-- Attachment #1: Type: text/plain, Size: 1667 bytes --]
On Sun, 27 Oct 2024, Markus Elfring wrote:
> …
> > If you could make a semantic patch that doesn't involve 30-some lines of
> > python code and would provide a minimal test case, then I could try to
> > understand if you are concerned about the same issue.
> Can you find another example helpful for achieving a better understanding?
>
>
> Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci arch/sh/drivers/pci/common.c
> …
> -> duplicate data
> file:
> arch/sh/drivers/pci/common.c
> function:
> early_read_config_dword
> line:
> 36
> …
>
>
> https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L33 :
> …
> #define EARLY_PCI_OP(rw, size, type) \
> int __init early_##rw##_config_##size(struct pci_channel *hose, \
> int top_bus, int bus, int devfn, int offset, type value) \
> { \
> return pci_##rw##_config_##size( \
> fake_pci_dev(hose, top_bus, bus, devfn), \
> offset, value); \
> }
>
> EARLY_PCI_OP(read, byte, u8 *)
> EARLY_PCI_OP(read, word, u16 *)
> EARLY_PCI_OP(read, dword, u32 *)
> EARLY_PCI_OP(write, byte, u8)
> EARLY_PCI_OP(write, word, u16)
> EARLY_PCI_OP(write, dword, u32)
> …
OK, that was much more understandable. In general, we don't make a great
effort to give reasonable results for macro definitions. I don't even
know what result is wanted? The starting position of each EARLY and th
ending position of each )? I guess the problem is that EARLY_PCI_OP is
not recognized as a declarer name, and so a parse error is found, causing
the macro to be expanded.
julia
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:05 ` Julia Lawall
@ 2024-10-27 17:28 ` Markus Elfring
2024-10-27 17:34 ` Julia Lawall
2024-10-28 9:00 ` [cocci] Searching for macro calls besides function implementations (with SmPL)? Markus Elfring
2024-10-28 10:44 ` [cocci] Broken code block size determination in function implementations Markus Elfring
` (2 subsequent siblings)
3 siblings, 2 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 17:28 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
>> …
>> -> duplicate data
>> file:
>> arch/sh/drivers/pci/common.c
>> function:
>> early_read_config_dword
>> line:
>> 36
>> …
>>
>>
>> https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L33 :
>> …
>> #define EARLY_PCI_OP(rw, size, type) \
>> int __init early_##rw##_config_##size(struct pci_channel *hose, \
>> int top_bus, int bus, int devfn, int offset, type value) \
>> { \
>> return pci_##rw##_config_##size( \
>> fake_pci_dev(hose, top_bus, bus, devfn), \
>> offset, value); \
>> }
…
>> EARLY_PCI_OP(read, dword, u32 *)
>> EARLY_PCI_OP(write, byte, u8)
…
> OK, that was much more understandable. In general, we don't make a great
> effort to give reasonable results for macro definitions.
Under which circumstances can this situation be adjusted better?
> I don't even know what result is wanted?
The discussed source code search should be restricted to function implementations
(which would usually not be generated by special macros).
> The starting position of each EARLY and th ending position of each )?
Obviously, this description does not fit to the desired data processing goal here.
> I guess the problem is that EARLY_PCI_OP is not recognized as a declarer name,
No information “SPECIAL NAMES: adding … as a declarer” was presented for this test case.
> and so a parse error is found,
I did not see such an indication so far.
> causing the macro to be expanded.
The mentioned test output indicated a questionable function name generation.
How will the software understanding evolve further?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:28 ` Markus Elfring
@ 2024-10-27 17:34 ` Julia Lawall
2024-10-27 17:45 ` Markus Elfring
2024-10-28 9:00 ` [cocci] Searching for macro calls besides function implementations (with SmPL)? Markus Elfring
1 sibling, 1 reply; 39+ messages in thread
From: Julia Lawall @ 2024-10-27 17:34 UTC (permalink / raw)
To: Markus Elfring; +Cc: cocci, Derek M. Jones, Victor Gambier
[-- Attachment #1: Type: text/plain, Size: 1876 bytes --]
On Sun, 27 Oct 2024, Markus Elfring wrote:
> >> …
> >> -> duplicate data
> >> file:
> >> arch/sh/drivers/pci/common.c
> >> function:
> >> early_read_config_dword
> >> line:
> >> 36
> >> …
> >>
> >>
> >> https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L33 :
> >> …
> >> #define EARLY_PCI_OP(rw, size, type) \
> >> int __init early_##rw##_config_##size(struct pci_channel *hose, \
> >> int top_bus, int bus, int devfn, int offset, type value) \
> >> { \
> >> return pci_##rw##_config_##size( \
> >> fake_pci_dev(hose, top_bus, bus, devfn), \
> >> offset, value); \
> >> }
> …
> >> EARLY_PCI_OP(read, dword, u32 *)
> >> EARLY_PCI_OP(write, byte, u8)
> …
> > OK, that was much more understandable. In general, we don't make a great
> > effort to give reasonable results for macro definitions.
>
> Under which circumstances can this situation be adjusted better?
>
>
> > I don't even know what result is wanted?
>
> The discussed source code search should be restricted to function implementations
> (which would usually not be generated by special macros).
>
>
> > The starting position of each EARLY and th ending position of each )?
>
> Obviously, this description does not fit to the desired data processing goal here.
>
>
> > I guess the problem is that EARLY_PCI_OP is not recognized as a declarer name,
>
> No information “SPECIAL NAMES: adding … as a declarer” was presented for this test case.
>
>
> > and so a parse error is found,
>
> I did not see such an indication so far.
Coccinelle doesn't report on parse errors unless an option like
--verbose-parsing or --parse-c is provided.
julia
>
>
> > causing the macro to be expanded.
> The mentioned test output indicated a questionable function name generation.
> How will the software understanding evolve further?
>
> Regards,
> Markus
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:34 ` Julia Lawall
@ 2024-10-27 17:45 ` Markus Elfring
2024-10-27 17:55 ` Julia Lawall
2024-10-27 18:00 ` Derek M Jones
0 siblings, 2 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 17:45 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
> Coccinelle doesn't report on parse errors unless an option like
> --verbose-parsing or --parse-c is provided.
How helpful do you find the following information then?
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch --verbose-parsing …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci arch/sh/drivers/pci/common.c
…
HANDLING: arch/sh/drivers/pci/common.c
ERROR-RECOV: found sync col 0 at line 49
parsing pass2: try again
ERROR-RECOV: found sync col 0 at line 49
parsing pass3: try again
Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
, keeping typeD __init
; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 983, 41, 0, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
, keeping typeD __init
; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1014, 42, 30, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
, keeping typeD __init
; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1046, 43, 31, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
, keeping typeD __init
; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1079, 44, 32, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
, keeping typeD __init
; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1109, 45, 29, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
, keeping typeD __init
; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1140, 46, 30, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
-> duplicate data
file:
arch/sh/drivers/pci/common.c
function:
early_read_config_dword
line:
36
…
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch --parse-c arch/sh/drivers/pci/common.c
…
NB total files = 1; perfect = 1; pbs = 0; timeout = 0; =========> 100%
nb good = 159, nb passed = 6 =========> 3.64% passed
nb good = 159, nb bad = 0 =========> 100.00% good or passed
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:45 ` Markus Elfring
@ 2024-10-27 17:55 ` Julia Lawall
2024-10-27 18:04 ` Markus Elfring
2024-10-27 18:00 ` Derek M Jones
1 sibling, 1 reply; 39+ messages in thread
From: Julia Lawall @ 2024-10-27 17:55 UTC (permalink / raw)
To: Markus Elfring; +Cc: cocci, Derek M. Jones, Victor Gambier
[-- Attachment #1: Type: text/plain, Size: 3212 bytes --]
On Sun, 27 Oct 2024, Markus Elfring wrote:
> > Coccinelle doesn't report on parse errors unless an option like
> > --verbose-parsing or --parse-c is provided.
>
> How helpful do you find the following information then?
>
>
> Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch --verbose-parsing …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations4.cocci arch/sh/drivers/pci/common.c
> …
> HANDLING: arch/sh/drivers/pci/common.c
> ERROR-RECOV: found sync col 0 at line 49
> parsing pass2: try again
> ERROR-RECOV: found sync col 0 at line 49
You can look in your file and see what is on line 49.
julia
> parsing pass3: try again
> Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
> , keeping typeD __init
>
> ; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 983, 41, 0, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
> Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
> , keeping typeD __init
>
> ; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1014, 42, 30, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
> Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
> , keeping typeD __init
>
> ; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1046, 43, 31, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
> Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
> , keeping typeD __init
>
> ; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1079, 44, 32, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
> Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
> , keeping typeD __init
>
> ; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1109, 45, 29, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
> Warning: PARSING: arch/sh/drivers/pci/common.c:34: two or more data types: dropping int
> , keeping typeD __init
>
> ; value = ([0], [0], [(0, 0, (Tag9 ((["__init"; (Tag2 (("__init", 736, 34, 4, "arch/sh/drivers/pci/common.c"), (("\n", 1140, 46, 30, "arch/sh/drivers/pci/common.c"), 4)), (0), ((0, 0, 0, 0)), 0, (3))]), 0)))], [(0, 0, 0)], [0])
> -> duplicate data
> file:
> arch/sh/drivers/pci/common.c
> function:
> early_read_config_dword
> line:
> 36
> …
>
>
>
> Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> spatch --parse-c arch/sh/drivers/pci/common.c
> …
> NB total files = 1; perfect = 1; pbs = 0; timeout = 0; =========> 100%
> nb good = 159, nb passed = 6 =========> 3.64% passed
> nb good = 159, nb bad = 0 =========> 100.00% good or passed
>
>
> Regards,
> Markus
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:45 ` Markus Elfring
2024-10-27 17:55 ` Julia Lawall
@ 2024-10-27 18:00 ` Derek M Jones
2024-10-27 18:11 ` [cocci] Evolving experiences from evidence-based software engineering Markus Elfring
1 sibling, 1 reply; 39+ messages in thread
From: Derek M Jones @ 2024-10-27 18:00 UTC (permalink / raw)
To: cocci
Markus,
> How helpful do you find the following information then?
All these questions. Have you finished reading my book already?
--
Derek M. Jones Evidence-based software engineering
blog:https://shape-of-code.com
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:55 ` Julia Lawall
@ 2024-10-27 18:04 ` Markus Elfring
0 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 18:04 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
>>> Coccinelle doesn't report on parse errors unless an option like
>>> --verbose-parsing or --parse-c is provided.
>>
>> How helpful do you find the following information then?
…
>> parsing pass2: try again
>> ERROR-RECOV: found sync col 0 at line 49
>
> You can look in your file and see what is on line 49.
The definition of the function “pci_is_66mhz_capable” is started there
(after six calls of the special macro “EARLY_PCI_OP”).
See also:
https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L49
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Evolving experiences from evidence-based software engineering
2024-10-27 18:00 ` Derek M Jones
@ 2024-10-27 18:11 ` Markus Elfring
0 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-27 18:11 UTC (permalink / raw)
To: Derek M. Jones; +Cc: cocci
>> How helpful do you find the following information then?
>
> All these questions. Have you finished reading my book already?
Yes.
Can this information source benefit any further from the running development discussion?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Searching for macro calls besides function implementations (with SmPL)?
2024-10-27 17:28 ` Markus Elfring
2024-10-27 17:34 ` Julia Lawall
@ 2024-10-28 9:00 ` Markus Elfring
2024-10-28 12:57 ` Julia Lawall
1 sibling, 1 reply; 39+ messages in thread
From: Markus Elfring @ 2024-10-28 9:00 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
> The mentioned test output indicated a questionable function name generation.
Will it ever become possible to find macro calls (between function implementations)
which can construct further functions?
A tiny source code search approach (like the following) does not produce desirable
data processing results so far by the means of the semantic patch language
(Coccinelle software).
@display@
@@
*EARLY_PCI_OP(...)
See also:
https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L33
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:05 ` Julia Lawall
2024-10-27 17:28 ` Markus Elfring
@ 2024-10-28 10:44 ` Markus Elfring
2024-10-29 12:17 ` Markus Elfring
2024-10-31 7:50 ` Markus Elfring
3 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-28 10:44 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
> I guess the problem is that EARLY_PCI_OP is not recognized as a declarer name,
> and so a parse error is found,
There are probably other data processing challenges involved.
> causing the macro to be expanded.
Would we usually like to analyse (and eventually transform) unexpanded source code?
https://gitlab.inria.fr/coccinelle/coccinelle/-/blob/7bc8a7acf4673a3143cae7aa00ef9374c9fdf893/docs/manual/introduction.tex#L1
https://github.com/coccinelle/coccinelle/blob/7bc8a7acf4673a3143cae7aa00ef9374c9fdf893/docs/manual/introduction.tex#L1
Do you find the following data representations helpful
also for further clarifications and development considerations?
Another example script for the semantic patch language:
@initialize:python@
@@
import sys
def show_data(places):
for place in places:
sys.stdout.write("|".join([place.current_element,
str(int(place.current_element_line_end) - int(place.line) - 1),
place.file,
place.line,
str(int(place.column) + 1)
]))
sys.stdout.write("\n")
@searching@
identifier f;
position p;
@@
f(...)
{@p
...
}
@script:python presenting@
place << searching.p;
@@
show_data(place)
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> time spatch --profile …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations5.cocci arch/sh/drivers/pci/common.c
…
early_read_config_byte|3|arch/sh/drivers/pci/common.c|36|1
early_read_config_dword|3|arch/sh/drivers/pci/common.c|36|1
early_read_config_word|3|arch/sh/drivers/pci/common.c|36|1
early_write_config_byte|3|arch/sh/drivers/pci/common.c|36|1
early_write_config_dword|3|arch/sh/drivers/pci/common.c|36|1
early_write_config_word|3|arch/sh/drivers/pci/common.c|36|1
fake_pci_dev|17|arch/sh/drivers/pci/common.c|13|1
pci_is_66mhz_capable|35|arch/sh/drivers/pci/common.c|51|1
pcibios_enable_err|5|arch/sh/drivers/pci/common.c|90|1
pcibios_enable_serr|5|arch/sh/drivers/pci/common.c|99|1
pcibios_enable_timers|7|arch/sh/drivers/pci/common.c|108|1
pcibios_handle_status_errors|34|arch/sh/drivers/pci/common.c|125|1
---------------------
profiling result
…
Rule searching : 0.003574 sec 1 count
…
real 0m0,229s
user 0m0,179s
sys 0m0,050s
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> time spatch --profile …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations6.cocci arch/sh/drivers/pci/common.c
…
profiling result
…
Rule searching : 0.012334 sec 1 count
…
real 0m0,263s
user 0m0,194s
sys 0m0,042s
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Searching for macro calls besides function implementations (with SmPL)?
2024-10-28 9:00 ` [cocci] Searching for macro calls besides function implementations (with SmPL)? Markus Elfring
@ 2024-10-28 12:57 ` Julia Lawall
0 siblings, 0 replies; 39+ messages in thread
From: Julia Lawall @ 2024-10-28 12:57 UTC (permalink / raw)
To: Markus Elfring; +Cc: cocci, Derek M. Jones, Victor Gambier
On Mon, 28 Oct 2024, Markus Elfring wrote:
> > The mentioned test output indicated a questionable function name generation.
> Will it ever become possible to find macro calls (between function implementations)
> which can construct further functions?
>
> A tiny source code search approach (like the following) does not produce desirable
> data processing results so far by the means of the semantic patch language
> (Coccinelle software).
>
>
> @display@
declarer name EARLY_PCI_OP;
> @@
> *EARLY_PCI_OP(...)
But maybe it won't work because thecode doesn't end with a semicolon.
julia
>
> See also:
> https://elixir.bootlin.com/linux/v6.12-rc4/source/arch/sh/drivers/pci/common.c#L33
>
> Regards,
> Markus
>
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:05 ` Julia Lawall
2024-10-27 17:28 ` Markus Elfring
2024-10-28 10:44 ` [cocci] Broken code block size determination in function implementations Markus Elfring
@ 2024-10-29 12:17 ` Markus Elfring
2024-10-31 7:50 ` Markus Elfring
3 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-29 12:17 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
> OK, that was much more understandable. …
I would like to add another development idea.
@initialize:python@
@@
import sys
records = {}
def store_data(places):
"""Add source code positions to an internal table."""
for place in places:
key = place.file, place.line, int(place.column) + 1
if key in records:
value = records[key]
actions = value[0]
actions.append(place.current_element)
records[key] = actions, value[1]
else:
records[key] = [place.current_element], str(int(place.current_element_line_end) - int(place.line) - 1)
@searching@
identifier f;
position p;
@@
f(...)
{@p
...
}
@script:python collection@
place << searching.p;
@@
store_data(place)
@finalize:python@
@@
if len(records) > 0:
delimiter = "|"
sys.stdout.write(delimiter.join(['action',
'"block size"',
'"source file"',
'"block start line"',
'"block start column"'
]))
sys.stdout.write("\n")
for key, value in records.items():
sys.stdout.write(delimiter.join([str(value[0]),
str(value[1]),
key[0],
key[1],
str(key[2])
]))
sys.stdout.write("\n")
else:
sys.stderr.write("No result for this analysis!\n")
Test result:
Markus_Elfring@Sonne:…/Projekte/Linux/next-analyses> time spatch --python /usr/bin/python3 …/Projekte/Coccinelle/janitor/list_curly_bracket_positions_for_function_implementations7.cocci arch/sh/drivers/pci/common.c
…
action|"block size"|"source file"|"block start line"|"block start column"
['early_read_config_byte', 'early_read_config_dword', 'early_read_config_word', 'early_write_config_byte', 'early_write_config_dword', 'early_write_config_word']|3|arch/sh/drivers/pci/common.c|36|1
['fake_pci_dev']|17|arch/sh/drivers/pci/common.c|13|1
…
real 0m0,164s
user 0m0,133s
sys 0m0,028s
Would you be looking for more powerful computation resources if you would dare
to repeat the shown source code analysis approach with parallel data processing
on all Linux source files occasionally?
Would you become curious to take another look at the software run time characteristics
according to the biggest function implementations (which can be found also
by the means of the semantic patch language)?
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
* Re: [cocci] Broken code block size determination in function implementations
2024-10-27 17:05 ` Julia Lawall
` (2 preceding siblings ...)
2024-10-29 12:17 ` Markus Elfring
@ 2024-10-31 7:50 ` Markus Elfring
3 siblings, 0 replies; 39+ messages in thread
From: Markus Elfring @ 2024-10-31 7:50 UTC (permalink / raw)
To: Julia Lawall, cocci; +Cc: Derek M. Jones, Victor Gambier
> OK, that was much more understandable. In general, we don't make a great
> effort to give reasonable results for macro definitions. …
Will any source code places (which can be found by a command like the following)
become more interesting for corresponding development considerations?
Markus_Elfring@Sonne:…/Projekte/Coccinelle/20160205> rg '##' parsing_c
…
Regards,
Markus
^ permalink raw reply [flat|nested] 39+ messages in thread
end of thread, other threads:[~2024-10-31 7:51 UTC | newest]
Thread overview: 39+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-24 11:56 [cocci] Multiple returns significant performance impact Derek M Jones
2024-10-24 12:42 ` Markus Elfring
2024-10-24 14:48 ` Julia Lawall
2024-10-24 15:23 ` Derek M Jones
2024-10-24 15:29 ` Julia Lawall
2024-10-24 15:35 ` Derek M Jones
2024-10-24 15:38 ` Julia Lawall
2024-10-24 15:44 ` Victor Gambier
2024-10-24 16:00 ` Julia Lawall
2024-10-24 15:50 ` Derek M Jones
2024-10-24 16:07 ` Markus Elfring
2024-10-24 16:14 ` Julia Lawall
2024-10-24 16:48 ` Markus Elfring
2024-10-24 17:30 ` Derek M Jones
2024-10-24 18:26 ` Markus Elfring
2024-10-24 21:03 ` Derek M Jones
2024-10-25 5:38 ` Markus Elfring
2024-10-25 11:54 ` Derek M Jones
2024-10-27 12:48 ` [cocci] Broken code block size determination in function implementations Markus Elfring
2024-10-27 13:31 ` Julia Lawall
2024-10-27 15:15 ` Markus Elfring
2024-10-27 16:40 ` Markus Elfring
2024-10-27 17:05 ` Julia Lawall
2024-10-27 17:28 ` Markus Elfring
2024-10-27 17:34 ` Julia Lawall
2024-10-27 17:45 ` Markus Elfring
2024-10-27 17:55 ` Julia Lawall
2024-10-27 18:04 ` Markus Elfring
2024-10-27 18:00 ` Derek M Jones
2024-10-27 18:11 ` [cocci] Evolving experiences from evidence-based software engineering Markus Elfring
2024-10-28 9:00 ` [cocci] Searching for macro calls besides function implementations (with SmPL)? Markus Elfring
2024-10-28 12:57 ` Julia Lawall
2024-10-28 10:44 ` [cocci] Broken code block size determination in function implementations Markus Elfring
2024-10-29 12:17 ` Markus Elfring
2024-10-31 7:50 ` Markus Elfring
2024-10-26 11:43 ` [cocci] Checking SmPL run time characteristics for code block position " Markus Elfring
2024-10-24 16:16 ` [cocci] SmPL position variables … Markus Elfring
2024-10-24 16:20 ` Julia Lawall
2024-10-24 16:28 ` Markus Elfring
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox