All of lore.kernel.org
 help / color / mirror / Atom feed
* Coccinelle Challenge 1
@ 2016-10-11 23:23 Elizabeth Ferdman
  0 siblings, 0 replies; 3+ messages in thread
From: Elizabeth Ferdman @ 2016-10-11 23:23 UTC (permalink / raw)
  To: julia.lawal; +Cc: outreachy-kernel

Hi Julia, 
I worked on the first challenge and I wanted to run my findings and
questions by you. Please confirm whether these observations are correct.

I looked at the difference between local idexpression, expression, and
identifier for ret.cocci. 

Local idexpression basically means what local identifier
would mean-- just local vars--  except that's invalid. 

Using expression in place of local identifier included results like
this: 

- a->b = c;
- return a->b;
+ return c;

I don't know whether or not that's always the right thing to do. I think
that if the changes are only local, it'd be fine to compress the lines.
But if a->b happens to have a global effect than it'd be wrong. I'm not
sure about this, but if the function receives *a as a parameter, then
would a->b = c be a persistent change? 

Using identifier included results where 'ret' was not necessarily local,
so compressing the lines would be bad because the global assignment
would be deleted.

I noticed some potentially bad results with ret.cocci:
1. When the type of 'ret' happens to be important, lines shouldn't be
compressed. For example, a function receives a variable of type u64 but
then returns it as u32. Deleting the assignment line means that the
value will be returned as the wrong type.

2. spatch deletes a comment between the assignment and the return
statement. E.g.:

-       start = iwe(info);
-       /* how to translate rssi to ?% */
-       return start;
+       return iwe(info);
 }

3. readability issues-- 
- the return value is very long with many |'s or ||'s
- the name of the variable might be meaningful to the reader
- there are if blocks with return statements: 
 same variable name but different values might provide meaning/order to
 the program.
In these cases the variable reassignment or creation isn't actually
necessary, but does it increase readability?

Questions:

I'm pretty confused about how these lines work: 

-ret =
+return
     e;
-return ret;

The first line deletes the name of the variable and the equals sign, 
but what happens to the expression on the right?
I'm confused as to why it doesn't say -ret = e to signify that the
expression on the right is what we want 'e' to be. Then return gets
added, and it's replacing 'ret ='. But why is 'e' even necessary if it
wasn't deleted? And why is it on a new line with spaces in front? 
The last line '-return ret;' makes sense, we just want to delete the 
word return followed by the variable name. 

I've read some basics of coccinelle, went through the slides, and read
some articles people wrote about it but I'm still pretty confused. :/

Thanks for your time, 
Elizabeth


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Coccinelle Challenge 1
@ 2016-10-11 23:32 Elizabeth Ferdman
  2016-10-12  5:37 ` Julia Lawall
  0 siblings, 1 reply; 3+ messages in thread
From: Elizabeth Ferdman @ 2016-10-11 23:32 UTC (permalink / raw)
  To: julia.lawall; +Cc: outreachy-kernel

Hi Julia,
I worked on the first challenge and I wanted to run my findings and
questions by you. Please confirm whether these observations are correct.

I looked at the difference between local idexpression, expression, and
identifier for ret.cocci.

Local idexpression basically means what local identifier
would mean-- just local vars--  except that's invalid.

Using expression in place of local identifier included results like
this:

- a->b = c;
- return a->b;
+ return c;

I don't know whether or not that's always the right thing to do. I think
that if the changes are only local, it'd be fine to compress the lines.
But if a->b happens to have a global effect than it'd be wrong. I'm not
sure about this, but if the function receives *a as a parameter, then
would a->b = c be a persistent change?

Using identifier included results where 'ret' was not necessarily local,
so compressing the lines would be bad because the global assignment
would be deleted.

I noticed some potentially bad results with ret.cocci:
1. When the type of 'ret' happens to be important, lines shouldn't be
compressed. For example, a function receives a variable of type u64 but
then returns it as u32. Deleting the assignment line means that the
value will be returned as the wrong type.

2. spatch deletes a comment between the assignment and the return
statement. E.g.:

-       start = iwe(info);
-       /* how to translate rssi to ?% */
-       return start;
+       return iwe(info);
 }

3. readability issues-- 
- the return value is very long with many |'s or ||'s 
- the name of the variable might be meaningful to the reader 
- there are if blocks with return statements: 
 same variable name but different values might provide meaning/order to 
  the program. 
  In these cases the variable reassignment or creation isn't actually
  necessary, but does it increase readability?

Questions: 
I'm pretty confused about how these lines work: 

-ret = 
+return 
     e; 
-return ret; 

The first line deletes the name of the variable and the equals sign, 
but what happens to the expression on the right? 
I'm confused as to why it doesn't say -ret = e to signify that the 
expression on the right is what we want 'e' to be. Then return gets 
added, and it's replacing 'ret ='. But why is 'e' even necessary if it 
wasn't deleted? And why is it on a new line with spaces in front? 
The last line '-return ret;' makes sense, we just want to delete the 
word return followed by the variable name. 

I've read some basics of coccinelle, went through the slides, and read 
some articles people wrote about it but I'm still pretty confused. :/ 

Thanks for your time, 
Elizabeth 


^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: Coccinelle Challenge 1
  2016-10-11 23:32 Elizabeth Ferdman
@ 2016-10-12  5:37 ` Julia Lawall
  0 siblings, 0 replies; 3+ messages in thread
From: Julia Lawall @ 2016-10-12  5:37 UTC (permalink / raw)
  To: Elizabeth Ferdman; +Cc: outreachy-kernel



On Tue, 11 Oct 2016, Elizabeth Ferdman wrote:

> Hi Julia,
> I worked on the first challenge and I wanted to run my findings and
> questions by you. Please confirm whether these observations are correct.
>
> I looked at the difference between local idexpression, expression, and
> identifier for ret.cocci.
>
> Local idexpression basically means what local identifier
> would mean-- just local vars--  except that's invalid.

An identifier is just a name.  It doesn't necessarily have any value.
For example, an identifier can be used in the place of a structure field
or a jump label.  An idexpression is an expression (term that has a value)
that is restricted to look like an identifier.  It cannot be used as a
field name or a jump label, only where other expressions are allowed.  As
such, a type can be specified for it, as in idexpression int x;, and as in
this case it can be specified to reference a local or global variable.

> Using expression in place of local identifier included results like
> this:
>
> - a->b = c;
> - return a->b;
> + return c;

Exactly.

> I don't know whether or not that's always the right thing to do. I think
> that if the changes are only local, it'd be fine to compress the lines.
> But if a->b happens to have a global effect than it'd be wrong. I'm not
> sure about this, but if the function receives *a as a parameter, then
> would a->b = c be a persistent change?

Yes it would.  If a->b = c doesn't make a persistent change, then
it probably doesn't make sense to write such code at all.  If it is a
purely local computation, one would use a local variable.

> Using identifier included results where 'ret' was not necessarily local,
> so compressing the lines would be bad because the global assignment
> would be deleted.

Correct as well.

> I noticed some potentially bad results with ret.cocci:
> 1. When the type of 'ret' happens to be important, lines shouldn't be
> compressed. For example, a function receives a variable of type u64 but
> then returns it as u32. Deleting the assignment line means that the
> value will be returned as the wrong type.

I don't think so.  The compiler does know the type of the function.  But
I'm not 100% sure.  Perhaps there could be a pathological case where the
expression is u64, the variable is u32, and the return type is also u64.
It could be nice for the person to use a cast in this case.  Coccinelle
can, however, detect that an assignment preserves the type:

@@
type T;
local idexpression T x;
T y;
@@

x = y // this assignment is type preserving

The only problem is that then Coccinelle has to figure out the type of x
(probably easy since it is a local variable) and y (probably hard; if it
is a call to a function defined in another file, then Coccinelle may not
know the return type).  Generally, determining type information may
require information in header files, and by default Coccinelle only
includes the ones in the same directory or with the same name as the .c
file.  Including more is possible, eg with the option --all-includes (all
directly included files) or --recursive-includes (also files included by
other included files), but these cause a significant slowdown, because a
lot more code needs to be processed.  There is an option
--include-headers-for-types that reads the header files but only collects
type information rather than trying to transform them.  This reduces the
performance impact.

> 2. spatch deletes a comment between the assignment and the return
> statement. E.g.:
>
> -       start = iwe(info);
> -       /* how to translate rssi to ?% */
> -       return start;
> +       return iwe(info);
>  }

That's true.  If you keep the word return instead of removing it and
adding it back, then the comment should stay as well.

>
> 3. readability issues--
> - the return value is very long with many |'s or ||'s
> - the name of the variable might be meaningful to the reader
> - there are if blocks with return statements:
>  same variable name but different values might provide meaning/order to
>   the program.
>   In these cases the variable reassignment or creation isn't actually
>   necessary, but does it increase readability?

Sure, it could be useful.  It's your responsability as a Coccinelle user
to look at the results and decide which are not a good idea.  Sometimes it
can be useful to adjust the semantic patch to remove the results you don't
like, but for things like meaningful names, the user would have to figure
out for themself.

> Questions:
> I'm pretty confused about how these lines work:
>
> -ret =
> +return
>      e;
> -return ret;
>
> The first line deletes the name of the variable and the equals sign,
> but what happens to the expression on the right?

Nothing.

> I'm confused as to why it doesn't say -ret = e to signify that the
> expression on the right is what we want 'e' to be. Then return gets
> added, and it's replacing 'ret ='. But why is 'e' even necessary if it
> wasn't deleted? And why is it on a new line with spaces in front?
> The last line '-return ret;' makes sense, we just want to delete the
> word return followed by the variable name.

In general, when you remove something and add it back, you change it from
how it was in the original code to how Coccinelle wants to pretty print
it.  This rule favors leaving the layout of the expression e as is.  But
there is no way to adapt this rule to leave the return as is, in order to
preserve any comment.  And the expression is now a function call, which
Coccinelle is now pretty good at pretty printing.  So it may be more
desirable to put eg:

- ret = e;
  return
-   ret
+   e
  ;

Coccinelle doesn't normally care about the spaces and newlines in the
specification.  So you can put newlines however needed for annotating
different tokens as - and +.  There is an option --smpl-spacing that will
preserve the spaces within a line of code.  This is useful if you are
working on a project that has layout conventions different from the Linux
kernel, such as putting a space after ( and before ).  But for the Linux
kernel, the default pretty printing strategy should be fine.

> I've read some basics of coccinelle, went through the slides, and read
> some articles people wrote about it but I'm still pretty confused. :/

There were a lot of good points in the above, so I don't think you are
actually so confused :)

julia

> Thanks for your time,
> Elizabeth
>


^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2016-10-12  5:37 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-10-11 23:23 Coccinelle Challenge 1 Elizabeth Ferdman
  -- strict thread matches above, loose matches on Subject: below --
2016-10-11 23:32 Elizabeth Ferdman
2016-10-12  5:37 ` Julia Lawall

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.