* [Cocci] Filter out field names
@ 2016-01-21 9:47 Daniel Wagner
2016-01-21 11:02 ` Julia Lawall
0 siblings, 1 reply; 6+ messages in thread
From: Daniel Wagner @ 2016-01-21 9:47 UTC (permalink / raw)
To: cocci
Hi,
I am trying to transform Linux kernel code in order to test for data
races. Let me illustrate this by an example
struct lpctx {
spinlock_t lock;
int cnt;
};
static struct lpctx *lpctx;
static void lpctx_good1(struct lpctx *lx)
{
spin_lock(&lx->lock);
lx->cnt++;
spin_unlock(&lx->lock);
}
static void lpctx_warning1(struct lpctx *lx)
{
lx->cnt++;
}
As you can see lptctx_warning1() is increasing the counter without
holding the lock. The main idea(*) is to annotate all accesses to lx by
adding a WARN_ON(!lock_is_held(&lx->lock.dep_map)).
Thanks to Julia's excellent help on IRC I have currently this somewhat
working cocci script:
@depends on patch@
struct lpctx *x;
identifier f;
statement S;
@@
( S
+ WARN_ON(!lock_is_held(&x->lock.dep_map));
&
x->f
)
This results in:
@@ -16,10 +16,13 @@ static void lpctx_good1(struct lpctx *lx
DBG("");
spin_lock(&lx->lock);
+ WARN_ON(!lock_is_held(&lx->lock.dep_map));
lx->cnt++;
+ WARN_ON(!lock_is_held(&lx->lock.dep_map));
spin_unlock(&lx->lock);
+ WARN_ON(!lock_is_held(&lx->lock.dep_map));
}
static void lpctx_warning1(struct lpctx *lx)
@@ -27,6 +30,7 @@ static void lpctx_warning1(struct lpctx
DBG("");
lx->cnt++;
+ WARN_ON(!lock_is_held(&lx->lock.dep_map));
}
There are a bunch of problems, e,g. the hard coded type/names or the
WARN_ON() should be put in front of the deferences but that is something
for later to improve. The main problem I face at this point is to filter
out the spin_lock() and spin_unlock() access. All my attempts didn't let
to the expected result. I think the best thing would be to match on the
type. So if 'f' is of type spinlock_t ignore it. Any ideas how this
could be expressed?
Thanks,
Daniel
(*) Talking with Nicholas over a coffee helps a lot. Thanks a lot!
^ permalink raw reply [flat|nested] 6+ messages in thread* [Cocci] Filter out field names 2016-01-21 9:47 [Cocci] Filter out field names Daniel Wagner @ 2016-01-21 11:02 ` Julia Lawall 2016-01-22 14:32 ` Daniel Wagner ` (2 more replies) 0 siblings, 3 replies; 6+ messages in thread From: Julia Lawall @ 2016-01-21 11:02 UTC (permalink / raw) To: cocci > There are a bunch of problems, e,g. the hard coded type/names or the > WARN_ON() should be put in front of the dereferences but that is something > for later to improve. The main problem I face at this point is to filter > out the spin_lock() and spin_unlock() access. All my attempts didn't let > to the expected result. I think the best thing would be to match on the > type. So if 'f' is of type spinlock_t ignore it. Any ideas how this > could be expressed? Here is my suggestion (see below for explanations): @r@ identifier i, virtual.ty; @@ struct ty *i; @s@ identifier i,f,virtual.ty; @@ f (...,struct ty *i,...) { ... } @locks@ typedef spinlock_t; spinlock_t *e; statement S; position p; @@ e at S@p @safe@ position p1; identifier virtual.ty; struct ty *x; identifier f; @@ \(spin_lock\|spin_lock_irqsave\)(&x->lock,...) ... when != &x->lock x ->@p1 f @depends on r || s@ identifier virtual.ty; struct ty *x; identifier f; statement S; position p != locks.p; position p1 != safe.p1; @@ ( ++ WARN_ON(!lock_is_held(&x->lock.dep_map)); S at p & x ->@p1 f ) -------------------------------------------------------------------------- Same thing with explanations: @r@ identifier i, virtual.ty; @@ struct ty *i; @s@ identifier i,f,virtual.ty; @@ f (...,struct ty *i,...) { ... } The above two rules look for files that contain declarations of your type of interest. If you are sure that the only files you care about modifying contain this code, then including these rules will reduce running time very substantially, because Coccinelle will only work on files that contain these declarations. Otherwise, it will work on all files, which will be very slow, because your pattern is very generic. Perhaps something to do in a second step. @locks@ typedef spinlock_t; spinlock_t *e; statement S; position p; @@ e at S@p This finds all expressions that have type spinlock_t *, and stores their position in p. We will use this later to avoid transforming spin_lock calls. It would be nice if this could have been integrated into the later rule, but that triggers some parsing problem. @safe@ position p1; identifier virtual.ty; struct ty *x; identifier f; @@ \(spin_lock\|spin_lock_irqsave\)(&x->lock,...) ... when != &x->lock x ->@p1 f This is an optimization. There is no point to add the WARN_ON if there was a call to spin_lock on your lock just before. You may want to add more locking functions here. @depends on r || s@ identifier virtual.ty; struct ty *x; identifier f; statement S; position p != locks.p; position p1 != safe.p1; @@ ( ++ WARN_ON(!lock_is_held(&x->lock.dep_map)); S at p & x ->@p1 f ) This adds the WARN_ON before the code. Using positions that should be different than previously collected positions, this does not add WARN_ON on statements that contain a lock somewhere, and does not add WARN_ON on references where there was a lock call recently. This should be run with at least the argument -D ty=typeofinterest. Because this is looking for expressions of type spinlock_t *, it may be useful to cause Coccinelle to take into account header files. The following options may be useful: --recursive-includes --include-headers-for-types Recursive includes causes not only files explicitly included, but also file included by other includes to be taken into account. If you think that only explicitly included files would be good enough, then --all-includes would be fine. Include headers for types means that Coccinelle will look at the header files for parsing and type information, but will not take them into account during the transformation process. This makes things more efficient, if you don't think that transformations are needed in the header files anyway. The transformation being considered here would be transformation of the header file in the context of the information available in the .c file. If you think that it would be useful to transform the header file on its own, then you can use the argument --include-headers. An issue that remains is that you assume that the lock field has type lock. A quick grep in the Linux kernel shows that this is not always the case. You can make a virtual identifier for the lock field as well. A more complicated approach would be to have Coccinelle figure out the type from the structure type definition. A final issue is that my quick grep in the Linux kernel also shows that not all locks are in structures. But then it is less obvious how to connect the locks to the protected references. And thinking on, it is not always the case that because a structure has a lock, that every field has to be accessed under a lock. So in the end, it may be necessary to do something even more clever. For example, if a field is somewhere referenced under a lock, and not referenced under a lock in some other place, then perhaps that is more likely to be a real problem. But even that is not 100% certain, because there can be eg an initialization phase that is not done concurrently, so no locks are actually needed at that point, even though locks are needed elsewhere. julia ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Cocci] Filter out field names 2016-01-21 11:02 ` Julia Lawall @ 2016-01-22 14:32 ` Daniel Wagner 2016-01-22 16:09 ` SF Markus Elfring 2016-02-26 10:40 ` Daniel Wagner 2 siblings, 0 replies; 6+ messages in thread From: Daniel Wagner @ 2016-01-22 14:32 UTC (permalink / raw) To: cocci Hi Julia, On 21.01.16 12:02, Julia Lawall wrote: >> There are a bunch of problems, e,g. the hard coded type/names or the >> WARN_ON() should be put in front of the dereferences but that is something >> for later to improve. The main problem I face at this point is to filter >> out the spin_lock() and spin_unlock() access. All my attempts didn't let >> to the expected result. I think the best thing would be to match on the >> type. So if 'f' is of type spinlock_t ignore it. Any ideas how this >> could be expressed? > > Here is my suggestion (see below for explanations): Thanks a lot for your help! It's a really appreciated. Unfortunately, I haven't found time to play with it because of some fights with paperwork. Hopefully I have time on Monday to continue with this work. Have a nice weekend, Daniel ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Cocci] Filter out field names 2016-01-21 11:02 ` Julia Lawall 2016-01-22 14:32 ` Daniel Wagner @ 2016-01-22 16:09 ` SF Markus Elfring 2016-02-26 10:40 ` Daniel Wagner 2 siblings, 0 replies; 6+ messages in thread From: SF Markus Elfring @ 2016-01-22 16:09 UTC (permalink / raw) To: cocci >@safe@ >position p1; >identifier virtual.ty; >struct ty *x; >identifier f; > @@ > > \(spin_lock\|spin_lock_irqsave\)(&x->lock,...) Can this SmPL specifcation be extended a bit easier if the suggested disjunction would be written like the following instead? (spin_lock |spin_lock_irqsave )(&x->lock,...) How do you think about the reuse of a SmPL constraint for passing a longer function name list eventually? > Same thing with explanations: > > @r@ > identifier i, virtual.ty; > @@ > > struct ty *i; > > @s@ > identifier i,f,virtual.ty; > @@ > > f (...,struct ty *i,...) { ... } > If you are sure that the only files you care about modifying > contain this code, then including these rules will reduce running time > very substantially, I am also interested in similar fine-tuning for scripts of the semantic patch language. > because Coccinelle will only work on files that contain these declarations. Should filtering usually be performed for each source file in the desired analysis? > Otherwise, it will work on all files, which will bevery slow, > because your pattern is very generic. Do you consider any other file preprocessing here which will influence the mentioned execution speed? > It would be nice if this could have been integrated into the > later rule, but that triggers some parsing problem. Would you like to explain this information a bit more? > And thinking on, it is not always the case that because a structure > has a lock, that every field has to be accessed under a lock. > So in the end, it may be necessary to do something even more clever. > For example, if a field is somewhere referenced under a lock, and not > referenced under a lock in some other place, then perhaps that is more > likely to be a real problem. But even that is not 100% certain, > because there can be eg an initialization phase that is not done concurrently, > so no locks are actually needed at that point, even though locks are > needed elsewhere. Can it be that all the special cases and further challenges you mention here belong to the technology around data flow analysis? Regards, Markus ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Cocci] Filter out field names 2016-01-21 11:02 ` Julia Lawall 2016-01-22 14:32 ` Daniel Wagner 2016-01-22 16:09 ` SF Markus Elfring @ 2016-02-26 10:40 ` Daniel Wagner 2016-02-26 12:40 ` SF Markus Elfring 2 siblings, 1 reply; 6+ messages in thread From: Daniel Wagner @ 2016-02-26 10:40 UTC (permalink / raw) To: cocci Hi Julia, Sorry for the long delay. Got distracted again. On 01/21/2016 12:02 PM, Julia Lawall wrote: > An issue that remains is that you assume that the lock field has type > lock. A quick grep in the Linux kernel shows that this is not always the > case. You can make a virtual identifier for the lock field as well. A > more complicated approach would be to have Coccinelle figure out the type > from the structure type definition. > > A final issue is that my quick grep in the Linux kernel also shows that not > all locks are in structures. But then it is less obvious how to connect > the locks to the protected references. > > And thinking on, it is not always the case that because a structure has a > lock, that every field has to be accessed under a lock. So in the end, it > may be necessary to do something even more clever. For example, > if a field is somewhere referenced under a lock, and not referenced under a > lock in some other place, then perhaps that is more likely to be a real > problem. But even that is not 100% certain, because there can be eg an > initialization phase that is not done concurrently, so no locks are > actually needed at that point, even though locks are needed elsewhere. Your analysis is absolutely correct. Automatically annotating is not really feasibly. At least without a lot of additional hand work. Especially the more interesting data structure do no have a simple locking pattern. Instead all access is heavily optimized which makes writing generic cocci rules really difficult (impossible?). I think my idea isn't that simple as I hoped. Well, at least I have learned something. thanks a lot for your help, daniel ^ permalink raw reply [flat|nested] 6+ messages in thread
* [Cocci] Filter out field names 2016-02-26 10:40 ` Daniel Wagner @ 2016-02-26 12:40 ` SF Markus Elfring 0 siblings, 0 replies; 6+ messages in thread From: SF Markus Elfring @ 2016-02-26 12:40 UTC (permalink / raw) To: cocci > I think my idea isn't that simple as I hoped. Well, at least I have > learned something. How do you think about to extend your software development experiences for technology around data flow analysis? Regards, Markus ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-02-26 12:40 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-01-21 9:47 [Cocci] Filter out field names Daniel Wagner 2016-01-21 11:02 ` Julia Lawall 2016-01-22 14:32 ` Daniel Wagner 2016-01-22 16:09 ` SF Markus Elfring 2016-02-26 10:40 ` Daniel Wagner 2016-02-26 12:40 ` SF Markus Elfring
This is an external index of several public inboxes, see mirroring instructions on how to clone and mirror all data and code used by this external index.