* idea/question about sparse's context checking
@ 2017-08-18 13:20 Luc Van Oostenryck
2017-08-18 14:26 ` Josh Triplett
` (2 more replies)
0 siblings, 3 replies; 11+ messages in thread
From: Luc Van Oostenryck @ 2017-08-18 13:20 UTC (permalink / raw)
To: Josh Triplett; +Cc: linux-sparse
Hi Josh,
I was thinking lately about sparse's context checking.
I had an idea and I wondered if it hasn't already been
tried or discussed.
The context checking essentially works with the instruction
OP_CONTEXT, that do nothing more than adding or subtracting
some constant value to 'the context'. Then, at checking time,
these instructions are interpreted along all the paths and if
there is a disagreement between values from different paths,
it emits a 'context imbalance' warning (there is also the
checks for the expected return value of the context).
My idea/question is the following: this interpretation
is already done for all 'normal' values. So couldn't we
consider the context as a special kind of 'variable',
use a normal pseudo for it, use phi-nodes on them and
let the normal simplification process act on them?
If the context is (proveably) well balanced, there
shouldn't be any phi-node left for this context.
I'm not sure if there would be any significant advantage,
but it seems to me that what is currently done is
somehow redundant with the 'normal' processing.
It could also maybe help to have several independent
contexts.
[of course, we can't really use add/sub for the context
increase/decrease as we want to check if the context
don't become negative].
Best regards,
-- Luc
^ permalink raw reply [flat|nested] 11+ messages in thread* Re: idea/question about sparse's context checking 2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck @ 2017-08-18 14:26 ` Josh Triplett 2017-08-18 14:37 ` Christopher Li 2017-08-18 19:34 ` Linus Torvalds 2 siblings, 0 replies; 11+ messages in thread From: Josh Triplett @ 2017-08-18 14:26 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: linux-sparse On August 18, 2017 6:20:42 AM PDT, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: >Hi Josh, > >I was thinking lately about sparse's context checking. >I had an idea and I wondered if it hasn't already been >tried or discussed. > >The context checking essentially works with the instruction >OP_CONTEXT, that do nothing more than adding or subtracting >some constant value to 'the context'. Then, at checking time, >these instructions are interpreted along all the paths and if >there is a disagreement between values from different paths, >it emits a 'context imbalance' warning (there is also the >checks for the expected return value of the context). > >My idea/question is the following: this interpretation >is already done for all 'normal' values. So couldn't we >consider the context as a special kind of 'variable', >use a normal pseudo for it, use phi-nodes on them and >let the normal simplification process act on them? >If the context is (proveably) well balanced, there >shouldn't be any phi-node left for this context. > >I'm not sure if there would be any significant advantage, >but it seems to me that what is currently done is >somehow redundant with the 'normal' processing. >It could also maybe help to have several independent >contexts. I'd love to see sparse doing more general value tracking, for this and other purposes. And yes, that would help with tracking multiple contexts. ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck 2017-08-18 14:26 ` Josh Triplett @ 2017-08-18 14:37 ` Christopher Li 2017-08-18 16:15 ` Luc Van Oostenryck 2017-08-18 19:34 ` Linus Torvalds 2 siblings, 1 reply; 11+ messages in thread From: Christopher Li @ 2017-08-18 14:37 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Josh Triplett, Linux-Sparse On Fri, Aug 18, 2017 at 9:20 AM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > Hi Josh, > > I was thinking lately about sparse's context checking. > I had an idea and I wondered if it hasn't already been > tried or discussed. > > The context checking essentially works with the instruction > OP_CONTEXT, that do nothing more than adding or subtracting > some constant value to 'the context'. Then, at checking time, > these instructions are interpreted along all the paths and if > there is a disagreement between values from different paths, > it emits a 'context imbalance' warning (there is also the > checks for the expected return value of the context). That is right. > > My idea/question is the following: this interpretation > is already done for all 'normal' values. So couldn't we > consider the context as a special kind of 'variable', > use a normal pseudo for it, use phi-nodes on them and Question. What syntax are you considering for declare this special pseudo? Does it automatically attach to the variable that has the context attribute? > let the normal simplification process act on them? > If the context is (proveably) well balanced, there > shouldn't be any phi-node left for this context. Assume you do the context pseudo attach to the lock variable. Then you will have to do the pointer alias properly. People might have different pointer point to that variable. > I'm not sure if there would be any significant advantage, > but it seems to me that what is currently done is > somehow redundant with the 'normal' processing. > It could also maybe help to have several independent > contexts. 'normal' process do you mean the context statement? > > [of course, we can't really use add/sub for the context > increase/decrease as we want to check if the context > don't become negative]. It can become negative for some helper function that wrap around the unclock function. I think the biggest problem with context right now is actually not able to inline or take into account for the function that change the context. We can add more context attribute to the function to describe what this function might do to the context. The best way is automatic to make this happen. I see that as the biggest cause of the false alarm on context warnings. Chris ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 14:37 ` Christopher Li @ 2017-08-18 16:15 ` Luc Van Oostenryck 2017-08-18 18:18 ` Christopher Li 0 siblings, 1 reply; 11+ messages in thread From: Luc Van Oostenryck @ 2017-08-18 16:15 UTC (permalink / raw) To: Christopher Li; +Cc: Josh Triplett, Linux-Sparse On Fri, Aug 18, 2017 at 4:37 PM, Christopher Li <sparse@chrisli.org> wrote: >> My idea/question is the following: this interpretation >> is already done for all 'normal' values. So couldn't we >> consider the context as a special kind of 'variable', >> use a normal pseudo for it, use phi-nodes on them and > > Question. What syntax are you considering for declare > this special pseudo? Does it automatically attach to the > variable that has the context attribute? First, it's only a vague idea. Soaking since a little while but still only a vague idea. You don't need at this point a special syntax. For example, the return value is currently stored in a special variable/symbol: 'return' in a special namespace/scope. >> let the normal simplification process act on them? >> If the context is (proveably) well balanced, there >> shouldn't be any phi-node left for this context. > > Assume you do the context pseudo attach to the lock > variable. Then you will have to do the pointer alias properly. > People might have different pointer point to that variable. No no, it's not the point here. What I meant in the idea is that currently OP_CONTEXT and its associated attribute work with a unique implicit 'object', called 'context' but not explicitly associated to a symbol/variable and thus they need their own specific machinery. >> I'm not sure if there would be any significant advantage, >> but it seems to me that what is currently done is >> somehow redundant with the 'normal' processing. >> It could also maybe help to have several independent >> contexts. > > 'normal' process do you mean the context statement? No, the processing of variables and pseudos. >> [of course, we can't really use add/sub for the context >> increase/decrease as we want to check if the context >> don't become negative]. > > It can become negative for some helper function that > wrap around the unclock function. This need to be checked (and currently there is a small gap where it can go negative and positive again and not warned about it. > I think the biggest problem with context right now is actually > not able to inline or take into account for the function that > change the context. We can add more context attribute to > the function to describe what this function might do to the > context. The best way is automatic to make this happen. I see > that as the biggest cause of the false alarm on context warnings. There are several problems here: 1) the false alarms. These are unavoidable, it's equivalent to the halting problem. We can only be correct and precise in restricted situations (no loops, for example). It's not what the idea here is about. 2) what you describe here about the inlines. The underlying problems is to match the 'names/expression' given in the declarations with the ones used in(side) the definition. I've some drafts but I'm not totally sure about what exactly can be done. 3) because of 2) we ignore all the names and we simply care about the sum of all contexts. So, in truth, we can only deal with a single context. 4) the duplicated machinery this idea/question is about. -- Luc ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 16:15 ` Luc Van Oostenryck @ 2017-08-18 18:18 ` Christopher Li 2017-08-18 19:01 ` Luc Van Oostenryck 0 siblings, 1 reply; 11+ messages in thread From: Christopher Li @ 2017-08-18 18:18 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Josh Triplett, Linux-Sparse On Fri, Aug 18, 2017 at 12:15 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: >>> let the normal simplification process act on them? >>> If the context is (proveably) well balanced, there >>> shouldn't be any phi-node left for this context. >> >> Assume you do the context pseudo attach to the lock >> variable. Then you will have to do the pointer alias properly. >> People might have different pointer point to that variable. > > No no, it's not the point here. > What I meant in the idea is that currently OP_CONTEXT > and its associated attribute work with a unique implicit > 'object', called 'context' but not explicitly associated to > a symbol/variable and thus they need their own specific > machinery. Ah, I did not get your idea at the first read. So you mean do not use bb->context to store the context. There will be still one context per BB. Store the context into pseudo and some how that pseudo in link into the BB. I was thinking weather you are going to create more than one context per BB, attach the context to the locking objects. That is not what you are trying to do. If my understanding of you just try to move bb->context into pseudos. In principle that is fine. It is also depend on a lot of implementation details as well. Do you still need to lookup the context per basic block? >> It can become negative for some helper function that >> wrap around the unclock function. > > This need to be checked (and currently there is a small gap > where it can go negative and positive again and not warned > about it. Not sure that is worthy while to warn. It can happen in the function assume the lock is taken when enter the function. The unlock it, do some thing , relock it. There is total legit reason to do it. Without cross function checking, there is no good way to know the function is(or should) always called with lock held. > There are several problems here: > 1) the false alarms. These are unavoidable, it's equivalent to the > halting problem. We can only be correct and precise in restricted > situations (no loops, for example). It's not what the idea here is about. The really hard one are not avoidable. The one I am talking about. Which call some helper function to acquired the lock is avoidable. Last time I look at it, that make up a the large part of the false alarm. > 2) what you describe here about the inlines. > The underlying problems is to match the 'names/expression' > given in the declarations with the ones used in(side) the definition. > I've some drafts but I'm not totally sure about what exactly can be > done. > 3) because of 2) we ignore all the names and we simply care about > the sum of all contexts. > So, in truth, we can only deal with a single context. There are two issues. I think having multiple context or tight the context into a variable is nice to have, but not that important. Even if we implemented that, most likely it is going to yield more warnings due to not able to find the name/expression properly. It is very unlikely that in the same function has real two context error cancel each other out. So having one context for now is more or less fine. Not able to do cross function check on context is the big issue in context checking. If we can do cross function checking, then a lot of warning can be eliminated. You can also take a look at the metalevel compiling papers. The company coverity was started out doing metalevel compiling checks. Coverity report a lot of bugs in Linux kernel back in the days. Chris ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 18:18 ` Christopher Li @ 2017-08-18 19:01 ` Luc Van Oostenryck 2017-08-18 19:59 ` Christopher Li 0 siblings, 1 reply; 11+ messages in thread From: Luc Van Oostenryck @ 2017-08-18 19:01 UTC (permalink / raw) To: Christopher Li; +Cc: Josh Triplett, Linux-Sparse On Fri, Aug 18, 2017 at 8:18 PM, Christopher Li <sparse@chrisli.org> wrote: > On Fri, Aug 18, 2017 at 12:15 PM, Luc Van Oostenryck wrote: >> >> No no, it's not the point here. >> What I meant in the idea is that currently OP_CONTEXT >> and its associated attribute work with a unique implicit >> 'object', called 'context' but not explicitly associated to >> a symbol/variable and thus they need their own specific >> machinery. > > Ah, I did not get your idea at the first read. So you mean > do not use bb->context to store the context. Exactly. > There will > be still one context per BB. Store the context into pseudo > and some how that pseudo in link into the BB. No, it will just be a pseudo, like any other pseudos so we could have as much as we want. We'll just need to deal with them during simplification phase. > I was thinking weather you are going to create more than one > context per BB, attach the context to the locking objects. > That is not what you are trying to do. Sorta. The contexts will be in (special) pseudos. > If my understanding of you just try to move bb->context > into pseudos. In principle that is fine. It is also depend on > a lot of implementation details as well. Do you still need to > lookup the context per basic block? The main idea is to avoid that. >>> It can become negative for some helper function that >>> wrap around the unclock function. >> >> This need to be checked (and currently there is a small gap >> where it can go negative and positive again and not warned >> about it. > > Not sure that is worthy while to warn. It can happen in the function > assume the lock is taken when enter the function. > The unlock it, do some thing , relock it. > There is total legit reason to do it. No, it's an error to (try to) unlock a lock that is not taken. > Without cross function checking, there is no good way to know > the function is(or should) always called with lock held. > >> There are several problems here: >> 1) the false alarms. These are unavoidable, it's equivalent to the >> halting problem. We can only be correct and precise in restricted >> situations (no loops, for example). It's not what the idea here is about. > > The really hard one are not avoidable. The one I am talking about. > Which call some helper function to acquired the lock is avoidable. > Last time I look at it, that make up a the large part of the false > alarm. Yes, most probably. >> 2) what you describe here about the inlines. >> The underlying problems is to match the 'names/expression' >> given in the declarations with the ones used in(side) the definition. >> I've some drafts but I'm not totally sure about what exactly can be >> done. >> 3) because of 2) we ignore all the names and we simply care about >> the sum of all contexts. >> So, in truth, we can only deal with a single context. > > There are two issues. I think having multiple context or tight the > context into a variable is nice to have, but not that important. > Even if we implemented that, most likely it is going to yield more > warnings due to not able to find the name/expression properly. > It is very unlikely that in the same function has real two context > error cancel each other out. So having one context for now > is more or less fine. Well, it can more frequent that we could think of. And anyway, it's the goal of a checker to look after error, and the rarer they are the more the checker (which find them!) is valuable. > Not able to do cross function check on context is the big > issue in context checking. If we can do cross function > checking, then a lot of warning can be eliminated. The problem is that the context attribute accept arbitrary *expressions* and not a simple name/argument. But it's normal that it accepts expressions as otherwise a lot of the functions would need a specific argument for it (and there is an idea behind that because for the functions where the expression is simply one of the arguments, we can do the propagation almost for free in the inliner). -- Luc ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 19:01 ` Luc Van Oostenryck @ 2017-08-18 19:59 ` Christopher Li 0 siblings, 0 replies; 11+ messages in thread From: Christopher Li @ 2017-08-18 19:59 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Josh Triplett, Linux-Sparse On Fri, Aug 18, 2017 at 3:01 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > No, it will just be a pseudo, like any other pseudos so we > could have as much as we want. We'll just need to deal > with them during simplification phase. I see. You use a special pseudo to store the context. Every context change will corresponding that pesudo add/decrease. That is a good idea. >> Not sure that is worthy while to warn. It can happen in the function >> assume the lock is taken when enter the function. >> The unlock it, do some thing , relock it. >> There is total legit reason to do it. > > No, it's an error to (try to) unlock a lock that is not taken. Right. But the function itself can't see the caller always call this function with lock already holed. If the caller call without holding the lock that will be a bug. >> There are two issues. I think having multiple context or tight the >> context into a variable is nice to have, but not that important. >> Even if we implemented that, most likely it is going to yield more >> warnings due to not able to find the name/expression properly. >> It is very unlikely that in the same function has real two context >> error cancel each other out. So having one context for now >> is more or less fine. > > Well, it can more frequent that we could think of. > And anyway, it's the goal of a checker to look after error, > and the rarer they are the more the checker (which > find them!) is valuable. That is true. We have have some prototypes and see how much new warnings it introduce. I am curious how much of the report is real bug vs false psitives. > The problem is that the context attribute accept arbitrary > *expressions* and not a simple name/argument. But it's > normal that it accepts expressions as otherwise a lot of > the functions would need a specific argument for it > (and there is an idea behind that because for the > functions where the expression is simply one of the > arguments, we can do the propagation almost for free > in the inliner). When you inline it, it is not cross function any more. That is one way to solve it. Chris ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck 2017-08-18 14:26 ` Josh Triplett 2017-08-18 14:37 ` Christopher Li @ 2017-08-18 19:34 ` Linus Torvalds 2017-08-18 20:59 ` Luc Van Oostenryck 2 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2017-08-18 19:34 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Josh Triplett, Sparse Mailing-list On Fri, Aug 18, 2017 at 6:20 AM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > > My idea/question is the following: this interpretation > is already done for all 'normal' values. So couldn't we > consider the context as a special kind of 'variable', I think that's a great idea. We do all the flow stuff for variables anyway, and using a hidden variable would make it potentially a lot more flexible. You could make the context op do much more than just a fixed inc/dec. Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 19:34 ` Linus Torvalds @ 2017-08-18 20:59 ` Luc Van Oostenryck 2017-08-18 21:10 ` Linus Torvalds 0 siblings, 1 reply; 11+ messages in thread From: Luc Van Oostenryck @ 2017-08-18 20:59 UTC (permalink / raw) To: Linus Torvalds; +Cc: Josh Triplett, Sparse Mailing-list On Fri, Aug 18, 2017 at 9:34 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Aug 18, 2017 at 6:20 AM, Luc Van Oostenryck wrote: > > I think that's a great idea. I confess that it's totally copied from how it's done for the return value. > We do all the flow stuff for variables > anyway, and using a hidden variable would make it potentially a lot > more flexible. You could make the context op do much more than just a > fixed inc/dec. Yes, I see potential too, but nothing very specific. Have you something in mind? -- Luc ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 20:59 ` Luc Van Oostenryck @ 2017-08-18 21:10 ` Linus Torvalds 2017-08-18 21:56 ` Luc Van Oostenryck 0 siblings, 1 reply; 11+ messages in thread From: Linus Torvalds @ 2017-08-18 21:10 UTC (permalink / raw) To: Luc Van Oostenryck; +Cc: Josh Triplett, Sparse Mailing-list On Fri, Aug 18, 2017 at 1:59 PM, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote: > >> We do all the flow stuff for variables >> anyway, and using a hidden variable would make it potentially a lot >> more flexible. You could make the context op do much more than just a >> fixed inc/dec. > > Yes, I see potential too, but nothing very specific. Have you something in mind? No, to a first approximation I'd just continue to add and subtract constant values. But it might allow us to do conditional contexts, which the kernel actually needs. Right now the kernel does tricks like this: # define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0) (see include/linux/compiler.h) exactly because we want to not really add a constant 1 to the context, but add it only if the condition "c" was non-zero. We then depend on sparse just doing the flow simplification etc. But it *could* have been done by just instead allowing the context to be updated with a boolean variable.. But sparse might prefer that flow-based approach anyway - I'm just saying that sometimes a more flexible model could be a good thing at least in theory. Linus ^ permalink raw reply [flat|nested] 11+ messages in thread
* Re: idea/question about sparse's context checking 2017-08-18 21:10 ` Linus Torvalds @ 2017-08-18 21:56 ` Luc Van Oostenryck 0 siblings, 0 replies; 11+ messages in thread From: Luc Van Oostenryck @ 2017-08-18 21:56 UTC (permalink / raw) To: Linus Torvalds; +Cc: Josh Triplett, Sparse Mailing-list On Fri, Aug 18, 2017 at 11:10 PM, Linus Torvalds <torvalds@linux-foundation.org> wrote: > On Fri, Aug 18, 2017 at 1:59 PM, Luc Van Oostenryck wrote: >> >> Yes, I see potential too, but nothing very specific. Have you something in mind? > > No, to a first approximation I'd just continue to add and subtract > constant values. > > But it might allow us to do conditional contexts, which the kernel > actually needs. Right now the kernel does tricks like this: > > # define __cond_lock(x,c) ((c) ? ({ __acquire(x); 1; }) : 0) > > (see include/linux/compiler.h) exactly because we want to not really > add a constant 1 to the context, but add it only if the condition "c" > was non-zero. Yes, it could be like __context_{set,clr}(x, <boolean>) or even __context_{or, xor, and, clr}(x, <somebitmask>) > We then depend on sparse just doing the flow simplification etc. But > it *could* have been done by just instead allowing the context to be > updated with a boolean variable.. > > But sparse might prefer that flow-based approach anyway - I'm just > saying that sometimes a more flexible model could be a good thing at > least in theory. Absolutetly. -- Luc ^ permalink raw reply [flat|nested] 11+ messages in thread
end of thread, other threads:[~2017-08-18 21:56 UTC | newest] Thread overview: 11+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck 2017-08-18 14:26 ` Josh Triplett 2017-08-18 14:37 ` Christopher Li 2017-08-18 16:15 ` Luc Van Oostenryck 2017-08-18 18:18 ` Christopher Li 2017-08-18 19:01 ` Luc Van Oostenryck 2017-08-18 19:59 ` Christopher Li 2017-08-18 19:34 ` Linus Torvalds 2017-08-18 20:59 ` Luc Van Oostenryck 2017-08-18 21:10 ` Linus Torvalds 2017-08-18 21:56 ` Luc Van Oostenryck
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).