idea/question about sparse's context checking

linux-sparse.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* idea/question about sparse's context checking
@ 2017-08-18 13:20 Luc Van Oostenryck
  2017-08-18 14:26 ` Josh Triplett
                   ` (2 more replies)
  0 siblings, 3 replies; 11+ messages in thread
From: Luc Van Oostenryck @ 2017-08-18 13:20 UTC (permalink / raw)
  To: Josh Triplett; +Cc: linux-sparse

Hi Josh,

I was thinking lately about sparse's context checking.
I had an idea and I wondered if it hasn't already been
tried or discussed.

The context checking essentially works with the instruction
OP_CONTEXT, that do nothing more than adding or subtracting
some constant value to 'the context'. Then, at checking time,
these instructions are interpreted along all the paths and if
there is a disagreement between values from different paths,
it emits a 'context imbalance' warning (there is also the
checks for the expected return value of the context).

My idea/question is the following: this interpretation
is already done for all 'normal' values. So couldn't we
consider the context as a special kind of 'variable',
use a normal pseudo for it, use phi-nodes on them and
let the normal simplification process act on them?
If the context is (proveably) well balanced, there
shouldn't be any phi-node left for this context.

I'm not sure if there would be any significant advantage,
but it seems to me that what is currently done is
somehow redundant with the 'normal' processing.
It could also maybe help to have several independent
contexts.

[of course, we can't really use add/sub for the context
increase/decrease as we want to check if the context
don't become negative].

Best regards,
-- Luc

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck
@ 2017-08-18 14:26 ` Josh Triplett
  2017-08-18 14:37 ` Christopher Li
  2017-08-18 19:34 ` Linus Torvalds
  2 siblings, 0 replies; 11+ messages in thread
From: Josh Triplett @ 2017-08-18 14:26 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: linux-sparse

On August 18, 2017 6:20:42 AM PDT, Luc Van Oostenryck <luc.vanoostenryck@gmail.com> wrote:
>Hi Josh,
>
>I was thinking lately about sparse's context checking.
>I had an idea and I wondered if it hasn't already been
>tried or discussed.
>
>The context checking essentially works with the instruction
>OP_CONTEXT, that do nothing more than adding or subtracting
>some constant value to 'the context'. Then, at checking time,
>these instructions are interpreted along all the paths and if
>there is a disagreement between values from different paths,
>it emits a 'context imbalance' warning (there is also the
>checks for the expected return value of the context).
>
>My idea/question is the following: this interpretation
>is already done for all 'normal' values. So couldn't we
>consider the context as a special kind of 'variable',
>use a normal pseudo for it, use phi-nodes on them and
>let the normal simplification process act on them?
>If the context is (proveably) well balanced, there
>shouldn't be any phi-node left for this context.
>
>I'm not sure if there would be any significant advantage,
>but it seems to me that what is currently done is
>somehow redundant with the 'normal' processing.
>It could also maybe help to have several independent
>contexts.

I'd love to see sparse doing more general value tracking, for this and other purposes. And yes, that would help with tracking multiple contexts.


^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck
  2017-08-18 14:26 ` Josh Triplett
@ 2017-08-18 14:37 ` Christopher Li
  2017-08-18 16:15   ` Luc Van Oostenryck
  2017-08-18 19:34 ` Linus Torvalds
  2 siblings, 1 reply; 11+ messages in thread
From: Christopher Li @ 2017-08-18 14:37 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Josh Triplett, Linux-Sparse

On Fri, Aug 18, 2017 at 9:20 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
> Hi Josh,
>
> I was thinking lately about sparse's context checking.
> I had an idea and I wondered if it hasn't already been
> tried or discussed.
>
> The context checking essentially works with the instruction
> OP_CONTEXT, that do nothing more than adding or subtracting
> some constant value to 'the context'. Then, at checking time,
> these instructions are interpreted along all the paths and if
> there is a disagreement between values from different paths,
> it emits a 'context imbalance' warning (there is also the
> checks for the expected return value of the context).

That is right.

>
> My idea/question is the following: this interpretation
> is already done for all 'normal' values. So couldn't we
> consider the context as a special kind of 'variable',
> use a normal pseudo for it, use phi-nodes on them and

Question. What syntax are you considering for declare
this special pseudo? Does it automatically attach to the
variable that has the context attribute?

> let the normal simplification process act on them?
> If the context is (proveably) well balanced, there
> shouldn't be any phi-node left for this context.

Assume you do the context pseudo attach to the lock
variable. Then you will have to do the pointer alias properly.
People might have different pointer point to that variable.

> I'm not sure if there would be any significant advantage,
> but it seems to me that what is currently done is
> somehow redundant with the 'normal' processing.
> It could also maybe help to have several independent
> contexts.

'normal' process do you mean the context statement?

>
> [of course, we can't really use add/sub for the context
> increase/decrease as we want to check if the context
> don't become negative].

It can become negative for some helper function that
wrap around the unclock function.

I think the biggest problem with context right now is actually
not able to inline or take into account for the function that
change the context. We can add more context attribute to
the function to describe what this function might do to the
context. The best way is automatic to make this happen. I see
that as the biggest cause of the false alarm on context warnings.

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 14:37 ` Christopher Li
@ 2017-08-18 16:15   ` Luc Van Oostenryck
  2017-08-18 18:18     ` Christopher Li
  0 siblings, 1 reply; 11+ messages in thread
From: Luc Van Oostenryck @ 2017-08-18 16:15 UTC (permalink / raw)
  To: Christopher Li; +Cc: Josh Triplett, Linux-Sparse

On Fri, Aug 18, 2017 at 4:37 PM, Christopher Li <sparse@chrisli.org> wrote:
>> My idea/question is the following: this interpretation
>> is already done for all 'normal' values. So couldn't we
>> consider the context as a special kind of 'variable',
>> use a normal pseudo for it, use phi-nodes on them and
>
> Question. What syntax are you considering for declare
> this special pseudo? Does it automatically attach to the
> variable that has the context attribute?

First, it's only a vague idea. Soaking since a little while
but still only a vague idea.

You don't need at this point a special syntax.
For example, the return value is currently stored in a special
variable/symbol: 'return' in a special namespace/scope.

>> let the normal simplification process act on them?
>> If the context is (proveably) well balanced, there
>> shouldn't be any phi-node left for this context.
>
> Assume you do the context pseudo attach to the lock
> variable. Then you will have to do the pointer alias properly.
> People might have different pointer point to that variable.

No no, it's not the point here.
What I meant in the idea is that currently OP_CONTEXT
and its associated attribute work with a unique implicit
'object', called 'context' but not explicitly associated to
a symbol/variable and thus they need their own specific
machinery.

>> I'm not sure if there would be any significant advantage,
>> but it seems to me that what is currently done is
>> somehow redundant with the 'normal' processing.
>> It could also maybe help to have several independent
>> contexts.
>
> 'normal' process do you mean the context statement?

No, the processing of variables and pseudos.

>> [of course, we can't really use add/sub for the context
>> increase/decrease as we want to check if the context
>> don't become negative].
>
> It can become negative for some helper function that
> wrap around the unclock function.

This need to be checked (and currently there is a small gap
where it can go negative and positive again and not warned
about it.

> I think the biggest problem with context right now is actually
> not able to inline or take into account for the function that
> change the context. We can add more context attribute to
> the function to describe what this function might do to the
> context. The best way is automatic to make this happen. I see
> that as the biggest cause of the false alarm on context warnings.

There are several problems here:
1) the false alarms. These are unavoidable, it's equivalent to the
    halting problem. We can only be correct and precise in restricted
   situations (no loops, for example). It's not what the idea here is about.
2) what you describe here about the inlines.
    The underlying problems is to match the 'names/expression'
    given in the declarations with the ones used in(side) the definition.
    I've some drafts but I'm not totally sure about what exactly can be
   done.
3) because of 2) we ignore all the names and we simply care about
    the sum of all contexts.
    So, in truth, we can only deal with a single context.
4) the duplicated machinery this idea/question is about.

-- Luc

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 16:15   ` Luc Van Oostenryck
@ 2017-08-18 18:18     ` Christopher Li
  2017-08-18 19:01       ` Luc Van Oostenryck
  0 siblings, 1 reply; 11+ messages in thread
From: Christopher Li @ 2017-08-18 18:18 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Josh Triplett, Linux-Sparse

On Fri, Aug 18, 2017 at 12:15 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>>> let the normal simplification process act on them?
>>> If the context is (proveably) well balanced, there
>>> shouldn't be any phi-node left for this context.
>>
>> Assume you do the context pseudo attach to the lock
>> variable. Then you will have to do the pointer alias properly.
>> People might have different pointer point to that variable.
>
> No no, it's not the point here.
> What I meant in the idea is that currently OP_CONTEXT
> and its associated attribute work with a unique implicit
> 'object', called 'context' but not explicitly associated to
> a symbol/variable and thus they need their own specific
> machinery.

Ah, I did not get your idea at the first read. So you mean
do not use bb->context to store the context. There will
be still one context per BB. Store the context into pseudo
and some how that pseudo in link into the BB.

I was thinking weather you are going to create more than one
context per BB, attach the context to the locking objects.
That is not what you are trying to do.

If my understanding of you just try to move bb->context
into pseudos. In principle that is fine. It is also depend on
a lot of implementation details as well. Do you still need to
lookup the context per basic block?

>> It can become negative for some helper function that
>> wrap around the unclock function.
>
> This need to be checked (and currently there is a small gap
> where it can go negative and positive again and not warned
> about it.

Not sure that is worthy while to warn. It can happen in the function
assume the lock is taken when enter the function.
The unlock it, do some thing , relock it.
There is total legit reason to do it.

Without cross function checking, there is no good way to know
the function is(or should) always called with lock held.

> There are several problems here:
> 1) the false alarms. These are unavoidable, it's equivalent to the
>     halting problem. We can only be correct and precise in restricted
>    situations (no loops, for example). It's not what the idea here is about.

The really hard one are not avoidable. The one I am talking about.
Which call some helper function to acquired the lock is avoidable.
Last time I look at it, that make up a the large part of the false
alarm.

> 2) what you describe here about the inlines.
>     The underlying problems is to match the 'names/expression'
>     given in the declarations with the ones used in(side) the definition.
>     I've some drafts but I'm not totally sure about what exactly can be
>    done.
> 3) because of 2) we ignore all the names and we simply care about
>     the sum of all contexts.
>     So, in truth, we can only deal with a single context.

There are two issues. I think having multiple context or tight the
context into a variable is nice to have, but not that important.
Even if we implemented that, most likely it is going to yield more
warnings due to not able to find the name/expression properly.
It is very unlikely that in the same function has real two context
error cancel each other out. So having one context for now
is more or less fine.

Not able to do cross function check on context is the big
issue in context checking. If we can do cross function
checking, then a lot of warning can be eliminated.

You can also take a look at the metalevel compiling papers.
The company coverity was started out doing metalevel compiling
checks. Coverity report a lot of bugs in Linux kernel back in the days.

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 18:18     ` Christopher Li
@ 2017-08-18 19:01       ` Luc Van Oostenryck
  2017-08-18 19:59         ` Christopher Li
  0 siblings, 1 reply; 11+ messages in thread
From: Luc Van Oostenryck @ 2017-08-18 19:01 UTC (permalink / raw)
  To: Christopher Li; +Cc: Josh Triplett, Linux-Sparse

On Fri, Aug 18, 2017 at 8:18 PM, Christopher Li <sparse@chrisli.org> wrote:
> On Fri, Aug 18, 2017 at 12:15 PM, Luc Van Oostenryck wrote:
>>
>> No no, it's not the point here.
>> What I meant in the idea is that currently OP_CONTEXT
>> and its associated attribute work with a unique implicit
>> 'object', called 'context' but not explicitly associated to
>> a symbol/variable and thus they need their own specific
>> machinery.
>
> Ah, I did not get your idea at the first read. So you mean
> do not use bb->context to store the context.

Exactly.
> There will
> be still one context per BB. Store the context into pseudo
> and some how that pseudo in link into the BB.

No, it will just be a pseudo, like any other pseudos so we
could have as much as we want. We'll just need to deal
with them during simplification phase.

> I was thinking weather you are going to create more than one
> context per BB, attach the context to the locking objects.
> That is not what you are trying to do.

Sorta. The contexts will be in (special) pseudos.

> If my understanding of you just try to move bb->context
> into pseudos. In principle that is fine. It is also depend on
> a lot of implementation details as well. Do you still need to
> lookup the context per basic block?

The main idea is to avoid that.

>>> It can become negative for some helper function that
>>> wrap around the unclock function.
>>
>> This need to be checked (and currently there is a small gap
>> where it can go negative and positive again and not warned
>> about it.
>
> Not sure that is worthy while to warn. It can happen in the function
> assume the lock is taken when enter the function.
> The unlock it, do some thing , relock it.
> There is total legit reason to do it.

No, it's an error to (try to) unlock a lock that is not taken.

> Without cross function checking, there is no good way to know
> the function is(or should) always called with lock held.
>
>> There are several problems here:
>> 1) the false alarms. These are unavoidable, it's equivalent to the
>>     halting problem. We can only be correct and precise in restricted
>>    situations (no loops, for example). It's not what the idea here is about.
>
> The really hard one are not avoidable. The one I am talking about.
> Which call some helper function to acquired the lock is avoidable.
> Last time I look at it, that make up a the large part of the false
> alarm.

Yes, most probably.

>> 2) what you describe here about the inlines.
>>     The underlying problems is to match the 'names/expression'
>>     given in the declarations with the ones used in(side) the definition.
>>     I've some drafts but I'm not totally sure about what exactly can be
>>    done.
>> 3) because of 2) we ignore all the names and we simply care about
>>     the sum of all contexts.
>>     So, in truth, we can only deal with a single context.
>
> There are two issues. I think having multiple context or tight the
> context into a variable is nice to have, but not that important.
> Even if we implemented that, most likely it is going to yield more
> warnings due to not able to find the name/expression properly.
> It is very unlikely that in the same function has real two context
> error cancel each other out. So having one context for now
> is more or less fine.

Well, it can more frequent that we could think of.
And anyway, it's the goal of a checker to look after error,
and the rarer they are the more the checker (which
find them!) is valuable.

> Not able to do cross function check on context is the big
> issue in context checking. If we can do cross function
> checking, then a lot of warning can be eliminated.

The problem is that the context attribute accept arbitrary
*expressions* and not a simple name/argument. But it's
normal that it accepts expressions as otherwise a lot of
the functions would need a specific argument for it
(and there is an idea behind that because for the
functions where the expression is simply one of the
arguments, we can do the propagation almost for free
in the inliner).

-- Luc

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 19:01       ` Luc Van Oostenryck
@ 2017-08-18 19:59         ` Christopher Li
  0 siblings, 0 replies; 11+ messages in thread
From: Christopher Li @ 2017-08-18 19:59 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Josh Triplett, Linux-Sparse

On Fri, Aug 18, 2017 at 3:01 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:

> No, it will just be a pseudo, like any other pseudos so we
> could have as much as we want. We'll just need to deal
> with them during simplification phase.

I see. You use a special pseudo to store the context. Every
context change will corresponding that pesudo add/decrease.
That is a good idea.

>> Not sure that is worthy while to warn. It can happen in the function
>> assume the lock is taken when enter the function.
>> The unlock it, do some thing , relock it.
>> There is total legit reason to do it.
>
> No, it's an error to (try to) unlock a lock that is not taken.

Right. But the function itself can't see the caller always call
this function with lock already holed. If the caller call without
holding the lock that will be a bug.

>> There are two issues. I think having multiple context or tight the
>> context into a variable is nice to have, but not that important.
>> Even if we implemented that, most likely it is going to yield more
>> warnings due to not able to find the name/expression properly.
>> It is very unlikely that in the same function has real two context
>> error cancel each other out. So having one context for now
>> is more or less fine.
>
> Well, it can more frequent that we could think of.
> And anyway, it's the goal of a checker to look after error,
> and the rarer they are the more the checker (which
> find them!) is valuable.

That is true. We have have some prototypes and see
how much new warnings it introduce. I am curious how much
of the report is real bug vs false psitives.

> The problem is that the context attribute accept arbitrary
> *expressions* and not a simple name/argument. But it's
> normal that it accepts expressions as otherwise a lot of
> the functions would need a specific argument for it
> (and there is an idea behind that because for the
> functions where the expression is simply one of the
> arguments, we can do the propagation almost for free
> in the inliner).

When you inline it, it is not cross function any more.
That is one way to solve it.

Chris

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck
  2017-08-18 14:26 ` Josh Triplett
  2017-08-18 14:37 ` Christopher Li
@ 2017-08-18 19:34 ` Linus Torvalds
  2017-08-18 20:59   ` Luc Van Oostenryck
  2 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2017-08-18 19:34 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Josh Triplett, Sparse Mailing-list

On Fri, Aug 18, 2017 at 6:20 AM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
> My idea/question is the following: this interpretation
> is already done for all 'normal' values. So couldn't we
> consider the context as a special kind of 'variable',

I think that's a great idea.  We do all the flow stuff for variables
anyway, and using a hidden variable would make it potentially a lot
more flexible. You could make the context op do much more than just a
fixed inc/dec.

                    Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 19:34 ` Linus Torvalds
@ 2017-08-18 20:59   ` Luc Van Oostenryck
  2017-08-18 21:10     ` Linus Torvalds
  0 siblings, 1 reply; 11+ messages in thread
From: Luc Van Oostenryck @ 2017-08-18 20:59 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Josh Triplett, Sparse Mailing-list

On Fri, Aug 18, 2017 at 9:34 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Aug 18, 2017 at 6:20 AM, Luc Van Oostenryck wrote:
>
> I think that's a great idea.
I confess that it's totally copied from how it's done for the return value.

> We do all the flow stuff for variables
> anyway, and using a hidden variable would make it potentially a lot
> more flexible. You could make the context op do much more than just a
> fixed inc/dec.

Yes, I see potential too, but nothing very specific. Have you something in mind?

-- Luc

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 20:59   ` Luc Van Oostenryck
@ 2017-08-18 21:10     ` Linus Torvalds
  2017-08-18 21:56       ` Luc Van Oostenryck
  0 siblings, 1 reply; 11+ messages in thread
From: Linus Torvalds @ 2017-08-18 21:10 UTC (permalink / raw)
  To: Luc Van Oostenryck; +Cc: Josh Triplett, Sparse Mailing-list

On Fri, Aug 18, 2017 at 1:59 PM, Luc Van Oostenryck
<luc.vanoostenryck@gmail.com> wrote:
>
>> We do all the flow stuff for variables
>> anyway, and using a hidden variable would make it potentially a lot
>> more flexible. You could make the context op do much more than just a
>> fixed inc/dec.
>
> Yes, I see potential too, but nothing very specific. Have you something in mind?

No, to a first approximation I'd just continue to add and subtract
constant values.

But it might allow us to do conditional contexts, which the kernel
actually needs. Right now the kernel does tricks like this:

     # define __cond_lock(x,c)      ((c) ? ({ __acquire(x); 1; }) : 0)

(see include/linux/compiler.h) exactly because we want to not really
add a constant 1 to the context, but add it only if the condition "c"
was non-zero.

We then depend on sparse just doing the flow simplification etc. But
it *could* have been done by just instead allowing the context to be
updated with a boolean variable..

But sparse might prefer that flow-based approach anyway - I'm just
saying that sometimes a more flexible model could be a good thing at
least in theory.

               Linus

^ permalink raw reply	[flat|nested] 11+ messages in thread

* Re: idea/question about sparse's context checking
  2017-08-18 21:10     ` Linus Torvalds
@ 2017-08-18 21:56       ` Luc Van Oostenryck
  0 siblings, 0 replies; 11+ messages in thread
From: Luc Van Oostenryck @ 2017-08-18 21:56 UTC (permalink / raw)
  To: Linus Torvalds; +Cc: Josh Triplett, Sparse Mailing-list

On Fri, Aug 18, 2017 at 11:10 PM, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
> On Fri, Aug 18, 2017 at 1:59 PM, Luc Van Oostenryck wrote:
>>
>> Yes, I see potential too, but nothing very specific. Have you something in mind?
>
> No, to a first approximation I'd just continue to add and subtract
> constant values.
>
> But it might allow us to do conditional contexts, which the kernel
> actually needs. Right now the kernel does tricks like this:
>
>      # define __cond_lock(x,c)      ((c) ? ({ __acquire(x); 1; }) : 0)
>
> (see include/linux/compiler.h) exactly because we want to not really
> add a constant 1 to the context, but add it only if the condition "c"
> was non-zero.

Yes, it could be like __context_{set,clr}(x, <boolean>)
or even                   __context_{or, xor, and, clr}(x, <somebitmask>)

> We then depend on sparse just doing the flow simplification etc. But
> it *could* have been done by just instead allowing the context to be
> updated with a boolean variable..
>
> But sparse might prefer that flow-based approach anyway - I'm just
> saying that sometimes a more flexible model could be a good thing at
> least in theory.

Absolutetly.

-- Luc

^ permalink raw reply	[flat|nested] 11+ messages in thread

end of thread, other threads:[~2017-08-18 21:56 UTC | newest]

Thread overview: 11+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-08-18 13:20 idea/question about sparse's context checking Luc Van Oostenryck
2017-08-18 14:26 ` Josh Triplett
2017-08-18 14:37 ` Christopher Li
2017-08-18 16:15   ` Luc Van Oostenryck
2017-08-18 18:18     ` Christopher Li
2017-08-18 19:01       ` Luc Van Oostenryck
2017-08-18 19:59         ` Christopher Li
2017-08-18 19:34 ` Linus Torvalds
2017-08-18 20:59   ` Luc Van Oostenryck
2017-08-18 21:10     ` Linus Torvalds
2017-08-18 21:56       ` Luc Van Oostenryck

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).