* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 10:49 ` Eric W. Biederman
@ 2017-05-29 10:49 ` Eric W. Biederman
2017-05-29 11:30 ` Eric W. Biederman
2017-05-29 12:13 ` Peter Zijlstra
2 siblings, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2017-05-29 10:49 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening@lists.openwall.com,
LKML
Peter Zijlstra <peterz@infradead.org> writes:
> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>
>> Kees I I have a concern:
>>
>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>> {
>> unsigned int new, val = atomic_read(&r->refs);
>>
>> do {
>> if (!val)
>> return false;
>>
>> if (unlikely(val == UINT_MAX))
>> return true;
>>
>> new = val + i;
>> if (new < val)
>> new = UINT_MAX;
>>
>> } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>
>> WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>
>> return true;
>> }
>>
>> Why in the world do you succeed when you the value saturates????
>
> Why not? On saturation the object will leak and returning a reference to
> it is always good.
>
>> From a code perspective that is bizarre. The code already has to handle
>> the case when the counter does not increment.
>
> I don't see it as bizarre, we turned an overflow/use-after-free into a
> leak. That's the primary mechanism here.
>
> As long as we have a reference to a leaked object, we might as well use
> it, its not going anywhere.
>
>> Fixing the return value would move refcount_t into the realm of
>> something that is desirable because it has bettern semantics and
>> is more useful just on a day to day correctness point of view. Even
>> ignoring the security implications.
>
> It changes the semantics between inc_not_zero() and inc(). It also
> complicates the semantics of inc_not_zero(), where currently the failure
> implies the count is 0 and means no-such-object, you complicate matters
> by basically returning 'busy'.
Busy is not a state of a reference count.
It is true I am suggesting treating something with a saturated reference
as not available. If that is what you mean by busy. But if it's
reference is zero it is also not available. So there is no practical
difference.
> That is a completely new class of failure that is actually hard to deal
> with, not to mention that it completely destroys refcount_inc_not_zero()
> being a 'simple' replacement for atomic_inc_not_zero().
>
> In case of the current failure, the no-such-object, we can fix that by
> creating said object. But what to do on 'busy' ? Surely you don't want
> to create another. You'd have to somehow retrofit something to wait on
> in every user.
Using little words.
A return of true from inc_not_zero means we took a reference.
A return of false means we did not take a reference.
The code already handles I took a reference or I did not take a
reference.
Therefore lying with refcount_t is not helpful. It takes failures
the code could easily handle and turns them into leaks.
At least that is how I have seen reference counts used. And those
are definitely the plane obivous semantics.
Your changes are definitely not drop in replacements for atomic_t in my
code.
Eric
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 10:49 ` Eric W. Biederman
2017-05-29 10:49 ` Eric W. Biederman
@ 2017-05-29 11:30 ` Eric W. Biederman
2017-05-29 11:30 ` Eric W. Biederman
2017-05-29 11:39 ` Eric W. Biederman
2017-05-29 12:13 ` Peter Zijlstra
2 siblings, 2 replies; 20+ messages in thread
From: Eric W. Biederman @ 2017-05-29 11:30 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening
ebiederm@xmission.com (Eric W. Biederman) writes:
> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>>
>>> Kees I I have a concern:
>>>
>>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>>> {
>>> unsigned int new, val = atomic_read(&r->refs);
>>>
>>> do {
>>> if (!val)
>>> return false;
>>>
>>> if (unlikely(val == UINT_MAX))
>>> return true;
>>>
>>> new = val + i;
>>> if (new < val)
>>> new = UINT_MAX;
>>>
>>> } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>>
>>> WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>>
>>> return true;
>>> }
>>>
>>> Why in the world do you succeed when you the value saturates????
>>
>> Why not? On saturation the object will leak and returning a reference to
>> it is always good.
>>
>>> From a code perspective that is bizarre. The code already has to handle
>>> the case when the counter does not increment.
>>
>> I don't see it as bizarre, we turned an overflow/use-after-free into a
>> leak. That's the primary mechanism here.
>>
>> As long as we have a reference to a leaked object, we might as well use
>> it, its not going anywhere.
>>
>>> Fixing the return value would move refcount_t into the realm of
>>> something that is desirable because it has bettern semantics and
>>> is more useful just on a day to day correctness point of view. Even
>>> ignoring the security implications.
>>
>> It changes the semantics between inc_not_zero() and inc(). It also
>> complicates the semantics of inc_not_zero(), where currently the failure
>> implies the count is 0 and means no-such-object, you complicate matters
>> by basically returning 'busy'.
>
> Busy is not a state of a reference count.
>
> It is true I am suggesting treating something with a saturated reference
> as not available. If that is what you mean by busy. But if it's
> reference is zero it is also not available. So there is no practical
> difference.
>
>> That is a completely new class of failure that is actually hard to deal
>> with, not to mention that it completely destroys refcount_inc_not_zero()
>> being a 'simple' replacement for atomic_inc_not_zero().
>>
>> In case of the current failure, the no-such-object, we can fix that by
>> creating said object. But what to do on 'busy' ? Surely you don't want
>> to create another. You'd have to somehow retrofit something to wait on
>> in every user.
>
> Using little words.
>
> A return of true from inc_not_zero means we took a reference.
> A return of false means we did not take a reference.
>
> The code already handles I took a reference or I did not take a
> reference.
>
> Therefore lying with refcount_t is not helpful. It takes failures
> the code could easily handle and turns them into leaks.
>
> At least that is how I have seen reference counts used. And those
> are definitely the plane obivous semantics.
>
> Your changes are definitely not drop in replacements for atomic_t in my
> code.
To clarify.
If my code uses atomic_inc it does not expect a failure of any sort
and saturate semantics are a fine replacement.
If my code uses atomic_inc_not_zero it knows how to handle a failure
to take a reference count. Making hiding the failure really bizarre.
A must check function that hides a case I can handle and requires
checking in a case where my code is built not to check is a drop in
replacement for neither.
So anyone who is proposing a refcount_t change as a drop in replacement
for any code I maintain I will nack on sight because refcount_t is not
currently a no-brain drop in replacement.
Eric
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 11:30 ` Eric W. Biederman
@ 2017-05-29 11:30 ` Eric W. Biederman
2017-05-29 11:39 ` Eric W. Biederman
1 sibling, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2017-05-29 11:30 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening@lists.openwall.com,
LKML
ebiederm@xmission.com (Eric W. Biederman) writes:
> Peter Zijlstra <peterz@infradead.org> writes:
>
>> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>>
>>> Kees I I have a concern:
>>>
>>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>>> {
>>> unsigned int new, val = atomic_read(&r->refs);
>>>
>>> do {
>>> if (!val)
>>> return false;
>>>
>>> if (unlikely(val == UINT_MAX))
>>> return true;
>>>
>>> new = val + i;
>>> if (new < val)
>>> new = UINT_MAX;
>>>
>>> } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>>
>>> WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>>
>>> return true;
>>> }
>>>
>>> Why in the world do you succeed when you the value saturates????
>>
>> Why not? On saturation the object will leak and returning a reference to
>> it is always good.
>>
>>> From a code perspective that is bizarre. The code already has to handle
>>> the case when the counter does not increment.
>>
>> I don't see it as bizarre, we turned an overflow/use-after-free into a
>> leak. That's the primary mechanism here.
>>
>> As long as we have a reference to a leaked object, we might as well use
>> it, its not going anywhere.
>>
>>> Fixing the return value would move refcount_t into the realm of
>>> something that is desirable because it has bettern semantics and
>>> is more useful just on a day to day correctness point of view. Even
>>> ignoring the security implications.
>>
>> It changes the semantics between inc_not_zero() and inc(). It also
>> complicates the semantics of inc_not_zero(), where currently the failure
>> implies the count is 0 and means no-such-object, you complicate matters
>> by basically returning 'busy'.
>
> Busy is not a state of a reference count.
>
> It is true I am suggesting treating something with a saturated reference
> as not available. If that is what you mean by busy. But if it's
> reference is zero it is also not available. So there is no practical
> difference.
>
>> That is a completely new class of failure that is actually hard to deal
>> with, not to mention that it completely destroys refcount_inc_not_zero()
>> being a 'simple' replacement for atomic_inc_not_zero().
>>
>> In case of the current failure, the no-such-object, we can fix that by
>> creating said object. But what to do on 'busy' ? Surely you don't want
>> to create another. You'd have to somehow retrofit something to wait on
>> in every user.
>
> Using little words.
>
> A return of true from inc_not_zero means we took a reference.
> A return of false means we did not take a reference.
>
> The code already handles I took a reference or I did not take a
> reference.
>
> Therefore lying with refcount_t is not helpful. It takes failures
> the code could easily handle and turns them into leaks.
>
> At least that is how I have seen reference counts used. And those
> are definitely the plane obivous semantics.
>
> Your changes are definitely not drop in replacements for atomic_t in my
> code.
To clarify.
If my code uses atomic_inc it does not expect a failure of any sort
and saturate semantics are a fine replacement.
If my code uses atomic_inc_not_zero it knows how to handle a failure
to take a reference count. Making hiding the failure really bizarre.
A must check function that hides a case I can handle and requires
checking in a case where my code is built not to check is a drop in
replacement for neither.
So anyone who is proposing a refcount_t change as a drop in replacement
for any code I maintain I will nack on sight because refcount_t is not
currently a no-brain drop in replacement.
Eric
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 11:30 ` Eric W. Biederman
2017-05-29 11:30 ` Eric W. Biederman
@ 2017-05-29 11:39 ` Eric W. Biederman
2017-05-29 11:39 ` Eric W. Biederman
2017-05-29 12:23 ` Peter Zijlstra
1 sibling, 2 replies; 20+ messages in thread
From: Eric W. Biederman @ 2017-05-29 11:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening
ebiederm@xmission.com (Eric W. Biederman) writes:
> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Peter Zijlstra <peterz@infradead.org> writes:
>>
>>> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>>>
>>>> Kees I I have a concern:
>>>>
>>>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>>>> {
>>>> unsigned int new, val = atomic_read(&r->refs);
>>>>
>>>> do {
>>>> if (!val)
>>>> return false;
>>>>
>>>> if (unlikely(val == UINT_MAX))
>>>> return true;
>>>>
>>>> new = val + i;
>>>> if (new < val)
>>>> new = UINT_MAX;
>>>>
>>>> } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>>>
>>>> WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>>>
>>>> return true;
>>>> }
>>>>
>>>> Why in the world do you succeed when you the value saturates????
>>>
>>> Why not? On saturation the object will leak and returning a reference to
>>> it is always good.
>>>
>>>> From a code perspective that is bizarre. The code already has to handle
>>>> the case when the counter does not increment.
>>>
>>> I don't see it as bizarre, we turned an overflow/use-after-free into a
>>> leak. That's the primary mechanism here.
>>>
>>> As long as we have a reference to a leaked object, we might as well use
>>> it, its not going anywhere.
>>>
>>>> Fixing the return value would move refcount_t into the realm of
>>>> something that is desirable because it has bettern semantics and
>>>> is more useful just on a day to day correctness point of view. Even
>>>> ignoring the security implications.
>>>
>>> It changes the semantics between inc_not_zero() and inc(). It also
>>> complicates the semantics of inc_not_zero(), where currently the failure
>>> implies the count is 0 and means no-such-object, you complicate matters
>>> by basically returning 'busy'.
>>
>> Busy is not a state of a reference count.
>>
>> It is true I am suggesting treating something with a saturated reference
>> as not available. If that is what you mean by busy. But if it's
>> reference is zero it is also not available. So there is no practical
>> difference.
>>
>>> That is a completely new class of failure that is actually hard to deal
>>> with, not to mention that it completely destroys refcount_inc_not_zero()
>>> being a 'simple' replacement for atomic_inc_not_zero().
>>>
>>> In case of the current failure, the no-such-object, we can fix that by
>>> creating said object. But what to do on 'busy' ? Surely you don't want
>>> to create another. You'd have to somehow retrofit something to wait on
>>> in every user.
>>
>> Using little words.
>>
>> A return of true from inc_not_zero means we took a reference.
>> A return of false means we did not take a reference.
>>
>> The code already handles I took a reference or I did not take a
>> reference.
>>
>> Therefore lying with refcount_t is not helpful. It takes failures
>> the code could easily handle and turns them into leaks.
>>
>> At least that is how I have seen reference counts used. And those
>> are definitely the plane obivous semantics.
>>
>> Your changes are definitely not drop in replacements for atomic_t in my
>> code.
>
> To clarify.
>
> If my code uses atomic_inc it does not expect a failure of any sort
> and saturate semantics are a fine replacement.
>
> If my code uses atomic_inc_not_zero it knows how to handle a failure
> to take a reference count. Making hiding the failure really bizarre.
>
> A must check function that hides a case I can handle and requires
> checking in a case where my code is built not to check is a drop in
> replacement for neither.
>
> So anyone who is proposing a refcount_t change as a drop in replacement
> for any code I maintain I will nack on sight because refcount_t is not
> currently a no-brain drop in replacement.
*Blink*
I failed to see that there is a refcount_inc. Too much noise in
the header file I suppose.
But implementing refcount_inc in terms of refcount_inc_not_zero is
totally broken. The two operations are not the same and the go to
different assumptions the code is making.
That explains why you think refcount_inc_not_zero should lie because
you are implementing refcount_inc with it. They are semantically very
different operations. Please separate them.
Eric
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 11:39 ` Eric W. Biederman
@ 2017-05-29 11:39 ` Eric W. Biederman
2017-05-29 12:23 ` Peter Zijlstra
1 sibling, 0 replies; 20+ messages in thread
From: Eric W. Biederman @ 2017-05-29 11:39 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening@lists.openwall.com,
LKML
ebiederm@xmission.com (Eric W. Biederman) writes:
> ebiederm@xmission.com (Eric W. Biederman) writes:
>
>> Peter Zijlstra <peterz@infradead.org> writes:
>>
>>> On Mon, May 29, 2017 at 04:11:13AM -0500, Eric W. Biederman wrote:
>>>
>>>> Kees I I have a concern:
>>>>
>>>> __must_check bool refcount_add_not_zero(unsigned int i, refcount_t *r)
>>>> {
>>>> unsigned int new, val = atomic_read(&r->refs);
>>>>
>>>> do {
>>>> if (!val)
>>>> return false;
>>>>
>>>> if (unlikely(val == UINT_MAX))
>>>> return true;
>>>>
>>>> new = val + i;
>>>> if (new < val)
>>>> new = UINT_MAX;
>>>>
>>>> } while (!atomic_try_cmpxchg_relaxed(&r->refs, &val, new));
>>>>
>>>> WARN_ONCE(new == UINT_MAX, "refcount_t: saturated; leaking memory.\n");
>>>>
>>>> return true;
>>>> }
>>>>
>>>> Why in the world do you succeed when you the value saturates????
>>>
>>> Why not? On saturation the object will leak and returning a reference to
>>> it is always good.
>>>
>>>> From a code perspective that is bizarre. The code already has to handle
>>>> the case when the counter does not increment.
>>>
>>> I don't see it as bizarre, we turned an overflow/use-after-free into a
>>> leak. That's the primary mechanism here.
>>>
>>> As long as we have a reference to a leaked object, we might as well use
>>> it, its not going anywhere.
>>>
>>>> Fixing the return value would move refcount_t into the realm of
>>>> something that is desirable because it has bettern semantics and
>>>> is more useful just on a day to day correctness point of view. Even
>>>> ignoring the security implications.
>>>
>>> It changes the semantics between inc_not_zero() and inc(). It also
>>> complicates the semantics of inc_not_zero(), where currently the failure
>>> implies the count is 0 and means no-such-object, you complicate matters
>>> by basically returning 'busy'.
>>
>> Busy is not a state of a reference count.
>>
>> It is true I am suggesting treating something with a saturated reference
>> as not available. If that is what you mean by busy. But if it's
>> reference is zero it is also not available. So there is no practical
>> difference.
>>
>>> That is a completely new class of failure that is actually hard to deal
>>> with, not to mention that it completely destroys refcount_inc_not_zero()
>>> being a 'simple' replacement for atomic_inc_not_zero().
>>>
>>> In case of the current failure, the no-such-object, we can fix that by
>>> creating said object. But what to do on 'busy' ? Surely you don't want
>>> to create another. You'd have to somehow retrofit something to wait on
>>> in every user.
>>
>> Using little words.
>>
>> A return of true from inc_not_zero means we took a reference.
>> A return of false means we did not take a reference.
>>
>> The code already handles I took a reference or I did not take a
>> reference.
>>
>> Therefore lying with refcount_t is not helpful. It takes failures
>> the code could easily handle and turns them into leaks.
>>
>> At least that is how I have seen reference counts used. And those
>> are definitely the plane obivous semantics.
>>
>> Your changes are definitely not drop in replacements for atomic_t in my
>> code.
>
> To clarify.
>
> If my code uses atomic_inc it does not expect a failure of any sort
> and saturate semantics are a fine replacement.
>
> If my code uses atomic_inc_not_zero it knows how to handle a failure
> to take a reference count. Making hiding the failure really bizarre.
>
> A must check function that hides a case I can handle and requires
> checking in a case where my code is built not to check is a drop in
> replacement for neither.
>
> So anyone who is proposing a refcount_t change as a drop in replacement
> for any code I maintain I will nack on sight because refcount_t is not
> currently a no-brain drop in replacement.
*Blink*
I failed to see that there is a refcount_inc. Too much noise in
the header file I suppose.
But implementing refcount_inc in terms of refcount_inc_not_zero is
totally broken. The two operations are not the same and the go to
different assumptions the code is making.
That explains why you think refcount_inc_not_zero should lie because
you are implementing refcount_inc with it. They are semantically very
different operations. Please separate them.
Eric
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 11:39 ` Eric W. Biederman
2017-05-29 11:39 ` Eric W. Biederman
@ 2017-05-29 12:23 ` Peter Zijlstra
2017-05-29 12:23 ` Peter Zijlstra
2017-05-29 15:43 ` Peter Zijlstra
1 sibling, 2 replies; 20+ messages in thread
From: Peter Zijlstra @ 2017-05-29 12:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening
On Mon, May 29, 2017 at 06:39:44AM -0500, Eric W. Biederman wrote:
> I failed to see that there is a refcount_inc. Too much noise in
> the header file I suppose.
>
> But implementing refcount_inc in terms of refcount_inc_not_zero is
> totally broken. The two operations are not the same and the go to
> different assumptions the code is making.
>
> That explains why you think refcount_inc_not_zero should lie because
> you are implementing refcount_inc with it. They are semantically very
> different operations. Please separate them.
There has been much debate about this. And the best I'll do is add a
comment and/or retain these exact semantics.
What is done is:
refcount_inc() := WARN_ON(!refcount_inc_not_zero())
Because incrementing a zero reference count is a use-after-free and
something we should not do ever.
This is where the whole usage count vs reference count pain comes from.
Once there are no more _references_ to an object, a reference count
frees the object. Therefore a zero reference count means a dead object
and incrementing from that is fail.
The usage count model otoh counts how many (active) users there are of
an object, and no active users is a good and expected situation. But it
is very explicitly not a reference count. Because even in the no users
case do we have a reference to the object (we've not leaked it after
all, we just don't track all references).
Similarly, refcount_dec() is implemented using dec_and_test() and will
WARN when it hits 0, because this is a leak and we don't want those
either.
A usage count variant otoh would be fine with hitting 0.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 12:23 ` Peter Zijlstra
@ 2017-05-29 12:23 ` Peter Zijlstra
2017-05-29 15:43 ` Peter Zijlstra
1 sibling, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2017-05-29 12:23 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening@lists.openwall.com,
LKML
On Mon, May 29, 2017 at 06:39:44AM -0500, Eric W. Biederman wrote:
> I failed to see that there is a refcount_inc. Too much noise in
> the header file I suppose.
>
> But implementing refcount_inc in terms of refcount_inc_not_zero is
> totally broken. The two operations are not the same and the go to
> different assumptions the code is making.
>
> That explains why you think refcount_inc_not_zero should lie because
> you are implementing refcount_inc with it. They are semantically very
> different operations. Please separate them.
There has been much debate about this. And the best I'll do is add a
comment and/or retain these exact semantics.
What is done is:
refcount_inc() := WARN_ON(!refcount_inc_not_zero())
Because incrementing a zero reference count is a use-after-free and
something we should not do ever.
This is where the whole usage count vs reference count pain comes from.
Once there are no more _references_ to an object, a reference count
frees the object. Therefore a zero reference count means a dead object
and incrementing from that is fail.
The usage count model otoh counts how many (active) users there are of
an object, and no active users is a good and expected situation. But it
is very explicitly not a reference count. Because even in the no users
case do we have a reference to the object (we've not leaked it after
all, we just don't track all references).
Similarly, refcount_dec() is implemented using dec_and_test() and will
WARN when it hits 0, because this is a leak and we don't want those
either.
A usage count variant otoh would be fine with hitting 0.
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 12:23 ` Peter Zijlstra
2017-05-29 12:23 ` Peter Zijlstra
@ 2017-05-29 15:43 ` Peter Zijlstra
2017-05-29 15:43 ` Peter Zijlstra
1 sibling, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2017-05-29 15:43 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening
On Mon, May 29, 2017 at 02:23:16PM +0200, Peter Zijlstra wrote:
> On Mon, May 29, 2017 at 06:39:44AM -0500, Eric W. Biederman wrote:
> > I failed to see that there is a refcount_inc. Too much noise in
> > the header file I suppose.
> >
> > But implementing refcount_inc in terms of refcount_inc_not_zero is
> > totally broken. The two operations are not the same and the go to
> > different assumptions the code is making.
> >
> > That explains why you think refcount_inc_not_zero should lie because
> > you are implementing refcount_inc with it. They are semantically very
> > different operations. Please separate them.
>
> There has been much debate about this. And the best I'll do is add a
> comment and/or retain these exact semantics.
>
> What is done is:
>
> refcount_inc() := WARN_ON(!refcount_inc_not_zero())
>
> Because incrementing a zero reference count is a use-after-free and
> something we should not do ever.
>
> This is where the whole usage count vs reference count pain comes from.
>
> Once there are no more _references_ to an object, a reference count
> frees the object. Therefore a zero reference count means a dead object
> and incrementing from that is fail.
>
> The usage count model otoh counts how many (active) users there are of
> an object, and no active users is a good and expected situation. But it
> is very explicitly not a reference count. Because even in the no users
> case do we have a reference to the object (we've not leaked it after
> all, we just don't track all references).
>
>
> Similarly, refcount_dec() is implemented using dec_and_test() and will
> WARN when it hits 0, because this is a leak and we don't want those
> either.
>
> A usage count variant otoh would be fine with hitting 0.
A typical pattern for the usage count is caches, where objects are kept
in a data structure (hash/tree and/or list) and we count how many users
there are of said objects. A GC or shrinker will then iterate the
structure and prune those objects that have 0 users.
It is fairly straight forward to convert those to refcount_t by adding
one reference for the data structure itself. The GC/shrinker will then
have to use something like refcount_dec_if_one() to drop the reference
from 1->0 (and we could easily add something like dec_and_lock_if_one if
so desired).
Not all of them mind you, but simple cases can certainly be done without
too much pain.
But clearly there have been conversions of less than desired quality /
clarity though ...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 15:43 ` Peter Zijlstra
@ 2017-05-29 15:43 ` Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2017-05-29 15:43 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening@lists.openwall.com,
LKML
On Mon, May 29, 2017 at 02:23:16PM +0200, Peter Zijlstra wrote:
> On Mon, May 29, 2017 at 06:39:44AM -0500, Eric W. Biederman wrote:
> > I failed to see that there is a refcount_inc. Too much noise in
> > the header file I suppose.
> >
> > But implementing refcount_inc in terms of refcount_inc_not_zero is
> > totally broken. The two operations are not the same and the go to
> > different assumptions the code is making.
> >
> > That explains why you think refcount_inc_not_zero should lie because
> > you are implementing refcount_inc with it. They are semantically very
> > different operations. Please separate them.
>
> There has been much debate about this. And the best I'll do is add a
> comment and/or retain these exact semantics.
>
> What is done is:
>
> refcount_inc() := WARN_ON(!refcount_inc_not_zero())
>
> Because incrementing a zero reference count is a use-after-free and
> something we should not do ever.
>
> This is where the whole usage count vs reference count pain comes from.
>
> Once there are no more _references_ to an object, a reference count
> frees the object. Therefore a zero reference count means a dead object
> and incrementing from that is fail.
>
> The usage count model otoh counts how many (active) users there are of
> an object, and no active users is a good and expected situation. But it
> is very explicitly not a reference count. Because even in the no users
> case do we have a reference to the object (we've not leaked it after
> all, we just don't track all references).
>
>
> Similarly, refcount_dec() is implemented using dec_and_test() and will
> WARN when it hits 0, because this is a leak and we don't want those
> either.
>
> A usage count variant otoh would be fine with hitting 0.
A typical pattern for the usage count is caches, where objects are kept
in a data structure (hash/tree and/or list) and we count how many users
there are of said objects. A GC or shrinker will then iterate the
structure and prune those objects that have 0 users.
It is fairly straight forward to convert those to refcount_t by adding
one reference for the data structure itself. The GC/shrinker will then
have to use something like refcount_dec_if_one() to drop the reference
from 1->0 (and we could easily add something like dec_and_lock_if_one if
so desired).
Not all of them mind you, but simple cases can certainly be done without
too much pain.
But clearly there have been conversions of less than desired quality /
clarity though ...
^ permalink raw reply [flat|nested] 20+ messages in thread
* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 10:49 ` Eric W. Biederman
2017-05-29 10:49 ` Eric W. Biederman
2017-05-29 11:30 ` Eric W. Biederman
@ 2017-05-29 12:13 ` Peter Zijlstra
2017-05-29 12:13 ` Peter Zijlstra
2 siblings, 1 reply; 20+ messages in thread
From: Peter Zijlstra @ 2017-05-29 12:13 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening
On Mon, May 29, 2017 at 05:49:53AM -0500, Eric W. Biederman wrote:
> > It changes the semantics between inc_not_zero() and inc(). It also
> > complicates the semantics of inc_not_zero(), where currently the failure
> > implies the count is 0 and means no-such-object, you complicate matters
> > by basically returning 'busy'.
>
> Busy is not a state of a reference count.
>
> It is true I am suggesting treating something with a saturated reference
> as not available. If that is what you mean by busy. But if it's
> reference is zero it is also not available. So there is no practical
> difference.
There is. Previously when inc_not_zero() failed, you _knew_ it was 0 and
therefore know the object no longer 'exists'.
Similarly, if you know you're serialized against 1->0 you can then
assume it will not fail.
That goes out the window the moment you fail for any other condition.
> > That is a completely new class of failure that is actually hard to deal
> > with, not to mention that it completely destroys refcount_inc_not_zero()
> > being a 'simple' replacement for atomic_inc_not_zero().
> >
> > In case of the current failure, the no-such-object, we can fix that by
> > creating said object. But what to do on 'busy' ? Surely you don't want
> > to create another. You'd have to somehow retrofit something to wait on
> > in every user.
>
> Using little words.
>
> A return of true from inc_not_zero means we took a reference.
> A return of false means we did not take a reference.
>
> The code already handles I took a reference or I did not take a
> reference.
I can well imagine code relying on the fact that failing to take a
reference means 0, see below. And once you start to fail for more
conditions, the actual value you failed on becomes interesting in order
to determine how to deal with it.
> Therefore lying with refcount_t is not helpful.
It is _NOT_ lying. It does as promised, it increments when it is not
zero. The fact that that increment can saturate is irrelevant. A
saturated reference is still a valid reference. Sure it causes a leak,
but who bloody cares, it shouldn't happen in the first place.
> It takes failures
> the code could easily handle and turns them into leaks.
That is the 'feature', we get to have leaks. Also those leaks _should_
not happen. They are a result of 'broken' code. So I don't see how
exposing them to the wider world helps anything but spread the pain of
the failure.
Please explain how the below is not subtly broken by changing
inc_not_zero.
struct obj *__obj_lookup(key)
{
obj = rcu_based_lookup(key);
if (refcount_inc_not_zero(&obj->ref))
return obj;
return NULL;
}
struct obj *obj_find_acquire(key)
{
/* fast path, lockless lookup */
rcu_read_lock()
obj = __obj_lookup(key);
rcu_read_unlock();
if (obj)
return obj;
/* slow path, serialize */
lock(&obj_lock);
/* we're serialized, if it exists we must get a ref */
obj = __obj_lookup(key);
if (obj)
goto unlock;
/* allocate a new object and insert it */
obj = obj_alloc();
obj_init(obj, key);
unlock:
unlock(&obj_lock);
return obj;
}
> At least that is how I have seen reference counts used. And those
> are definitely the plane obivous semantics.
I very strongly disagree. The one thing this primitive does is change
add/sub to be saturating, *consistently*.
You argue to expose the failure case, leading to more error paths
leading to more complication. I'll argue that nobody wants more error
handling, esp. for something that should not happen to begin with.
> Your changes are definitely not drop in replacements for atomic_t in my
> code.
refcount_t was never meant to be a drop-in replacement for atomic_t.
It is meant to be a fairly painless replacement for reference
counts that were previously implemented using atomic_t.
There is also the fairly common usage count scenario, that does not fit
well with refcount_t. I'm not sure we want to shoehorn that into
refcount_t either, we could create yet another type for that.
And there are of course a lot of less common things we're not wanting to
replace at all. atomic_t isn't broken we don't need to fix it.
I have no idea what you do so I cannot comment further.
^ permalink raw reply [flat|nested] 20+ messages in thread* Re: [PATCH 0/3] ipc subsystem refcounter conversions
2017-05-29 12:13 ` Peter Zijlstra
@ 2017-05-29 12:13 ` Peter Zijlstra
0 siblings, 0 replies; 20+ messages in thread
From: Peter Zijlstra @ 2017-05-29 12:13 UTC (permalink / raw)
To: Eric W. Biederman
Cc: Christoph Hellwig, Kees Cook, Andrew Morton, Elena Reshetova,
Greg KH, Ingo Molnar, Alexey Dobriyan, Serge E. Hallyn, arozansk,
Davidlohr Bueso, Manfred Spraul, axboe@kernel.dk, James Bottomley,
x86@kernel.org, Ingo Molnar, Arnd Bergmann, David S. Miller,
Rik van Riel, linux-arch, kernel-hardening@lists.openwall.com,
LKML
On Mon, May 29, 2017 at 05:49:53AM -0500, Eric W. Biederman wrote:
> > It changes the semantics between inc_not_zero() and inc(). It also
> > complicates the semantics of inc_not_zero(), where currently the failure
> > implies the count is 0 and means no-such-object, you complicate matters
> > by basically returning 'busy'.
>
> Busy is not a state of a reference count.
>
> It is true I am suggesting treating something with a saturated reference
> as not available. If that is what you mean by busy. But if it's
> reference is zero it is also not available. So there is no practical
> difference.
There is. Previously when inc_not_zero() failed, you _knew_ it was 0 and
therefore know the object no longer 'exists'.
Similarly, if you know you're serialized against 1->0 you can then
assume it will not fail.
That goes out the window the moment you fail for any other condition.
> > That is a completely new class of failure that is actually hard to deal
> > with, not to mention that it completely destroys refcount_inc_not_zero()
> > being a 'simple' replacement for atomic_inc_not_zero().
> >
> > In case of the current failure, the no-such-object, we can fix that by
> > creating said object. But what to do on 'busy' ? Surely you don't want
> > to create another. You'd have to somehow retrofit something to wait on
> > in every user.
>
> Using little words.
>
> A return of true from inc_not_zero means we took a reference.
> A return of false means we did not take a reference.
>
> The code already handles I took a reference or I did not take a
> reference.
I can well imagine code relying on the fact that failing to take a
reference means 0, see below. And once you start to fail for more
conditions, the actual value you failed on becomes interesting in order
to determine how to deal with it.
> Therefore lying with refcount_t is not helpful.
It is _NOT_ lying. It does as promised, it increments when it is not
zero. The fact that that increment can saturate is irrelevant. A
saturated reference is still a valid reference. Sure it causes a leak,
but who bloody cares, it shouldn't happen in the first place.
> It takes failures
> the code could easily handle and turns them into leaks.
That is the 'feature', we get to have leaks. Also those leaks _should_
not happen. They are a result of 'broken' code. So I don't see how
exposing them to the wider world helps anything but spread the pain of
the failure.
Please explain how the below is not subtly broken by changing
inc_not_zero.
struct obj *__obj_lookup(key)
{
obj = rcu_based_lookup(key);
if (refcount_inc_not_zero(&obj->ref))
return obj;
return NULL;
}
struct obj *obj_find_acquire(key)
{
/* fast path, lockless lookup */
rcu_read_lock()
obj = __obj_lookup(key);
rcu_read_unlock();
if (obj)
return obj;
/* slow path, serialize */
lock(&obj_lock);
/* we're serialized, if it exists we must get a ref */
obj = __obj_lookup(key);
if (obj)
goto unlock;
/* allocate a new object and insert it */
obj = obj_alloc();
obj_init(obj, key);
unlock:
unlock(&obj_lock);
return obj;
}
> At least that is how I have seen reference counts used. And those
> are definitely the plane obivous semantics.
I very strongly disagree. The one thing this primitive does is change
add/sub to be saturating, *consistently*.
You argue to expose the failure case, leading to more error paths
leading to more complication. I'll argue that nobody wants more error
handling, esp. for something that should not happen to begin with.
> Your changes are definitely not drop in replacements for atomic_t in my
> code.
refcount_t was never meant to be a drop-in replacement for atomic_t.
It is meant to be a fairly painless replacement for reference
counts that were previously implemented using atomic_t.
There is also the fairly common usage count scenario, that does not fit
well with refcount_t. I'm not sure we want to shoehorn that into
refcount_t either, we could create yet another type for that.
And there are of course a lot of less common things we're not wanting to
replace at all. atomic_t isn't broken we don't need to fix it.
I have no idea what you do so I cannot comment further.
^ permalink raw reply [flat|nested] 20+ messages in thread