public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
@ 2009-12-21  1:20 Andi Kleen
  2009-12-21  1:59 ` Eric W. Biederman
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2009-12-21  1:20 UTC (permalink / raw)
  To: linux-kernel, paulmck, ebiederm


With BKL-less sysctls most of the writable string sysctls are racy. There
is no locking on the reader side, so a reader could see an inconsistent
string or worse miss the terminating null and walk of beyond it.

This patch kit adds a new "rcu string" variant to avoid these 
problems and convers the racy users. One the writer side the strings are 
always copied to new memory and the readers use rcu_read_lock()
to get a stable view. For readers who access the string over
sleeps the reader copies the string. 

This is all hidden in a new generic "rcu_string" ADT which can be also 
used for other purposes.

This finally implements all the letters in RCU, most other users
leave out the 'C'.

I also fixed the zone order list sysctl to use a mutex to avoid
racing with itself. 

I left some obscure users in architectures (sparc, mips) alone and audited
all of the others. The sparc reboot_cmd one has references to asm files
which I didn't want to touch and the mips variant seemd just too obscure.

All the others are not racy.

-Andi

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
  2009-12-21  1:20 Andi Kleen
@ 2009-12-21  1:59 ` Eric W. Biederman
  2009-12-21  2:04   ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Eric W. Biederman @ 2009-12-21  1:59 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, paulmck

Andi Kleen <andi@firstfloor.org> writes:

> With BKL-less sysctls most of the writable string sysctls are racy. There
> is no locking on the reader side, so a reader could see an inconsistent
> string or worse miss the terminating null and walk of beyond it.

The walk will only extend up to the maximum length of the string.
So the worst case really is inconsistent data.

This is an unfortunate corner case.  This is not a regression as this
has been the way things have worked for years.  So probably 2.6.34
material.

> This patch kit adds a new "rcu string" variant to avoid these 
> problems and convers the racy users. One the writer side the strings are 
> always copied to new memory and the readers use rcu_read_lock()
> to get a stable view. For readers who access the string over
> sleeps the reader copies the string. 

I will have to look more after the holidays.  This rcu_string looks like
it introduces allocations on paths that did not use them before, which
has me wondering a bit.

Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
  2009-12-21  1:59 ` Eric W. Biederman
@ 2009-12-21  2:04   ` Andi Kleen
  2009-12-21  2:31     ` Eric W. Biederman
  0 siblings, 1 reply; 7+ messages in thread
From: Andi Kleen @ 2009-12-21  2:04 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andi Kleen, linux-kernel, paulmck

On Sun, Dec 20, 2009 at 05:59:59PM -0800, Eric W. Biederman wrote:
> Andi Kleen <andi@firstfloor.org> writes:
> 
> > With BKL-less sysctls most of the writable string sysctls are racy. There
> > is no locking on the reader side, so a reader could see an inconsistent
> > string or worse miss the terminating null and walk of beyond it.
> 
> The walk will only extend up to the maximum length of the string.
> So the worst case really is inconsistent data.

It could still miss the 0 byte and walk out, can't it?

> This is an unfortunate corner case.  This is not a regression as this
> has been the way things have worked for years.  So probably 2.6.34
>  material.

The one that's a clear regression is the core pattern one, that 
was protected before by the BKL. A lot of others were always
broken yes.

> 
> > This patch kit adds a new "rcu string" variant to avoid these 
> > problems and convers the racy users. One the writer side the strings are 
> > always copied to new memory and the readers use rcu_read_lock()
> > to get a stable view. For readers who access the string over
> > sleeps the reader copies the string. 
> 
> I will have to look more after the holidays.  This rcu_string looks like
> it introduces allocations on paths that did not use them before, which
> has me wondering a bit.

On the reader side about all of them allocated before, e.g. for
call_usermodehelper.

If the strings were made a bit smaller this could be also 
put on the stack, but I didn't dare for 256 bytes. 

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
  2009-12-21  2:04   ` Andi Kleen
@ 2009-12-21  2:31     ` Eric W. Biederman
  2009-12-21  3:21       ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Eric W. Biederman @ 2009-12-21  2:31 UTC (permalink / raw)
  To: Andi Kleen; +Cc: linux-kernel, paulmck

Andi Kleen <andi@firstfloor.org> writes:

> On Sun, Dec 20, 2009 at 05:59:59PM -0800, Eric W. Biederman wrote:
>> Andi Kleen <andi@firstfloor.org> writes:
>> 
>> > With BKL-less sysctls most of the writable string sysctls are racy. There
>> > is no locking on the reader side, so a reader could see an inconsistent
>> > string or worse miss the terminating null and walk of beyond it.
>> 
>> The walk will only extend up to the maximum length of the string.
>> So the worst case really is inconsistent data.
>
> It could still miss the 0 byte and walk out, can't it?

Looking again.  Yes it appears there is a small vulnerability there.

The code does:

	len = strlen(data);
	if (len > maxlen)
		len = maxlen;

So we should be calling:
   len = strnlen(data, maxlen);

At which point we won't be able to walk out.

The write side appears to be in need of strnlen_user
as well, so it does not walk all of user space looking
for null byte.

>> This is an unfortunate corner case.  This is not a regression as this
>> has been the way things have worked for years.  So probably 2.6.34
>>  material.
>
> The one that's a clear regression is the core pattern one, that 
> was protected before by the BKL. A lot of others were always
> broken yes.

Nope.  The core pattern just thought it was protected by BKL.  I did
not change the /proc/sys code path to remove the BKL.  I don't know
if we ever took the BKL on the /proc/sys codepath.

I remember looking at the core pattern earlier and my memory is that
sysctl is new enough that core pattern was not protected by the BKL on the
/proc/sys path when it was introduced.

There was a lot of confusing code in the sys_sysctl code path (which
grabbed the BKL) so I expect people thought they were safe due to the
BKL when they were not.

So we have sysctl have locking problems, not new sysctl regressions.

>> > This patch kit adds a new "rcu string" variant to avoid these 
>> > problems and convers the racy users. One the writer side the strings are 
>> > always copied to new memory and the readers use rcu_read_lock()
>> > to get a stable view. For readers who access the string over
>> > sleeps the reader copies the string. 
>> 
>> I will have to look more after the holidays.  This rcu_string looks like
>> it introduces allocations on paths that did not use them before, which
>> has me wondering a bit.
>
> On the reader side about all of them allocated before, e.g. for
> call_usermodehelper.

That sounds like less of an issue.

> If the strings were made a bit smaller this could be also 
> put on the stack, but I didn't dare for 256 bytes. 

Hmm.  rcu wise that sounds wrong, but I haven't looked into your
cool new data structure yet.

Eric

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
  2009-12-21  2:31     ` Eric W. Biederman
@ 2009-12-21  3:21       ` Andi Kleen
  0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2009-12-21  3:21 UTC (permalink / raw)
  To: Eric W. Biederman; +Cc: Andi Kleen, linux-kernel, paulmck

> So we have sysctl have locking problems, not new sysctl regressions.

Ok.

> > If the strings were made a bit smaller this could be also 
> > put on the stack, but I didn't dare for 256 bytes. 
> 
> Hmm.  rcu wise that sounds wrong, but I haven't looked into your

What sounds wrong? 

The reason for the copies is that when the reader sleeps 
rcu_read_lock() cannot be used to protect the string completely,
so it copies instead.

The alternative would have been to use SRCU, but I didn't like that.

> cool new data structure yet.

It's not really particularly new or cool, it's just a simple RCU wrapper
around a string.

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
@ 2009-12-21  7:57 Alexey Dobriyan
  2009-12-21 10:50 ` Andi Kleen
  0 siblings, 1 reply; 7+ messages in thread
From: Alexey Dobriyan @ 2009-12-21  7:57 UTC (permalink / raw)
  To: andi; +Cc: linux-kernel, ebiederm, paulmck

> new = alloc_rcu_string(MAX_GLOBAL_SIZE, GFP_KERNEL);

I think adding failure mode is not right.
We can get away with per-sysctl-string spinlock and not introduce yet
another API.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls
  2009-12-21  7:57 [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls Alexey Dobriyan
@ 2009-12-21 10:50 ` Andi Kleen
  0 siblings, 0 replies; 7+ messages in thread
From: Andi Kleen @ 2009-12-21 10:50 UTC (permalink / raw)
  To: Alexey Dobriyan; +Cc: andi, linux-kernel, ebiederm, paulmck

On Mon, Dec 21, 2009 at 09:57:32AM +0200, Alexey Dobriyan wrote:
> > new = alloc_rcu_string(MAX_GLOBAL_SIZE, GFP_KERNEL);
> 
> I think adding failure mode is not right.
> We can get away with per-sysctl-string spinlock and not introduce yet
> another API.

You mean when running out of memory here?

When this runs out of memory the system will be not usable anyways
and on the last legs and likely in a deadly swap storm. 

-Andi

-- 
ak@linux.intel.com -- Speaking for myself only.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2009-12-21 10:50 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-21  7:57 [PATCH] [0/11] SYSCTL: Use RCU to avoid races with string sysctls Alexey Dobriyan
2009-12-21 10:50 ` Andi Kleen
  -- strict thread matches above, loose matches on Subject: below --
2009-12-21  1:20 Andi Kleen
2009-12-21  1:59 ` Eric W. Biederman
2009-12-21  2:04   ` Andi Kleen
2009-12-21  2:31     ` Eric W. Biederman
2009-12-21  3:21       ` Andi Kleen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox