* [Patch] rwsem: fix rwsem_is_locked() bug
@ 2009-09-30 3:19 Amerigo Wang
2009-09-30 23:08 ` Andrew Morton
2009-10-01 12:34 ` David Howells
0 siblings, 2 replies; 7+ messages in thread
From: Amerigo Wang @ 2009-09-30 3:19 UTC (permalink / raw)
To: linux-kernel
Cc: Brian Behlendorf, David Howells, Ben Woodard, Amerigo Wang,
Stable Team, akpm
rwsem_is_locked() tests ->activity without locks, so we should always
keep ->activity consistent. However, the code in __rwsem_do_wake()
breaks this rule, it updates ->activity after _all_ readers waken up,
this may give some reader a wrong ->activity value, thus cause
rwsem_is_locked() behaves wrong.
Brian has a kernel module to reproduce this, I can include it
if any of you need. Of course, with Brian's approval.
With this patch applied, I can't trigger that bug any more.
Reported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Cc: Ben Woodard <bwoodard@llnl.gov>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Stable Team <stable@kernel.org>
---
diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c
index 9df3ca5..44e4484 100644
--- a/lib/rwsem-spinlock.c
+++ b/lib/rwsem-spinlock.c
@@ -49,7 +49,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
struct rwsem_waiter *waiter;
struct task_struct *tsk;
- int woken;
waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
@@ -78,24 +77,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
/* grant an infinite number of read locks to the front of the queue */
dont_wake_writers:
- woken = 0;
while (waiter->flags & RWSEM_WAITING_FOR_READ) {
struct list_head *next = waiter->list.next;
+ sem->activity++;
list_del(&waiter->list);
tsk = waiter->task;
smp_mb();
waiter->task = NULL;
wake_up_process(tsk);
put_task_struct(tsk);
- woken++;
if (list_empty(&sem->wait_list))
break;
waiter = list_entry(next, struct rwsem_waiter, list);
}
- sem->activity += woken;
-
out:
return sem;
}
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [Patch] rwsem: fix rwsem_is_locked() bug
2009-09-30 3:19 [Patch] rwsem: fix rwsem_is_locked() bug Amerigo Wang
@ 2009-09-30 23:08 ` Andrew Morton
2009-10-05 3:23 ` Amerigo Wang
2009-10-01 12:34 ` David Howells
1 sibling, 1 reply; 7+ messages in thread
From: Andrew Morton @ 2009-09-30 23:08 UTC (permalink / raw)
To: Amerigo Wang
Cc: linux-kernel, behlendorf1, dhowells, bwoodard, amwang, stable
On Tue, 29 Sep 2009 23:19:02 -0400
Amerigo Wang <amwang@redhat.com> wrote:
>
> rwsem_is_locked() tests ->activity without locks, so we should always
> keep ->activity consistent. However, the code in __rwsem_do_wake()
> breaks this rule, it updates ->activity after _all_ readers waken up,
> this may give some reader a wrong ->activity value, thus cause
> rwsem_is_locked() behaves wrong.
>
> Brian has a kernel module to reproduce this, I can include it
> if any of you need. Of course, with Brian's approval.
>
> With this patch applied, I can't trigger that bug any more.
>
Changelog doesn't describe the bug well.
>
> ---
> diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c
> index 9df3ca5..44e4484 100644
> --- a/lib/rwsem-spinlock.c
> +++ b/lib/rwsem-spinlock.c
> @@ -49,7 +49,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
> {
> struct rwsem_waiter *waiter;
> struct task_struct *tsk;
> - int woken;
>
> waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
>
> @@ -78,24 +77,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>
> /* grant an infinite number of read locks to the front of the queue */
> dont_wake_writers:
> - woken = 0;
> while (waiter->flags & RWSEM_WAITING_FOR_READ) {
> struct list_head *next = waiter->list.next;
>
> + sem->activity++;
> list_del(&waiter->list);
> tsk = waiter->task;
> smp_mb();
> waiter->task = NULL;
> wake_up_process(tsk);
> put_task_struct(tsk);
> - woken++;
> if (list_empty(&sem->wait_list))
> break;
> waiter = list_entry(next, struct rwsem_waiter, list);
> }
>
> - sem->activity += woken;
> -
> out:
> return sem;
> }
So if I understand this correctly
- we have one or more processes sleeping in down_read(), waiting for access.
- we wake one or more processes up without altering ->activity
- they start to run and they do rwsem_is_locked(). This incorrectly
returns "false", because the waker process is still crunching away in
__rwsem_do_wake().
- the waker now alters ->activity, but it was too late.
And the patch fixes this by updating ->activity prior to waking the
sleeping processes. So when they run, they'll see a non-zero value of
->activity.
Fair enough, I guess.
I don't know if we really need this in -stable. Do we expect that
there will be any real runtime bugs arising from this?
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Patch] rwsem: fix rwsem_is_locked() bug
2009-09-30 3:19 [Patch] rwsem: fix rwsem_is_locked() bug Amerigo Wang
2009-09-30 23:08 ` Andrew Morton
@ 2009-10-01 12:34 ` David Howells
2009-10-05 3:26 ` Amerigo Wang
2009-10-05 6:30 ` Amerigo Wang
1 sibling, 2 replies; 7+ messages in thread
From: David Howells @ 2009-10-01 12:34 UTC (permalink / raw)
To: Amerigo Wang
Cc: dhowells, linux-kernel, Brian Behlendorf, Ben Woodard,
Stable Team, akpm
Amerigo Wang <amwang@redhat.com> wrote:
> rwsem_is_locked() tests ->activity without locks, so we should always
> keep ->activity consistent. However, the code in __rwsem_do_wake()
> breaks this rule, it updates ->activity after _all_ readers waken up,
> this may give some reader a wrong ->activity value, thus cause
> rwsem_is_locked() behaves wrong.
NAK.
This does not fix the case where the active readers run out, but there's a
writer on the queue (see __up_read()), nor the case where the active writer
ends, but there's a waiter on the queue (see __up_write()). In both cases,
the lock is still held, though sem->activity is 0.
I'm leary of endorsing the presence of rwsem_is_locked() since, unless the
function calling it knows that the process it is running in has the rwsem
locked, the value is obsolete the moment the test has been performed.
The other problem with this change is that it has the potential to cause more
cacheline ping-pong under contention. That said, contention on an rwsem is
much less likely, I think, than on, say, a spinlock, so this change shouldn't
cause a significant slowdown.
Your patch would probably be better as:
- woken = 0;
+ woken = ++sem->activity;
while (waiter->flags & RWSEM_WAITING_FOR_READ) {
struct list_head *next = waiter->list.next;
list_del(&waiter->list);
tsk = waiter->task;
smp_mb();
waiter->task = NULL;
wake_up_process(tsk);
put_task_struct(tsk);
woken++;
if (list_empty(&sem->wait_list))
break;
waiter = list_entry(next, struct rwsem_waiter, list);
}
- sem->activity += woken;
+ sem->activity = woken;
However, as I said above, that is not sufficient. You really do need to put
spinlocks in rwsem_is_locked():
static inline int rwsem_is_locked(struct rw_semaphore *sem)
{
unsigned long flags;
__s32 activity;
spin_lock_irqsave(&sem->wait_lock, flags);
activity = sem->activity;
spin_unlock_irqrestore(&sem->wait_lock, flags);
return activity != 0;
}
You also need to check over lib/rwsem.c. rwsem_is_locked() is unreliable for
that algorithm.
David
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Patch] rwsem: fix rwsem_is_locked() bug
2009-09-30 23:08 ` Andrew Morton
@ 2009-10-05 3:23 ` Amerigo Wang
0 siblings, 0 replies; 7+ messages in thread
From: Amerigo Wang @ 2009-10-05 3:23 UTC (permalink / raw)
To: Andrew Morton; +Cc: linux-kernel, behlendorf1, dhowells, bwoodard, stable
Andrew Morton wrote:
> On Tue, 29 Sep 2009 23:19:02 -0400
> Amerigo Wang <amwang@redhat.com> wrote:
>
>> rwsem_is_locked() tests ->activity without locks, so we should always
>> keep ->activity consistent. However, the code in __rwsem_do_wake()
>> breaks this rule, it updates ->activity after _all_ readers waken up,
>> this may give some reader a wrong ->activity value, thus cause
>> rwsem_is_locked() behaves wrong.
>>
>> Brian has a kernel module to reproduce this, I can include it
>> if any of you need. Of course, with Brian's approval.
>>
>> With this patch applied, I can't trigger that bug any more.
>>
>
> Changelog doesn't describe the bug well.
Sorry for my English. :-/
>
>> ---
>> diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c
>> index 9df3ca5..44e4484 100644
>> --- a/lib/rwsem-spinlock.c
>> +++ b/lib/rwsem-spinlock.c
>> @@ -49,7 +49,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>> {
>> struct rwsem_waiter *waiter;
>> struct task_struct *tsk;
>> - int woken;
>>
>> waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
>>
>> @@ -78,24 +77,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
>>
>> /* grant an infinite number of read locks to the front of the queue */
>> dont_wake_writers:
>> - woken = 0;
>> while (waiter->flags & RWSEM_WAITING_FOR_READ) {
>> struct list_head *next = waiter->list.next;
>>
>> + sem->activity++;
>> list_del(&waiter->list);
>> tsk = waiter->task;
>> smp_mb();
>> waiter->task = NULL;
>> wake_up_process(tsk);
>> put_task_struct(tsk);
>> - woken++;
>> if (list_empty(&sem->wait_list))
>> break;
>> waiter = list_entry(next, struct rwsem_waiter, list);
>> }
>>
>> - sem->activity += woken;
>> -
>> out:
>> return sem;
>> }
>
> So if I understand this correctly
>
> - we have one or more processes sleeping in down_read(), waiting for access.
>
> - we wake one or more processes up without altering ->activity
>
> - they start to run and they do rwsem_is_locked(). This incorrectly
> returns "false", because the waker process is still crunching away in
> __rwsem_do_wake().
>
> - the waker now alters ->activity, but it was too late.
>
> And the patch fixes this by updating ->activity prior to waking the
> sleeping processes. So when they run, they'll see a non-zero value of
> ->activity.
>
> Fair enough, I guess.
Yes, exactly.
But after reading David's comments, I realized that rwsem_is_locked()
has more problems, this only fixes one of them.
I will try another fix.
>
> I don't know if we really need this in -stable. Do we expect that
> there will be any real runtime bugs arising from this?
Not sure, I need an extra kernel module to trigger this bug,
so probably it doesn't affect the real kernel.
Thanks!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Patch] rwsem: fix rwsem_is_locked() bug
2009-10-01 12:34 ` David Howells
@ 2009-10-05 3:26 ` Amerigo Wang
2009-10-05 6:30 ` Amerigo Wang
1 sibling, 0 replies; 7+ messages in thread
From: Amerigo Wang @ 2009-10-05 3:26 UTC (permalink / raw)
To: David Howells
Cc: linux-kernel, Brian Behlendorf, Ben Woodard, Stable Team, akpm
David Howells wrote:
> Amerigo Wang <amwang@redhat.com> wrote:
>
>> rwsem_is_locked() tests ->activity without locks, so we should always
>> keep ->activity consistent. However, the code in __rwsem_do_wake()
>> breaks this rule, it updates ->activity after _all_ readers waken up,
>> this may give some reader a wrong ->activity value, thus cause
>> rwsem_is_locked() behaves wrong.
>
> NAK.
>
> This does not fix the case where the active readers run out, but there's a
> writer on the queue (see __up_read()), nor the case where the active writer
> ends, but there's a waiter on the queue (see __up_write()). In both cases,
> the lock is still held, though sem->activity is 0.
Hmm, so the algorithm used in rwsem_is_locked() is not right.:-/
>
> I'm leary of endorsing the presence of rwsem_is_locked() since, unless the
> function calling it knows that the process it is running in has the rwsem
> locked, the value is obsolete the moment the test has been performed.
>
> The other problem with this change is that it has the potential to cause more
> cacheline ping-pong under contention. That said, contention on an rwsem is
> much less likely, I think, than on, say, a spinlock, so this change shouldn't
> cause a significant slowdown.
>
> Your patch would probably be better as:
>
> - woken = 0;
> + woken = ++sem->activity;
> while (waiter->flags & RWSEM_WAITING_FOR_READ) {
> struct list_head *next = waiter->list.next;
>
> list_del(&waiter->list);
> tsk = waiter->task;
> smp_mb();
> waiter->task = NULL;
> wake_up_process(tsk);
> put_task_struct(tsk);
> woken++;
> if (list_empty(&sem->wait_list))
> break;
> waiter = list_entry(next, struct rwsem_waiter, list);
> }
>
> - sem->activity += woken;
> + sem->activity = woken;
>
> However, as I said above, that is not sufficient. You really do need to put
> spinlocks in rwsem_is_locked():
>
> static inline int rwsem_is_locked(struct rw_semaphore *sem)
> {
> unsigned long flags;
> __s32 activity;
>
> spin_lock_irqsave(&sem->wait_lock, flags);
> activity = sem->activity;
> spin_unlock_irqrestore(&sem->wait_lock, flags);
> return activity != 0;
> }
Sure, adding spinlocks can solve this, but that would be expensive,
wouldn't it?
>
> You also need to check over lib/rwsem.c. rwsem_is_locked() is unreliable for
> that algorithm.
Yeah, I agree, I will try another fix.
Thank you!
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Patch] rwsem: fix rwsem_is_locked() bug
2009-10-01 12:34 ` David Howells
2009-10-05 3:26 ` Amerigo Wang
@ 2009-10-05 6:30 ` Amerigo Wang
2009-10-05 12:58 ` David Howells
1 sibling, 1 reply; 7+ messages in thread
From: Amerigo Wang @ 2009-10-05 6:30 UTC (permalink / raw)
To: David Howells
Cc: linux-kernel, Brian Behlendorf, Ben Woodard, Stable Team, akpm
David Howells wrote:
>
> Your patch would probably be better as:
>
> - woken = 0;
> + woken = ++sem->activity;
> while (waiter->flags & RWSEM_WAITING_FOR_READ) {
> struct list_head *next = waiter->list.next;
>
> list_del(&waiter->list);
> tsk = waiter->task;
> smp_mb();
> waiter->task = NULL;
> wake_up_process(tsk);
> put_task_struct(tsk);
> woken++;
> if (list_empty(&sem->wait_list))
> break;
> waiter = list_entry(next, struct rwsem_waiter, list);
> }
>
> - sem->activity += woken;
> + sem->activity = woken;
FYI, this seems wrong, probably even worse than what it is now.
I tried this change, it breaks portmap.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [Patch] rwsem: fix rwsem_is_locked() bug
2009-10-05 6:30 ` Amerigo Wang
@ 2009-10-05 12:58 ` David Howells
0 siblings, 0 replies; 7+ messages in thread
From: David Howells @ 2009-10-05 12:58 UTC (permalink / raw)
To: Amerigo Wang
Cc: dhowells, linux-kernel, Brian Behlendorf, Ben Woodard,
Stable Team, akpm
Amerigo Wang <amwang@redhat.com> wrote:
> FYI, this seems wrong, probably even worse than what it is now.
> I tried this change, it breaks portmap.
Sorry, my mistake. The following:
> > - woken = 0;
> > + woken = ++sem->activity;
should have been:
- woken = 0;
+ woken = sem->activity++;
David
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2009-10-05 13:00 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-09-30 3:19 [Patch] rwsem: fix rwsem_is_locked() bug Amerigo Wang
2009-09-30 23:08 ` Andrew Morton
2009-10-05 3:23 ` Amerigo Wang
2009-10-01 12:34 ` David Howells
2009-10-05 3:26 ` Amerigo Wang
2009-10-05 6:30 ` Amerigo Wang
2009-10-05 12:58 ` David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox