* [Patch v2] rwsem: fix rwsem_is_locked() bugs
@ 2009-10-05 6:36 Amerigo Wang
2009-10-05 13:13 ` David Howells
0 siblings, 1 reply; 4+ messages in thread
From: Amerigo Wang @ 2009-10-05 6:36 UTC (permalink / raw)
To: linux-kernel
Cc: Brian Behlendorf, David Howells, Ben Woodard, Amerigo Wang,
Stable Team, akpm
rwsem_is_locked() tests ->activity without locks, so we should always
keep ->activity consistent. However, the code in __rwsem_do_wake()
breaks this rule, it updates ->activity after _all_ readers waken up,
this may give some reader a wrong ->activity value, thus cause
rwsem_is_locked() behaves wrong.
Quote from Andrew:
"
- we have one or more processes sleeping in down_read(), waiting for access.
- we wake one or more processes up without altering ->activity
- they start to run and they do rwsem_is_locked(). This incorrectly
returns "false", because the waker process is still crunching away in
__rwsem_do_wake().
- the waker now alters ->activity, but it was too late.
And the patch fixes this by updating ->activity prior to waking the
sleeping processes. So when they run, they'll see a non-zero value of
->activity.
"
Also, we have more problems, as pointed by David:
"... the case where the active readers run out, but there's a
writer on the queue (see __up_read()), nor the case where the active writer
ends, but there's a waiter on the queue (see __up_write()). In both cases,
the lock is still held, though sem->activity is 0."
This patch fixes this too.
David also said we may have "the potential to cause more cacheline ping-pong
under contention", but "this change shouldn't cause a significant slowdown."
With this patch applied, I can't trigger that bug any more.
Reported-by: Brian Behlendorf <behlendorf1@llnl.gov>
Cc: Ben Woodard <bwoodard@llnl.gov>
Cc: David Howells <dhowells@redhat.com>
Signed-off-by: WANG Cong <amwang@redhat.com>
Cc: Stable Team <stable@kernel.org>
---
diff --git a/include/linux/rwsem-spinlock.h b/include/linux/rwsem-spinlock.h
index 6c3c0f6..1395bb6 100644
--- a/include/linux/rwsem-spinlock.h
+++ b/include/linux/rwsem-spinlock.h
@@ -71,7 +71,7 @@ extern void __downgrade_write(struct rw_semaphore *sem);
static inline int rwsem_is_locked(struct rw_semaphore *sem)
{
- return (sem->activity != 0);
+ return !(sem->activity == 0 && list_empty(&sem->wait_list));
}
#endif /* __KERNEL__ */
diff --git a/lib/rwsem-spinlock.c b/lib/rwsem-spinlock.c
index 9df3ca5..44e4484 100644
--- a/lib/rwsem-spinlock.c
+++ b/lib/rwsem-spinlock.c
@@ -49,7 +49,6 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
{
struct rwsem_waiter *waiter;
struct task_struct *tsk;
- int woken;
waiter = list_entry(sem->wait_list.next, struct rwsem_waiter, list);
@@ -78,24 +77,21 @@ __rwsem_do_wake(struct rw_semaphore *sem, int wakewrite)
/* grant an infinite number of read locks to the front of the queue */
dont_wake_writers:
- woken = 0;
while (waiter->flags & RWSEM_WAITING_FOR_READ) {
struct list_head *next = waiter->list.next;
+ sem->activity++;
list_del(&waiter->list);
tsk = waiter->task;
smp_mb();
waiter->task = NULL;
wake_up_process(tsk);
put_task_struct(tsk);
- woken++;
if (list_empty(&sem->wait_list))
break;
waiter = list_entry(next, struct rwsem_waiter, list);
}
- sem->activity += woken;
-
out:
return sem;
}
^ permalink raw reply related [flat|nested] 4+ messages in thread
* Re: [Patch v2] rwsem: fix rwsem_is_locked() bugs
2009-10-05 6:36 [Patch v2] rwsem: fix rwsem_is_locked() bugs Amerigo Wang
@ 2009-10-05 13:13 ` David Howells
2009-10-06 7:02 ` Amerigo Wang
0 siblings, 1 reply; 4+ messages in thread
From: David Howells @ 2009-10-05 13:13 UTC (permalink / raw)
To: Amerigo Wang
Cc: dhowells, linux-kernel, Brian Behlendorf, Ben Woodard,
Stable Team, akpm
Amerigo Wang <amwang@redhat.com> wrote:
> - return (sem->activity != 0);
> + return !(sem->activity == 0 && list_empty(&sem->wait_list));
This needs to be done in the opposite order with an smp_rmb() between[*], I
think, because the someone releasing the lock will first reduce activity to
zero, and then attempt to empty the list, so with your altered code as it
stands, you can get:
CPU 1 CPU 2
=============================== ===============================
[sem is read locked, 1 queued writer]
-->up_read()
sem->activity-- -->rwsem_is_locked()
[sem->activity now 0] sem->activity == 0 [true]
<interrupt>
-->__rwsem_do_wake()
sem->activity = -1
[sem->activity now !=0]
list_del()
[sem->wait_list now empty] </interrupt>
list_empty(&sem->wait_list) [true]
wake_up_process()
<--__rwsem_do_wake()
<--up_read()
[sem is write locked] return false [ie. sem is not locked]
In fact, I don't think even swapping things around addresses the problem. You
do not prevent the state inside the sem changing under you whilst you try to
interpret it.
[*] there would also need to be an smp_wmb() between the update of
sem->activity and the deletion from sem->wait_list to balance out the
smp_rmb().
David
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch v2] rwsem: fix rwsem_is_locked() bugs
2009-10-05 13:13 ` David Howells
@ 2009-10-06 7:02 ` Amerigo Wang
2009-10-06 7:18 ` David Howells
0 siblings, 1 reply; 4+ messages in thread
From: Amerigo Wang @ 2009-10-06 7:02 UTC (permalink / raw)
To: David Howells
Cc: linux-kernel, Brian Behlendorf, Ben Woodard, Stable Team, akpm
David Howells wrote:
> Amerigo Wang <amwang@redhat.com> wrote:
>
>> - return (sem->activity != 0);
>> + return !(sem->activity == 0 && list_empty(&sem->wait_list));
>
> This needs to be done in the opposite order with an smp_rmb() between[*], I
> think, because the someone releasing the lock will first reduce activity to
> zero, and then attempt to empty the list, so with your altered code as it
> stands, you can get:
>
> CPU 1 CPU 2
> =============================== ===============================
> [sem is read locked, 1 queued writer]
> -->up_read()
> sem->activity-- -->rwsem_is_locked()
> [sem->activity now 0] sem->activity == 0 [true]
> <interrupt>
> -->__rwsem_do_wake()
> sem->activity = -1
> [sem->activity now !=0]
> list_del()
> [sem->wait_list now empty] </interrupt>
> list_empty(&sem->wait_list) [true]
> wake_up_process()
> <--__rwsem_do_wake()
> <--up_read()
> [sem is write locked] return false [ie. sem is not locked]
>
> In fact, I don't think even swapping things around addresses the problem. You
> do not prevent the state inside the sem changing under you whilst you try to
> interpret it.
Hmm, right. I think we have to disable irq and preempt here, so
probably spin_trylock_irq() is a good choice.
Since if we have locks, we don't need memory barriers any more, right?
I just sent out the updated patch.
Thanks!
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [Patch v2] rwsem: fix rwsem_is_locked() bugs
2009-10-06 7:02 ` Amerigo Wang
@ 2009-10-06 7:18 ` David Howells
0 siblings, 0 replies; 4+ messages in thread
From: David Howells @ 2009-10-06 7:18 UTC (permalink / raw)
To: Amerigo Wang
Cc: dhowells, linux-kernel, Brian Behlendorf, Ben Woodard,
Stable Team, akpm
Amerigo Wang <amwang@redhat.com> wrote:
> Since if we have locks, we don't need memory barriers any more, right?
Indeed - locks are implicit memory barriers.
David
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-10-06 7:20 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-10-05 6:36 [Patch v2] rwsem: fix rwsem_is_locked() bugs Amerigo Wang
2009-10-05 13:13 ` David Howells
2009-10-06 7:02 ` Amerigo Wang
2009-10-06 7:18 ` David Howells
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).