* [PATCH v2 1/3] sched: Avoid that __wait_on_bit_lock() hangs
2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
@ 2016-08-05 23:09 ` Bart Van Assche
2016-08-06 21:36 ` Bart Van Assche
2016-08-05 23:09 ` [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait() Bart Van Assche
2016-08-05 23:10 ` [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups Bart Van Assche
2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
Johannes Weiner, Neil Brown, Michael Shaver,
linux-kernel@vger.kernel.org
If delivery of a signal and __wake_up_common() happen concurrently
it is possible that the signal is delivered after __wake_up_common()
woke up the affected task and before bit_wait_io() checks whether a
signal is pending. Avoid that the next waiter is not woken up if this
happens. This patch fixes the following hang:
INFO: task systemd-udevd:10111 blocked for more than 480 seconds.
Not tainted 4.7.0-dbg+ #1
Call Trace:
[<ffffffff8161f397>] schedule+0x37/0x90
[<ffffffff816239ef>] schedule_timeout+0x27f/0x470
[<ffffffff8161e76f>] io_schedule_timeout+0x9f/0x110
[<ffffffff8161fb36>] bit_wait_io+0x16/0x60
[<ffffffff8161f929>] __wait_on_bit_lock+0x49/0xa0
[<ffffffff8114fe69>] __lock_page+0xb9/0xc0
[<ffffffff81165d90>] truncate_inode_pages_range+0x3e0/0x760
[<ffffffff81166120>] truncate_inode_pages+0x10/0x20
[<ffffffff81212a20>] kill_bdev+0x30/0x40
[<ffffffff81213d41>] __blkdev_put+0x71/0x360
[<ffffffff81214079>] blkdev_put+0x49/0x170
[<ffffffff812141c0>] blkdev_close+0x20/0x30
[<ffffffff811d48e8>] __fput+0xe8/0x1f0
[<ffffffff811d4a29>] ____fput+0x9/0x10
[<ffffffff810842d3>] task_work_run+0x83/0xb0
[<ffffffff8106606e>] do_exit+0x3ee/0xc40
[<ffffffff8106694b>] do_group_exit+0x4b/0xc0
[<ffffffff81073d9a>] get_signal+0x2ca/0x940
[<ffffffff8101bf43>] do_signal+0x23/0x660
[<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
[<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
[<ffffffff81624e33>] entry_SYSCALL_64_fastpath+0xa6/0xa8
Fixes: 777c6c5f1f6e ("wait: prevent exclusive waiter starvation")
References: https://lkml.org/lkml/2012/8/24/185
References: http://www.spinics.net/lists/raid/msg53056.html
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
Cc: <stable@vger.kernel.org> # v2.6.29+
---
kernel/sched/wait.c | 18 +++++++++++-------
1 file changed, 11 insertions(+), 7 deletions(-)
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f15d6b6..fa12939 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -266,12 +266,16 @@ EXPORT_SYMBOL(finish_wait);
* the wait descriptor from the given waitqueue if still
* queued.
*
- * Wakes up the next waiter if the caller is concurrently
- * woken up through the queue.
- *
- * This prevents waiter starvation where an exclusive waiter
- * aborts and is woken up concurrently and no one wakes up
- * the next waiter.
+ * Wakes up the next waiter to prevent waiter starvation
+ * when an exclusive waiter aborts and is woken up
+ * concurrently and no one wakes up the next waiter. Note:
+ * even when a signal is pending it is possible that
+ * __wake_up_common() wakes up the current thread and hence
+ * that @wait has been removed from the wait queue @q. Hence
+ * test whether there are more waiters on the wait queue
+ * even if @wait is not on the wait queue @q. This approach
+ * will cause a spurious wakeup if a signal is delivered and
+ * no other thread calls __wake_up_common() concurrently.
*/
void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
unsigned int mode, void *key)
@@ -282,7 +286,7 @@ void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
spin_lock_irqsave(&q->lock, flags);
if (!list_empty(&wait->task_list))
list_del_init(&wait->task_list);
- else if (waitqueue_active(q))
+ if (waitqueue_active(q))
__wake_up_locked_key(q, mode, key);
spin_unlock_irqrestore(&q->lock, flags);
}
--
2.9.2
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait()
2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
@ 2016-08-05 23:09 ` Bart Van Assche
2016-08-05 23:10 ` [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups Bart Van Assche
2 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:09 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
Johannes Weiner, Neil Brown, Michael Shaver,
linux-kernel@vger.kernel.org
This patch does not change any functionality.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
---
kernel/sched/wait.c | 10 ++++++----
1 file changed, 6 insertions(+), 4 deletions(-)
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index fa12939..dcc7693 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -429,18 +429,20 @@ int __sched
__wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
wait_bit_action_f *action, unsigned mode)
{
+ struct wait_bit_key *const key = &q->key;
+
do {
int ret;
prepare_to_wait_exclusive(wq, &q->wait, mode);
- if (!test_bit(q->key.bit_nr, q->key.flags))
+ if (!test_bit(key->bit_nr, key->flags))
continue;
- ret = action(&q->key, mode);
+ ret = action(key, mode);
if (!ret)
continue;
- abort_exclusive_wait(wq, &q->wait, mode, &q->key);
+ abort_exclusive_wait(wq, &q->wait, mode, key);
return ret;
- } while (test_and_set_bit(q->key.bit_nr, q->key.flags));
+ } while (test_and_set_bit(key->bit_nr, key->flags));
finish_wait(wq, &q->wait);
return 0;
}
--
2.9.2
^ permalink raw reply related [flat|nested] 6+ messages in thread* [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups
2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
2016-08-05 23:09 ` [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait() Bart Van Assche
@ 2016-08-05 23:10 ` Bart Van Assche
2 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:10 UTC (permalink / raw)
To: Ingo Molnar
Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
Johannes Weiner, Neil Brown, Michael Shaver,
linux-kernel@vger.kernel.org
Patch "sched: Avoid that __wait_on_bit_lock() hangs" made spurious
wakeups possible. Avoid to trigger such spurious wakeups.
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
---
include/linux/wait.h | 5 +++--
kernel/sched/wait.c | 21 ++++++++-------------
2 files changed, 11 insertions(+), 15 deletions(-)
diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..5034fce 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -282,7 +282,7 @@ wait_queue_head_t *bit_waitqueue(void *, int);
__ret = __int; \
if (exclusive) { \
abort_exclusive_wait(&wq, &__wait, \
- state, NULL); \
+ condition, state, NULL); \
goto __out; \
} \
break; \
@@ -976,7 +976,8 @@ void prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state);
void prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state);
long prepare_to_wait_event(wait_queue_head_t *q, wait_queue_t *wait, int state);
void finish_wait(wait_queue_head_t *q, wait_queue_t *wait);
-void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait, unsigned int mode, void *key);
+void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
+ bool wake_up_next, unsigned int mode, void *key);
long wait_woken(wait_queue_t *wait, unsigned mode, long timeout);
int woken_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index dcc7693..c5a913f 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -259,6 +259,7 @@ EXPORT_SYMBOL(finish_wait);
* abort_exclusive_wait - abort exclusive waiting in a queue
* @q: waitqueue waited on
* @wait: wait descriptor
+ * @wake_up_next: Whether or not to wake up another exclusive waiter
* @mode: runstate of the waiter to be woken
* @key: key to identify a wait bit queue or %NULL
*
@@ -266,19 +267,12 @@ EXPORT_SYMBOL(finish_wait);
* the wait descriptor from the given waitqueue if still
* queued.
*
- * Wakes up the next waiter to prevent waiter starvation
- * when an exclusive waiter aborts and is woken up
- * concurrently and no one wakes up the next waiter. Note:
- * even when a signal is pending it is possible that
- * __wake_up_common() wakes up the current thread and hence
- * that @wait has been removed from the wait queue @q. Hence
- * test whether there are more waiters on the wait queue
- * even if @wait is not on the wait queue @q. This approach
- * will cause a spurious wakeup if a signal is delivered and
- * no other thread calls __wake_up_common() concurrently.
+ * Wakes up the next waiter if @wake_up_next is set to prevent
+ * waiter starvation when an exclusive waiter aborts and is
+ * woken up concurrently and no one wakes up the next waiter.
*/
void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
- unsigned int mode, void *key)
+ bool wake_up_next, unsigned int mode, void *key)
{
unsigned long flags;
@@ -286,7 +280,7 @@ void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
spin_lock_irqsave(&q->lock, flags);
if (!list_empty(&wait->task_list))
list_del_init(&wait->task_list);
- if (waitqueue_active(q))
+ if (wake_up_next && waitqueue_active(q))
__wake_up_locked_key(q, mode, key);
spin_unlock_irqrestore(&q->lock, flags);
}
@@ -440,7 +434,8 @@ __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
ret = action(key, mode);
if (!ret)
continue;
- abort_exclusive_wait(wq, &q->wait, mode, key);
+ abort_exclusive_wait(wq, &q->wait,
+ test_bit(key->bit_nr, key->flags), mode, key);
return ret;
} while (test_and_set_bit(key->bit_nr, key->flags));
finish_wait(wq, &q->wait);
--
2.9.2
^ permalink raw reply related [flat|nested] 6+ messages in thread