public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs
@ 2016-08-05 23:08 Bart Van Assche
  2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
                   ` (2 more replies)
  0 siblings, 3 replies; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:08 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
	Johannes Weiner, Neil Brown, Michael Shaver,
	linux-kernel@vger.kernel.org

Hello Ingo,

This patch series addresses an issue that my SRP driver test software 
can trigger reproducibly, namely that __wait_on_bit_lock() hangs. Please 
review and apply these patches whenever this is convenient for you. The 
changes compared to the first version of this series are:
* Reworked the abort_exclusive_wait() changes.
* Added two patches, namely a patch that introduces a local variable and
   a patch (nr. 3) that eliminates again the spurious wakeups introduced
   by patch 1.

These patches have been tested on top of kernel v4.7 with kernel 
debugging enabled (detection of list corruption, lockdep, SLUB 
debugging, kmemleak, ...).

See also https://lkml.org/lkml/2016/8/3/289.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH v2 1/3] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
@ 2016-08-05 23:09 ` Bart Van Assche
  2016-08-06 21:36   ` Bart Van Assche
  2016-08-05 23:09 ` [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait() Bart Van Assche
  2016-08-05 23:10 ` [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups Bart Van Assche
  2 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
	Johannes Weiner, Neil Brown, Michael Shaver,
	linux-kernel@vger.kernel.org

If delivery of a signal and __wake_up_common() happen concurrently
it is possible that the signal is delivered after __wake_up_common()
woke up the affected task and before bit_wait_io() checks whether a
signal is pending. Avoid that the next waiter is not woken up if this
happens. This patch fixes the following hang:

INFO: task systemd-udevd:10111 blocked for more than 480 seconds.
      Not tainted 4.7.0-dbg+ #1
Call Trace:
 [<ffffffff8161f397>] schedule+0x37/0x90
 [<ffffffff816239ef>] schedule_timeout+0x27f/0x470
 [<ffffffff8161e76f>] io_schedule_timeout+0x9f/0x110
 [<ffffffff8161fb36>] bit_wait_io+0x16/0x60
 [<ffffffff8161f929>] __wait_on_bit_lock+0x49/0xa0
 [<ffffffff8114fe69>] __lock_page+0xb9/0xc0
 [<ffffffff81165d90>] truncate_inode_pages_range+0x3e0/0x760
 [<ffffffff81166120>] truncate_inode_pages+0x10/0x20
 [<ffffffff81212a20>] kill_bdev+0x30/0x40
 [<ffffffff81213d41>] __blkdev_put+0x71/0x360
 [<ffffffff81214079>] blkdev_put+0x49/0x170
 [<ffffffff812141c0>] blkdev_close+0x20/0x30
 [<ffffffff811d48e8>] __fput+0xe8/0x1f0
 [<ffffffff811d4a29>] ____fput+0x9/0x10
 [<ffffffff810842d3>] task_work_run+0x83/0xb0
 [<ffffffff8106606e>] do_exit+0x3ee/0xc40
 [<ffffffff8106694b>] do_group_exit+0x4b/0xc0
 [<ffffffff81073d9a>] get_signal+0x2ca/0x940
 [<ffffffff8101bf43>] do_signal+0x23/0x660
 [<ffffffff810022b3>] exit_to_usermode_loop+0x73/0xb0
 [<ffffffff81002cb0>] syscall_return_slowpath+0xb0/0xc0
 [<ffffffff81624e33>] entry_SYSCALL_64_fastpath+0xa6/0xa8

Fixes: 777c6c5f1f6e ("wait: prevent exclusive waiter starvation")
References: https://lkml.org/lkml/2012/8/24/185
References: http://www.spinics.net/lists/raid/msg53056.html
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
Cc: <stable@vger.kernel.org> # v2.6.29+
---
 kernel/sched/wait.c | 18 +++++++++++-------
 1 file changed, 11 insertions(+), 7 deletions(-)

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index f15d6b6..fa12939 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -266,12 +266,16 @@ EXPORT_SYMBOL(finish_wait);
  * the wait descriptor from the given waitqueue if still
  * queued.
  *
- * Wakes up the next waiter if the caller is concurrently
- * woken up through the queue.
- *
- * This prevents waiter starvation where an exclusive waiter
- * aborts and is woken up concurrently and no one wakes up
- * the next waiter.
+ * Wakes up the next waiter to prevent waiter starvation
+ * when an exclusive waiter aborts and is woken up
+ * concurrently and no one wakes up the next waiter. Note:
+ * even when a signal is pending it is possible that
+ * __wake_up_common() wakes up the current thread and hence
+ * that @wait has been removed from the wait queue @q. Hence
+ * test whether there are more waiters on the wait queue
+ * even if @wait is not on the wait queue @q. This approach
+ * will cause a spurious wakeup if a signal is delivered and
+ * no other thread calls __wake_up_common() concurrently.
  */
 void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
 			unsigned int mode, void *key)
@@ -282,7 +286,7 @@ void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
 	spin_lock_irqsave(&q->lock, flags);
 	if (!list_empty(&wait->task_list))
 		list_del_init(&wait->task_list);
-	else if (waitqueue_active(q))
+	if (waitqueue_active(q))
 		__wake_up_locked_key(q, mode, key);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait()
  2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
  2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
@ 2016-08-05 23:09 ` Bart Van Assche
  2016-08-05 23:10 ` [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups Bart Van Assche
  2 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:09 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
	Johannes Weiner, Neil Brown, Michael Shaver,
	linux-kernel@vger.kernel.org

This patch does not change any functionality.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
---
 kernel/sched/wait.c | 10 ++++++----
 1 file changed, 6 insertions(+), 4 deletions(-)

diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index fa12939..dcc7693 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -429,18 +429,20 @@ int __sched
 __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
 			wait_bit_action_f *action, unsigned mode)
 {
+	struct wait_bit_key *const key = &q->key;
+
 	do {
 		int ret;
 
 		prepare_to_wait_exclusive(wq, &q->wait, mode);
-		if (!test_bit(q->key.bit_nr, q->key.flags))
+		if (!test_bit(key->bit_nr, key->flags))
 			continue;
-		ret = action(&q->key, mode);
+		ret = action(key, mode);
 		if (!ret)
 			continue;
-		abort_exclusive_wait(wq, &q->wait, mode, &q->key);
+		abort_exclusive_wait(wq, &q->wait, mode, key);
 		return ret;
-	} while (test_and_set_bit(q->key.bit_nr, q->key.flags));
+	} while (test_and_set_bit(key->bit_nr, key->flags));
 	finish_wait(wq, &q->wait);
 	return 0;
 }
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups
  2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
  2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
  2016-08-05 23:09 ` [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait() Bart Van Assche
@ 2016-08-05 23:10 ` Bart Van Assche
  2 siblings, 0 replies; 6+ messages in thread
From: Bart Van Assche @ 2016-08-05 23:10 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
	Johannes Weiner, Neil Brown, Michael Shaver,
	linux-kernel@vger.kernel.org

Patch "sched: Avoid that __wait_on_bit_lock() hangs" made spurious
wakeups possible. Avoid to trigger such spurious wakeups.

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: Ingo Molnar <mingo@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Cc: Oleg Nesterov <oleg@redhat.com>
Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Johannes Weiner <hannes@cmpxchg.org>
Cc: Neil Brown <neilb@suse.de>
Cc: Michael Shaver <jmshaver@gmail.com>
---
 include/linux/wait.h |  5 +++--
 kernel/sched/wait.c  | 21 ++++++++-------------
 2 files changed, 11 insertions(+), 15 deletions(-)

diff --git a/include/linux/wait.h b/include/linux/wait.h
index 27d7a0a..5034fce 100644
--- a/include/linux/wait.h
+++ b/include/linux/wait.h
@@ -282,7 +282,7 @@ wait_queue_head_t *bit_waitqueue(void *, int);
 			__ret = __int;					\
 			if (exclusive) {				\
 				abort_exclusive_wait(&wq, &__wait,	\
-						     state, NULL);	\
+					condition, state, NULL);	\
 				goto __out;				\
 			}						\
 			break;						\
@@ -976,7 +976,8 @@ void prepare_to_wait(wait_queue_head_t *q, wait_queue_t *wait, int state);
 void prepare_to_wait_exclusive(wait_queue_head_t *q, wait_queue_t *wait, int state);
 long prepare_to_wait_event(wait_queue_head_t *q, wait_queue_t *wait, int state);
 void finish_wait(wait_queue_head_t *q, wait_queue_t *wait);
-void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait, unsigned int mode, void *key);
+void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
+			  bool wake_up_next, unsigned int mode, void *key);
 long wait_woken(wait_queue_t *wait, unsigned mode, long timeout);
 int woken_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
 int autoremove_wake_function(wait_queue_t *wait, unsigned mode, int sync, void *key);
diff --git a/kernel/sched/wait.c b/kernel/sched/wait.c
index dcc7693..c5a913f 100644
--- a/kernel/sched/wait.c
+++ b/kernel/sched/wait.c
@@ -259,6 +259,7 @@ EXPORT_SYMBOL(finish_wait);
  * abort_exclusive_wait - abort exclusive waiting in a queue
  * @q: waitqueue waited on
  * @wait: wait descriptor
+ * @wake_up_next: Whether or not to wake up another exclusive waiter
  * @mode: runstate of the waiter to be woken
  * @key: key to identify a wait bit queue or %NULL
  *
@@ -266,19 +267,12 @@ EXPORT_SYMBOL(finish_wait);
  * the wait descriptor from the given waitqueue if still
  * queued.
  *
- * Wakes up the next waiter to prevent waiter starvation
- * when an exclusive waiter aborts and is woken up
- * concurrently and no one wakes up the next waiter. Note:
- * even when a signal is pending it is possible that
- * __wake_up_common() wakes up the current thread and hence
- * that @wait has been removed from the wait queue @q. Hence
- * test whether there are more waiters on the wait queue
- * even if @wait is not on the wait queue @q. This approach
- * will cause a spurious wakeup if a signal is delivered and
- * no other thread calls __wake_up_common() concurrently.
+ * Wakes up the next waiter if @wake_up_next is set to prevent
+ * waiter starvation when an exclusive waiter aborts and is
+ * woken up concurrently and no one wakes up the next waiter.
  */
 void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
-			unsigned int mode, void *key)
+			  bool wake_up_next, unsigned int mode, void *key)
 {
 	unsigned long flags;
 
@@ -286,7 +280,7 @@ void abort_exclusive_wait(wait_queue_head_t *q, wait_queue_t *wait,
 	spin_lock_irqsave(&q->lock, flags);
 	if (!list_empty(&wait->task_list))
 		list_del_init(&wait->task_list);
-	if (waitqueue_active(q))
+	if (wake_up_next && waitqueue_active(q))
 		__wake_up_locked_key(q, mode, key);
 	spin_unlock_irqrestore(&q->lock, flags);
 }
@@ -440,7 +434,8 @@ __wait_on_bit_lock(wait_queue_head_t *wq, struct wait_bit_queue *q,
 		ret = action(key, mode);
 		if (!ret)
 			continue;
-		abort_exclusive_wait(wq, &q->wait, mode, key);
+		abort_exclusive_wait(wq, &q->wait,
+				test_bit(key->bit_nr, key->flags), mode, key);
 		return ret;
 	} while (test_and_set_bit(key->bit_nr, key->flags));
 	finish_wait(wq, &q->wait);
-- 
2.9.2

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/3] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
@ 2016-08-06 21:36   ` Bart Van Assche
  2016-08-09 17:24     ` Oleg Nesterov
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-08-06 21:36 UTC (permalink / raw)
  To: Ingo Molnar
  Cc: Peter Zijlstra, Oleg Nesterov, akpm@linux-foundation.org,
	Johannes Weiner, Neil Brown, Michael Shaver,
	linux-kernel@vger.kernel.org

On 08/05/16 16:09, Bart Van Assche wrote:
> If delivery of a signal and __wake_up_common() happen concurrently
> it is possible that the signal is delivered after __wake_up_common()
> woke up the affected task and before bit_wait_io() checks whether a
> signal is pending. Avoid that the next waiter is not woken up if this
> happens.

(replying to my own e-mail)

Although this patch works reliably in my tests, the above description 
does not explain the hang. I will resend this patch with a better 
description.

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH v2 1/3] sched: Avoid that __wait_on_bit_lock() hangs
  2016-08-06 21:36   ` Bart Van Assche
@ 2016-08-09 17:24     ` Oleg Nesterov
  0 siblings, 0 replies; 6+ messages in thread
From: Oleg Nesterov @ 2016-08-09 17:24 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Ingo Molnar, Peter Zijlstra, akpm@linux-foundation.org,
	Johannes Weiner, Neil Brown, Michael Shaver,
	linux-kernel@vger.kernel.org

On 08/06, Bart Van Assche wrote:
>
> Although this patch works reliably in my tests, the above description
> does not explain the hang. I will resend this patch with a better
> description.

Yes please, I'll ignore this series till then ;)

See also another patch I sent a minute ago.

Oleg.

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-08-09 17:25 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-08-05 23:08 [PATCH v2 0/3] sched: Avoid that __wait_on_bit_lock() hangs Bart Van Assche
2016-08-05 23:09 ` [PATCH v2 1/3] " Bart Van Assche
2016-08-06 21:36   ` Bart Van Assche
2016-08-09 17:24     ` Oleg Nesterov
2016-08-05 23:09 ` [PATCH v2 2/3] sched: Introduce a local variable in abort_exclusive_wait() Bart Van Assche
2016-08-05 23:10 ` [PATCH v2 3/3] sched: Avoid that abort_exclusive_wait() triggers spurious wakeups Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox