linux-rt-users.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper
@ 2017-11-13 17:56 Mikulas Patocka
  2017-11-17 14:57 ` Sebastian Siewior
  0 siblings, 1 reply; 18+ messages in thread
From: Mikulas Patocka @ 2017-11-13 17:56 UTC (permalink / raw)
  To: linux-kernel
  Cc: Thomas Gleixner, Ingo Molnar, Sebastian Siewior, Steven Rostedt,
	linux-rt-users

Hi

I'm submitting this patch for the CONFIG_PREEMPT_RT patch. It fixes 
deadlocks in device mapper when real time preemption is used.

Mikulas


From: Mikulas Patocka <mpatocka@redhat.com>

When some block device driver creates a bio and submits it to another 
block device driver, the bio is added to current->bio_list (in order to 
avoid unbounded recursion).

However, this queuing of bios can cause deadlocks, in order to avoid them, 
device mapper registers a function flush_current_bio_list. This function 
is called when device mapper driver blocks. It redirects bios queued on 
current->bio_list to helper workqueues, so that these bios can proceed 
even if the driver is blocked.

The problem with CONFIG_PREEMPT_RT_FULL is that when the device mapper
driver blocks, it won't call flush_current_bio_list (because
tsk_is_pi_blocked returns true in sched_submit_work), so deadlocks in
block device stack can happen.

Note that we can't call blk_schedule_flush_plug if tsk_is_pi_blocked
returns true - that would cause
BUG_ON(rt_mutex_real_waiter(task->pi_blocked_on)) in
task_blocks_on_rt_mutex when flush_current_bio_list attempts to take a
spinlock.

So the proper fix is to call blk_schedule_flush_plug in rt_mutex_fastlock,
when fast acquire failed and when the task is about to block.

Signed-off-by: Mikulas Patocka <mpatocka@redhat.com>

---
 kernel/locking/rtmutex.c |   13 +++++++++++++
 1 file changed, 13 insertions(+)

Index: linux-stable/kernel/locking/rtmutex.c
===================================================================
--- linux-stable.orig/kernel/locking/rtmutex.c
+++ linux-stable/kernel/locking/rtmutex.c
@@ -24,6 +24,7 @@
 #include <linux/sched/debug.h>
 #include <linux/timer.h>
 #include <linux/ww_mutex.h>
+#include <linux/blkdev.h>
 
 #include "rtmutex_common.h"
 
@@ -1939,6 +1940,15 @@ rt_mutex_fastlock(struct rt_mutex *lock,
 	if (likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
 		return 0;
 
+	/*
+	 * If rt_mutex blocks, the function sched_submit_work will not call
+	 * blk_schedule_flush_plug (because tsk_is_pi_blocked would be true).
+	 * We must call blk_schedule_flush_plug here, if we don't call it,
+	 * a deadlock in device mapper may happen.
+	 */
+	if (unlikely(blk_needs_flush_plug(current)))
+		blk_schedule_flush_plug(current);
+
 	return slowfn(lock, state, NULL, RT_MUTEX_MIN_CHAINWALK, ww_ctx);
 }
 
@@ -1956,6 +1966,9 @@ rt_mutex_timed_fastlock(struct rt_mutex
 	    likely(rt_mutex_cmpxchg_acquire(lock, NULL, current)))
 		return 0;
 
+	if (unlikely(blk_needs_flush_plug(current)))
+		blk_schedule_flush_plug(current);
+
 	return slowfn(lock, state, timeout, chwalk, ww_ctx);
 }
 

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2017-11-23 14:50 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-11-13 17:56 [PATCH PREEMPT RT] rt-mutex: fix deadlock in device mapper Mikulas Patocka
2017-11-17 14:57 ` Sebastian Siewior
2017-11-18 18:37   ` Mike Galbraith
2017-11-20 10:53     ` Sebastian Siewior
2017-11-20 12:43       ` Mike Galbraith
2017-11-20 13:49         ` Mike Galbraith
2017-11-20 21:31       ` Mikulas Patocka
2017-11-20 22:11         ` Mikulas Patocka
2017-11-20 21:33     ` Mikulas Patocka
2017-11-21  3:20       ` Mike Galbraith
2017-11-21  8:37         ` Thomas Gleixner
2017-11-21  9:18           ` Mike Galbraith
2017-11-21 16:11             ` Mikulas Patocka
2017-11-21 17:33               ` Mike Galbraith
2017-11-21 19:56                 ` Mikulas Patocka
2017-11-21 21:20                   ` Mike Galbraith
2017-11-23 14:42                     ` Sebastian Siewior
2017-11-23 14:50                       ` Mike Galbraith

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).