dm-devel.redhat.com archive mirror
 help / color / mirror / Atom feed
* [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock
@ 2016-06-28  9:07 Bart Van Assche
  2016-06-28 18:15 ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-06-28  9:07 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: device-mapper development

Avoid that submitting an ioctl to a dm device while an underlying
block device is being removed triggers a deadlock. The call traces
reported by SysRq-w if the deadlock occurs are as follows:

sysrq: SysRq : Show Blocked State
  task                        PC stack   pid father
systemd-udevd   D ffff8803683f7878     0  6684    494 0x00000006
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230
 [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110
 [<ffffffff815ad786>] bit_wait_io+0x16/0x60
 [<ffffffff815ad579>] __wait_on_bit_lock+0x49/0xa0
 [<ffffffff8111b3d6>] __lock_page+0xb6/0xc0
 [<ffffffff8112f6a4>] truncate_inode_pages_range+0x444/0x790
 [<ffffffff8112fa00>] truncate_inode_pages+0x10/0x20
 [<ffffffff811d6ef0>] kill_bdev+0x30/0x40
 [<ffffffff811d8201>] __blkdev_put+0x71/0x360
 [<ffffffff811d8539>] blkdev_put+0x49/0x170
 [<ffffffff811d8680>] blkdev_close+0x20/0x30
 [<ffffffff8119e338>] __fput+0xe8/0x1f0
 [<ffffffff8119e479>] ____fput+0x9/0x10
 [<ffffffff8107876c>] task_work_run+0x7c/0xb0
 [<ffffffff8105d047>] do_exit+0x3b7/0xb10
 [<ffffffff8105d82b>] do_group_exit+0x4b/0xc0
 [<ffffffff81068f25>] get_signal+0x1c5/0x7f0
 [<ffffffff8101a1a3>] do_signal+0x23/0x700
 [<ffffffff810020d3>] exit_to_usermode_loop+0x73/0xb0
 [<ffffffff81002580>] syscall_return_slowpath+0xb0/0xc0
 [<ffffffff815b2537>] entry_SYSCALL_64_fastpath+0xaa/0xac
systemd-udevd   D ffff880062613cd8     0  6767    494 0x00000000
Call Trace:
 [<ffffffff815acf97>] schedule+0x37/0x90
 [<ffffffff815b1487>] schedule_timeout+0x157/0x230
 [<ffffffff810c0d33>] msleep+0x33/0x40
 [<ffffffffa0341a5b>] dm_grab_bdev_for_ioctl+0x7b/0x150 [dm_mod]
 [<ffffffffa0341e25>] dm_blk_ioctl+0x35/0x80 [dm_mod]
 [<ffffffff812b36eb>] blkdev_ioctl+0x25b/0x980
 [<ffffffff811d79b8>] block_ioctl+0x38/0x40
 [<ffffffff811afd5e>] do_vfs_ioctl+0x8e/0x660
 [<ffffffff811b036c>] SyS_ioctl+0x3c/0x70
 [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac

Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: <stable@vger.kernel.org>
---
 drivers/md/dm.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1b2f962..f3564e1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -630,7 +630,8 @@ retry:
 
 out:
 	dm_put_live_table(md, srcu_idx);
-	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
+	if (r == -ENOTCONN && !fatal_signal_pending(current) &&
+	    !blk_queue_dying(bdev_get_queue(*bdev))) {
 		msleep(10);
 		goto retry;
 	}
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
  2016-06-28  9:07 [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock Bart Van Assche
@ 2016-06-28 18:15 ` Mike Snitzer
  2016-06-28 18:29   ` Bart Van Assche
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2016-06-28 18:15 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: device-mapper development

On Tue, Jun 28 2016 at  5:07am -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> Avoid that submitting an ioctl to a dm device while an underlying
> block device is being removed triggers a deadlock. The call traces
> reported by SysRq-w if the deadlock occurs are as follows:
> 
> sysrq: SysRq : Show Blocked State
>   task                        PC stack   pid father
> systemd-udevd   D ffff8803683f7878     0  6684    494 0x00000006
> Call Trace:
>  [<ffffffff815acf97>] schedule+0x37/0x90
>  [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230
>  [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110
>  [<ffffffff815ad786>] bit_wait_io+0x16/0x60
>  [<ffffffff815ad579>] __wait_on_bit_lock+0x49/0xa0
>  [<ffffffff8111b3d6>] __lock_page+0xb6/0xc0
>  [<ffffffff8112f6a4>] truncate_inode_pages_range+0x444/0x790
>  [<ffffffff8112fa00>] truncate_inode_pages+0x10/0x20
>  [<ffffffff811d6ef0>] kill_bdev+0x30/0x40
>  [<ffffffff811d8201>] __blkdev_put+0x71/0x360
>  [<ffffffff811d8539>] blkdev_put+0x49/0x170
>  [<ffffffff811d8680>] blkdev_close+0x20/0x30
>  [<ffffffff8119e338>] __fput+0xe8/0x1f0
>  [<ffffffff8119e479>] ____fput+0x9/0x10
>  [<ffffffff8107876c>] task_work_run+0x7c/0xb0
>  [<ffffffff8105d047>] do_exit+0x3b7/0xb10
>  [<ffffffff8105d82b>] do_group_exit+0x4b/0xc0
>  [<ffffffff81068f25>] get_signal+0x1c5/0x7f0
>  [<ffffffff8101a1a3>] do_signal+0x23/0x700
>  [<ffffffff810020d3>] exit_to_usermode_loop+0x73/0xb0
>  [<ffffffff81002580>] syscall_return_slowpath+0xb0/0xc0
>  [<ffffffff815b2537>] entry_SYSCALL_64_fastpath+0xaa/0xac
> systemd-udevd   D ffff880062613cd8     0  6767    494 0x00000000
> Call Trace:
>  [<ffffffff815acf97>] schedule+0x37/0x90
>  [<ffffffff815b1487>] schedule_timeout+0x157/0x230
>  [<ffffffff810c0d33>] msleep+0x33/0x40
>  [<ffffffffa0341a5b>] dm_grab_bdev_for_ioctl+0x7b/0x150 [dm_mod]
>  [<ffffffffa0341e25>] dm_blk_ioctl+0x35/0x80 [dm_mod]
>  [<ffffffff812b36eb>] blkdev_ioctl+0x25b/0x980
>  [<ffffffff811d79b8>] block_ioctl+0x38/0x40
>  [<ffffffff811afd5e>] do_vfs_ioctl+0x8e/0x660
>  [<ffffffff811b036c>] SyS_ioctl+0x3c/0x70
>  [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
> 
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: <stable@vger.kernel.org>
> ---
>  drivers/md/dm.c | 3 ++-
>  1 file changed, 2 insertions(+), 1 deletion(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 1b2f962..f3564e1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -630,7 +630,8 @@ retry:
>  
>  out:
>  	dm_put_live_table(md, srcu_idx);
> -	if (r == -ENOTCONN && !fatal_signal_pending(current)) {
> +	if (r == -ENOTCONN && !fatal_signal_pending(current) &&
> +	    !blk_queue_dying(bdev_get_queue(*bdev))) {
>  		msleep(10);
>  		goto retry;
>  	}
> -- 
> 2.8.4

Hi Bart,

This patch doesn't make sense.

In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only
valid if r == 0.  But r == -ENOTCONN so how can *bdev be valid?

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
  2016-06-28 18:15 ` Mike Snitzer
@ 2016-06-28 18:29   ` Bart Van Assche
  2016-06-28 18:59     ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-06-28 18:29 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: device-mapper development

On 06/28/2016 08:15 PM, Mike Snitzer wrote:
> This patch doesn't make sense.
> 
> In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only
> valid if r == 0.  But r == -ENOTCONN so how can *bdev be valid?

Sorry but the dm code is not my area of expertise. How about the patch
below? Please note that so far only the queue-length path selector has
been tested.

Thanks,

Bart.

diff --git a/drivers/md/dm-queue-length.c b/drivers/md/dm-queue-length.c
index 23f1786..4c36648 100644
--- a/drivers/md/dm-queue-length.c
+++ b/drivers/md/dm-queue-length.c
@@ -199,6 +199,8 @@ static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes)
 	list_move_tail(s->valid_paths.next, &s->valid_paths);
 
 	list_for_each_entry(pi, &s->valid_paths, list) {
+		if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev)))
+			continue;
 		if (!best ||
 		    (atomic_read(&pi->qlen) < atomic_read(&best->qlen)))
 			best = pi;
diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
index 4ace1da..56e3919 100644
--- a/drivers/md/dm-round-robin.c
+++ b/drivers/md/dm-round-robin.c
@@ -217,13 +217,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
 			return current_path;
 	}
 
+	current_path = NULL;
+
 	spin_lock_irqsave(&s->lock, flags);
-	if (!list_empty(&s->valid_paths)) {
-		pi = list_entry(s->valid_paths.next, struct path_info, list);
+	list_for_each_entry(pi, &s->valid_paths, list) {
+		if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev)))
+			continue;
 		list_move_tail(&pi->list, &s->valid_paths);
 		percpu_counter_set(&s->repeat_count, pi->repeat_count);
 		set_percpu_current_path(s, pi->path);
 		current_path = pi->path;
+		break;
 	}
 	spin_unlock_irqrestore(&s->lock, flags);
 
diff --git a/drivers/md/dm-service-time.c b/drivers/md/dm-service-time.c
index 7b86420..fccf66a 100644
--- a/drivers/md/dm-service-time.c
+++ b/drivers/md/dm-service-time.c
@@ -285,9 +285,12 @@ static struct dm_path *st_select_path(struct path_selector *ps, size_t nr_bytes)
 	/* Change preferred (first in list) path to evenly balance. */
 	list_move_tail(s->valid_paths.next, &s->valid_paths);
 
-	list_for_each_entry(pi, &s->valid_paths, list)
+	list_for_each_entry(pi, &s->valid_paths, list) {
+		if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev)))
+			continue;
 		if (!best || (st_compare_load(pi, best, nr_bytes) < 0))
 			best = pi;
+	}
 
 	if (!best)
 		goto out;
-- 
2.8.4

^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
  2016-06-28 18:29   ` Bart Van Assche
@ 2016-06-28 18:59     ` Mike Snitzer
  2016-06-28 19:16       ` Bart Van Assche
  0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2016-06-28 18:59 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: device-mapper development

On Tue, Jun 28 2016 at  2:29pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 06/28/2016 08:15 PM, Mike Snitzer wrote:
> > This patch doesn't make sense.
> > 
> > In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only
> > valid if r == 0.  But r == -ENOTCONN so how can *bdev be valid?
> 
> Sorry but the dm code is not my area of expertise. How about the patch
> below? Please note that so far only the queue-length path selector has
> been tested.

Can we go back to what it is you've experienced?  is it that you have
'queue_if_no_path' enabled and are issuing ioctls to an mpath device
(while removing underlying paths) you'll experience a live-lock (_not_
deadlock) once no valid paths exist?

If that isn't what you're hitting then I'd like to better understand how
a request_queue that is "dying" isn't able to keep itself up enough to
fail IO issued to it (to allow normal error handling to trap the IO
failure).

Mike

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
  2016-06-28 18:59     ` Mike Snitzer
@ 2016-06-28 19:16       ` Bart Van Assche
  2016-06-28 19:33         ` Mike Snitzer
  0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-06-28 19:16 UTC (permalink / raw)
  To: Mike Snitzer; +Cc: device-mapper development

On 06/28/2016 08:59 PM, Mike Snitzer wrote:
> Can we go back to what it is you've experienced?  is it that you have
> 'queue_if_no_path' enabled and are issuing ioctls to an mpath device
> (while removing underlying paths) you'll experience a live-lock (_not_
> deadlock) once no valid paths exist?
>
> If that isn't what you're hitting then I'd like to better understand how
> a request_queue that is "dying" isn't able to keep itself up enough to
> fail IO issued to it (to allow normal error handling to trap the IO
> failure).

Hello Mike,

Since I started testing kernel v4.7-rc<n> I noticed about twenty times 
that systemd-udevd got stuck in truncate_inode_pages(). I have not yet 
seen this with any older kernel version. queue_if_no_path is indeed 
enabled in my tests. The test I run consists of running fio on top of an 
mpath device and repeatedly removing and restoring the underlying 
devices. The test script is available at 
https://github.com/bvanassche/srp-test/blob/master/tests/02. Please let 
me know if you need more information.

Bart.

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
  2016-06-28 19:16       ` Bart Van Assche
@ 2016-06-28 19:33         ` Mike Snitzer
  0 siblings, 0 replies; 6+ messages in thread
From: Mike Snitzer @ 2016-06-28 19:33 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: device-mapper development

On Tue, Jun 28 2016 at  3:16pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 06/28/2016 08:59 PM, Mike Snitzer wrote:
> >Can we go back to what it is you've experienced?  is it that you have
> >'queue_if_no_path' enabled and are issuing ioctls to an mpath device
> >(while removing underlying paths) you'll experience a live-lock (_not_
> >deadlock) once no valid paths exist?
> >
> >If that isn't what you're hitting then I'd like to better understand how
> >a request_queue that is "dying" isn't able to keep itself up enough to
> >fail IO issued to it (to allow normal error handling to trap the IO
> >failure).
> 
> Hello Mike,
> 
> Since I started testing kernel v4.7-rc<n> I noticed about twenty
> times that systemd-udevd got stuck in truncate_inode_pages(). I have
> not yet seen this with any older kernel version. queue_if_no_path is
> indeed enabled in my tests. The test I run consists of running fio
> on top of an mpath device and repeatedly removing and restoring the
> underlying devices. The test script is available at
> https://github.com/bvanassche/srp-test/blob/master/tests/02. Please
> let me know if you need more information.

I'm not going to be able to setup this test and chase this in the
near-term.  If you want this fixed soon then I'll need you to continue
chasing this.

Something else must be going on.  I fail to see how avoiding dying
queues, like your 2nd path selectors patch does, should be needed.

A dying queue, and the underlying device that is being torn down, still
needs to complete (fail) any of its outstanding IO -- or IO issued to it
e.g. via __blkdev_driver_ioctl -- right?

Could your driver's queue maybe not be getting torn down like it did in the
past? -- if it lingers in this "dying" state then that could start to
explain why this is happening all of a sudden in v4.7-rc<n>.  Would be
nice to know if that is what is happening.

But you've definitely seen that your path selector patch, that skips
selecting paths with dying queues, avoids this live-lock issue?

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2016-06-28 19:33 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-28  9:07 [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock Bart Van Assche
2016-06-28 18:15 ` Mike Snitzer
2016-06-28 18:29   ` Bart Van Assche
2016-06-28 18:59     ` Mike Snitzer
2016-06-28 19:16       ` Bart Van Assche
2016-06-28 19:33         ` Mike Snitzer

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).