* [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock
@ 2016-06-28 9:07 Bart Van Assche
2016-06-28 18:15 ` Mike Snitzer
0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-06-28 9:07 UTC (permalink / raw)
To: Mike Snitzer; +Cc: device-mapper development
Avoid that submitting an ioctl to a dm device while an underlying
block device is being removed triggers a deadlock. The call traces
reported by SysRq-w if the deadlock occurs are as follows:
sysrq: SysRq : Show Blocked State
task PC stack pid father
systemd-udevd D ffff8803683f7878 0 6684 494 0x00000006
Call Trace:
[<ffffffff815acf97>] schedule+0x37/0x90
[<ffffffff815b14bb>] schedule_timeout+0x18b/0x230
[<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110
[<ffffffff815ad786>] bit_wait_io+0x16/0x60
[<ffffffff815ad579>] __wait_on_bit_lock+0x49/0xa0
[<ffffffff8111b3d6>] __lock_page+0xb6/0xc0
[<ffffffff8112f6a4>] truncate_inode_pages_range+0x444/0x790
[<ffffffff8112fa00>] truncate_inode_pages+0x10/0x20
[<ffffffff811d6ef0>] kill_bdev+0x30/0x40
[<ffffffff811d8201>] __blkdev_put+0x71/0x360
[<ffffffff811d8539>] blkdev_put+0x49/0x170
[<ffffffff811d8680>] blkdev_close+0x20/0x30
[<ffffffff8119e338>] __fput+0xe8/0x1f0
[<ffffffff8119e479>] ____fput+0x9/0x10
[<ffffffff8107876c>] task_work_run+0x7c/0xb0
[<ffffffff8105d047>] do_exit+0x3b7/0xb10
[<ffffffff8105d82b>] do_group_exit+0x4b/0xc0
[<ffffffff81068f25>] get_signal+0x1c5/0x7f0
[<ffffffff8101a1a3>] do_signal+0x23/0x700
[<ffffffff810020d3>] exit_to_usermode_loop+0x73/0xb0
[<ffffffff81002580>] syscall_return_slowpath+0xb0/0xc0
[<ffffffff815b2537>] entry_SYSCALL_64_fastpath+0xaa/0xac
systemd-udevd D ffff880062613cd8 0 6767 494 0x00000000
Call Trace:
[<ffffffff815acf97>] schedule+0x37/0x90
[<ffffffff815b1487>] schedule_timeout+0x157/0x230
[<ffffffff810c0d33>] msleep+0x33/0x40
[<ffffffffa0341a5b>] dm_grab_bdev_for_ioctl+0x7b/0x150 [dm_mod]
[<ffffffffa0341e25>] dm_blk_ioctl+0x35/0x80 [dm_mod]
[<ffffffff812b36eb>] blkdev_ioctl+0x25b/0x980
[<ffffffff811d79b8>] block_ioctl+0x38/0x40
[<ffffffff811afd5e>] do_vfs_ioctl+0x8e/0x660
[<ffffffff811b036c>] SyS_ioctl+0x3c/0x70
[<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: <stable@vger.kernel.org>
---
drivers/md/dm.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/drivers/md/dm.c b/drivers/md/dm.c
index 1b2f962..f3564e1 100644
--- a/drivers/md/dm.c
+++ b/drivers/md/dm.c
@@ -630,7 +630,8 @@ retry:
out:
dm_put_live_table(md, srcu_idx);
- if (r == -ENOTCONN && !fatal_signal_pending(current)) {
+ if (r == -ENOTCONN && !fatal_signal_pending(current) &&
+ !blk_queue_dying(bdev_get_queue(*bdev))) {
msleep(10);
goto retry;
}
--
2.8.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
2016-06-28 9:07 [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock Bart Van Assche
@ 2016-06-28 18:15 ` Mike Snitzer
2016-06-28 18:29 ` Bart Van Assche
0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2016-06-28 18:15 UTC (permalink / raw)
To: Bart Van Assche; +Cc: device-mapper development
On Tue, Jun 28 2016 at 5:07am -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> Avoid that submitting an ioctl to a dm device while an underlying
> block device is being removed triggers a deadlock. The call traces
> reported by SysRq-w if the deadlock occurs are as follows:
>
> sysrq: SysRq : Show Blocked State
> task PC stack pid father
> systemd-udevd D ffff8803683f7878 0 6684 494 0x00000006
> Call Trace:
> [<ffffffff815acf97>] schedule+0x37/0x90
> [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230
> [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110
> [<ffffffff815ad786>] bit_wait_io+0x16/0x60
> [<ffffffff815ad579>] __wait_on_bit_lock+0x49/0xa0
> [<ffffffff8111b3d6>] __lock_page+0xb6/0xc0
> [<ffffffff8112f6a4>] truncate_inode_pages_range+0x444/0x790
> [<ffffffff8112fa00>] truncate_inode_pages+0x10/0x20
> [<ffffffff811d6ef0>] kill_bdev+0x30/0x40
> [<ffffffff811d8201>] __blkdev_put+0x71/0x360
> [<ffffffff811d8539>] blkdev_put+0x49/0x170
> [<ffffffff811d8680>] blkdev_close+0x20/0x30
> [<ffffffff8119e338>] __fput+0xe8/0x1f0
> [<ffffffff8119e479>] ____fput+0x9/0x10
> [<ffffffff8107876c>] task_work_run+0x7c/0xb0
> [<ffffffff8105d047>] do_exit+0x3b7/0xb10
> [<ffffffff8105d82b>] do_group_exit+0x4b/0xc0
> [<ffffffff81068f25>] get_signal+0x1c5/0x7f0
> [<ffffffff8101a1a3>] do_signal+0x23/0x700
> [<ffffffff810020d3>] exit_to_usermode_loop+0x73/0xb0
> [<ffffffff81002580>] syscall_return_slowpath+0xb0/0xc0
> [<ffffffff815b2537>] entry_SYSCALL_64_fastpath+0xaa/0xac
> systemd-udevd D ffff880062613cd8 0 6767 494 0x00000000
> Call Trace:
> [<ffffffff815acf97>] schedule+0x37/0x90
> [<ffffffff815b1487>] schedule_timeout+0x157/0x230
> [<ffffffff810c0d33>] msleep+0x33/0x40
> [<ffffffffa0341a5b>] dm_grab_bdev_for_ioctl+0x7b/0x150 [dm_mod]
> [<ffffffffa0341e25>] dm_blk_ioctl+0x35/0x80 [dm_mod]
> [<ffffffff812b36eb>] blkdev_ioctl+0x25b/0x980
> [<ffffffff811d79b8>] block_ioctl+0x38/0x40
> [<ffffffff811afd5e>] do_vfs_ioctl+0x8e/0x660
> [<ffffffff811b036c>] SyS_ioctl+0x3c/0x70
> [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac
>
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: <stable@vger.kernel.org>
> ---
> drivers/md/dm.c | 3 ++-
> 1 file changed, 2 insertions(+), 1 deletion(-)
>
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 1b2f962..f3564e1 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -630,7 +630,8 @@ retry:
>
> out:
> dm_put_live_table(md, srcu_idx);
> - if (r == -ENOTCONN && !fatal_signal_pending(current)) {
> + if (r == -ENOTCONN && !fatal_signal_pending(current) &&
> + !blk_queue_dying(bdev_get_queue(*bdev))) {
> msleep(10);
> goto retry;
> }
> --
> 2.8.4
Hi Bart,
This patch doesn't make sense.
In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only
valid if r == 0. But r == -ENOTCONN so how can *bdev be valid?
Mike
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
2016-06-28 18:15 ` Mike Snitzer
@ 2016-06-28 18:29 ` Bart Van Assche
2016-06-28 18:59 ` Mike Snitzer
0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-06-28 18:29 UTC (permalink / raw)
To: Mike Snitzer; +Cc: device-mapper development
On 06/28/2016 08:15 PM, Mike Snitzer wrote:
> This patch doesn't make sense.
>
> In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only
> valid if r == 0. But r == -ENOTCONN so how can *bdev be valid?
Sorry but the dm code is not my area of expertise. How about the patch
below? Please note that so far only the queue-length path selector has
been tested.
Thanks,
Bart.
diff --git a/drivers/md/dm-queue-length.c b/drivers/md/dm-queue-length.c
index 23f1786..4c36648 100644
--- a/drivers/md/dm-queue-length.c
+++ b/drivers/md/dm-queue-length.c
@@ -199,6 +199,8 @@ static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes)
list_move_tail(s->valid_paths.next, &s->valid_paths);
list_for_each_entry(pi, &s->valid_paths, list) {
+ if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev)))
+ continue;
if (!best ||
(atomic_read(&pi->qlen) < atomic_read(&best->qlen)))
best = pi;
diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c
index 4ace1da..56e3919 100644
--- a/drivers/md/dm-round-robin.c
+++ b/drivers/md/dm-round-robin.c
@@ -217,13 +217,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes)
return current_path;
}
+ current_path = NULL;
+
spin_lock_irqsave(&s->lock, flags);
- if (!list_empty(&s->valid_paths)) {
- pi = list_entry(s->valid_paths.next, struct path_info, list);
+ list_for_each_entry(pi, &s->valid_paths, list) {
+ if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev)))
+ continue;
list_move_tail(&pi->list, &s->valid_paths);
percpu_counter_set(&s->repeat_count, pi->repeat_count);
set_percpu_current_path(s, pi->path);
current_path = pi->path;
+ break;
}
spin_unlock_irqrestore(&s->lock, flags);
diff --git a/drivers/md/dm-service-time.c b/drivers/md/dm-service-time.c
index 7b86420..fccf66a 100644
--- a/drivers/md/dm-service-time.c
+++ b/drivers/md/dm-service-time.c
@@ -285,9 +285,12 @@ static struct dm_path *st_select_path(struct path_selector *ps, size_t nr_bytes)
/* Change preferred (first in list) path to evenly balance. */
list_move_tail(s->valid_paths.next, &s->valid_paths);
- list_for_each_entry(pi, &s->valid_paths, list)
+ list_for_each_entry(pi, &s->valid_paths, list) {
+ if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev)))
+ continue;
if (!best || (st_compare_load(pi, best, nr_bytes) < 0))
best = pi;
+ }
if (!best)
goto out;
--
2.8.4
^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
2016-06-28 18:29 ` Bart Van Assche
@ 2016-06-28 18:59 ` Mike Snitzer
2016-06-28 19:16 ` Bart Van Assche
0 siblings, 1 reply; 6+ messages in thread
From: Mike Snitzer @ 2016-06-28 18:59 UTC (permalink / raw)
To: Bart Van Assche; +Cc: device-mapper development
On Tue, Jun 28 2016 at 2:29pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> On 06/28/2016 08:15 PM, Mike Snitzer wrote:
> > This patch doesn't make sense.
> >
> > In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only
> > valid if r == 0. But r == -ENOTCONN so how can *bdev be valid?
>
> Sorry but the dm code is not my area of expertise. How about the patch
> below? Please note that so far only the queue-length path selector has
> been tested.
Can we go back to what it is you've experienced? is it that you have
'queue_if_no_path' enabled and are issuing ioctls to an mpath device
(while removing underlying paths) you'll experience a live-lock (_not_
deadlock) once no valid paths exist?
If that isn't what you're hitting then I'd like to better understand how
a request_queue that is "dying" isn't able to keep itself up enough to
fail IO issued to it (to allow normal error handling to trap the IO
failure).
Mike
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
2016-06-28 18:59 ` Mike Snitzer
@ 2016-06-28 19:16 ` Bart Van Assche
2016-06-28 19:33 ` Mike Snitzer
0 siblings, 1 reply; 6+ messages in thread
From: Bart Van Assche @ 2016-06-28 19:16 UTC (permalink / raw)
To: Mike Snitzer; +Cc: device-mapper development
On 06/28/2016 08:59 PM, Mike Snitzer wrote:
> Can we go back to what it is you've experienced? is it that you have
> 'queue_if_no_path' enabled and are issuing ioctls to an mpath device
> (while removing underlying paths) you'll experience a live-lock (_not_
> deadlock) once no valid paths exist?
>
> If that isn't what you're hitting then I'd like to better understand how
> a request_queue that is "dying" isn't able to keep itself up enough to
> fail IO issued to it (to allow normal error handling to trap the IO
> failure).
Hello Mike,
Since I started testing kernel v4.7-rc<n> I noticed about twenty times
that systemd-udevd got stuck in truncate_inode_pages(). I have not yet
seen this with any older kernel version. queue_if_no_path is indeed
enabled in my tests. The test I run consists of running fio on top of an
mpath device and repeatedly removing and restoring the underlying
devices. The test script is available at
https://github.com/bvanassche/srp-test/blob/master/tests/02. Please let
me know if you need more information.
Bart.
^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock
2016-06-28 19:16 ` Bart Van Assche
@ 2016-06-28 19:33 ` Mike Snitzer
0 siblings, 0 replies; 6+ messages in thread
From: Mike Snitzer @ 2016-06-28 19:33 UTC (permalink / raw)
To: Bart Van Assche; +Cc: device-mapper development
On Tue, Jun 28 2016 at 3:16pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> On 06/28/2016 08:59 PM, Mike Snitzer wrote:
> >Can we go back to what it is you've experienced? is it that you have
> >'queue_if_no_path' enabled and are issuing ioctls to an mpath device
> >(while removing underlying paths) you'll experience a live-lock (_not_
> >deadlock) once no valid paths exist?
> >
> >If that isn't what you're hitting then I'd like to better understand how
> >a request_queue that is "dying" isn't able to keep itself up enough to
> >fail IO issued to it (to allow normal error handling to trap the IO
> >failure).
>
> Hello Mike,
>
> Since I started testing kernel v4.7-rc<n> I noticed about twenty
> times that systemd-udevd got stuck in truncate_inode_pages(). I have
> not yet seen this with any older kernel version. queue_if_no_path is
> indeed enabled in my tests. The test I run consists of running fio
> on top of an mpath device and repeatedly removing and restoring the
> underlying devices. The test script is available at
> https://github.com/bvanassche/srp-test/blob/master/tests/02. Please
> let me know if you need more information.
I'm not going to be able to setup this test and chase this in the
near-term. If you want this fixed soon then I'll need you to continue
chasing this.
Something else must be going on. I fail to see how avoiding dying
queues, like your 2nd path selectors patch does, should be needed.
A dying queue, and the underlying device that is being torn down, still
needs to complete (fail) any of its outstanding IO -- or IO issued to it
e.g. via __blkdev_driver_ioctl -- right?
Could your driver's queue maybe not be getting torn down like it did in the
past? -- if it lingers in this "dying" state then that could start to
explain why this is happening all of a sudden in v4.7-rc<n>. Would be
nice to know if that is what is happening.
But you've definitely seen that your path selector patch, that skips
selecting paths with dying queues, avoids this live-lock issue?
^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-28 19:33 UTC | newest]
Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2016-06-28 9:07 [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock Bart Van Assche
2016-06-28 18:15 ` Mike Snitzer
2016-06-28 18:29 ` Bart Van Assche
2016-06-28 18:59 ` Mike Snitzer
2016-06-28 19:16 ` Bart Van Assche
2016-06-28 19:33 ` Mike Snitzer
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).