* [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock @ 2016-06-28 9:07 Bart Van Assche 2016-06-28 18:15 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Bart Van Assche @ 2016-06-28 9:07 UTC (permalink / raw) To: Mike Snitzer; +Cc: device-mapper development Avoid that submitting an ioctl to a dm device while an underlying block device is being removed triggers a deadlock. The call traces reported by SysRq-w if the deadlock occurs are as follows: sysrq: SysRq : Show Blocked State task PC stack pid father systemd-udevd D ffff8803683f7878 0 6684 494 0x00000006 Call Trace: [<ffffffff815acf97>] schedule+0x37/0x90 [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230 [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110 [<ffffffff815ad786>] bit_wait_io+0x16/0x60 [<ffffffff815ad579>] __wait_on_bit_lock+0x49/0xa0 [<ffffffff8111b3d6>] __lock_page+0xb6/0xc0 [<ffffffff8112f6a4>] truncate_inode_pages_range+0x444/0x790 [<ffffffff8112fa00>] truncate_inode_pages+0x10/0x20 [<ffffffff811d6ef0>] kill_bdev+0x30/0x40 [<ffffffff811d8201>] __blkdev_put+0x71/0x360 [<ffffffff811d8539>] blkdev_put+0x49/0x170 [<ffffffff811d8680>] blkdev_close+0x20/0x30 [<ffffffff8119e338>] __fput+0xe8/0x1f0 [<ffffffff8119e479>] ____fput+0x9/0x10 [<ffffffff8107876c>] task_work_run+0x7c/0xb0 [<ffffffff8105d047>] do_exit+0x3b7/0xb10 [<ffffffff8105d82b>] do_group_exit+0x4b/0xc0 [<ffffffff81068f25>] get_signal+0x1c5/0x7f0 [<ffffffff8101a1a3>] do_signal+0x23/0x700 [<ffffffff810020d3>] exit_to_usermode_loop+0x73/0xb0 [<ffffffff81002580>] syscall_return_slowpath+0xb0/0xc0 [<ffffffff815b2537>] entry_SYSCALL_64_fastpath+0xaa/0xac systemd-udevd D ffff880062613cd8 0 6767 494 0x00000000 Call Trace: [<ffffffff815acf97>] schedule+0x37/0x90 [<ffffffff815b1487>] schedule_timeout+0x157/0x230 [<ffffffff810c0d33>] msleep+0x33/0x40 [<ffffffffa0341a5b>] dm_grab_bdev_for_ioctl+0x7b/0x150 [dm_mod] [<ffffffffa0341e25>] dm_blk_ioctl+0x35/0x80 [dm_mod] [<ffffffff812b36eb>] blkdev_ioctl+0x25b/0x980 [<ffffffff811d79b8>] block_ioctl+0x38/0x40 [<ffffffff811afd5e>] do_vfs_ioctl+0x8e/0x660 [<ffffffff811b036c>] SyS_ioctl+0x3c/0x70 [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> Cc: <stable@vger.kernel.org> --- drivers/md/dm.c | 3 ++- 1 file changed, 2 insertions(+), 1 deletion(-) diff --git a/drivers/md/dm.c b/drivers/md/dm.c index 1b2f962..f3564e1 100644 --- a/drivers/md/dm.c +++ b/drivers/md/dm.c @@ -630,7 +630,8 @@ retry: out: dm_put_live_table(md, srcu_idx); - if (r == -ENOTCONN && !fatal_signal_pending(current)) { + if (r == -ENOTCONN && !fatal_signal_pending(current) && + !blk_queue_dying(bdev_get_queue(*bdev))) { msleep(10); goto retry; } -- 2.8.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock 2016-06-28 9:07 [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock Bart Van Assche @ 2016-06-28 18:15 ` Mike Snitzer 2016-06-28 18:29 ` Bart Van Assche 0 siblings, 1 reply; 6+ messages in thread From: Mike Snitzer @ 2016-06-28 18:15 UTC (permalink / raw) To: Bart Van Assche; +Cc: device-mapper development On Tue, Jun 28 2016 at 5:07am -0400, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > Avoid that submitting an ioctl to a dm device while an underlying > block device is being removed triggers a deadlock. The call traces > reported by SysRq-w if the deadlock occurs are as follows: > > sysrq: SysRq : Show Blocked State > task PC stack pid father > systemd-udevd D ffff8803683f7878 0 6684 494 0x00000006 > Call Trace: > [<ffffffff815acf97>] schedule+0x37/0x90 > [<ffffffff815b14bb>] schedule_timeout+0x18b/0x230 > [<ffffffff815ac61f>] io_schedule_timeout+0x9f/0x110 > [<ffffffff815ad786>] bit_wait_io+0x16/0x60 > [<ffffffff815ad579>] __wait_on_bit_lock+0x49/0xa0 > [<ffffffff8111b3d6>] __lock_page+0xb6/0xc0 > [<ffffffff8112f6a4>] truncate_inode_pages_range+0x444/0x790 > [<ffffffff8112fa00>] truncate_inode_pages+0x10/0x20 > [<ffffffff811d6ef0>] kill_bdev+0x30/0x40 > [<ffffffff811d8201>] __blkdev_put+0x71/0x360 > [<ffffffff811d8539>] blkdev_put+0x49/0x170 > [<ffffffff811d8680>] blkdev_close+0x20/0x30 > [<ffffffff8119e338>] __fput+0xe8/0x1f0 > [<ffffffff8119e479>] ____fput+0x9/0x10 > [<ffffffff8107876c>] task_work_run+0x7c/0xb0 > [<ffffffff8105d047>] do_exit+0x3b7/0xb10 > [<ffffffff8105d82b>] do_group_exit+0x4b/0xc0 > [<ffffffff81068f25>] get_signal+0x1c5/0x7f0 > [<ffffffff8101a1a3>] do_signal+0x23/0x700 > [<ffffffff810020d3>] exit_to_usermode_loop+0x73/0xb0 > [<ffffffff81002580>] syscall_return_slowpath+0xb0/0xc0 > [<ffffffff815b2537>] entry_SYSCALL_64_fastpath+0xaa/0xac > systemd-udevd D ffff880062613cd8 0 6767 494 0x00000000 > Call Trace: > [<ffffffff815acf97>] schedule+0x37/0x90 > [<ffffffff815b1487>] schedule_timeout+0x157/0x230 > [<ffffffff810c0d33>] msleep+0x33/0x40 > [<ffffffffa0341a5b>] dm_grab_bdev_for_ioctl+0x7b/0x150 [dm_mod] > [<ffffffffa0341e25>] dm_blk_ioctl+0x35/0x80 [dm_mod] > [<ffffffff812b36eb>] blkdev_ioctl+0x25b/0x980 > [<ffffffff811d79b8>] block_ioctl+0x38/0x40 > [<ffffffff811afd5e>] do_vfs_ioctl+0x8e/0x660 > [<ffffffff811b036c>] SyS_ioctl+0x3c/0x70 > [<ffffffff815b24a9>] entry_SYSCALL_64_fastpath+0x1c/0xac > > Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com> > Cc: <stable@vger.kernel.org> > --- > drivers/md/dm.c | 3 ++- > 1 file changed, 2 insertions(+), 1 deletion(-) > > diff --git a/drivers/md/dm.c b/drivers/md/dm.c > index 1b2f962..f3564e1 100644 > --- a/drivers/md/dm.c > +++ b/drivers/md/dm.c > @@ -630,7 +630,8 @@ retry: > > out: > dm_put_live_table(md, srcu_idx); > - if (r == -ENOTCONN && !fatal_signal_pending(current)) { > + if (r == -ENOTCONN && !fatal_signal_pending(current) && > + !blk_queue_dying(bdev_get_queue(*bdev))) { > msleep(10); > goto retry; > } > -- > 2.8.4 Hi Bart, This patch doesn't make sense. In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only valid if r == 0. But r == -ENOTCONN so how can *bdev be valid? Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock 2016-06-28 18:15 ` Mike Snitzer @ 2016-06-28 18:29 ` Bart Van Assche 2016-06-28 18:59 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Bart Van Assche @ 2016-06-28 18:29 UTC (permalink / raw) To: Mike Snitzer; +Cc: device-mapper development On 06/28/2016 08:15 PM, Mike Snitzer wrote: > This patch doesn't make sense. > > In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only > valid if r == 0. But r == -ENOTCONN so how can *bdev be valid? Sorry but the dm code is not my area of expertise. How about the patch below? Please note that so far only the queue-length path selector has been tested. Thanks, Bart. diff --git a/drivers/md/dm-queue-length.c b/drivers/md/dm-queue-length.c index 23f1786..4c36648 100644 --- a/drivers/md/dm-queue-length.c +++ b/drivers/md/dm-queue-length.c @@ -199,6 +199,8 @@ static struct dm_path *ql_select_path(struct path_selector *ps, size_t nr_bytes) list_move_tail(s->valid_paths.next, &s->valid_paths); list_for_each_entry(pi, &s->valid_paths, list) { + if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev))) + continue; if (!best || (atomic_read(&pi->qlen) < atomic_read(&best->qlen))) best = pi; diff --git a/drivers/md/dm-round-robin.c b/drivers/md/dm-round-robin.c index 4ace1da..56e3919 100644 --- a/drivers/md/dm-round-robin.c +++ b/drivers/md/dm-round-robin.c @@ -217,13 +217,17 @@ static struct dm_path *rr_select_path(struct path_selector *ps, size_t nr_bytes) return current_path; } + current_path = NULL; + spin_lock_irqsave(&s->lock, flags); - if (!list_empty(&s->valid_paths)) { - pi = list_entry(s->valid_paths.next, struct path_info, list); + list_for_each_entry(pi, &s->valid_paths, list) { + if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev))) + continue; list_move_tail(&pi->list, &s->valid_paths); percpu_counter_set(&s->repeat_count, pi->repeat_count); set_percpu_current_path(s, pi->path); current_path = pi->path; + break; } spin_unlock_irqrestore(&s->lock, flags); diff --git a/drivers/md/dm-service-time.c b/drivers/md/dm-service-time.c index 7b86420..fccf66a 100644 --- a/drivers/md/dm-service-time.c +++ b/drivers/md/dm-service-time.c @@ -285,9 +285,12 @@ static struct dm_path *st_select_path(struct path_selector *ps, size_t nr_bytes) /* Change preferred (first in list) path to evenly balance. */ list_move_tail(s->valid_paths.next, &s->valid_paths); - list_for_each_entry(pi, &s->valid_paths, list) + list_for_each_entry(pi, &s->valid_paths, list) { + if (blk_queue_dying(bdev_get_queue(pi->path->dev->bdev))) + continue; if (!best || (st_compare_load(pi, best, nr_bytes) < 0)) best = pi; + } if (!best) goto out; -- 2.8.4 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock 2016-06-28 18:29 ` Bart Van Assche @ 2016-06-28 18:59 ` Mike Snitzer 2016-06-28 19:16 ` Bart Van Assche 0 siblings, 1 reply; 6+ messages in thread From: Mike Snitzer @ 2016-06-28 18:59 UTC (permalink / raw) To: Bart Van Assche; +Cc: device-mapper development On Tue, Jun 28 2016 at 2:29pm -0400, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 06/28/2016 08:15 PM, Mike Snitzer wrote: > > This patch doesn't make sense. > > > > In the context of dm-mpath.c:multipath_prepare_ioctl, *bdev is only > > valid if r == 0. But r == -ENOTCONN so how can *bdev be valid? > > Sorry but the dm code is not my area of expertise. How about the patch > below? Please note that so far only the queue-length path selector has > been tested. Can we go back to what it is you've experienced? is it that you have 'queue_if_no_path' enabled and are issuing ioctls to an mpath device (while removing underlying paths) you'll experience a live-lock (_not_ deadlock) once no valid paths exist? If that isn't what you're hitting then I'd like to better understand how a request_queue that is "dying" isn't able to keep itself up enough to fail IO issued to it (to allow normal error handling to trap the IO failure). Mike ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock 2016-06-28 18:59 ` Mike Snitzer @ 2016-06-28 19:16 ` Bart Van Assche 2016-06-28 19:33 ` Mike Snitzer 0 siblings, 1 reply; 6+ messages in thread From: Bart Van Assche @ 2016-06-28 19:16 UTC (permalink / raw) To: Mike Snitzer; +Cc: device-mapper development On 06/28/2016 08:59 PM, Mike Snitzer wrote: > Can we go back to what it is you've experienced? is it that you have > 'queue_if_no_path' enabled and are issuing ioctls to an mpath device > (while removing underlying paths) you'll experience a live-lock (_not_ > deadlock) once no valid paths exist? > > If that isn't what you're hitting then I'd like to better understand how > a request_queue that is "dying" isn't able to keep itself up enough to > fail IO issued to it (to allow normal error handling to trap the IO > failure). Hello Mike, Since I started testing kernel v4.7-rc<n> I noticed about twenty times that systemd-udevd got stuck in truncate_inode_pages(). I have not yet seen this with any older kernel version. queue_if_no_path is indeed enabled in my tests. The test I run consists of running fio on top of an mpath device and repeatedly removing and restoring the underlying devices. The test script is available at https://github.com/bvanassche/srp-test/blob/master/tests/02. Please let me know if you need more information. Bart. ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: dm mpath: Fix a dm_blk_ioctl() deadlock 2016-06-28 19:16 ` Bart Van Assche @ 2016-06-28 19:33 ` Mike Snitzer 0 siblings, 0 replies; 6+ messages in thread From: Mike Snitzer @ 2016-06-28 19:33 UTC (permalink / raw) To: Bart Van Assche; +Cc: device-mapper development On Tue, Jun 28 2016 at 3:16pm -0400, Bart Van Assche <bart.vanassche@sandisk.com> wrote: > On 06/28/2016 08:59 PM, Mike Snitzer wrote: > >Can we go back to what it is you've experienced? is it that you have > >'queue_if_no_path' enabled and are issuing ioctls to an mpath device > >(while removing underlying paths) you'll experience a live-lock (_not_ > >deadlock) once no valid paths exist? > > > >If that isn't what you're hitting then I'd like to better understand how > >a request_queue that is "dying" isn't able to keep itself up enough to > >fail IO issued to it (to allow normal error handling to trap the IO > >failure). > > Hello Mike, > > Since I started testing kernel v4.7-rc<n> I noticed about twenty > times that systemd-udevd got stuck in truncate_inode_pages(). I have > not yet seen this with any older kernel version. queue_if_no_path is > indeed enabled in my tests. The test I run consists of running fio > on top of an mpath device and repeatedly removing and restoring the > underlying devices. The test script is available at > https://github.com/bvanassche/srp-test/blob/master/tests/02. Please > let me know if you need more information. I'm not going to be able to setup this test and chase this in the near-term. If you want this fixed soon then I'll need you to continue chasing this. Something else must be going on. I fail to see how avoiding dying queues, like your 2nd path selectors patch does, should be needed. A dying queue, and the underlying device that is being torn down, still needs to complete (fail) any of its outstanding IO -- or IO issued to it e.g. via __blkdev_driver_ioctl -- right? Could your driver's queue maybe not be getting torn down like it did in the past? -- if it lingers in this "dying" state then that could start to explain why this is happening all of a sudden in v4.7-rc<n>. Would be nice to know if that is what is happening. But you've definitely seen that your path selector patch, that skips selecting paths with dying queues, avoids this live-lock issue? ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2016-06-28 19:33 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2016-06-28 9:07 [PATCH] dm mpath: Fix a dm_blk_ioctl() deadlock Bart Van Assche 2016-06-28 18:15 ` Mike Snitzer 2016-06-28 18:29 ` Bart Van Assche 2016-06-28 18:59 ` Mike Snitzer 2016-06-28 19:16 ` Bart Van Assche 2016-06-28 19:33 ` Mike Snitzer
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).