Re: v4.8 dm-mpath - Mike Snitzer

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mike Snitzer <snitzer@redhat.com>
To: Bart Van Assche <bart.vanassche@sandisk.com>
Cc: device-mapper development <dm-devel@redhat.com>
Subject: Re: v4.8 dm-mpath
Date: Wed, 17 Aug 2016 21:54:34 -0400	[thread overview]
Message-ID: <20160818015433.GA8972@redhat.com> (raw)
In-Reply-To: <a85fe9dc-e09a-152f-8576-13e899cde9c4@sandisk.com>

On Wed, Aug 17 2016 at  8:29pm -0400,
Bart Van Assche <bart.vanassche@sandisk.com> wrote:

> On 08/16/2016 07:43 PM, Mike Snitzer wrote:
> >On Tue, Aug 16 2016 at  3:12pm -0400, Mike Snitzer <snitzer@redhat.com> wrote:
> >>On Tue, Aug 16 2016 at  1:32pm -0400, Bart Van Assche <bart.vanassche@sandisk.com> wrote:
> >>>Can you have a look at this?
> >>
> >>I'll get linux-dm.git's 'dm-4.8' branch (v4.8-rc2) loaded on one of my
> >>testbed systems to run the mptest testsuite.  It should provide coverage
> >>for simple failover and failback.
> >
> >I successfully ran mptest against that v4.8-rc2 based kernel.  Any
> >chance you could try mptest to see if it also passes for you?  Or if it
> >is able to trigger the issue for you.
> >
> >mptest's runtest script defaults to setting up scsi-mq and dm-mq.
> 
> Hello Mike,
> 
> I will run mptest as soon as I have the time.

OK, thanks.
 
> Earlier today I came up with three patches that avoid the hang in
> truncate_inode_pages_range() that I had reported before. It would be
> appreciated if you could have a look at these patches. Please keep
> in mind that I'm not a dm expert.

I don't recall where you mentioned a hang in
truncate_inode_pages_range().. ah I see it here:
https://patchwork.kernel.org/patch/9202331/

Comments inlined below.

> From 0e8e9f7d7489f5e3b0bf9e0c59257277d08a2ec0 Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bart.vanassche@sandisk.com>
> Date: Wed, 17 Aug 2016 14:18:36 -0700
> Subject: [PATCH 1/3] block: Avoid that requests get stuck when stopping and
>  restarting a queue
> 
> If a driver calls blk_start_queue() after a queue has been marked
> "dead", avoid that requests get stuck by invoking __blk_run_queue()
> directly if QUEUE_FLAG_DEAD has been set.
> 
> While testing the dm-mpath driver I noticed that sporadically a
> readahead request was generated by the filesystem on top of the dm
> device but that that I/O request was never processed. This happened
> only while the dm core was suspending and resuming I/O.s
> 
> The following backtrace (triggered by a WARN_ON_ONCE(true) statement
> that I had inserted just before the __blk_run_queue(q) call) shows
> that this code path is relevant:
> 
> WARNING: CPU: 7 PID: 4707 at drivers/md/dm.c:2035 map_request+0x1d5/0x340 [dm_mod]
> Aug 17 15:47:35 ion-dev-ib-ini kernel: CPU: 7 PID: 4707 Comm: kdmwork-254:0 Tainted: G        W       4.7.0-dbg+ #2
> Call Trace:
>  [<ffffffff81322ee7>] dump_stack+0x68/0xa1
>  [<ffffffff81061e06>] __warn+0xc6/0xe0
>  [<ffffffff81061ed8>] warn_slowpath_null+0x18/0x20
>  [<ffffffffa04282d5>] map_request+0x1d5/0x340 [dm_mod]
>  [<ffffffffa042845d>] map_tio_request+0x1d/0x40 [dm_mod]
>  [<ffffffff81085fe1>] kthread_worker_fn+0xd1/0x1b0
>  [<ffffffff81085e9a>] kthread+0xea/0x100
>  [<ffffffff8162857f>] ret_from_fork+0x1f/0x40
> 
> References: commit 704605711ef0 ("block: Avoid scheduling delayed work on a dead queue")
> Signed-off-by: Bart Van Assche <bart.vanassche@sandisk.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Mike Snitzer <snitzer@redhat.com>
> Cc: stable <stable@vger.kernel.org>
> ---
>  block/blk-core.c | 2 ++
>  1 file changed, 2 insertions(+)
> 
> diff --git a/block/blk-core.c b/block/blk-core.c
> index 0d9638e..5f75709 100644
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -355,6 +355,8 @@ void blk_run_queue_async(struct request_queue *q)
>  {
>  	if (likely(!blk_queue_stopped(q) && !blk_queue_dead(q)))
>  		mod_delayed_work(kblockd_workqueue, &q->delay_work, 0);
> +	else
> +		__blk_run_queue(q);
>  }
>  EXPORT_SYMBOL(blk_run_queue_async);

Your else clause would catch the case when the queue is stopped but
not dead... __blk_run_queue() will handle this cleanly but I think the
logic could be: else if (blk_queue_dead(q))

Also, strikes me as odd to have an _async interface resort to a
synchronous call.  Why is that needed?  Are you sure it is safe for all
callers?

> From 9b103cfb2d9b4cdec73c7d09188b4d42b041d49e Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bart.vanassche@sandisk.com>
> Date: Wed, 17 Aug 2016 16:26:02 -0700
> Subject: [PATCH 2/3] dm: Change return type of __dm_internal_suspend() to int
> 
> ---
>  drivers/md/dm.c | 15 ++++++++++-----
>  1 file changed, 10 insertions(+), 5 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 3689b7f..6eccdf3 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -3379,16 +3379,18 @@ out:
>   * It may be used only from the kernel.
>   */
>  
> -static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_flags)
> +static int __dm_internal_suspend(struct mapped_device *md,
> +				 unsigned suspend_flags)
>  {
>  	struct dm_table *map = NULL;
> +	int ret = 0;
>  
>  	if (md->internal_suspend_count++)
> -		return; /* nested internal suspend */
> +		goto out; /* nested internal suspend */
>  
>  	if (dm_suspended_md(md)) {
>  		set_bit(DMF_SUSPENDED_INTERNALLY, &md->flags);
> -		return; /* nest suspend */
> +		goto out; /* nest suspend */
>  	}
>  
>  	map = rcu_dereference_protected(md->map, lockdep_is_held(&md->suspend_lock));
> @@ -3399,10 +3401,13 @@ static void __dm_internal_suspend(struct mapped_device *md, unsigned suspend_fla
>  	 * would require changing .presuspend to return an error -- avoid this
>  	 * until there is a need for more elaborate variants of internal suspend.
>  	 */
> -	(void) __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE,
> -			    DMF_SUSPENDED_INTERNALLY);
> +	ret = __dm_suspend(md, map, suspend_flags, TASK_UNINTERRUPTIBLE,
> +			   DMF_SUSPENDED_INTERNALLY);
>  
>  	dm_table_postsuspend_targets(map);
> +
> +out:
> +	return ret;
>  }
>  
>  static void __dm_internal_resume(struct mapped_device *md)

So you're only wanting to have your new __dm_internal_suspend() caller,
__dm_destroy, issue a WARN_ON if __dm_suspend() fails.

Is this _really_ important or did you just do this for debugging
purposes?


> From f6e58e71eec584b3f66fe8367340ca896d29408e Mon Sep 17 00:00:00 2001
> From: Bart Van Assche <bart.vanassche@sandisk.com>
> Date: Wed, 17 Aug 2016 15:26:07 -0700
> Subject: [PATCH 3/3] dm: Avoid that requests get stuck while destroying a dm
>  device
> 
> ---
>  drivers/md/dm.c | 25 ++++++++++++++-----------
>  1 file changed, 14 insertions(+), 11 deletions(-)
> 
> diff --git a/drivers/md/dm.c b/drivers/md/dm.c
> index 6eccdf3..ae4bfa0 100644
> --- a/drivers/md/dm.c
> +++ b/drivers/md/dm.c
> @@ -2958,10 +2958,13 @@ const char *dm_device_name(struct mapped_device *md)
>  }
>  EXPORT_SYMBOL_GPL(dm_device_name);
>  
> +static int __dm_internal_suspend(struct mapped_device *md,
> +				 unsigned suspend_flags);
> +
>  static void __dm_destroy(struct mapped_device *md, bool wait)
>  {
> -	struct dm_table *map;
> -	int srcu_idx;
> +	struct request_queue *q = dm_get_md_queue(md);
> +	int res;
>  
>  	might_sleep();
>  
> @@ -2970,21 +2973,21 @@ static void __dm_destroy(struct mapped_device *md, bool wait)
>  	set_bit(DMF_FREEING, &md->flags);
>  	spin_unlock(&_minor_lock);
>  
> -	if (dm_request_based(md) && md->kworker_task)
> -		flush_kthread_worker(&md->kworker);

Header needs detail.  E.g. __dm_suspend() via, __dm_internal_suspend(),
will call flush_kthread_worker().  But that is a nit in the grand scheme
of things.  Bigger concerns below.

> +	/*
> +	 * Avoid that dm_make_request() gets called after
> +	 * __dm_internal_suspend() finished.
> +	 */
> +	spin_lock_irq(q->queue_lock);
> +	queue_flag_set(QUEUE_FLAG_DYING, q);
> +	spin_unlock_irq(q->queue_lock);

Feels like the case for marking the queue as DYING can be made
independent of changes to the mechanism for flushing IO here.

>  
>  	/*
>  	 * Take suspend_lock so that presuspend and postsuspend methods
>  	 * do not race with internal suspend.
>  	 */
>  	mutex_lock(&md->suspend_lock);
> -	map = dm_get_live_table(md, &srcu_idx);
> -	if (!dm_suspended_md(md)) {
> -		dm_table_presuspend_targets(map);
> -		dm_table_postsuspend_targets(map);
> -	}
> -	/* dm_put_live_table must be before msleep, otherwise deadlock is possible */
> -	dm_put_live_table(md, srcu_idx);
> +	res = __dm_internal_suspend(md, DM_SUSPEND_NOFLUSH_FLAG);
> +	WARN_ONCE(res, "__dm_internal_suspend() returned %d\n", res);
>  	mutex_unlock(&md->suspend_lock);
>  
>  	/*

Please be explicit about why switching to using __dm_internal_suspend()
is important.  Are you just going for code reuse or is there a more
important reason?

I can infer you probably want the call to dm_stop_queue()...

But I need way more convincing on why __dm_internal_suspend() is needed.
As you can see from the header for commit ffcc3936416, the internal
suspend code is relatively complex.  Increasing its use, especially in a
method such as __dm_destroy, needs a good reason.

But that aside, what raises serious concerns for me is that you're using
DM_SUSPEND_NOFLUSH_FLAG where we absolutely _need_ to flush IO (given
we're in __dm_destroy).

At a minimum DM_SUSPEND_NOFLUSH_FLAG must not be used.

Mike

next prev parent reply	other threads:[~2016-08-18  1:54 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2016-08-16 17:32 v4.8 dm-mpath Bart Van Assche
2016-08-16 19:12 ` Mike Snitzer
2016-08-17  2:43   ` Mike Snitzer
2016-08-18  0:29     ` Bart Van Assche
2016-08-18  1:54       ` Mike Snitzer [this message]
2016-08-25 17:40         ` Bart Van Assche
2016-08-26 14:26           ` Mike Snitzer
2016-08-26 15:33             ` Bart Van Assche
2016-08-26 16:35               ` Mike Snitzer
2016-08-25 21:06     ` Bart Van Assche

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20160818015433.GA8972@redhat.com \
    --to=snitzer@redhat.com \
    --cc=bart.vanassche@sandisk.com \
    --cc=dm-devel@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.