All of lore.kernel.org
 help / color / mirror / Atom feed
From: Christian Borntraeger <borntraeger@de.ibm.com>
To: Tejun Heo <tj@kernel.org>, Jens Axboe <axboe@kernel.dk>
Cc: Kent Overstreet <kmo@daterainc.com>,
	Christoph Hellwig <hch@lst.de>,
	"linux-kernel@vger.kernel.org >> Linux Kernel Mailing List"
	<linux-kernel@vger.kernel.org>,
	linux-s390 <linux-s390@vger.kernel.org>
Subject: Re: [PATCH block/for-linus] blk-mq: make mq_queue_reinit_notify() freeze queues in parallel
Date: Tue, 04 Nov 2014 20:46:59 +0100	[thread overview]
Message-ID: <54592D33.4080405@de.ibm.com> (raw)
In-Reply-To: <20141104185227.GH14459@htj.dyndns.org>

Am 04.11.2014 19:52, schrieb Tejun Heo:
> q->mq_usage_counter is a percpu_ref which is killed and drained when
> the queue is frozen.  On a CPU hotplug event, blk_mq_queue_reinit()
> which involves freezing the queue is invoked on all existing queues.
> Because percpu_ref killing and draining involve a RCU grace period,
> doing the above on one queue after another may take a long time if
> there are many queues on the system.
> 
> This patch splits out initiation of freezing and waiting for its
> completion, and updates blk_mq_queue_reinit_notify() so that the
> queues are frozen in parallel instead of one after another.  Note that
> freezing and unfreezing are moved from blk_mq_queue_reinit() to
> blk_mq_queue_reinit_notify().
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Christian Borntraeger <borntraeger@de.ibm.com>
Tested-by: Christian Borntraeger <borntraeger@de.ibm.com>

Thanks.

> ---
> Christian, can you please verify that this resolves the latency issue
> that you're seeing?  Jens, can you please route this patch once
> Christian confirms it?
> 
> Thanks!
> 
>  block/blk-mq.c |   41 +++++++++++++++++++++++++++++++++--------
>  1 file changed, 33 insertions(+), 8 deletions(-)
> 
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -107,11 +107,7 @@ static void blk_mq_usage_counter_release
>  	wake_up_all(&q->mq_freeze_wq);
>  }
> 
> -/*
> - * Guarantee no request is in use, so we can change any data structure of
> - * the queue afterward.
> - */
> -void blk_mq_freeze_queue(struct request_queue *q)
> +static void blk_mq_freeze_queue_start(struct request_queue *q)
>  {
>  	bool freeze;
> 
> @@ -123,9 +119,23 @@ void blk_mq_freeze_queue(struct request_
>  		percpu_ref_kill(&q->mq_usage_counter);
>  		blk_mq_run_queues(q, false);
>  	}
> +}
> +
> +static void blk_mq_freeze_queue_wait(struct request_queue *q)
> +{
>  	wait_event(q->mq_freeze_wq, percpu_ref_is_zero(&q->mq_usage_counter));
>  }
> 
> +/*
> + * Guarantee no request is in use, so we can change any data structure of
> + * the queue afterward.
> + */
> +void blk_mq_freeze_queue(struct request_queue *q)
> +{
> +	blk_mq_freeze_queue_start(q);
> +	blk_mq_freeze_queue_wait(q);
> +}
> +
>  static void blk_mq_unfreeze_queue(struct request_queue *q)
>  {
>  	bool wake;
> @@ -1921,7 +1931,7 @@ void blk_mq_free_queue(struct request_qu
>  /* Basically redo blk_mq_init_queue with queue frozen */
>  static void blk_mq_queue_reinit(struct request_queue *q)
>  {
> -	blk_mq_freeze_queue(q);
> +	WARN_ON_ONCE(!q->mq_freeze_depth);
> 
>  	blk_mq_sysfs_unregister(q);
> 
> @@ -1936,8 +1946,6 @@ static void blk_mq_queue_reinit(struct r
>  	blk_mq_map_swqueue(q);
> 
>  	blk_mq_sysfs_register(q);
> -
> -	blk_mq_unfreeze_queue(q);
>  }
> 
>  static int blk_mq_queue_reinit_notify(struct notifier_block *nb,
> @@ -1956,8 +1964,25 @@ static int blk_mq_queue_reinit_notify(st
>  		return NOTIFY_OK;
> 
>  	mutex_lock(&all_q_mutex);
> +
> +	/*
> +	 * We need to freeze and reinit all existing queues.  Freezing
> +	 * involves synchronous wait for an RCU grace period and doing it
> +	 * one by one may take a long time.  Start freezing all queues in
> +	 * one swoop and then wait for the completions so that freezing can
> +	 * take place in parallel.
> +	 */
> +	list_for_each_entry(q, &all_q_list, all_q_node)
> +		blk_mq_freeze_queue_start(q);
> +	list_for_each_entry(q, &all_q_list, all_q_node)
> +		blk_mq_freeze_queue_wait(q);
> +
>  	list_for_each_entry(q, &all_q_list, all_q_node)
>  		blk_mq_queue_reinit(q);
> +
> +	list_for_each_entry(q, &all_q_list, all_q_node)
> +		blk_mq_unfreeze_queue(q);
> +
>  	mutex_unlock(&all_q_mutex);
>  	return NOTIFY_OK;
>  }
> 

  reply	other threads:[~2014-11-04 19:46 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2014-10-28 19:35 blk-mq vs cpu hotplug performance (due to percpu_ref_put performance) Christian Borntraeger
2014-10-28 20:00 ` Tejun Heo
2014-10-28 20:20   ` Christian Borntraeger
2014-10-28 20:22     ` Tejun Heo
2014-10-28 20:26       ` Tejun Heo
2014-10-28 20:29       ` Christian Borntraeger
2014-10-28 20:30         ` Tejun Heo
2014-11-04 18:52           ` [PATCH block/for-linus] blk-mq: make mq_queue_reinit_notify() freeze queues in parallel Tejun Heo
2014-11-04 19:46             ` Christian Borntraeger [this message]
2014-11-04 21:48             ` Jens Axboe

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=54592D33.4080405@de.ibm.com \
    --to=borntraeger@de.ibm.com \
    --cc=axboe@kernel.dk \
    --cc=hch@lst.de \
    --cc=kmo@daterainc.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-s390@vger.kernel.org \
    --cc=tj@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.