All of lore.kernel.org
 help / color / mirror / Atom feed
From: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
To: Jens Axboe <axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
Cc: Joseph Glanville
	<joseph.glanville-2MxvZkOi9dvvnOemgxGiVw@public.gmane.org>,
	cgroups <cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org>,
	Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>,
	linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
Subject: [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
Date: Thu, 20 Sep 2012 14:08:52 -0700	[thread overview]
Message-ID: <20120920210852.GC7264@google.com> (raw)
In-Reply-To: <20120920201815.GB7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>

b82d4b197c ("blkcg: make request_queue bypassing on allocation") made
request_queues bypassed on allocation to avoid switching on and off
bypass mode on a queue being initialized.  Some drivers allocate and
then destroy a lot of queues without fully initializing them and
incurring bypass latency overhead on each of them could add upto
significant overhead.

Unfortunately, blk_init_allocated_queue() is never used by queues of
bio-based drivers, which means that all bio-based driver queues are in
bypass mode even after initialization and registration complete
successfully.

Due to the limited way request_queues are used by bio drivers, this
problem is hidden pretty well but it shows up when blk-throttle is
used in combination with a bio-based driver.  Trying to configure
(echoing to cgroupfs file) blk-throttle for a bio-based driver hangs
indefinitely in blkg_conf_prep() waiting for bypass mode to end.

This patch moves the initial blk_queue_bypass_end() call from
blk_init_allocated_queue() to blk_register_queue() which is called for
any userland-visible queues regardless of its type.

I believe this is correct because I don't think there is any block
driver which needs or wants working elevator and blk-cgroup on a queue
which isn't visible to userland.  If there are such users, we need a
different solution.

Signed-off-by: Tejun Heo <tj-DgEjT+Ai2ygdnm+yROfE0A@public.gmane.org>
Reported-by: Joseph Glanville <joseph.glanville-2MxvZkOi9dvvnOemgxGiVw@public.gmane.org>
Cc: Vivek Goyal <vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
Cc: stable-u79uwXL29TY76Z2rM5mHXA@public.gmane.org
---
Jens, while these are fixes, I think it isn't extremely urgent and
routing these through 3.7-rc1 should be enough.

Thanks.

 block/blk-core.c  |    7 ++-----
 block/blk-sysfs.c |    6 ++++++
 2 files changed, 8 insertions(+), 5 deletions(-)

--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -608,8 +608,8 @@ struct request_queue *blk_alloc_queue_no
 	/*
 	 * A queue starts its life with bypass turned on to avoid
 	 * unnecessary bypass on/off overhead and nasty surprises during
-	 * init.  The initial bypass will be finished at the end of
-	 * blk_init_allocated_queue().
+	 * init.  The initial bypass will be finished when the queue is
+	 * registered by blk_register_queue().
 	 */
 	q->bypass_depth = 1;
 	__set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags);
@@ -714,9 +714,6 @@ blk_init_allocated_queue(struct request_
 		return NULL;
 
 	blk_queue_congestion_threshold(q);
-
-	/* all done, end the initial bypass */
-	blk_queue_bypass_end(q);
 	return q;
 }
 EXPORT_SYMBOL(blk_init_allocated_queue);
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -527,6 +527,12 @@ int blk_register_queue(struct gendisk *d
 	if (WARN_ON(!q))
 		return -ENXIO;
 
+	/*
+	 * Initialization must be complete by now.  Finish the initial
+	 * bypass from queue allocation.
+	 */
+	blk_queue_bypass_end(q);
+
 	ret = blk_trace_init_sysfs(dev);
 	if (ret)
 		return ret;

WARNING: multiple messages have this Message-ID (diff)
From: Tejun Heo <tj@kernel.org>
To: Jens Axboe <axboe@kernel.dk>
Cc: Joseph Glanville <joseph.glanville@orionvm.com.au>,
	cgroups <cgroups@vger.kernel.org>,
	Vivek Goyal <vgoyal@redhat.com>,
	linux-kernel@vger.kernel.org
Subject: [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
Date: Thu, 20 Sep 2012 14:08:52 -0700	[thread overview]
Message-ID: <20120920210852.GC7264@google.com> (raw)
In-Reply-To: <20120920201815.GB7264@google.com>

b82d4b197c ("blkcg: make request_queue bypassing on allocation") made
request_queues bypassed on allocation to avoid switching on and off
bypass mode on a queue being initialized.  Some drivers allocate and
then destroy a lot of queues without fully initializing them and
incurring bypass latency overhead on each of them could add upto
significant overhead.

Unfortunately, blk_init_allocated_queue() is never used by queues of
bio-based drivers, which means that all bio-based driver queues are in
bypass mode even after initialization and registration complete
successfully.

Due to the limited way request_queues are used by bio drivers, this
problem is hidden pretty well but it shows up when blk-throttle is
used in combination with a bio-based driver.  Trying to configure
(echoing to cgroupfs file) blk-throttle for a bio-based driver hangs
indefinitely in blkg_conf_prep() waiting for bypass mode to end.

This patch moves the initial blk_queue_bypass_end() call from
blk_init_allocated_queue() to blk_register_queue() which is called for
any userland-visible queues regardless of its type.

I believe this is correct because I don't think there is any block
driver which needs or wants working elevator and blk-cgroup on a queue
which isn't visible to userland.  If there are such users, we need a
different solution.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joseph Glanville <joseph.glanville@orionvm.com.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: stable@vger.kernel.org
---
Jens, while these are fixes, I think it isn't extremely urgent and
routing these through 3.7-rc1 should be enough.

Thanks.

 block/blk-core.c  |    7 ++-----
 block/blk-sysfs.c |    6 ++++++
 2 files changed, 8 insertions(+), 5 deletions(-)

--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -608,8 +608,8 @@ struct request_queue *blk_alloc_queue_no
 	/*
 	 * A queue starts its life with bypass turned on to avoid
 	 * unnecessary bypass on/off overhead and nasty surprises during
-	 * init.  The initial bypass will be finished at the end of
-	 * blk_init_allocated_queue().
+	 * init.  The initial bypass will be finished when the queue is
+	 * registered by blk_register_queue().
 	 */
 	q->bypass_depth = 1;
 	__set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags);
@@ -714,9 +714,6 @@ blk_init_allocated_queue(struct request_
 		return NULL;
 
 	blk_queue_congestion_threshold(q);
-
-	/* all done, end the initial bypass */
-	blk_queue_bypass_end(q);
 	return q;
 }
 EXPORT_SYMBOL(blk_init_allocated_queue);
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -527,6 +527,12 @@ int blk_register_queue(struct gendisk *d
 	if (WARN_ON(!q))
 		return -ENXIO;
 
+	/*
+	 * Initialization must be complete by now.  Finish the initial
+	 * bypass from queue allocation.
+	 */
+	blk_queue_bypass_end(q);
+
 	ret = blk_trace_init_sysfs(dev);
 	if (ret)
 		return ret;

  parent reply	other threads:[~2012-09-20 21:08 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2012-09-16 17:31 3.6-rc5 cgroups blkio throttle + md regression Joseph Glanville
     [not found] ` <CAOzFzEhf2LfT0BCNPAgPgxZ3=pj2KvJ4Z6kP7XF8nnxag1dfvg-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-19 16:28   ` Joseph Glanville
     [not found]     ` <CAOzFzEiC4313K4H9393ffzNyBo398BPYSxTk7ZEmuH4GfW5qtw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-19 18:20       ` Joseph Glanville
     [not found]         ` <CAOzFzEhWk_7oQFOyW6Ri_9Cvsshj2s3pa=Oo-p8uL6r340MoTw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-09-19 19:42           ` Vivek Goyal
     [not found]             ` <20120919194231.GF31860-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-09-20 18:31               ` Tejun Heo
     [not found]                 ` <20120920183153.GI28934-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 18:42                   ` Vivek Goyal
     [not found]                     ` <20120920184219.GH4681-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-09-20 19:17                       ` Vivek Goyal
     [not found]                         ` <20120920191716.GI4681-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-09-20 19:20                           ` Tejun Heo
     [not found]                             ` <20120920192038.GJ28934-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 19:32                               ` Vivek Goyal
     [not found]                                 ` <20120920193227.GJ4681-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-09-20 20:17                                   ` Tejun Heo
2012-09-20 19:57                               ` Vivek Goyal
     [not found]                                 ` <20120920195759.GK4681-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org>
2012-09-20 20:18                                   ` Tejun Heo
     [not found]                                     ` <20120920201815.GB7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 21:08                                       ` Tejun Heo [this message]
2012-09-20 21:08                                         ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Tejun Heo
     [not found]                                         ` <20120920210852.GC7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-20 21:09                                           ` [PATCH 2/2] block: fix request_queue->flags initialization Tejun Heo
2012-09-20 21:09                                             ` Tejun Heo
     [not found]                                             ` <20120920210930.GD7264-hpIqsD4AKlfQT0dZR+AlfA@public.gmane.org>
2012-09-21 13:25                                               ` Vivek Goyal
2012-09-21 13:25                                                 ` Vivek Goyal
2012-09-21 13:25                                           ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Vivek Goyal
2012-09-21 13:25                                             ` Vivek Goyal
2012-09-21 13:25                                           ` Jens Axboe
2012-09-21 13:25                                             ` Jens Axboe
     [not found]                                             ` <505C6AD4.6030206-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org>
2012-10-16 10:00                                               ` Joseph Glanville
2012-10-16 10:00                                                 ` Joseph Glanville
     [not found]                                                 ` <CAOzFzEiCWazLEfjo=w8c+7qCce98Q6faW1uwGm-tRmCNPJUztw-JsoAwUIsXosN+BqQ9rBEUg@public.gmane.org>
2012-10-16 19:11                                                   ` Tejun Heo
2012-10-16 19:11                                                     ` Tejun Heo

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20120920210852.GC7264@google.com \
    --to=tj-dgejt+ai2ygdnm+yrofe0a@public.gmane.org \
    --cc=axboe-tSWWG44O7X1aa/9Udqfwiw@public.gmane.org \
    --cc=cgroups-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=joseph.glanville-2MxvZkOi9dvvnOemgxGiVw@public.gmane.org \
    --cc=linux-kernel-u79uwXL29TY76Z2rM5mHXA@public.gmane.org \
    --cc=vgoyal-H+wXaHxf7aLQT0dZR+AlfA@public.gmane.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.