public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
       [not found]                 ` <20120920201815.GB7264@google.com>
@ 2012-09-20 21:08                   ` Tejun Heo
  2012-09-20 21:09                     ` [PATCH 2/2] block: fix request_queue->flags initialization Tejun Heo
                                       ` (2 more replies)
  0 siblings, 3 replies; 7+ messages in thread
From: Tejun Heo @ 2012-09-20 21:08 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Joseph Glanville, cgroups, Vivek Goyal, linux-kernel

b82d4b197c ("blkcg: make request_queue bypassing on allocation") made
request_queues bypassed on allocation to avoid switching on and off
bypass mode on a queue being initialized.  Some drivers allocate and
then destroy a lot of queues without fully initializing them and
incurring bypass latency overhead on each of them could add upto
significant overhead.

Unfortunately, blk_init_allocated_queue() is never used by queues of
bio-based drivers, which means that all bio-based driver queues are in
bypass mode even after initialization and registration complete
successfully.

Due to the limited way request_queues are used by bio drivers, this
problem is hidden pretty well but it shows up when blk-throttle is
used in combination with a bio-based driver.  Trying to configure
(echoing to cgroupfs file) blk-throttle for a bio-based driver hangs
indefinitely in blkg_conf_prep() waiting for bypass mode to end.

This patch moves the initial blk_queue_bypass_end() call from
blk_init_allocated_queue() to blk_register_queue() which is called for
any userland-visible queues regardless of its type.

I believe this is correct because I don't think there is any block
driver which needs or wants working elevator and blk-cgroup on a queue
which isn't visible to userland.  If there are such users, we need a
different solution.

Signed-off-by: Tejun Heo <tj@kernel.org>
Reported-by: Joseph Glanville <joseph.glanville@orionvm.com.au>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: stable@vger.kernel.org
---
Jens, while these are fixes, I think it isn't extremely urgent and
routing these through 3.7-rc1 should be enough.

Thanks.

 block/blk-core.c  |    7 ++-----
 block/blk-sysfs.c |    6 ++++++
 2 files changed, 8 insertions(+), 5 deletions(-)

--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -608,8 +608,8 @@ struct request_queue *blk_alloc_queue_no
 	/*
 	 * A queue starts its life with bypass turned on to avoid
 	 * unnecessary bypass on/off overhead and nasty surprises during
-	 * init.  The initial bypass will be finished at the end of
-	 * blk_init_allocated_queue().
+	 * init.  The initial bypass will be finished when the queue is
+	 * registered by blk_register_queue().
 	 */
 	q->bypass_depth = 1;
 	__set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags);
@@ -714,9 +714,6 @@ blk_init_allocated_queue(struct request_
 		return NULL;
 
 	blk_queue_congestion_threshold(q);
-
-	/* all done, end the initial bypass */
-	blk_queue_bypass_end(q);
 	return q;
 }
 EXPORT_SYMBOL(blk_init_allocated_queue);
--- a/block/blk-sysfs.c
+++ b/block/blk-sysfs.c
@@ -527,6 +527,12 @@ int blk_register_queue(struct gendisk *d
 	if (WARN_ON(!q))
 		return -ENXIO;
 
+	/*
+	 * Initialization must be complete by now.  Finish the initial
+	 * bypass from queue allocation.
+	 */
+	blk_queue_bypass_end(q);
+
 	ret = blk_trace_init_sysfs(dev);
 	if (ret)
 		return ret;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH 2/2] block: fix request_queue->flags initialization
  2012-09-20 21:08                   ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Tejun Heo
@ 2012-09-20 21:09                     ` Tejun Heo
  2012-09-21 13:25                       ` Vivek Goyal
  2012-09-21 13:25                     ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Vivek Goyal
  2012-09-21 13:25                     ` Jens Axboe
  2 siblings, 1 reply; 7+ messages in thread
From: Tejun Heo @ 2012-09-20 21:09 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Joseph Glanville, cgroups, Vivek Goyal, linux-kernel

A queue newly allocated with blk_alloc_queue_node() has only
QUEUE_FLAG_BYPASS set.  For request-based drivers,
blk_init_allocated_queue() is called and q->queue_flags is overwritten
with QUEUE_FLAG_DEFAULT which doesn't include BYPASS even though the
initial bypass is still in effect.

In blk_init_allocated_queue(), or QUEUE_FLAG_DEFAULT to q->queue_flags
instead of overwriting.

Signed-off-by: Tejun Heo <tj@kernel.org>
Cc: Vivek Goyal <vgoyal@redhat.com>
Cc: stable@vger.kernel.org
---
 block/blk-core.c |    2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -696,7 +696,7 @@ blk_init_allocated_queue(struct request_
 	q->request_fn		= rfn;
 	q->prep_rq_fn		= NULL;
 	q->unprep_rq_fn		= NULL;
-	q->queue_flags		= QUEUE_FLAG_DEFAULT;
+	q->queue_flags		|= QUEUE_FLAG_DEFAULT;
 
 	/* Override internal queue lock with supplied lock pointer */
 	if (lock)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
  2012-09-20 21:08                   ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Tejun Heo
  2012-09-20 21:09                     ` [PATCH 2/2] block: fix request_queue->flags initialization Tejun Heo
@ 2012-09-21 13:25                     ` Vivek Goyal
  2012-09-21 13:25                     ` Jens Axboe
  2 siblings, 0 replies; 7+ messages in thread
From: Vivek Goyal @ 2012-09-21 13:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, Joseph Glanville, cgroups, linux-kernel

On Thu, Sep 20, 2012 at 02:08:52PM -0700, Tejun Heo wrote:
> b82d4b197c ("blkcg: make request_queue bypassing on allocation") made
> request_queues bypassed on allocation to avoid switching on and off
> bypass mode on a queue being initialized.  Some drivers allocate and
> then destroy a lot of queues without fully initializing them and
> incurring bypass latency overhead on each of them could add upto
> significant overhead.
> 
> Unfortunately, blk_init_allocated_queue() is never used by queues of
> bio-based drivers, which means that all bio-based driver queues are in
> bypass mode even after initialization and registration complete
> successfully.
> 
> Due to the limited way request_queues are used by bio drivers, this
> problem is hidden pretty well but it shows up when blk-throttle is
> used in combination with a bio-based driver.  Trying to configure
> (echoing to cgroupfs file) blk-throttle for a bio-based driver hangs
> indefinitely in blkg_conf_prep() waiting for bypass mode to end.
> 
> This patch moves the initial blk_queue_bypass_end() call from
> blk_init_allocated_queue() to blk_register_queue() which is called for
> any userland-visible queues regardless of its type.
> 
> I believe this is correct because I don't think there is any block
> driver which needs or wants working elevator and blk-cgroup on a queue
> which isn't visible to userland.  If there are such users, we need a
> different solution.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Joseph Glanville <joseph.glanville@orionvm.com.au>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: stable@vger.kernel.org
> ---
> Jens, while these are fixes, I think it isn't extremely urgent and
> routing these through 3.7-rc1 should be enough.

Looks good to me. 

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Given the fact that blkcg throttling is broken on all bio based devices
(dm,md), I would think that we need to send these fixes out in 3.6
instead of pushing these out to 3.7.

Thanks
Vivek

> 
> Thanks.
> 
>  block/blk-core.c  |    7 ++-----
>  block/blk-sysfs.c |    6 ++++++
>  2 files changed, 8 insertions(+), 5 deletions(-)
> 
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -608,8 +608,8 @@ struct request_queue *blk_alloc_queue_no
>  	/*
>  	 * A queue starts its life with bypass turned on to avoid
>  	 * unnecessary bypass on/off overhead and nasty surprises during
> -	 * init.  The initial bypass will be finished at the end of
> -	 * blk_init_allocated_queue().
> +	 * init.  The initial bypass will be finished when the queue is
> +	 * registered by blk_register_queue().
>  	 */
>  	q->bypass_depth = 1;
>  	__set_bit(QUEUE_FLAG_BYPASS, &q->queue_flags);
> @@ -714,9 +714,6 @@ blk_init_allocated_queue(struct request_
>  		return NULL;
>  
>  	blk_queue_congestion_threshold(q);
> -
> -	/* all done, end the initial bypass */
> -	blk_queue_bypass_end(q);
>  	return q;
>  }
>  EXPORT_SYMBOL(blk_init_allocated_queue);
> --- a/block/blk-sysfs.c
> +++ b/block/blk-sysfs.c
> @@ -527,6 +527,12 @@ int blk_register_queue(struct gendisk *d
>  	if (WARN_ON(!q))
>  		return -ENXIO;
>  
> +	/*
> +	 * Initialization must be complete by now.  Finish the initial
> +	 * bypass from queue allocation.
> +	 */
> +	blk_queue_bypass_end(q);
> +
>  	ret = blk_trace_init_sysfs(dev);
>  	if (ret)
>  		return ret;

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 2/2] block: fix request_queue->flags initialization
  2012-09-20 21:09                     ` [PATCH 2/2] block: fix request_queue->flags initialization Tejun Heo
@ 2012-09-21 13:25                       ` Vivek Goyal
  0 siblings, 0 replies; 7+ messages in thread
From: Vivek Goyal @ 2012-09-21 13:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Jens Axboe, Joseph Glanville, cgroups, linux-kernel

On Thu, Sep 20, 2012 at 02:09:30PM -0700, Tejun Heo wrote:
> A queue newly allocated with blk_alloc_queue_node() has only
> QUEUE_FLAG_BYPASS set.  For request-based drivers,
> blk_init_allocated_queue() is called and q->queue_flags is overwritten
> with QUEUE_FLAG_DEFAULT which doesn't include BYPASS even though the
> initial bypass is still in effect.
> 
> In blk_init_allocated_queue(), or QUEUE_FLAG_DEFAULT to q->queue_flags
> instead of overwriting.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: stable@vger.kernel.org

Acked-by: Vivek Goyal <vgoyal@redhat.com>

Vivek

> ---
>  block/blk-core.c |    2 +-
>  1 file changed, 1 insertion(+), 1 deletion(-)
> 
> --- a/block/blk-core.c
> +++ b/block/blk-core.c
> @@ -696,7 +696,7 @@ blk_init_allocated_queue(struct request_
>  	q->request_fn		= rfn;
>  	q->prep_rq_fn		= NULL;
>  	q->unprep_rq_fn		= NULL;
> -	q->queue_flags		= QUEUE_FLAG_DEFAULT;
> +	q->queue_flags		|= QUEUE_FLAG_DEFAULT;
>  
>  	/* Override internal queue lock with supplied lock pointer */
>  	if (lock)

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
  2012-09-20 21:08                   ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Tejun Heo
  2012-09-20 21:09                     ` [PATCH 2/2] block: fix request_queue->flags initialization Tejun Heo
  2012-09-21 13:25                     ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Vivek Goyal
@ 2012-09-21 13:25                     ` Jens Axboe
  2012-10-16 10:00                       ` Joseph Glanville
  2 siblings, 1 reply; 7+ messages in thread
From: Jens Axboe @ 2012-09-21 13:25 UTC (permalink / raw)
  To: Tejun Heo; +Cc: Joseph Glanville, cgroups, Vivek Goyal, linux-kernel

On 09/20/2012 11:08 PM, Tejun Heo wrote:
> b82d4b197c ("blkcg: make request_queue bypassing on allocation") made
> request_queues bypassed on allocation to avoid switching on and off
> bypass mode on a queue being initialized.  Some drivers allocate and
> then destroy a lot of queues without fully initializing them and
> incurring bypass latency overhead on each of them could add upto
> significant overhead.
> 
> Unfortunately, blk_init_allocated_queue() is never used by queues of
> bio-based drivers, which means that all bio-based driver queues are in
> bypass mode even after initialization and registration complete
> successfully.
> 
> Due to the limited way request_queues are used by bio drivers, this
> problem is hidden pretty well but it shows up when blk-throttle is
> used in combination with a bio-based driver.  Trying to configure
> (echoing to cgroupfs file) blk-throttle for a bio-based driver hangs
> indefinitely in blkg_conf_prep() waiting for bypass mode to end.
> 
> This patch moves the initial blk_queue_bypass_end() call from
> blk_init_allocated_queue() to blk_register_queue() which is called for
> any userland-visible queues regardless of its type.
> 
> I believe this is correct because I don't think there is any block
> driver which needs or wants working elevator and blk-cgroup on a queue
> which isn't visible to userland.  If there are such users, we need a
> different solution.
> 
> Signed-off-by: Tejun Heo <tj@kernel.org>
> Reported-by: Joseph Glanville <joseph.glanville@orionvm.com.au>
> Cc: Vivek Goyal <vgoyal@redhat.com>
> Cc: stable@vger.kernel.org
> ---
> Jens, while these are fixes, I think it isn't extremely urgent and
> routing these through 3.7-rc1 should be enough.

Agree, I'll shove them into for-3.7/core

-- 
Jens Axboe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
  2012-09-21 13:25                     ` Jens Axboe
@ 2012-10-16 10:00                       ` Joseph Glanville
  2012-10-16 19:11                         ` Tejun Heo
  0 siblings, 1 reply; 7+ messages in thread
From: Joseph Glanville @ 2012-10-16 10:00 UTC (permalink / raw)
  To: Jens Axboe; +Cc: Tejun Heo, cgroups, Vivek Goyal, linux-kernel

On 21 September 2012 23:25, Jens Axboe <axboe@kernel.dk> wrote:
> On 09/20/2012 11:08 PM, Tejun Heo wrote:
>> b82d4b197c ("blkcg: make request_queue bypassing on allocation") made
>> request_queues bypassed on allocation to avoid switching on and off
>> bypass mode on a queue being initialized.  Some drivers allocate and
>> then destroy a lot of queues without fully initializing them and
>> incurring bypass latency overhead on each of them could add upto
>> significant overhead.
>>
>> Unfortunately, blk_init_allocated_queue() is never used by queues of
>> bio-based drivers, which means that all bio-based driver queues are in
>> bypass mode even after initialization and registration complete
>> successfully.
>>
>> Due to the limited way request_queues are used by bio drivers, this
>> problem is hidden pretty well but it shows up when blk-throttle is
>> used in combination with a bio-based driver.  Trying to configure
>> (echoing to cgroupfs file) blk-throttle for a bio-based driver hangs
>> indefinitely in blkg_conf_prep() waiting for bypass mode to end.
>>
>> This patch moves the initial blk_queue_bypass_end() call from
>> blk_init_allocated_queue() to blk_register_queue() which is called for
>> any userland-visible queues regardless of its type.
>>
>> I believe this is correct because I don't think there is any block
>> driver which needs or wants working elevator and blk-cgroup on a queue
>> which isn't visible to userland.  If there are such users, we need a
>> different solution.
>>
>> Signed-off-by: Tejun Heo <tj@kernel.org>
>> Reported-by: Joseph Glanville <joseph.glanville@orionvm.com.au>
>> Cc: Vivek Goyal <vgoyal@redhat.com>
>> Cc: stable@vger.kernel.org
>> ---
>> Jens, while these are fixes, I think it isn't extremely urgent and
>> routing these through 3.7-rc1 should be enough.
>
> Agree, I'll shove them into for-3.7/core
>
> --
> Jens Axboe
>

Hi,

Has this patch been marked for stable?
This is still currently broken on 3.6.2 (and I would assume other
stable kernels since 3.5)

Joseph.

-- 
CTO | Orion Virtualisation Solutions | www.orionvm.com.au
Phone: 1300 56 99 52 | Mobile: 0428 754 846

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue()
  2012-10-16 10:00                       ` Joseph Glanville
@ 2012-10-16 19:11                         ` Tejun Heo
  0 siblings, 0 replies; 7+ messages in thread
From: Tejun Heo @ 2012-10-16 19:11 UTC (permalink / raw)
  To: Joseph Glanville; +Cc: Jens Axboe, cgroups, Vivek Goyal, linux-kernel

On Tue, Oct 16, 2012 at 09:00:30PM +1100, Joseph Glanville wrote:
> Has this patch been marked for stable?

Yes, both have been marked.

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2012-10-16 19:12 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <CAOzFzEhf2LfT0BCNPAgPgxZ3=pj2KvJ4Z6kP7XF8nnxag1dfvg@mail.gmail.com>
     [not found] ` <CAOzFzEiC4313K4H9393ffzNyBo398BPYSxTk7ZEmuH4GfW5qtw@mail.gmail.com>
     [not found]   ` <CAOzFzEhWk_7oQFOyW6Ri_9Cvsshj2s3pa=Oo-p8uL6r340MoTw@mail.gmail.com>
     [not found]     ` <20120919194231.GF31860@redhat.com>
     [not found]       ` <20120920183153.GI28934@google.com>
     [not found]         ` <20120920184219.GH4681@redhat.com>
     [not found]           ` <20120920191716.GI4681@redhat.com>
     [not found]             ` <20120920192038.GJ28934@google.com>
     [not found]               ` <20120920195759.GK4681@redhat.com>
     [not found]                 ` <20120920201815.GB7264@google.com>
2012-09-20 21:08                   ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Tejun Heo
2012-09-20 21:09                     ` [PATCH 2/2] block: fix request_queue->flags initialization Tejun Heo
2012-09-21 13:25                       ` Vivek Goyal
2012-09-21 13:25                     ` [PATCH 1/2] block: lift the initial queue bypass mode on blk_register_queue() instead of blk_init_allocated_queue() Vivek Goyal
2012-09-21 13:25                     ` Jens Axboe
2012-10-16 10:00                       ` Joseph Glanville
2012-10-16 19:11                         ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox