public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3
@ 2011-10-12 21:22 Jeff Moyer
  2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer
  2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer
  0 siblings, 2 replies; 6+ messages in thread
From: Jeff Moyer @ 2011-10-12 21:22 UTC (permalink / raw)
  To: linux-kernel, jaxboe; +Cc: tj, christophe, dm-devel, msnitzer

Hi,

This patch series fixes a regression introduced by commit
4853abaae7e4a2af938115ce9071ef8684fb7af4.  I've tested that this solves
the problem when dm-mpath is backed by devices that require a write
cache flush.  I've also ensured that this does not reintroduce the
performance problems that the original fix solved.

Cheers,
Jeff


^ permalink raw reply	[flat|nested] 6+ messages in thread

* [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request
  2011-10-12 21:22 [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 Jeff Moyer
@ 2011-10-12 21:22 ` Jeff Moyer
  2011-10-12 22:17   ` Tejun Heo
  2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff Moyer @ 2011-10-12 21:22 UTC (permalink / raw)
  To: linux-kernel, jaxboe; +Cc: tj, christophe, dm-devel, msnitzer, Jeff Moyer

A dm-multipath user reported[1] a problem when trying to boot
a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4
(block: fix flush machinery for stacking drivers with differring
flush flags) applied.  It turns out that an empty flush request
can be sent into blk_insert_flush.  When the BUG_ON was fixed
to allow for this, I/O on the underlying device would stall.  The
reason is that blk_insert_cloned_request does not kick the queue.
In the aforementioned commit, I had added a special case to
kick the queue if data was sent down but the queue flags did
not require a flush.  A better solution is to push the queue
kick up into blk_insert_cloned_request.

This patch, along with a follow-on which fixes the BUG_ON, fixes
the issue reported.

[1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html

Reported-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
---
 block/blk-core.c  |    2 ++
 block/blk-flush.c |    1 -
 2 files changed, 2 insertions(+), 1 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index d34433a..795154e 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1725,6 +1725,8 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq)
 		where = ELEVATOR_INSERT_FLUSH;
 
 	add_acct_request(q, rq, where);
+	if (where == ELEVATOR_INSERT_FLUSH)
+		__blk_run_queue(q);
 	spin_unlock_irqrestore(q->queue_lock, flags);
 
 	return 0;
diff --git a/block/blk-flush.c b/block/blk-flush.c
index 491eb30..0ff29c6 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -330,7 +330,6 @@ void blk_insert_flush(struct request *rq)
 	if ((policy & REQ_FSEQ_DATA) &&
 	    !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) {
 		list_add_tail(&rq->queuelist, &q->queue_head);
-		blk_run_queue_async(q);
 		return;
 	}
 
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush
  2011-10-12 21:22 [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 Jeff Moyer
  2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer
@ 2011-10-12 21:22 ` Jeff Moyer
  2011-10-12 22:18   ` Tejun Heo
  1 sibling, 1 reply; 6+ messages in thread
From: Jeff Moyer @ 2011-10-12 21:22 UTC (permalink / raw)
  To: linux-kernel, jaxboe; +Cc: tj, christophe, dm-devel, msnitzer, Jeff Moyer

A user reported a regression due to commit
4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush
machinery for stacking drivers with differring flush flags).
Part of the problem is that blk_insert_flush required a
single bio be attached to the request.  In reality, having
no attached bio is also a valid case, as can be observed with
an empty flush.

[1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html

Reported-by: Christophe Saout <christophe@saout.de>
Signed-off-by: Jeff Moyer <jmoyer@redhat.com
Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
---
 block/blk-flush.c |    2 +-
 1 files changed, 1 insertions(+), 1 deletions(-)

diff --git a/block/blk-flush.c b/block/blk-flush.c
index 0ff29c6..720ad60 100644
--- a/block/blk-flush.c
+++ b/block/blk-flush.c
@@ -320,7 +320,7 @@ void blk_insert_flush(struct request *rq)
 		return;
 	}
 
-	BUG_ON(!rq->bio || rq->bio != rq->biotail);
+	BUG_ON(rq->bio != rq->biotail); /*assumes zero or single bio rq */
 
 	/*
 	 * If there's data but flush is not necessary, the request can be
-- 
1.7.1


^ permalink raw reply related	[flat|nested] 6+ messages in thread

* Re: [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request
  2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer
@ 2011-10-12 22:17   ` Tejun Heo
  2011-10-14 15:05     ` [dm-devel] " Vivek Goyal
  0 siblings, 1 reply; 6+ messages in thread
From: Tejun Heo @ 2011-10-12 22:17 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-kernel, jaxboe, christophe, dm-devel, msnitzer

On Wed, Oct 12, 2011 at 05:22:41PM -0400, Jeff Moyer wrote:
> A dm-multipath user reported[1] a problem when trying to boot
> a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4
> (block: fix flush machinery for stacking drivers with differring
> flush flags) applied.  It turns out that an empty flush request
> can be sent into blk_insert_flush.  When the BUG_ON was fixed
> to allow for this, I/O on the underlying device would stall.  The
> reason is that blk_insert_cloned_request does not kick the queue.
> In the aforementioned commit, I had added a special case to
> kick the queue if data was sent down but the queue flags did
> not require a flush.  A better solution is to push the queue
> kick up into blk_insert_cloned_request.
> 
> This patch, along with a follow-on which fixes the BUG_ON, fixes
> the issue reported.
> 
> [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html
> 
> Reported-by: Christophe Saout <christophe@saout.de>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

Acked-by: Tejun Heo <tj@kernel.org>

Thank you for fixing this, but one curiosity, what happens for !flush
cloned requests?  Is someone else responsible for kicking the queue?

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush
  2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer
@ 2011-10-12 22:18   ` Tejun Heo
  0 siblings, 0 replies; 6+ messages in thread
From: Tejun Heo @ 2011-10-12 22:18 UTC (permalink / raw)
  To: Jeff Moyer; +Cc: linux-kernel, jaxboe, christophe, dm-devel, msnitzer

On Wed, Oct 12, 2011 at 05:22:42PM -0400, Jeff Moyer wrote:
> A user reported a regression due to commit
> 4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush
> machinery for stacking drivers with differring flush flags).
> Part of the problem is that blk_insert_flush required a
> single bio be attached to the request.  In reality, having
> no attached bio is also a valid case, as can be observed with
> an empty flush.
> 
> [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html
> 
> Reported-by: Christophe Saout <christophe@saout.de>
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com
> Signed-off-by: Jeff Moyer <jmoyer@redhat.com>

Duplicate SOB's and missing space after '/*' in the comment.  Other
than that,

 Acked-by: Tejun Heo <tj@kernel.org>

Thanks.

-- 
tejun

^ permalink raw reply	[flat|nested] 6+ messages in thread

* Re: [dm-devel] [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request
  2011-10-12 22:17   ` Tejun Heo
@ 2011-10-14 15:05     ` Vivek Goyal
  0 siblings, 0 replies; 6+ messages in thread
From: Vivek Goyal @ 2011-10-14 15:05 UTC (permalink / raw)
  To: Tejun Heo
  Cc: Jeff Moyer, jaxboe, dm-devel, msnitzer, linux-kernel, christophe

On Wed, Oct 12, 2011 at 03:17:32PM -0700, Tejun Heo wrote:
> On Wed, Oct 12, 2011 at 05:22:41PM -0400, Jeff Moyer wrote:
> > A dm-multipath user reported[1] a problem when trying to boot
> > a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4
> > (block: fix flush machinery for stacking drivers with differring
> > flush flags) applied.  It turns out that an empty flush request
> > can be sent into blk_insert_flush.  When the BUG_ON was fixed
> > to allow for this, I/O on the underlying device would stall.  The
> > reason is that blk_insert_cloned_request does not kick the queue.
> > In the aforementioned commit, I had added a special case to
> > kick the queue if data was sent down but the queue flags did
> > not require a flush.  A better solution is to push the queue
> > kick up into blk_insert_cloned_request.
> > 
> > This patch, along with a follow-on which fixes the BUG_ON, fixes
> > the issue reported.
> > 
> > [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html
> > 
> > Reported-by: Christophe Saout <christophe@saout.de>
> > Signed-off-by: Jeff Moyer <jmoyer@redhat.com>
> 
> Acked-by: Tejun Heo <tj@kernel.org>
> 
> Thank you for fixing this, but one curiosity, what happens for !flush
> cloned requests?  Is someone else responsible for kicking the queue?

I guess it is working for non flush requests because
blk_insert_cloned_request() inserts requests at the back (ELEVATOR_INSERT_BACK)
and elevator code is kicking the queue in that case.

        case ELEVATOR_INSERT_BACK:
                rq->cmd_flags |= REQ_SOFTBARRIER;
                elv_drain_elevator(q);
                list_add_tail(&rq->queuelist, &q->queue_head);
                /*
                 * We kick the queue here for the following reasons.
                 * - The elevator might have returned NULL previously
                 *   to delay requests and returned them now.  As the
                 *   queue wasn't empty before this request, ll_rw_blk
                 *   won't run the queue on return, resulting in hang.
                 * - Usually, back inserted requests won't be merged
                 *   with anything.  There's no point in delaying queue
                 *   processing.
                 */
                __blk_run_queue(q);
                break;

So it is really not clear who should kick the queue and when. Though extra
kick won't harm, so to me it looks that blk_insert_cloned_reqeust()
should always kick the queue after inserting any request (Either back
insert for flush insert etc.).

According to above comment we kick the queue here as elevaor might have
returned NULL in the past despite have a request. If that's the case then
somebody should have setup a timer to dispatch that request in time. What
happens if next request does not come for next 10 seconds. This request
will be sitting there for a long time.

So to me, blk_insert_cloned_request() should not rely on queue kick being
provided by ELEVATOR_INSERT_BACK. It should always kick the queue after
inserting a request. (as tejun mentioned about blk_insert_request()
kicking the queue).

Thanks
Vivek

^ permalink raw reply	[flat|nested] 6+ messages in thread

end of thread, other threads:[~2011-10-14 15:05 UTC | newest]

Thread overview: 6+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2011-10-12 21:22 [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 Jeff Moyer
2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer
2011-10-12 22:17   ` Tejun Heo
2011-10-14 15:05     ` [dm-devel] " Vivek Goyal
2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer
2011-10-12 22:18   ` Tejun Heo

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox