* [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 @ 2011-10-12 21:22 Jeff Moyer 2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer 2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer 0 siblings, 2 replies; 6+ messages in thread From: Jeff Moyer @ 2011-10-12 21:22 UTC (permalink / raw) To: linux-kernel, jaxboe; +Cc: tj, christophe, dm-devel, msnitzer Hi, This patch series fixes a regression introduced by commit 4853abaae7e4a2af938115ce9071ef8684fb7af4. I've tested that this solves the problem when dm-mpath is backed by devices that require a write cache flush. I've also ensured that this does not reintroduce the performance problems that the original fix solved. Cheers, Jeff ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request 2011-10-12 21:22 [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 Jeff Moyer @ 2011-10-12 21:22 ` Jeff Moyer 2011-10-12 22:17 ` Tejun Heo 2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer 1 sibling, 1 reply; 6+ messages in thread From: Jeff Moyer @ 2011-10-12 21:22 UTC (permalink / raw) To: linux-kernel, jaxboe; +Cc: tj, christophe, dm-devel, msnitzer, Jeff Moyer A dm-multipath user reported[1] a problem when trying to boot a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush machinery for stacking drivers with differring flush flags) applied. It turns out that an empty flush request can be sent into blk_insert_flush. When the BUG_ON was fixed to allow for this, I/O on the underlying device would stall. The reason is that blk_insert_cloned_request does not kick the queue. In the aforementioned commit, I had added a special case to kick the queue if data was sent down but the queue flags did not require a flush. A better solution is to push the queue kick up into blk_insert_cloned_request. This patch, along with a follow-on which fixes the BUG_ON, fixes the issue reported. [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html Reported-by: Christophe Saout <christophe@saout.de> Signed-off-by: Jeff Moyer <jmoyer@redhat.com> --- block/blk-core.c | 2 ++ block/blk-flush.c | 1 - 2 files changed, 2 insertions(+), 1 deletions(-) diff --git a/block/blk-core.c b/block/blk-core.c index d34433a..795154e 100644 --- a/block/blk-core.c +++ b/block/blk-core.c @@ -1725,6 +1725,8 @@ int blk_insert_cloned_request(struct request_queue *q, struct request *rq) where = ELEVATOR_INSERT_FLUSH; add_acct_request(q, rq, where); + if (where == ELEVATOR_INSERT_FLUSH) + __blk_run_queue(q); spin_unlock_irqrestore(q->queue_lock, flags); return 0; diff --git a/block/blk-flush.c b/block/blk-flush.c index 491eb30..0ff29c6 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -330,7 +330,6 @@ void blk_insert_flush(struct request *rq) if ((policy & REQ_FSEQ_DATA) && !(policy & (REQ_FSEQ_PREFLUSH | REQ_FSEQ_POSTFLUSH))) { list_add_tail(&rq->queuelist, &q->queue_head); - blk_run_queue_async(q); return; } -- 1.7.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request 2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer @ 2011-10-12 22:17 ` Tejun Heo 2011-10-14 15:05 ` [dm-devel] " Vivek Goyal 0 siblings, 1 reply; 6+ messages in thread From: Tejun Heo @ 2011-10-12 22:17 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel, jaxboe, christophe, dm-devel, msnitzer On Wed, Oct 12, 2011 at 05:22:41PM -0400, Jeff Moyer wrote: > A dm-multipath user reported[1] a problem when trying to boot > a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4 > (block: fix flush machinery for stacking drivers with differring > flush flags) applied. It turns out that an empty flush request > can be sent into blk_insert_flush. When the BUG_ON was fixed > to allow for this, I/O on the underlying device would stall. The > reason is that blk_insert_cloned_request does not kick the queue. > In the aforementioned commit, I had added a special case to > kick the queue if data was sent down but the queue flags did > not require a flush. A better solution is to push the queue > kick up into blk_insert_cloned_request. > > This patch, along with a follow-on which fixes the BUG_ON, fixes > the issue reported. > > [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html > > Reported-by: Christophe Saout <christophe@saout.de> > Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Acked-by: Tejun Heo <tj@kernel.org> Thank you for fixing this, but one curiosity, what happens for !flush cloned requests? Is someone else responsible for kicking the queue? -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
* Re: [dm-devel] [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request 2011-10-12 22:17 ` Tejun Heo @ 2011-10-14 15:05 ` Vivek Goyal 0 siblings, 0 replies; 6+ messages in thread From: Vivek Goyal @ 2011-10-14 15:05 UTC (permalink / raw) To: Tejun Heo Cc: Jeff Moyer, jaxboe, dm-devel, msnitzer, linux-kernel, christophe On Wed, Oct 12, 2011 at 03:17:32PM -0700, Tejun Heo wrote: > On Wed, Oct 12, 2011 at 05:22:41PM -0400, Jeff Moyer wrote: > > A dm-multipath user reported[1] a problem when trying to boot > > a kernel with commit 4853abaae7e4a2af938115ce9071ef8684fb7af4 > > (block: fix flush machinery for stacking drivers with differring > > flush flags) applied. It turns out that an empty flush request > > can be sent into blk_insert_flush. When the BUG_ON was fixed > > to allow for this, I/O on the underlying device would stall. The > > reason is that blk_insert_cloned_request does not kick the queue. > > In the aforementioned commit, I had added a special case to > > kick the queue if data was sent down but the queue flags did > > not require a flush. A better solution is to push the queue > > kick up into blk_insert_cloned_request. > > > > This patch, along with a follow-on which fixes the BUG_ON, fixes > > the issue reported. > > > > [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html > > > > Reported-by: Christophe Saout <christophe@saout.de> > > Signed-off-by: Jeff Moyer <jmoyer@redhat.com> > > Acked-by: Tejun Heo <tj@kernel.org> > > Thank you for fixing this, but one curiosity, what happens for !flush > cloned requests? Is someone else responsible for kicking the queue? I guess it is working for non flush requests because blk_insert_cloned_request() inserts requests at the back (ELEVATOR_INSERT_BACK) and elevator code is kicking the queue in that case. case ELEVATOR_INSERT_BACK: rq->cmd_flags |= REQ_SOFTBARRIER; elv_drain_elevator(q); list_add_tail(&rq->queuelist, &q->queue_head); /* * We kick the queue here for the following reasons. * - The elevator might have returned NULL previously * to delay requests and returned them now. As the * queue wasn't empty before this request, ll_rw_blk * won't run the queue on return, resulting in hang. * - Usually, back inserted requests won't be merged * with anything. There's no point in delaying queue * processing. */ __blk_run_queue(q); break; So it is really not clear who should kick the queue and when. Though extra kick won't harm, so to me it looks that blk_insert_cloned_reqeust() should always kick the queue after inserting any request (Either back insert for flush insert etc.). According to above comment we kick the queue here as elevaor might have returned NULL in the past despite have a request. If that's the case then somebody should have setup a timer to dispatch that request in time. What happens if next request does not come for next 10 seconds. This request will be sitting there for a long time. So to me, blk_insert_cloned_request() should not rely on queue kick being provided by ELEVATOR_INSERT_BACK. It should always kick the queue after inserting a request. (as tejun mentioned about blk_insert_request() kicking the queue). Thanks Vivek ^ permalink raw reply [flat|nested] 6+ messages in thread
* [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush 2011-10-12 21:22 [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 Jeff Moyer 2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer @ 2011-10-12 21:22 ` Jeff Moyer 2011-10-12 22:18 ` Tejun Heo 1 sibling, 1 reply; 6+ messages in thread From: Jeff Moyer @ 2011-10-12 21:22 UTC (permalink / raw) To: linux-kernel, jaxboe; +Cc: tj, christophe, dm-devel, msnitzer, Jeff Moyer A user reported a regression due to commit 4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush machinery for stacking drivers with differring flush flags). Part of the problem is that blk_insert_flush required a single bio be attached to the request. In reality, having no attached bio is also a valid case, as can be observed with an empty flush. [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html Reported-by: Christophe Saout <christophe@saout.de> Signed-off-by: Jeff Moyer <jmoyer@redhat.com Signed-off-by: Jeff Moyer <jmoyer@redhat.com> --- block/blk-flush.c | 2 +- 1 files changed, 1 insertions(+), 1 deletions(-) diff --git a/block/blk-flush.c b/block/blk-flush.c index 0ff29c6..720ad60 100644 --- a/block/blk-flush.c +++ b/block/blk-flush.c @@ -320,7 +320,7 @@ void blk_insert_flush(struct request *rq) return; } - BUG_ON(!rq->bio || rq->bio != rq->biotail); + BUG_ON(rq->bio != rq->biotail); /*assumes zero or single bio rq */ /* * If there's data but flush is not necessary, the request can be -- 1.7.1 ^ permalink raw reply related [flat|nested] 6+ messages in thread
* Re: [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush 2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer @ 2011-10-12 22:18 ` Tejun Heo 0 siblings, 0 replies; 6+ messages in thread From: Tejun Heo @ 2011-10-12 22:18 UTC (permalink / raw) To: Jeff Moyer; +Cc: linux-kernel, jaxboe, christophe, dm-devel, msnitzer On Wed, Oct 12, 2011 at 05:22:42PM -0400, Jeff Moyer wrote: > A user reported a regression due to commit > 4853abaae7e4a2af938115ce9071ef8684fb7af4 (block: fix flush > machinery for stacking drivers with differring flush flags). > Part of the problem is that blk_insert_flush required a > single bio be attached to the request. In reality, having > no attached bio is also a valid case, as can be observed with > an empty flush. > > [1] http://www.redhat.com/archives/dm-devel/2011-September/msg00154.html > > Reported-by: Christophe Saout <christophe@saout.de> > Signed-off-by: Jeff Moyer <jmoyer@redhat.com > Signed-off-by: Jeff Moyer <jmoyer@redhat.com> Duplicate SOB's and missing space after '/*' in the comment. Other than that, Acked-by: Tejun Heo <tj@kernel.org> Thanks. -- tejun ^ permalink raw reply [flat|nested] 6+ messages in thread
end of thread, other threads:[~2011-10-14 15:05 UTC | newest] Thread overview: 6+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2011-10-12 21:22 [PATCH 0/2] block: fix flush regression introduced in 3.1-rc3 Jeff Moyer 2011-10-12 21:22 ` [PATCH 1/2] blk-flush: move the queue kick into blk_insert_cloned_request Jeff Moyer 2011-10-12 22:17 ` Tejun Heo 2011-10-14 15:05 ` [dm-devel] " Vivek Goyal 2011-10-12 21:22 ` [PATCH 2/2] blk-flush: fix invalid BUG_ON in blk_insert_flush Jeff Moyer 2011-10-12 22:18 ` Tejun Heo
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox