* [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
@ 2008-11-28 4:32 Tejun Heo
2008-11-28 4:32 ` [PATCH 1/6] block: reorganize QUEUE_ORDERED_* constants Tejun Heo
` (6 more replies)
0 siblings, 7 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel
Hello, Jens.
As it currently stands, empty barrier is being handled a little bit
differently from other types of barriers. It wraps around the normal
barrier implementation but intercepts BAR issue and completes it in
elv_next_request() without letting the low level driver see it. This
works but is a little bit hacky.
The barrier or ordered sequence implementation is already quite
flexible to accomodate different modes, it just needs small
adjustments to support empty barrier in similar way different barrier
modes are handled.
This patchset updates start_ordered() and ordered sequence completion
more flexible such that BAR itself can be made optional and
reimplements empty barrier by simply turning off DO_BAR and
DO_POSTFLUSH flags in start_ordered().
This patchset also fixes empty barrier on write-through (or no cache)
w/ ordered tag. Previously, this basically desolved into noop and
without any command to issue, ordering by tag wasn't enforced at all.
This patchset promotes such empty barriers to DRAIN.
This patchset contains the following six patches.
0001-block-reorganize-QUEUE_ORDERED_-constants.patch
0002-block-remove-duplicate-or-unused-barrier-discard-er.patch
0003-block-make-every-barrier-action-optional.patch
0004-block-make-barrier-completion-more-robust.patch
0005-block-simplify-empty-barrier-implementation.patch
0006-block-fix-empty-barrier-on-write-through-w-ordered.patch
0001-0002 preps for the following changes. 0003-0004 makes ordered
sequence implementation more flexible. 0005 simplifies empty barrier
implementation and 0006 fixes the noop bug.
This patchset is on top of
linux-2.6-block#for-2.6.29 (75f2e95a9641224c7077d5e677f2a5fe2fdd4f0e)
+ block-internal-dequeue-shouldnt-start-timer [1]
and contains the following changes.
block/blk-barrier.c | 120 +++++++++++++++++++++++++++++++------------------
block/blk-core.c | 44 ++++-------------
block/elevator.c | 18 ++-----
include/linux/blkdev.h | 47 +++++++++++--------
4 files changed, 122 insertions(+), 107 deletions(-)
Thanks.
--
tejun
[1] http://article.gmane.org/gmane.linux.kernel/763054
^ permalink raw reply [flat|nested] 13+ messages in thread
* [PATCH 1/6] block: reorganize QUEUE_ORDERED_* constants
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
@ 2008-11-28 4:32 ` Tejun Heo
2008-11-28 4:32 ` [PATCH 2/6] block: remove duplicate or unused barrier/discard error paths Tejun Heo
` (5 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel; +Cc: Tejun Heo
Separate out ordering type (drain,) and action masks (preflush,
postflush, fua) from visible ordering mode selectors
(QUEUE_ORDERED_*). Ordering types are now named QUEUE_ORDERED_BY_*
while action masks are named QUEUE_ORDERED_DO_*.
This change is necessary to add QUEUE_ORDERED_DO_BAR and make it
optional to improve empty barrier implementation.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-barrier.c | 20 ++++++++++----------
include/linux/blkdev.h | 39 +++++++++++++++++++++++----------------
2 files changed, 33 insertions(+), 26 deletions(-)
diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index 6e72d66..1d7adc7 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -24,8 +24,8 @@
int blk_queue_ordered(struct request_queue *q, unsigned ordered,
prepare_flush_fn *prepare_flush_fn)
{
- if (ordered & (QUEUE_ORDERED_PREFLUSH | QUEUE_ORDERED_POSTFLUSH) &&
- prepare_flush_fn == NULL) {
+ if (!prepare_flush_fn && (ordered & (QUEUE_ORDERED_DO_PREFLUSH |
+ QUEUE_ORDERED_DO_POSTFLUSH))) {
printk(KERN_ERR "%s: prepare_flush_fn required\n", __func__);
return -EINVAL;
}
@@ -134,7 +134,7 @@ static void queue_flush(struct request_queue *q, unsigned which)
struct request *rq;
rq_end_io_fn *end_io;
- if (which == QUEUE_ORDERED_PREFLUSH) {
+ if (which == QUEUE_ORDERED_DO_PREFLUSH) {
rq = &q->pre_flush_rq;
end_io = pre_flush_end_io;
} else {
@@ -167,7 +167,7 @@ static inline struct request *start_ordered(struct request_queue *q,
blk_rq_init(q, rq);
if (bio_data_dir(q->orig_bar_rq->bio) == WRITE)
rq->cmd_flags |= REQ_RW;
- if (q->ordered & QUEUE_ORDERED_FUA)
+ if (q->ordered & QUEUE_ORDERED_DO_FUA)
rq->cmd_flags |= REQ_FUA;
init_request_from_bio(rq, q->orig_bar_rq->bio);
rq->end_io = bar_end_io;
@@ -181,20 +181,20 @@ static inline struct request *start_ordered(struct request_queue *q,
* there will be no data written between the pre and post flush.
* Hence a single flush will suffice.
*/
- if ((q->ordered & QUEUE_ORDERED_POSTFLUSH) && !blk_empty_barrier(rq))
- queue_flush(q, QUEUE_ORDERED_POSTFLUSH);
+ if ((q->ordered & QUEUE_ORDERED_DO_POSTFLUSH) && !blk_empty_barrier(rq))
+ queue_flush(q, QUEUE_ORDERED_DO_POSTFLUSH);
else
q->ordseq |= QUEUE_ORDSEQ_POSTFLUSH;
elv_insert(q, rq, ELEVATOR_INSERT_FRONT);
- if (q->ordered & QUEUE_ORDERED_PREFLUSH) {
- queue_flush(q, QUEUE_ORDERED_PREFLUSH);
+ if (q->ordered & QUEUE_ORDERED_DO_PREFLUSH) {
+ queue_flush(q, QUEUE_ORDERED_DO_PREFLUSH);
rq = &q->pre_flush_rq;
} else
q->ordseq |= QUEUE_ORDSEQ_PREFLUSH;
- if ((q->ordered & QUEUE_ORDERED_TAG) || q->in_flight == 0)
+ if ((q->ordered & QUEUE_ORDERED_BY_TAG) || q->in_flight == 0)
q->ordseq |= QUEUE_ORDSEQ_DRAIN;
else
rq = NULL;
@@ -237,7 +237,7 @@ int blk_do_ordered(struct request_queue *q, struct request **rqp)
rq != &q->pre_flush_rq && rq != &q->post_flush_rq)
return 1;
- if (q->ordered & QUEUE_ORDERED_TAG) {
+ if (q->ordered & QUEUE_ORDERED_BY_TAG) {
/* Ordered by tag. Blocking the next barrier is enough. */
if (is_barrier && rq != &q->bar_rq)
*rqp = NULL;
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index bedeea2..7f9cd08 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -523,22 +523,29 @@ enum {
* TAG_FLUSH : ordering by tag w/ pre and post flushes
* TAG_FUA : ordering by tag w/ pre flush and FUA write
*/
- QUEUE_ORDERED_NONE = 0x00,
- QUEUE_ORDERED_DRAIN = 0x01,
- QUEUE_ORDERED_TAG = 0x02,
-
- QUEUE_ORDERED_PREFLUSH = 0x10,
- QUEUE_ORDERED_POSTFLUSH = 0x20,
- QUEUE_ORDERED_FUA = 0x40,
-
- QUEUE_ORDERED_DRAIN_FLUSH = QUEUE_ORDERED_DRAIN |
- QUEUE_ORDERED_PREFLUSH | QUEUE_ORDERED_POSTFLUSH,
- QUEUE_ORDERED_DRAIN_FUA = QUEUE_ORDERED_DRAIN |
- QUEUE_ORDERED_PREFLUSH | QUEUE_ORDERED_FUA,
- QUEUE_ORDERED_TAG_FLUSH = QUEUE_ORDERED_TAG |
- QUEUE_ORDERED_PREFLUSH | QUEUE_ORDERED_POSTFLUSH,
- QUEUE_ORDERED_TAG_FUA = QUEUE_ORDERED_TAG |
- QUEUE_ORDERED_PREFLUSH | QUEUE_ORDERED_FUA,
+ QUEUE_ORDERED_BY_DRAIN = 0x01,
+ QUEUE_ORDERED_BY_TAG = 0x02,
+ QUEUE_ORDERED_DO_PREFLUSH = 0x10,
+ QUEUE_ORDERED_DO_POSTFLUSH = 0x40,
+ QUEUE_ORDERED_DO_FUA = 0x80,
+
+ QUEUE_ORDERED_NONE = 0x00,
+
+ QUEUE_ORDERED_DRAIN = QUEUE_ORDERED_BY_DRAIN,
+ QUEUE_ORDERED_DRAIN_FLUSH = QUEUE_ORDERED_DRAIN |
+ QUEUE_ORDERED_DO_PREFLUSH |
+ QUEUE_ORDERED_DO_POSTFLUSH,
+ QUEUE_ORDERED_DRAIN_FUA = QUEUE_ORDERED_DRAIN |
+ QUEUE_ORDERED_DO_PREFLUSH |
+ QUEUE_ORDERED_DO_FUA,
+
+ QUEUE_ORDERED_TAG = QUEUE_ORDERED_BY_TAG,
+ QUEUE_ORDERED_TAG_FLUSH = QUEUE_ORDERED_TAG |
+ QUEUE_ORDERED_DO_PREFLUSH |
+ QUEUE_ORDERED_DO_POSTFLUSH,
+ QUEUE_ORDERED_TAG_FUA = QUEUE_ORDERED_TAG |
+ QUEUE_ORDERED_DO_PREFLUSH |
+ QUEUE_ORDERED_DO_FUA,
/*
* Ordered operation sequence
--
1.5.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 2/6] block: remove duplicate or unused barrier/discard error paths
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
2008-11-28 4:32 ` [PATCH 1/6] block: reorganize QUEUE_ORDERED_* constants Tejun Heo
@ 2008-11-28 4:32 ` Tejun Heo
2008-11-28 4:32 ` [PATCH 3/6] block: make every barrier action optional Tejun Heo
` (4 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel; +Cc: Tejun Heo
* Because barrier mode can be changed dynamically, whether barrier is
supported or not can be determined only when actually issuing the
barrier and there is no point in checking it earlier. Drop barrier
support check in generic_make_request() and __make_request(), and
update comment around the support check in blk_do_ordered().
* There is no reason to check discard support in both
generic_make_request() and __make_request(). Drop the check in
__make_request(). While at it, move error action block to the end
of the function and add unlikely() to q existence test.
* Barrier request, be it empty or not, is never passed to low level
driver and thus it's meaningless to try to copy back req->sector to
bio->bi_sector on error. In addition, the notion of failed sector
doesn't make any sense for empty barrier to begin with. Drop the
code block from __end_that_request_first().
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-barrier.c | 4 ++--
block/blk-core.c | 44 +++++++++++---------------------------------
2 files changed, 13 insertions(+), 35 deletions(-)
diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index 1d7adc7..43d479a 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -216,8 +216,8 @@ int blk_do_ordered(struct request_queue *q, struct request **rqp)
return 1;
} else {
/*
- * This can happen when the queue switches to
- * ORDERED_NONE while this request is on it.
+ * Queue ordering not supported. Terminate
+ * with prejudice.
*/
elv_dequeue_request(q, rq);
if (__blk_end_request(rq, -EOPNOTSUPP,
diff --git a/block/blk-core.c b/block/blk-core.c
index d472c4b..43ed62b 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1131,7 +1131,7 @@ void init_request_from_bio(struct request *req, struct bio *bio)
static int __make_request(struct request_queue *q, struct bio *bio)
{
struct request *req;
- int el_ret, nr_sectors, barrier, discard, err;
+ int el_ret, nr_sectors;
const unsigned short prio = bio_prio(bio);
const int sync = bio_sync(bio);
int rw_flags;
@@ -1145,22 +1145,9 @@ static int __make_request(struct request_queue *q, struct bio *bio)
*/
blk_queue_bounce(q, &bio);
- barrier = bio_barrier(bio);
- if (unlikely(barrier) && bio_has_data(bio) &&
- (q->next_ordered == QUEUE_ORDERED_NONE)) {
- err = -EOPNOTSUPP;
- goto end_io;
- }
-
- discard = bio_discard(bio);
- if (unlikely(discard) && !q->prepare_discard_fn) {
- err = -EOPNOTSUPP;
- goto end_io;
- }
-
spin_lock_irq(q->queue_lock);
- if (unlikely(barrier) || elv_queue_empty(q))
+ if (unlikely(bio_barrier(bio)) || elv_queue_empty(q))
goto get_rq;
el_ret = elv_merge(q, &req, bio);
@@ -1254,10 +1241,6 @@ out:
__generic_unplug_device(q);
spin_unlock_irq(q->queue_lock);
return 0;
-
-end_io:
- bio_endio(bio, err);
- return 0;
}
/*
@@ -1410,15 +1393,13 @@ static inline void __generic_make_request(struct bio *bio)
char b[BDEVNAME_SIZE];
q = bdev_get_queue(bio->bi_bdev);
- if (!q) {
+ if (unlikely(!q)) {
printk(KERN_ERR
"generic_make_request: Trying to access "
"nonexistent block-device %s (%Lu)\n",
bdevname(bio->bi_bdev, b),
(long long) bio->bi_sector);
-end_io:
- bio_endio(bio, err);
- break;
+ goto end_io;
}
if (unlikely(nr_sectors > q->max_hw_sectors)) {
@@ -1455,14 +1436,19 @@ end_io:
if (bio_check_eod(bio, nr_sectors))
goto end_io;
- if ((bio_empty_barrier(bio) && !q->prepare_flush_fn) ||
- (bio_discard(bio) && !q->prepare_discard_fn)) {
+
+ if (bio_discard(bio) && !q->prepare_discard_fn) {
err = -EOPNOTSUPP;
goto end_io;
}
ret = q->make_request_fn(q, bio);
} while (ret);
+
+ return;
+
+end_io:
+ bio_endio(bio, err);
}
/*
@@ -1712,14 +1698,6 @@ static int __end_that_request_first(struct request *req, int error,
while ((bio = req->bio) != NULL) {
int nbytes;
- /*
- * For an empty barrier request, the low level driver must
- * store a potential error location in ->sector. We pass
- * that back up in ->bi_sector.
- */
- if (blk_empty_barrier(req))
- bio->bi_sector = req->sector;
-
if (nr_bytes >= bio->bi_size) {
req->bio = bio->bi_next;
nbytes = bio->bi_size;
--
1.5.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 3/6] block: make every barrier action optional
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
2008-11-28 4:32 ` [PATCH 1/6] block: reorganize QUEUE_ORDERED_* constants Tejun Heo
2008-11-28 4:32 ` [PATCH 2/6] block: remove duplicate or unused barrier/discard error paths Tejun Heo
@ 2008-11-28 4:32 ` Tejun Heo
2008-11-28 4:32 ` [PATCH 4/6] block: make barrier completion more robust Tejun Heo
` (3 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel; +Cc: Tejun Heo
In all barrier sequences, the barrier write itself was always assumed
to be issued and thus didn't have corresponding control flag. This
patch adds QUEUE_ORDERED_DO_BAR and unify action mask handling in
start_ordered() such that any barrier action can be skipped.
This patch doesn't introduce any visible behavior changes.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-barrier.c | 41 ++++++++++++++++++++++++-----------------
include/linux/blkdev.h | 7 +++++--
2 files changed, 29 insertions(+), 19 deletions(-)
diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index 43d479a..1efabf8 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -158,19 +158,10 @@ static inline struct request *start_ordered(struct request_queue *q,
q->ordered = q->next_ordered;
q->ordseq |= QUEUE_ORDSEQ_STARTED;
- /*
- * Prep proxy barrier request.
- */
+ /* stash away the original request */
elv_dequeue_request(q, rq);
q->orig_bar_rq = rq;
- rq = &q->bar_rq;
- blk_rq_init(q, rq);
- if (bio_data_dir(q->orig_bar_rq->bio) == WRITE)
- rq->cmd_flags |= REQ_RW;
- if (q->ordered & QUEUE_ORDERED_DO_FUA)
- rq->cmd_flags |= REQ_FUA;
- init_request_from_bio(rq, q->orig_bar_rq->bio);
- rq->end_io = bar_end_io;
+ rq = NULL;
/*
* Queue ordered sequence. As we stack them at the head, we
@@ -181,12 +172,28 @@ static inline struct request *start_ordered(struct request_queue *q,
* there will be no data written between the pre and post flush.
* Hence a single flush will suffice.
*/
- if ((q->ordered & QUEUE_ORDERED_DO_POSTFLUSH) && !blk_empty_barrier(rq))
+ if ((q->ordered & QUEUE_ORDERED_DO_POSTFLUSH) &&
+ !blk_empty_barrier(q->orig_bar_rq)) {
queue_flush(q, QUEUE_ORDERED_DO_POSTFLUSH);
- else
+ rq = &q->post_flush_rq;
+ } else
q->ordseq |= QUEUE_ORDSEQ_POSTFLUSH;
- elv_insert(q, rq, ELEVATOR_INSERT_FRONT);
+ if (q->ordered & QUEUE_ORDERED_DO_BAR) {
+ rq = &q->bar_rq;
+
+ /* initialize proxy request and queue it */
+ blk_rq_init(q, rq);
+ if (bio_data_dir(q->orig_bar_rq->bio) == WRITE)
+ rq->cmd_flags |= REQ_RW;
+ if (q->ordered & QUEUE_ORDERED_DO_FUA)
+ rq->cmd_flags |= REQ_FUA;
+ init_request_from_bio(rq, q->orig_bar_rq->bio);
+ rq->end_io = bar_end_io;
+
+ elv_insert(q, rq, ELEVATOR_INSERT_FRONT);
+ } else
+ q->ordseq |= QUEUE_ORDSEQ_BAR;
if (q->ordered & QUEUE_ORDERED_DO_PREFLUSH) {
queue_flush(q, QUEUE_ORDERED_DO_PREFLUSH);
@@ -194,10 +201,10 @@ static inline struct request *start_ordered(struct request_queue *q,
} else
q->ordseq |= QUEUE_ORDSEQ_PREFLUSH;
- if ((q->ordered & QUEUE_ORDERED_BY_TAG) || q->in_flight == 0)
- q->ordseq |= QUEUE_ORDSEQ_DRAIN;
- else
+ if ((q->ordered & QUEUE_ORDERED_BY_DRAIN) && q->in_flight)
rq = NULL;
+ else
+ q->ordseq |= QUEUE_ORDSEQ_DRAIN;
return rq;
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 7f9cd08..076620d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -526,12 +526,14 @@ enum {
QUEUE_ORDERED_BY_DRAIN = 0x01,
QUEUE_ORDERED_BY_TAG = 0x02,
QUEUE_ORDERED_DO_PREFLUSH = 0x10,
+ QUEUE_ORDERED_DO_BAR = 0x20,
QUEUE_ORDERED_DO_POSTFLUSH = 0x40,
QUEUE_ORDERED_DO_FUA = 0x80,
QUEUE_ORDERED_NONE = 0x00,
- QUEUE_ORDERED_DRAIN = QUEUE_ORDERED_BY_DRAIN,
+ QUEUE_ORDERED_DRAIN = QUEUE_ORDERED_BY_DRAIN |
+ QUEUE_ORDERED_DO_BAR,
QUEUE_ORDERED_DRAIN_FLUSH = QUEUE_ORDERED_DRAIN |
QUEUE_ORDERED_DO_PREFLUSH |
QUEUE_ORDERED_DO_POSTFLUSH,
@@ -539,7 +541,8 @@ enum {
QUEUE_ORDERED_DO_PREFLUSH |
QUEUE_ORDERED_DO_FUA,
- QUEUE_ORDERED_TAG = QUEUE_ORDERED_BY_TAG,
+ QUEUE_ORDERED_TAG = QUEUE_ORDERED_BY_TAG |
+ QUEUE_ORDERED_DO_BAR,
QUEUE_ORDERED_TAG_FLUSH = QUEUE_ORDERED_TAG |
QUEUE_ORDERED_DO_PREFLUSH |
QUEUE_ORDERED_DO_POSTFLUSH,
--
1.5.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 4/6] block: make barrier completion more robust
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
` (2 preceding siblings ...)
2008-11-28 4:32 ` [PATCH 3/6] block: make every barrier action optional Tejun Heo
@ 2008-11-28 4:32 ` Tejun Heo
2008-11-28 4:32 ` [PATCH 5/6] block: simplify empty barrier implementation Tejun Heo
` (2 subsequent siblings)
6 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel; +Cc: Tejun Heo
Barrier completion had the following assumptions.
* start_ordered() couldn't finish the whole sequence properly. If all
actions are to be skipped, q->ordseq is set correctly but the actual
completion was never triggered thus hanging the barrier request.
* Drain completion in elv_complete_request() assumed that there's
always at least one request in the queue when drain completes.
Both assumptions are true but these assumptions need to be removed to
improve empty barrier implementation. This patch makes the following
changes.
* Make start_ordered() use blk_ordered_complete_seq() to mark skipped
steps complete and notify __elv_next_request() that it should fetch
the next request if the whole barrier has completed inside
start_ordered().
* Make drain completion path in elv_complete_request() check whether
the queue is empty. Empty queue also indicates drain completion.
* While at it, convert 0/1 return from blk_do_ordered() to false/true.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-barrier.c | 45 +++++++++++++++++++++++++++------------------
block/elevator.c | 10 +++++++---
include/linux/blkdev.h | 4 ++--
3 files changed, 36 insertions(+), 23 deletions(-)
diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index 1efabf8..b03d880 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -88,7 +88,7 @@ unsigned blk_ordered_req_seq(struct request *rq)
return QUEUE_ORDSEQ_DONE;
}
-void blk_ordered_complete_seq(struct request_queue *q, unsigned seq, int error)
+bool blk_ordered_complete_seq(struct request_queue *q, unsigned seq, int error)
{
struct request *rq;
@@ -99,7 +99,7 @@ void blk_ordered_complete_seq(struct request_queue *q, unsigned seq, int error)
q->ordseq |= seq;
if (blk_ordered_cur_seq(q) != QUEUE_ORDSEQ_DONE)
- return;
+ return false;
/*
* Okay, sequence complete.
@@ -109,6 +109,8 @@ void blk_ordered_complete_seq(struct request_queue *q, unsigned seq, int error)
if (__blk_end_request(rq, q->orderr, blk_rq_bytes(rq)))
BUG();
+
+ return true;
}
static void pre_flush_end_io(struct request *rq, int error)
@@ -151,9 +153,11 @@ static void queue_flush(struct request_queue *q, unsigned which)
elv_insert(q, rq, ELEVATOR_INSERT_FRONT);
}
-static inline struct request *start_ordered(struct request_queue *q,
- struct request *rq)
+static inline bool start_ordered(struct request_queue *q, struct request **rqp)
{
+ struct request *rq = *rqp;
+ unsigned skip = 0;
+
q->orderr = 0;
q->ordered = q->next_ordered;
q->ordseq |= QUEUE_ORDSEQ_STARTED;
@@ -177,7 +181,7 @@ static inline struct request *start_ordered(struct request_queue *q,
queue_flush(q, QUEUE_ORDERED_DO_POSTFLUSH);
rq = &q->post_flush_rq;
} else
- q->ordseq |= QUEUE_ORDSEQ_POSTFLUSH;
+ skip |= QUEUE_ORDSEQ_POSTFLUSH;
if (q->ordered & QUEUE_ORDERED_DO_BAR) {
rq = &q->bar_rq;
@@ -193,35 +197,40 @@ static inline struct request *start_ordered(struct request_queue *q,
elv_insert(q, rq, ELEVATOR_INSERT_FRONT);
} else
- q->ordseq |= QUEUE_ORDSEQ_BAR;
+ skip |= QUEUE_ORDSEQ_BAR;
if (q->ordered & QUEUE_ORDERED_DO_PREFLUSH) {
queue_flush(q, QUEUE_ORDERED_DO_PREFLUSH);
rq = &q->pre_flush_rq;
} else
- q->ordseq |= QUEUE_ORDSEQ_PREFLUSH;
+ skip |= QUEUE_ORDSEQ_PREFLUSH;
if ((q->ordered & QUEUE_ORDERED_BY_DRAIN) && q->in_flight)
rq = NULL;
else
- q->ordseq |= QUEUE_ORDSEQ_DRAIN;
+ skip |= QUEUE_ORDSEQ_DRAIN;
+
+ *rqp = rq;
- return rq;
+ /*
+ * Complete skipped sequences. If whole sequence is complete,
+ * return false to tell elevator that this request is gone.
+ */
+ return !blk_ordered_complete_seq(q, skip, 0);
}
-int blk_do_ordered(struct request_queue *q, struct request **rqp)
+bool blk_do_ordered(struct request_queue *q, struct request **rqp)
{
struct request *rq = *rqp;
const int is_barrier = blk_fs_request(rq) && blk_barrier_rq(rq);
if (!q->ordseq) {
if (!is_barrier)
- return 1;
+ return true;
- if (q->next_ordered != QUEUE_ORDERED_NONE) {
- *rqp = start_ordered(q, rq);
- return 1;
- } else {
+ if (q->next_ordered != QUEUE_ORDERED_NONE)
+ return start_ordered(q, rqp);
+ else {
/*
* Queue ordering not supported. Terminate
* with prejudice.
@@ -231,7 +240,7 @@ int blk_do_ordered(struct request_queue *q, struct request **rqp)
blk_rq_bytes(rq)))
BUG();
*rqp = NULL;
- return 0;
+ return false;
}
}
@@ -242,7 +251,7 @@ int blk_do_ordered(struct request_queue *q, struct request **rqp)
/* Special requests are not subject to ordering rules. */
if (!blk_fs_request(rq) &&
rq != &q->pre_flush_rq && rq != &q->post_flush_rq)
- return 1;
+ return true;
if (q->ordered & QUEUE_ORDERED_BY_TAG) {
/* Ordered by tag. Blocking the next barrier is enough. */
@@ -255,7 +264,7 @@ int blk_do_ordered(struct request_queue *q, struct request **rqp)
*rqp = NULL;
}
- return 1;
+ return true;
}
static void bio_end_empty_barrier(struct bio *bio, int err)
diff --git a/block/elevator.c b/block/elevator.c
index a6951f7..3b45d26 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -938,10 +938,14 @@ void elv_completed_request(struct request_queue *q, struct request *rq)
* drained for flush sequence.
*/
if (unlikely(q->ordseq)) {
- struct request *first_rq = list_entry_rq(q->queue_head.next);
- if (q->in_flight == 0 &&
+ struct request *next = NULL;
+
+ if (!list_empty(&q->queue_head))
+ next = list_entry_rq(q->queue_head.next);
+
+ if (!q->in_flight &&
blk_ordered_cur_seq(q) == QUEUE_ORDSEQ_DRAIN &&
- blk_ordered_req_seq(first_rq) > QUEUE_ORDSEQ_DRAIN) {
+ (!next || blk_ordered_req_seq(next) > QUEUE_ORDSEQ_DRAIN)) {
blk_ordered_complete_seq(q, QUEUE_ORDSEQ_DRAIN, 0);
blk_start_queueing(q);
}
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 076620d..0234726 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -865,10 +865,10 @@ extern void blk_queue_rq_timed_out(struct request_queue *, rq_timed_out_fn *);
extern void blk_queue_rq_timeout(struct request_queue *, unsigned int);
extern struct backing_dev_info *blk_get_backing_dev_info(struct block_device *bdev);
extern int blk_queue_ordered(struct request_queue *, unsigned, prepare_flush_fn *);
-extern int blk_do_ordered(struct request_queue *, struct request **);
+extern bool blk_do_ordered(struct request_queue *, struct request **);
extern unsigned blk_ordered_cur_seq(struct request_queue *);
extern unsigned blk_ordered_req_seq(struct request *);
-extern void blk_ordered_complete_seq(struct request_queue *, unsigned, int);
+extern bool blk_ordered_complete_seq(struct request_queue *, unsigned, int);
extern int blk_rq_map_sg(struct request_queue *, struct request *, struct scatterlist *);
extern void blk_dump_rq_flags(struct request *, char *);
--
1.5.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 5/6] block: simplify empty barrier implementation
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
` (3 preceding siblings ...)
2008-11-28 4:32 ` [PATCH 4/6] block: make barrier completion more robust Tejun Heo
@ 2008-11-28 4:32 ` Tejun Heo
2008-11-28 4:32 ` [PATCH 6/6] block: fix empty barrier on write-through w/ ordered tag Tejun Heo
2008-12-12 3:23 ` [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
6 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel; +Cc: Tejun Heo
Empty barrier required special handling in __elv_next_request() to
complete it without letting the low level driver see it.
With previous changes, barrier code is now flexible enough to skip the
BAR step using the same barrier sequence selection mechanism. Drop
the special handling and mask off q->ordered from start_ordered().
Remove blk_empty_barrier() test which now has no user.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-barrier.c | 16 ++++++++++------
block/elevator.c | 8 --------
include/linux/blkdev.h | 1 -
3 files changed, 10 insertions(+), 15 deletions(-)
diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index b03d880..c63044e 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -162,6 +162,14 @@ static inline bool start_ordered(struct request_queue *q, struct request **rqp)
q->ordered = q->next_ordered;
q->ordseq |= QUEUE_ORDSEQ_STARTED;
+ /*
+ * For an empty barrier, there's no actual BAR request, which
+ * in turn makes POSTFLUSH unnecessary. Mask them off.
+ */
+ if (!rq->hard_nr_sectors)
+ q->ordered &= ~(QUEUE_ORDERED_DO_BAR |
+ QUEUE_ORDERED_DO_POSTFLUSH);
+
/* stash away the original request */
elv_dequeue_request(q, rq);
q->orig_bar_rq = rq;
@@ -171,13 +179,9 @@ static inline bool start_ordered(struct request_queue *q, struct request **rqp)
* Queue ordered sequence. As we stack them at the head, we
* need to queue in reverse order. Note that we rely on that
* no fs request uses ELEVATOR_INSERT_FRONT and thus no fs
- * request gets inbetween ordered sequence. If this request is
- * an empty barrier, we don't need to do a postflush ever since
- * there will be no data written between the pre and post flush.
- * Hence a single flush will suffice.
+ * request gets inbetween ordered sequence.
*/
- if ((q->ordered & QUEUE_ORDERED_DO_POSTFLUSH) &&
- !blk_empty_barrier(q->orig_bar_rq)) {
+ if (q->ordered & QUEUE_ORDERED_DO_POSTFLUSH) {
queue_flush(q, QUEUE_ORDERED_DO_POSTFLUSH);
rq = &q->post_flush_rq;
} else
diff --git a/block/elevator.c b/block/elevator.c
index 3b45d26..5562782 100644
--- a/block/elevator.c
+++ b/block/elevator.c
@@ -749,14 +749,6 @@ struct request *elv_next_request(struct request_queue *q)
int ret;
while ((rq = __elv_next_request(q)) != NULL) {
- /*
- * Kill the empty barrier place holder, the driver must
- * not ever see it.
- */
- if (blk_empty_barrier(rq)) {
- __blk_end_request(rq, 0, blk_rq_bytes(rq));
- continue;
- }
if (!(rq->cmd_flags & REQ_STARTED)) {
/*
* This is the first time the device driver
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 0234726..10616b2 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -596,7 +596,6 @@ enum {
#define blk_fua_rq(rq) ((rq)->cmd_flags & REQ_FUA)
#define blk_discard_rq(rq) ((rq)->cmd_flags & REQ_DISCARD)
#define blk_bidi_rq(rq) ((rq)->next_rq != NULL)
-#define blk_empty_barrier(rq) (blk_barrier_rq(rq) && blk_fs_request(rq) && !(rq)->hard_nr_sectors)
/* rq->queuelist of dequeued request must be list_empty() */
#define blk_queued_rq(rq) (!list_empty(&(rq)->queuelist))
--
1.5.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* [PATCH 6/6] block: fix empty barrier on write-through w/ ordered tag
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
` (4 preceding siblings ...)
2008-11-28 4:32 ` [PATCH 5/6] block: simplify empty barrier implementation Tejun Heo
@ 2008-11-28 4:32 ` Tejun Heo
2008-12-12 3:23 ` [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
6 siblings, 0 replies; 13+ messages in thread
From: Tejun Heo @ 2008-11-28 4:32 UTC (permalink / raw)
To: jens.axboe, linux-kernel; +Cc: Tejun Heo
Empty barrier on write-through (or no cache) w/ ordered tag has no
command to execute and without any command to execute ordered tag is
never issued to the device and the ordering is never achieved. Force
draining for such cases.
Signed-off-by: Tejun Heo <tj@kernel.org>
---
block/blk-barrier.c | 14 +++++++++++++-
1 files changed, 13 insertions(+), 1 deletions(-)
diff --git a/block/blk-barrier.c b/block/blk-barrier.c
index c63044e..8eba4e4 100644
--- a/block/blk-barrier.c
+++ b/block/blk-barrier.c
@@ -166,9 +166,21 @@ static inline bool start_ordered(struct request_queue *q, struct request **rqp)
* For an empty barrier, there's no actual BAR request, which
* in turn makes POSTFLUSH unnecessary. Mask them off.
*/
- if (!rq->hard_nr_sectors)
+ if (!rq->hard_nr_sectors) {
q->ordered &= ~(QUEUE_ORDERED_DO_BAR |
QUEUE_ORDERED_DO_POSTFLUSH);
+ /*
+ * Empty barrier on a write-through device w/ ordered
+ * tag has no command to issue and without any command
+ * to issue, ordering by tag can't be used. Drain
+ * instead.
+ */
+ if ((q->ordered & QUEUE_ORDERED_BY_TAG) &&
+ !(q->ordered & QUEUE_ORDERED_DO_PREFLUSH)) {
+ q->ordered &= ~QUEUE_ORDERED_BY_TAG;
+ q->ordered |= QUEUE_ORDERED_BY_DRAIN;
+ }
+ }
/* stash away the original request */
elv_dequeue_request(q, rq);
--
1.5.6
^ permalink raw reply related [flat|nested] 13+ messages in thread
* Re: [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
` (5 preceding siblings ...)
2008-11-28 4:32 ` [PATCH 6/6] block: fix empty barrier on write-through w/ ordered tag Tejun Heo
@ 2008-12-12 3:23 ` Tejun Heo
2008-12-12 6:34 ` Jens Axboe
6 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2008-12-12 3:23 UTC (permalink / raw)
To: jens.axboe, linux-kernel
Jens, did you have time to review this?
Thanks.
--
tejun
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
2008-12-12 3:23 ` [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
@ 2008-12-12 6:34 ` Jens Axboe
2008-12-12 7:19 ` Tejun Heo
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2008-12-12 6:34 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
On Fri, Dec 12 2008, Tejun Heo wrote:
> Jens, did you have time to review this?
Sorry yes, I thought I had sent a reply on this already, weird. I
committed it last week, and it's tested fine here.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
2008-12-12 6:34 ` Jens Axboe
@ 2008-12-12 7:19 ` Tejun Heo
2008-12-12 11:36 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2008-12-12 7:19 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
Jens Axboe wrote:
> On Fri, Dec 12 2008, Tejun Heo wrote:
>> Jens, did you have time to review this?
>
> Sorry yes, I thought I had sent a reply on this already, weird. I
> committed it last week, and it's tested fine here.
Ah... thanks. Next patchset brewing then. :-)
--
tejun
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
2008-12-12 7:19 ` Tejun Heo
@ 2008-12-12 11:36 ` Jens Axboe
2008-12-12 13:26 ` Tejun Heo
0 siblings, 1 reply; 13+ messages in thread
From: Jens Axboe @ 2008-12-12 11:36 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
On Fri, Dec 12 2008, Tejun Heo wrote:
> Jens Axboe wrote:
> > On Fri, Dec 12 2008, Tejun Heo wrote:
> >> Jens, did you have time to review this?
> >
> > Sorry yes, I thought I had sent a reply on this already, weird. I
> > committed it last week, and it's tested fine here.
>
> Ah... thanks. Next patchset brewing then. :-)
Excellent. Are you going forward with the explicit peek/dequeue
interface for elv_next_request()? If not, then I'll put some time into
that as I think it needs doing soon. But if you are, I'll happily do
something else instead.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
2008-12-12 11:36 ` Jens Axboe
@ 2008-12-12 13:26 ` Tejun Heo
2008-12-12 13:28 ` Jens Axboe
0 siblings, 1 reply; 13+ messages in thread
From: Tejun Heo @ 2008-12-12 13:26 UTC (permalink / raw)
To: Jens Axboe; +Cc: linux-kernel
Jens Axboe wrote:
> On Fri, Dec 12 2008, Tejun Heo wrote:
>> Jens Axboe wrote:
>>> On Fri, Dec 12 2008, Tejun Heo wrote:
>>>> Jens, did you have time to review this?
>>> Sorry yes, I thought I had sent a reply on this already, weird. I
>>> committed it last week, and it's tested fine here.
>> Ah... thanks. Next patchset brewing then. :-)
>
> Excellent. Are you going forward with the explicit peek/dequeue
> interface for elv_next_request()? If not, then I'll put some time into
> that as I think it needs doing soon. But if you are, I'll happily do
> something else instead.
I'm actually working on that. As you opposed to the original
proposal, I'm changing the approach, so it would be great if we don't
collide on the same area.
Thanks. :-)
--
tejun
^ permalink raw reply [flat|nested] 13+ messages in thread
* Re: [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier
2008-12-12 13:26 ` Tejun Heo
@ 2008-12-12 13:28 ` Jens Axboe
0 siblings, 0 replies; 13+ messages in thread
From: Jens Axboe @ 2008-12-12 13:28 UTC (permalink / raw)
To: Tejun Heo; +Cc: linux-kernel
On Fri, Dec 12 2008, Tejun Heo wrote:
> Jens Axboe wrote:
> > On Fri, Dec 12 2008, Tejun Heo wrote:
> >> Jens Axboe wrote:
> >>> On Fri, Dec 12 2008, Tejun Heo wrote:
> >>>> Jens, did you have time to review this?
> >>> Sorry yes, I thought I had sent a reply on this already, weird. I
> >>> committed it last week, and it's tested fine here.
> >> Ah... thanks. Next patchset brewing then. :-)
> >
> > Excellent. Are you going forward with the explicit peek/dequeue
> > interface for elv_next_request()? If not, then I'll put some time into
> > that as I think it needs doing soon. But if you are, I'll happily do
> > something else instead.
>
> I'm actually working on that. As you opposed to the original
> proposal, I'm changing the approach, so it would be great if we don't
> collide on the same area.
Ah nice, sounds promising! I wont step on your toes, and it'd be silly
to waste time working on the same bits, so don't worry about that.
--
Jens Axboe
^ permalink raw reply [flat|nested] 13+ messages in thread
end of thread, other threads:[~2008-12-12 13:29 UTC | newest]
Thread overview: 13+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-11-28 4:32 [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
2008-11-28 4:32 ` [PATCH 1/6] block: reorganize QUEUE_ORDERED_* constants Tejun Heo
2008-11-28 4:32 ` [PATCH 2/6] block: remove duplicate or unused barrier/discard error paths Tejun Heo
2008-11-28 4:32 ` [PATCH 3/6] block: make every barrier action optional Tejun Heo
2008-11-28 4:32 ` [PATCH 4/6] block: make barrier completion more robust Tejun Heo
2008-11-28 4:32 ` [PATCH 5/6] block: simplify empty barrier implementation Tejun Heo
2008-11-28 4:32 ` [PATCH 6/6] block: fix empty barrier on write-through w/ ordered tag Tejun Heo
2008-12-12 3:23 ` [PATCHSET block#for-2.6.29] block: simplify and fix empty barrier Tejun Heo
2008-12-12 6:34 ` Jens Axboe
2008-12-12 7:19 ` Tejun Heo
2008-12-12 11:36 ` Jens Axboe
2008-12-12 13:26 ` Tejun Heo
2008-12-12 13:28 ` Jens Axboe
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox