linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices
@ 2023-04-24 20:33 Bart Van Assche
  2023-04-24 20:33 ` [PATCH v3 1/9] block: Simplify blk_req_needs_zone_write_lock() Bart Van Assche
                   ` (8 more replies)
  0 siblings, 9 replies; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe; +Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche

Hi Jens,

This patch series improves support for zoned block devices in the mq-deadline
scheduler as follows:
* The order of requeued writes (REQ_OP_WRITE*) is preserved.
* The active zone limit is enforced.

Please consider this patch series for the next merge window.

Thanks,

Bart.

Changes compared to v2:
- In the patch that micro-optimizes blk_req_needs_zone_write_lock(), inline
  bdev_op_is_zoned_write() instead of modifying it.
- In patch "block: Introduce blk_rq_is_seq_zoned_write()", converted "case
  REQ_OP_ZONE_APPEND" into a source code comment.
- Reworked deadline_skip_seq_writes() as suggested by Christoph.
- Dropped the patch that disabled head insertion for zoned writes.
- Dropped patch "mq-deadline: Fix a race condition related to zoned writes".
- Reworked handling of requeued requests: the 'next_rq' pointer has been
  removed and instead the position of the most recently dispatched request is
  tracked.
- Dropped the patches for tracking zone capacity and for restricting the number
  of active zones.

Changes compared to v1:
- Left out the patches related to request insertion and requeuing since
  Christoph is busy with reworking these patches.
- Added a patch for enforcing the active zone limit.

Bart Van Assche (9):
  block: Simplify blk_req_needs_zone_write_lock()
  block: Micro-optimize blk_req_needs_zone_write_lock()
  block: Introduce blk_rq_is_seq_zoned_write()
  block: mq-deadline: Clean up deadline_check_fifo()
  block: mq-deadline: Simplify deadline_skip_seq_writes()
  block: mq-deadline: Improve deadline_skip_seq_writes()
  block: mq-deadline: Track the dispatch position
  block: mq-deadline: Handle requeued requests correctly
  block: mq-deadline: Fix handling of at-head zoned writes

 block/blk-mq.h         |   4 +-
 block/blk-zoned.c      |  28 ++++++----
 block/mq-deadline.c    | 114 +++++++++++++++++++++++++++--------------
 include/linux/blk-mq.h |   6 +++
 include/linux/blkdev.h |   9 ----
 5 files changed, 102 insertions(+), 59 deletions(-)


^ permalink raw reply	[flat|nested] 26+ messages in thread

* [PATCH v3 1/9] block: Simplify blk_req_needs_zone_write_lock()
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-24 20:33 ` [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock() Bart Van Assche
                   ` (7 subsequent siblings)
  8 siblings, 0 replies; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Remove the blk_rq_is_passthrough() check because it is redundant:
blk_req_needs_zone_write_lock() also calls bdev_op_is_zoned_write()
and the latter function returns false for pass-through requests.

Reviewed-by: Christoph Hellwig <hch@lst.de>
Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-zoned.c | 3 ---
 1 file changed, 3 deletions(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index fce9082384d6..835d9e937d4d 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -57,9 +57,6 @@ EXPORT_SYMBOL_GPL(blk_zone_cond_str);
  */
 bool blk_req_needs_zone_write_lock(struct request *rq)
 {
-	if (blk_rq_is_passthrough(rq))
-		return false;
-
 	if (!rq->q->disk->seq_zones_wlock)
 		return false;
 

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock()
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
  2023-04-24 20:33 ` [PATCH v3 1/9] block: Simplify blk_req_needs_zone_write_lock() Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:44   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write() Bart Van Assche
                   ` (6 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Micro-optimize blk_req_needs_zone_write_lock() by inlining
bdev_op_is_zoned_write(). This is a micro-optimization because the
number of pointers that is dereferenced while testing whether the
request queue is associated with a zoned device is reduced.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-mq.h         |  5 +++--
 block/blk-zoned.c      | 11 ++++-------
 include/linux/blkdev.h |  9 ---------
 3 files changed, 7 insertions(+), 18 deletions(-)

diff --git a/block/blk-mq.h b/block/blk-mq.h
index e876584d3516..6bb1281a61f2 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -367,8 +367,9 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap)
 static inline struct blk_plug *blk_mq_plug( struct bio *bio)
 {
 	/* Zoned block device write operation case: do not plug the BIO */
-	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
-	    bdev_op_is_zoned_write(bio->bi_bdev, bio_op(bio)))
+	if ((bio_op(bio) == REQ_OP_WRITE ||
+	     bio_op(bio) == REQ_OP_WRITE_ZEROES) &&
+	    disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector))
 		return NULL;
 
 	/*
diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 835d9e937d4d..4640b75ae66f 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -57,13 +57,10 @@ EXPORT_SYMBOL_GPL(blk_zone_cond_str);
  */
 bool blk_req_needs_zone_write_lock(struct request *rq)
 {
-	if (!rq->q->disk->seq_zones_wlock)
-		return false;
-
-	if (bdev_op_is_zoned_write(rq->q->disk->part0, req_op(rq)))
-		return blk_rq_zone_is_seq(rq);
-
-	return false;
+	return rq->q->disk->seq_zones_wlock &&
+		(req_op(rq) == REQ_OP_WRITE ||
+		 req_op(rq) == REQ_OP_WRITE_ZEROES) &&
+		blk_rq_zone_is_seq(rq);
 }
 EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock);
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index e3242e67a8e3..b700c935e230 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -1284,15 +1284,6 @@ static inline unsigned int bdev_zone_no(struct block_device *bdev, sector_t sec)
 	return disk_zone_no(bdev->bd_disk, sec);
 }
 
-static inline bool bdev_op_is_zoned_write(struct block_device *bdev,
-					  blk_opf_t op)
-{
-	if (!bdev_is_zoned(bdev))
-		return false;
-
-	return op == REQ_OP_WRITE || op == REQ_OP_WRITE_ZEROES;
-}
-
 static inline sector_t bdev_zone_sectors(struct block_device *bdev)
 {
 	struct request_queue *q = bdev_get_queue(bdev);

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write()
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
  2023-04-24 20:33 ` [PATCH v3 1/9] block: Simplify blk_req_needs_zone_write_lock() Bart Van Assche
  2023-04-24 20:33 ` [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock() Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:45   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 4/9] block: mq-deadline: Clean up deadline_check_fifo() Bart Van Assche
                   ` (5 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Introduce the function blk_rq_is_seq_zoned_write(). This function will
be used in later patches to preserve the order of zoned writes for which
the order needs to be preserved.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-zoned.c      | 22 +++++++++++++++++++---
 include/linux/blk-mq.h |  6 ++++++
 2 files changed, 25 insertions(+), 3 deletions(-)

diff --git a/block/blk-zoned.c b/block/blk-zoned.c
index 4640b75ae66f..78c39fc505bc 100644
--- a/block/blk-zoned.c
+++ b/block/blk-zoned.c
@@ -52,15 +52,31 @@ const char *blk_zone_cond_str(enum blk_zone_cond zone_cond)
 }
 EXPORT_SYMBOL_GPL(blk_zone_cond_str);
 
+/**
+ * blk_rq_is_seq_zoned_write() - Check if @rq requires write serialization.
+ * @rq: Request to examine.
+ *
+ * Note: REQ_OP_ZONE_APPEND requests do not require serialization.
+ */
+bool blk_rq_is_seq_zoned_write(struct request *rq)
+{
+	switch (req_op(rq)) {
+	case REQ_OP_WRITE:
+	case REQ_OP_WRITE_ZEROES:
+		return blk_rq_zone_is_seq(rq);
+	default:
+		return false;
+	}
+}
+EXPORT_SYMBOL_GPL(blk_rq_is_seq_zoned_write);
+
 /*
  * Return true if a request is a write requests that needs zone write locking.
  */
 bool blk_req_needs_zone_write_lock(struct request *rq)
 {
 	return rq->q->disk->seq_zones_wlock &&
-		(req_op(rq) == REQ_OP_WRITE ||
-		 req_op(rq) == REQ_OP_WRITE_ZEROES) &&
-		blk_rq_zone_is_seq(rq);
+		blk_rq_is_seq_zoned_write(rq);
 }
 EXPORT_SYMBOL_GPL(blk_req_needs_zone_write_lock);
 
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 06caacd77ed6..e498b85bc470 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -1164,6 +1164,7 @@ static inline unsigned int blk_rq_zone_is_seq(struct request *rq)
 	return disk_zone_is_seq(rq->q->disk, blk_rq_pos(rq));
 }
 
+bool blk_rq_is_seq_zoned_write(struct request *rq);
 bool blk_req_needs_zone_write_lock(struct request *rq);
 bool blk_req_zone_write_trylock(struct request *rq);
 void __blk_req_zone_write_lock(struct request *rq);
@@ -1194,6 +1195,11 @@ static inline bool blk_req_can_dispatch_to_zone(struct request *rq)
 	return !blk_req_zone_is_write_locked(rq);
 }
 #else /* CONFIG_BLK_DEV_ZONED */
+static inline bool blk_rq_is_seq_zoned_write(struct request *rq)
+{
+	return false;
+}
+
 static inline bool blk_req_needs_zone_write_lock(struct request *rq)
 {
 	return false;

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 4/9] block: mq-deadline: Clean up deadline_check_fifo()
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
                   ` (2 preceding siblings ...)
  2023-04-24 20:33 ` [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write() Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:46   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 5/9] block: mq-deadline: Simplify deadline_skip_seq_writes() Bart Van Assche
                   ` (4 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Change the return type of deadline_check_fifo() from 'int' into 'bool'.
Use time_is_before_eq_jiffies() instead of time_after_eq(). No
functionality has been changed.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 16 +++++-----------
 1 file changed, 5 insertions(+), 11 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 5839a027e0f0..a016cafa54b3 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -272,21 +272,15 @@ static u32 dd_queued(struct deadline_data *dd, enum dd_prio prio)
 }
 
 /*
- * deadline_check_fifo returns 0 if there are no expired requests on the fifo,
- * 1 otherwise. Requires !list_empty(&dd->fifo_list[data_dir])
+ * deadline_check_fifo returns true if and only if there are expired requests
+ * in the FIFO list. Requires !list_empty(&dd->fifo_list[data_dir]).
  */
-static inline int deadline_check_fifo(struct dd_per_prio *per_prio,
-				      enum dd_data_dir data_dir)
+static inline bool deadline_check_fifo(struct dd_per_prio *per_prio,
+				       enum dd_data_dir data_dir)
 {
 	struct request *rq = rq_entry_fifo(per_prio->fifo_list[data_dir].next);
 
-	/*
-	 * rq is expired!
-	 */
-	if (time_after_eq(jiffies, (unsigned long)rq->fifo_time))
-		return 1;
-
-	return 0;
+	return time_is_before_eq_jiffies((unsigned long)rq->fifo_time);
 }
 
 /*

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 5/9] block: mq-deadline: Simplify deadline_skip_seq_writes()
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
                   ` (3 preceding siblings ...)
  2023-04-24 20:33 ` [PATCH v3 4/9] block: mq-deadline: Clean up deadline_check_fifo() Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:47   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 6/9] block: mq-deadline: Improve deadline_skip_seq_writes() Bart Van Assche
                   ` (3 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Make the deadline_skip_seq_writes() code shorter without changing its
functionality.

Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 9 +++------
 1 file changed, 3 insertions(+), 6 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index a016cafa54b3..6276afede9cd 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -304,14 +304,11 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
 						struct request *rq)
 {
 	sector_t pos = blk_rq_pos(rq);
-	sector_t skipped_sectors = 0;
 
-	while (rq) {
-		if (blk_rq_pos(rq) != pos + skipped_sectors)
-			break;
-		skipped_sectors += blk_rq_sectors(rq);
+	do {
+		pos += blk_rq_sectors(rq);
 		rq = deadline_latter_request(rq);
-	}
+	} while (rq && blk_rq_pos(rq) == pos);
 
 	return rq;
 }

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 6/9] block: mq-deadline: Improve deadline_skip_seq_writes()
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
                   ` (4 preceding siblings ...)
  2023-04-24 20:33 ` [PATCH v3 5/9] block: mq-deadline: Simplify deadline_skip_seq_writes() Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:48   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 7/9] block: mq-deadline: Track the dispatch position Bart Van Assche
                   ` (2 subsequent siblings)
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Make deadline_skip_seq_writes() do what its name suggests, namely to
skip sequential writes.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 6276afede9cd..dbc0feca963e 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -308,7 +308,7 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
 	do {
 		pos += blk_rq_sectors(rq);
 		rq = deadline_latter_request(rq);
-	} while (rq && blk_rq_pos(rq) == pos);
+	} while (rq && blk_rq_pos(rq) == pos && blk_rq_is_seq_zoned_write(rq));
 
 	return rq;
 }

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 7/9] block: mq-deadline: Track the dispatch position
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
                   ` (5 preceding siblings ...)
  2023-04-24 20:33 ` [PATCH v3 6/9] block: mq-deadline: Improve deadline_skip_seq_writes() Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:50   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly Bart Van Assche
  2023-04-24 20:33 ` [PATCH v3 9/9] block: mq-deadline: Fix handling of at-head zoned writes Bart Van Assche
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Track the position (sector_t) of the most recently dispatched request
instead of tracking a pointer to the next request to dispatch. This
patch simplifies the code for inserting and removing a request from the
red-black tree.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 45 +++++++++++++++++++++++++++++++--------------
 1 file changed, 31 insertions(+), 14 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index dbc0feca963e..bf1dfe9fe9c9 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -74,8 +74,8 @@ struct dd_per_prio {
 	struct list_head dispatch;
 	struct rb_root sort_list[DD_DIR_COUNT];
 	struct list_head fifo_list[DD_DIR_COUNT];
-	/* Next request in FIFO order. Read, write or both are NULL. */
-	struct request *next_rq[DD_DIR_COUNT];
+	/* Position of the most recently dispatched request. */
+	sector_t latest_pos[DD_DIR_COUNT];
 	struct io_stats_per_prio stats;
 };
 
@@ -156,6 +156,25 @@ deadline_latter_request(struct request *rq)
 	return NULL;
 }
 
+/* Return the first request for which blk_rq_pos() >= pos. */
+static inline struct request *deadline_from_pos(struct dd_per_prio *per_prio,
+				enum dd_data_dir data_dir, sector_t pos)
+{
+	struct rb_node *node = per_prio->sort_list[data_dir].rb_node;
+	struct request *rq, *res = NULL;
+
+	while (node) {
+		rq = rb_entry_rq(node);
+		if (blk_rq_pos(rq) >= pos) {
+			res = rq;
+			node = node->rb_left;
+		} else {
+			node = node->rb_right;
+		}
+	}
+	return res;
+}
+
 static void
 deadline_add_rq_rb(struct dd_per_prio *per_prio, struct request *rq)
 {
@@ -167,11 +186,6 @@ deadline_add_rq_rb(struct dd_per_prio *per_prio, struct request *rq)
 static inline void
 deadline_del_rq_rb(struct dd_per_prio *per_prio, struct request *rq)
 {
-	const enum dd_data_dir data_dir = rq_data_dir(rq);
-
-	if (per_prio->next_rq[data_dir] == rq)
-		per_prio->next_rq[data_dir] = deadline_latter_request(rq);
-
 	elv_rb_del(deadline_rb_root(per_prio, rq), rq);
 }
 
@@ -251,10 +265,6 @@ static void
 deadline_move_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 		      struct request *rq)
 {
-	const enum dd_data_dir data_dir = rq_data_dir(rq);
-
-	per_prio->next_rq[data_dir] = deadline_latter_request(rq);
-
 	/*
 	 * take it off the sort and fifo list
 	 */
@@ -363,7 +373,8 @@ deadline_next_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 	struct request *rq;
 	unsigned long flags;
 
-	rq = per_prio->next_rq[data_dir];
+	rq = deadline_from_pos(per_prio, data_dir,
+			       per_prio->latest_pos[data_dir]);
 	if (!rq)
 		return NULL;
 
@@ -426,6 +437,7 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 		if (started_after(dd, rq, latest_start))
 			return NULL;
 		list_del_init(&rq->queuelist);
+		data_dir = rq_data_dir(rq);
 		goto done;
 	}
 
@@ -433,9 +445,11 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 	 * batches are currently reads XOR writes
 	 */
 	rq = deadline_next_request(dd, per_prio, dd->last_dir);
-	if (rq && dd->batching < dd->fifo_batch)
+	if (rq && dd->batching < dd->fifo_batch) {
 		/* we have a next request are still entitled to batch */
+		data_dir = rq_data_dir(rq);
 		goto dispatch_request;
+	}
 
 	/*
 	 * at this point we are not running a batch. select the appropriate
@@ -513,6 +527,7 @@ static struct request *__dd_dispatch_request(struct deadline_data *dd,
 done:
 	ioprio_class = dd_rq_ioclass(rq);
 	prio = ioprio_class_to_prio[ioprio_class];
+	dd->per_prio[prio].latest_pos[data_dir] = blk_rq_pos(rq);
 	dd->per_prio[prio].stats.dispatched++;
 	/*
 	 * If the request needs its target zone locked, do it.
@@ -1026,8 +1041,10 @@ static int deadline_##name##_next_rq_show(void *data,			\
 	struct request_queue *q = data;					\
 	struct deadline_data *dd = q->elevator->elevator_data;		\
 	struct dd_per_prio *per_prio = &dd->per_prio[prio];		\
-	struct request *rq = per_prio->next_rq[data_dir];		\
+	struct request *rq;						\
 									\
+	rq = deadline_from_pos(per_prio, data_dir,			\
+			       per_prio->latest_pos[data_dir]);		\
 	if (rq)								\
 		__blk_mq_debugfs_rq_show(m, rq);			\
 	return 0;							\

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
                   ` (6 preceding siblings ...)
  2023-04-24 20:33 ` [PATCH v3 7/9] block: mq-deadline: Track the dispatch position Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:54   ` Christoph Hellwig
  2023-04-24 20:33 ` [PATCH v3 9/9] block: mq-deadline: Fix handling of at-head zoned writes Bart Van Assche
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Start dispatching from the start of a zone instead of from the starting
position of the most recently dispatched request.

If a zoned write is requeued with an LBA that is lower than already
inserted zoned writes, make sure that it is submitted first.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 25 +++++++++++++++++++++++--
 1 file changed, 23 insertions(+), 2 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index bf1dfe9fe9c9..2de6c63190d8 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -156,13 +156,23 @@ deadline_latter_request(struct request *rq)
 	return NULL;
 }
 
-/* Return the first request for which blk_rq_pos() >= pos. */
+/*
+ * Return the first request for which blk_rq_pos() >= @pos. For zoned devices,
+ * return the first request after the highest zone start <= @pos.
+ */
 static inline struct request *deadline_from_pos(struct dd_per_prio *per_prio,
 				enum dd_data_dir data_dir, sector_t pos)
 {
 	struct rb_node *node = per_prio->sort_list[data_dir].rb_node;
 	struct request *rq, *res = NULL;
 
+	if (!node)
+		return NULL;
+
+	rq = rb_entry_rq(node);
+	if (blk_rq_is_seq_zoned_write(rq))
+		pos -= bdev_offset_from_zone_start(rq->q->disk->part0, pos);
+
 	while (node) {
 		rq = rb_entry_rq(node);
 		if (blk_rq_pos(rq) >= pos) {
@@ -809,6 +819,8 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 		list_add(&rq->queuelist, &per_prio->dispatch);
 		rq->fifo_time = jiffies;
 	} else {
+		struct list_head *insert_before;
+
 		deadline_add_rq_rb(per_prio, rq);
 
 		if (rq_mergeable(rq)) {
@@ -821,7 +833,16 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
 		 * set expire time and add to fifo list
 		 */
 		rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
-		list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]);
+		insert_before = &per_prio->fifo_list[data_dir];
+#ifdef CONFIG_BLK_DEV_ZONED
+		if (blk_rq_is_seq_zoned_write(rq)) {
+			struct request *rq2 = deadline_latter_request(rq);
+
+			if (rq2 && blk_rq_zone_no(rq2) == blk_rq_zone_no(rq))
+				insert_before = &rq2->queuelist;
+		}
+#endif
+		list_add_tail(&rq->queuelist, insert_before);
 	}
 }
 

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* [PATCH v3 9/9] block: mq-deadline: Fix handling of at-head zoned writes
  2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
                   ` (7 preceding siblings ...)
  2023-04-24 20:33 ` [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly Bart Van Assche
@ 2023-04-24 20:33 ` Bart Van Assche
  2023-04-28  5:58   ` Christoph Hellwig
  8 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-24 20:33 UTC (permalink / raw)
  To: Jens Axboe
  Cc: linux-block, Jaegeuk Kim, Christoph Hellwig, Bart Van Assche,
	Damien Le Moal, Ming Lei

Before dispatching a zoned write from the FIFO list, check whether there
are any zoned writes in the RB-tree for a lower LBA for the same zone.
This patch ensures that zoned writes happen in order even if at_head is
set for some writes for a zone and not for others.

Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
Cc: Christoph Hellwig <hch@lst.de>
Cc: Ming Lei <ming.lei@redhat.com>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/mq-deadline.c | 18 ++++++++++++------
 1 file changed, 12 insertions(+), 6 deletions(-)

diff --git a/block/mq-deadline.c b/block/mq-deadline.c
index 2de6c63190d8..ed0da04c51ec 100644
--- a/block/mq-deadline.c
+++ b/block/mq-deadline.c
@@ -341,7 +341,7 @@ static struct request *
 deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 		      enum dd_data_dir data_dir)
 {
-	struct request *rq;
+	struct request *rq, *rb_rq, *next;
 	unsigned long flags;
 
 	if (list_empty(&per_prio->fifo_list[data_dir]))
@@ -353,13 +353,19 @@ deadline_fifo_request(struct deadline_data *dd, struct dd_per_prio *per_prio,
 
 	/*
 	 * Look for a write request that can be dispatched, that is one with
-	 * an unlocked target zone. For some HDDs, breaking a sequential
-	 * write stream can lead to lower throughput, so make sure to preserve
-	 * sequential write streams, even if that stream crosses into the next
-	 * zones and these zones are unlocked.
+	 * an unlocked target zone. For each write request from the FIFO list,
+	 * check whether an earlier write request exists in the RB tree. For
+	 * some HDDs, breaking a sequential write stream can lead to lower
+	 * throughput, so make sure to preserve sequential write streams, even
+	 * if that stream crosses into the next zones and these zones are
+	 * unlocked.
 	 */
 	spin_lock_irqsave(&dd->zone_lock, flags);
-	list_for_each_entry(rq, &per_prio->fifo_list[DD_WRITE], queuelist) {
+	list_for_each_entry_safe(rq, next, &per_prio->fifo_list[DD_WRITE],
+				 queuelist) {
+		rb_rq = deadline_from_pos(per_prio, data_dir, blk_rq_pos(rq));
+		if (rb_rq && blk_rq_pos(rb_rq) < blk_rq_pos(rq))
+			rq = rb_rq;
 		if (blk_req_can_dispatch_to_zone(rq) &&
 		    (blk_queue_nonrot(rq->q) ||
 		     !deadline_is_seq_write(dd, rq)))

^ permalink raw reply related	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock()
  2023-04-24 20:33 ` [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock() Bart Van Assche
@ 2023-04-28  5:44   ` Christoph Hellwig
  2023-04-28 19:46     ` Bart Van Assche
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:44 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

On Mon, Apr 24, 2023 at 01:33:22PM -0700, Bart Van Assche wrote:
> @@ -367,8 +367,9 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap)
>  static inline struct blk_plug *blk_mq_plug( struct bio *bio)
>  {
>  	/* Zoned block device write operation case: do not plug the BIO */
> -	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
> -	    bdev_op_is_zoned_write(bio->bi_bdev, bio_op(bio)))
> +	if ((bio_op(bio) == REQ_OP_WRITE ||
> +	     bio_op(bio) == REQ_OP_WRITE_ZEROES) &&
> +	    disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector))
>  		return NULL;

I find this a bit hard to hard to read.  Why not:

	if (disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector)) {
	  	/*
		 * Do not plug for writes that require zone locking.
		 */
		if (bio_op(bio) == REQ_OP_WRITE ||
		    bio_op(bio) == REQ_OP_WRITE_ZEROES)
			return NULL;
	}

>  bool blk_req_needs_zone_write_lock(struct request *rq)
>  {
> -	if (!rq->q->disk->seq_zones_wlock)
> -		return false;
> -
> -	if (bdev_op_is_zoned_write(rq->q->disk->part0, req_op(rq)))
> -		return blk_rq_zone_is_seq(rq);
> -
> -	return false;
> +	return rq->q->disk->seq_zones_wlock &&
> +		(req_op(rq) == REQ_OP_WRITE ||
> +		 req_op(rq) == REQ_OP_WRITE_ZEROES) &&
> +		blk_rq_zone_is_seq(rq);

Similaly here.  The old version did flow much better, so I'd prefer
something like:


	if (!rq->q->disk->seq_zones_wlock || !blk_rq_zone_is_seq(rq))
		return false;
	return req_op(rq) == REQ_OP_WRITE || req_op(rq) == REQ_OP_WRITE_ZEROES);

I also wonder if the check that and op is write or write zeroes, that
is needs zone locking would be useful instead of dupliating it all
over.  That is instead of removing bdev_op_is_zoned_write
keep a op_is_zoned_write without the bdev_is_zoned check.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write()
  2023-04-24 20:33 ` [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write() Bart Van Assche
@ 2023-04-28  5:45   ` Christoph Hellwig
  2023-04-28 19:59     ` Bart Van Assche
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:45 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

> + */
> +bool blk_rq_is_seq_zoned_write(struct request *rq)
> +{
> +	switch (req_op(rq)) {
> +	case REQ_OP_WRITE:
> +	case REQ_OP_WRITE_ZEROES:
> +		return blk_rq_zone_is_seq(rq);

This would be another use for op_is_zoned_write..

>  bool blk_req_needs_zone_write_lock(struct request *rq)
>  {
>  	return rq->q->disk->seq_zones_wlock &&
> -		(req_op(rq) == REQ_OP_WRITE ||
> -		 req_op(rq) == REQ_OP_WRITE_ZEROES) &&
> -		blk_rq_zone_is_seq(rq);
> +		blk_rq_is_seq_zoned_write(rq);
>  }

.. and given that this just reshuffles the previous patch it might
make sense to just move it before that.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 4/9] block: mq-deadline: Clean up deadline_check_fifo()
  2023-04-24 20:33 ` [PATCH v3 4/9] block: mq-deadline: Clean up deadline_check_fifo() Bart Van Assche
@ 2023-04-28  5:46   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:46 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

On Mon, Apr 24, 2023 at 01:33:24PM -0700, Bart Van Assche wrote:
> Change the return type of deadline_check_fifo() from 'int' into 'bool'.
> Use time_is_before_eq_jiffies() instead of time_after_eq(). No
> functionality has been changed.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 5/9] block: mq-deadline: Simplify deadline_skip_seq_writes()
  2023-04-24 20:33 ` [PATCH v3 5/9] block: mq-deadline: Simplify deadline_skip_seq_writes() Bart Van Assche
@ 2023-04-28  5:47   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:47 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

On Mon, Apr 24, 2023 at 01:33:25PM -0700, Bart Van Assche wrote:
> Make the deadline_skip_seq_writes() code shorter without changing its
> functionality.
> 
> Reviewed-by: Damien Le Moal <dlemoal@kernel.org>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/mq-deadline.c | 9 +++------
>  1 file changed, 3 insertions(+), 6 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index a016cafa54b3..6276afede9cd 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -304,14 +304,11 @@ static struct request *deadline_skip_seq_writes(struct deadline_data *dd,
>  						struct request *rq)
>  {
>  	sector_t pos = blk_rq_pos(rq);
> -	sector_t skipped_sectors = 0;
>  
> -	while (rq) {
> -		if (blk_rq_pos(rq) != pos + skipped_sectors)
> -			break;
> -		skipped_sectors += blk_rq_sectors(rq);
> +	do {
> +		pos += blk_rq_sectors(rq);
>  		rq = deadline_latter_request(rq);
> -	}
> +	} while (rq && blk_rq_pos(rq) == pos);

Looks good, although I'd maybe rename pos to next_pos for clarity.

Reviewed-by: Christoph Hellwig <hch@lst.de>


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 6/9] block: mq-deadline: Improve deadline_skip_seq_writes()
  2023-04-24 20:33 ` [PATCH v3 6/9] block: mq-deadline: Improve deadline_skip_seq_writes() Bart Van Assche
@ 2023-04-28  5:48   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:48 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

On Mon, Apr 24, 2023 at 01:33:26PM -0700, Bart Van Assche wrote:
> Make deadline_skip_seq_writes() do what its name suggests, namely to
> skip sequential writes.

Looks good:

Reviewed-by: Christoph Hellwig <hch@lst.de>

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/9] block: mq-deadline: Track the dispatch position
  2023-04-24 20:33 ` [PATCH v3 7/9] block: mq-deadline: Track the dispatch position Bart Van Assche
@ 2023-04-28  5:50   ` Christoph Hellwig
  2023-04-28 17:04     ` Bart Van Assche
  2023-04-28 20:04     ` Bart Van Assche
  0 siblings, 2 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:50 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

On Mon, Apr 24, 2023 at 01:33:27PM -0700, Bart Van Assche wrote:
> Track the position (sector_t) of the most recently dispatched request
> instead of tracking a pointer to the next request to dispatch. This
> patch simplifies the code for inserting and removing a request from the
> red-black tree.

Can you explain what the simplification is?  As of this patch the
code looks more complicated to me..

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly
  2023-04-24 20:33 ` [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly Bart Van Assche
@ 2023-04-28  5:54   ` Christoph Hellwig
  2023-04-28 17:03     ` Bart Van Assche
  0 siblings, 1 reply; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:54 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

On Mon, Apr 24, 2023 at 01:33:28PM -0700, Bart Van Assche wrote:
> Start dispatching from the start of a zone instead of from the starting
> position of the most recently dispatched request.
> 
> If a zoned write is requeued with an LBA that is lower than already
> inserted zoned writes, make sure that it is submitted first.
> 
> Cc: Damien Le Moal <damien.lemoal@opensource.wdc.com>
> Cc: Christoph Hellwig <hch@lst.de>
> Cc: Ming Lei <ming.lei@redhat.com>
> Signed-off-by: Bart Van Assche <bvanassche@acm.org>
> ---
>  block/mq-deadline.c | 25 +++++++++++++++++++++++--
>  1 file changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/block/mq-deadline.c b/block/mq-deadline.c
> index bf1dfe9fe9c9..2de6c63190d8 100644
> --- a/block/mq-deadline.c
> +++ b/block/mq-deadline.c
> @@ -156,13 +156,23 @@ deadline_latter_request(struct request *rq)
>  	return NULL;
>  }
>  
> -/* Return the first request for which blk_rq_pos() >= pos. */
> +/*
> + * Return the first request for which blk_rq_pos() >= @pos. For zoned devices,
> + * return the first request after the highest zone start <= @pos.
> + */
>  static inline struct request *deadline_from_pos(struct dd_per_prio *per_prio,
>  				enum dd_data_dir data_dir, sector_t pos)
>  {
>  	struct rb_node *node = per_prio->sort_list[data_dir].rb_node;
>  	struct request *rq, *res = NULL;
>  
> +	if (!node)
> +		return NULL;
> +
> +	rq = rb_entry_rq(node);
> +	if (blk_rq_is_seq_zoned_write(rq))
> +		pos -= bdev_offset_from_zone_start(rq->q->disk->part0, pos);

This looks a bit odd.  I'd write this as:

	if (blk_rq_is_seq_zoned_write(rq))
		pos = round_down(pos, rq->q->limits.chunk_sectors);

> @@ -821,7 +833,16 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>  		 * set expire time and add to fifo list
>  		 */
>  		rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
> -		list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]);
> +		insert_before = &per_prio->fifo_list[data_dir];
> +#ifdef CONFIG_BLK_DEV_ZONED
> +		if (blk_rq_is_seq_zoned_write(rq)) {
> +			struct request *rq2 = deadline_latter_request(rq);
> +
> +			if (rq2 && blk_rq_zone_no(rq2) == blk_rq_zone_no(rq))
> +				insert_before = &rq2->queuelist;
> +		}
> +#endif

Why does this need an ifdef?

Also can you please always add comments for these special cases?
Same for the one above.

> +		list_add_tail(&rq->queuelist, insert_before);
>  	}
>  }
>  
---end quoted text---

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 9/9] block: mq-deadline: Fix handling of at-head zoned writes
  2023-04-24 20:33 ` [PATCH v3 9/9] block: mq-deadline: Fix handling of at-head zoned writes Bart Van Assche
@ 2023-04-28  5:58   ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-04-28  5:58 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Christoph Hellwig,
	Damien Le Moal, Ming Lei

>  	/*
>  	 * Look for a write request that can be dispatched, that is one with
> +	 * an unlocked target zone. For each write request from the FIFO list,
> +	 * check whether an earlier write request exists in the RB tree.

I can see that you are checking the RB tree, so the comment is a bit
useless.  It should explain why.  And preferably right next to the code
instead of hidden in this long and fairly hard to read above the loop
comment.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly
  2023-04-28  5:54   ` Christoph Hellwig
@ 2023-04-28 17:03     ` Bart Van Assche
  2023-05-01  4:37       ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-28 17:03 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Damien Le Moal, Ming Lei

On 4/27/23 22:54, Christoph Hellwig wrote:
> On Mon, Apr 24, 2023 at 01:33:28PM -0700, Bart Van Assche wrote:
>> @@ -821,7 +833,16 @@ static void dd_insert_request(struct blk_mq_hw_ctx *hctx, struct request *rq,
>>   		 * set expire time and add to fifo list
>>   		 */
>>   		rq->fifo_time = jiffies + dd->fifo_expire[data_dir];
>> -		list_add_tail(&rq->queuelist, &per_prio->fifo_list[data_dir]);
>> +		insert_before = &per_prio->fifo_list[data_dir];
>> +#ifdef CONFIG_BLK_DEV_ZONED
>> +		if (blk_rq_is_seq_zoned_write(rq)) {
>> +			struct request *rq2 = deadline_latter_request(rq);
>> +
>> +			if (rq2 && blk_rq_zone_no(rq2) == blk_rq_zone_no(rq))
>> +				insert_before = &rq2->queuelist;
>> +		}
>> +#endif
> 
> Why does this need an ifdef?

Because the blk_rq_zone_no() definition is surrounded with #ifdef 
CONFIG_BLK_DEV_ZONED / #endif. Without the #ifdef the above code would 
trigger a compilation error with CONFIG_BLK_DEV_ZONED=n. I have 
considered to add a definition of blk_rq_zone_no() for 
CONFIG_BLK_DEV_ZONED=n. I prefer not to do this because I think it's 
better to cause a compiler error if blk_rq_zone_no() is used in code 
that is also used for the CONFIG_BLK_DEV_ZONED=n case.

> Also can you please always add comments for these special cases?

Will do.

Thanks,

Bart.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/9] block: mq-deadline: Track the dispatch position
  2023-04-28  5:50   ` Christoph Hellwig
@ 2023-04-28 17:04     ` Bart Van Assche
  2023-04-28 20:04     ` Bart Van Assche
  1 sibling, 0 replies; 26+ messages in thread
From: Bart Van Assche @ 2023-04-28 17:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Damien Le Moal, Ming Lei

On 4/27/23 22:50, Christoph Hellwig wrote:
> On Mon, Apr 24, 2023 at 01:33:27PM -0700, Bart Van Assche wrote:
>> Track the position (sector_t) of the most recently dispatched request
>> instead of tracking a pointer to the next request to dispatch. This
>> patch simplifies the code for inserting and removing a request from the
>> red-black tree.
> 
> Can you explain what the simplification is?  As of this patch the
> code looks more complicated to me..

Hi Christoph,

I will rewrite the patch description.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock()
  2023-04-28  5:44   ` Christoph Hellwig
@ 2023-04-28 19:46     ` Bart Van Assche
  2023-05-01  4:34       ` Christoph Hellwig
  0 siblings, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-28 19:46 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Damien Le Moal, Ming Lei

On 4/27/23 22:44, Christoph Hellwig wrote:
> On Mon, Apr 24, 2023 at 01:33:22PM -0700, Bart Van Assche wrote:
>> @@ -367,8 +367,9 @@ static inline void blk_mq_clear_mq_map(struct blk_mq_queue_map *qmap)
>>   static inline struct blk_plug *blk_mq_plug( struct bio *bio)
>>   {
>>   	/* Zoned block device write operation case: do not plug the BIO */
>> -	if (IS_ENABLED(CONFIG_BLK_DEV_ZONED) &&
>> -	    bdev_op_is_zoned_write(bio->bi_bdev, bio_op(bio)))
>> +	if ((bio_op(bio) == REQ_OP_WRITE ||
>> +	     bio_op(bio) == REQ_OP_WRITE_ZEROES) &&
>> +	    disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector))
>>   		return NULL;
> 
> I find this a bit hard to hard to read.  Why not:
> 
> 	if (disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector)) {
> 	  	/*
> 		 * Do not plug for writes that require zone locking.
> 		 */
> 		if (bio_op(bio) == REQ_OP_WRITE ||
> 		    bio_op(bio) == REQ_OP_WRITE_ZEROES)
> 			return NULL;
> 	}

In the above alternative the expensive check happens before a check that is not
expensive at all. Do you really want me to call disk_zone_is_seq() before checking
the operation type?

>>   bool blk_req_needs_zone_write_lock(struct request *rq)
>>   {
>> -	if (!rq->q->disk->seq_zones_wlock)
>> -		return false;
>> -
>> -	if (bdev_op_is_zoned_write(rq->q->disk->part0, req_op(rq)))
>> -		return blk_rq_zone_is_seq(rq);
>> -
>> -	return false;
>> +	return rq->q->disk->seq_zones_wlock &&
>> +		(req_op(rq) == REQ_OP_WRITE ||
>> +		 req_op(rq) == REQ_OP_WRITE_ZEROES) &&
>> +		blk_rq_zone_is_seq(rq);
> 
> Similaly here.  The old version did flow much better, so I'd prefer
> something like:
> 
> 
> 	if (!rq->q->disk->seq_zones_wlock || !blk_rq_zone_is_seq(rq))
> 		return false;
> 	return req_op(rq) == REQ_OP_WRITE || req_op(rq) == REQ_OP_WRITE_ZEROES);
> 
> I also wonder if the check that and op is write or write zeroes, that
> is needs zone locking would be useful instead of dupliating it all
> over.  That is instead of removing bdev_op_is_zoned_write
> keep a op_is_zoned_write without the bdev_is_zoned check.

I will introduce an op_is_zoned_write() helper function.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write()
  2023-04-28  5:45   ` Christoph Hellwig
@ 2023-04-28 19:59     ` Bart Van Assche
  0 siblings, 0 replies; 26+ messages in thread
From: Bart Van Assche @ 2023-04-28 19:59 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Damien Le Moal, Ming Lei

On 4/27/23 22:45, Christoph Hellwig wrote:
>>   bool blk_req_needs_zone_write_lock(struct request *rq)
>>   {
>>   	return rq->q->disk->seq_zones_wlock &&
>> -		(req_op(rq) == REQ_OP_WRITE ||
>> -		 req_op(rq) == REQ_OP_WRITE_ZEROES) &&
>> -		blk_rq_zone_is_seq(rq);
>> +		blk_rq_is_seq_zoned_write(rq);
>>   }
> 
> .. and given that this just reshuffles the previous patch it might
> make sense to just move it before that.

Hi Christoph,

I will move this patch in front of the micro-optimization patch.

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/9] block: mq-deadline: Track the dispatch position
  2023-04-28  5:50   ` Christoph Hellwig
  2023-04-28 17:04     ` Bart Van Assche
@ 2023-04-28 20:04     ` Bart Van Assche
  2023-05-01  4:35       ` Christoph Hellwig
  1 sibling, 1 reply; 26+ messages in thread
From: Bart Van Assche @ 2023-04-28 20:04 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Jens Axboe, linux-block, Jaegeuk Kim, Damien Le Moal, Ming Lei

On 4/27/23 22:50, Christoph Hellwig wrote:
> On Mon, Apr 24, 2023 at 01:33:27PM -0700, Bart Van Assche wrote:
>> Track the position (sector_t) of the most recently dispatched request
>> instead of tracking a pointer to the next request to dispatch. This
>> patch simplifies the code for inserting and removing a request from the
>> red-black tree.
> 
> Can you explain what the simplification is?  As of this patch the
> code looks more complicated to me..

How about changing the patch description into the following text?

"Track the position (sector_t) of the most recently dispatched request
instead of tracking a pointer to the next request to dispatch. This
patch is the basis for patch "Handle requeued requests correctly".
Without this patch it would me significantly more complicated to start
searching for a zoned write from the start of a zone."

Thanks,

Bart.

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock()
  2023-04-28 19:46     ` Bart Van Assche
@ 2023-05-01  4:34       ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-05-01  4:34 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Jaegeuk Kim,
	Damien Le Moal, Ming Lei

On Fri, Apr 28, 2023 at 12:46:06PM -0700, Bart Van Assche wrote:
>>> +	if ((bio_op(bio) == REQ_OP_WRITE ||
>>> +	     bio_op(bio) == REQ_OP_WRITE_ZEROES) &&
>>> +	    disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector))
>>>   		return NULL;
>>
>> I find this a bit hard to hard to read.  Why not:
>>
>> 	if (disk_zone_is_seq(bio->bi_bdev->bd_disk, bio->bi_iter.bi_sector)) {
>> 	  	/*
>> 		 * Do not plug for writes that require zone locking.
>> 		 */
>> 		if (bio_op(bio) == REQ_OP_WRITE ||
>> 		    bio_op(bio) == REQ_OP_WRITE_ZEROES)
>> 			return NULL;
>> 	}
>
> In the above alternative the expensive check happens before a check that is not
> expensive at all. Do you really want me to call disk_zone_is_seq() before checking
> the operation type?

What expensive check?  The first check in disk_zone_is_seq is for
a zoned device, avoiding any further check if it is not.


^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 7/9] block: mq-deadline: Track the dispatch position
  2023-04-28 20:04     ` Bart Van Assche
@ 2023-05-01  4:35       ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-05-01  4:35 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Jaegeuk Kim,
	Damien Le Moal, Ming Lei

On Fri, Apr 28, 2023 at 01:04:01PM -0700, Bart Van Assche wrote:
> "Track the position (sector_t) of the most recently dispatched request
> instead of tracking a pointer to the next request to dispatch. This
> patch is the basis for patch "Handle requeued requests correctly".
> Without this patch it would me significantly more complicated to start

s/me/be/

^ permalink raw reply	[flat|nested] 26+ messages in thread

* Re: [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly
  2023-04-28 17:03     ` Bart Van Assche
@ 2023-05-01  4:37       ` Christoph Hellwig
  0 siblings, 0 replies; 26+ messages in thread
From: Christoph Hellwig @ 2023-05-01  4:37 UTC (permalink / raw)
  To: Bart Van Assche
  Cc: Christoph Hellwig, Jens Axboe, linux-block, Jaegeuk Kim,
	Damien Le Moal, Ming Lei

On Fri, Apr 28, 2023 at 10:03:45AM -0700, Bart Van Assche wrote:
> Because the blk_rq_zone_no() definition is surrounded with #ifdef 
> CONFIG_BLK_DEV_ZONED / #endif. Without the #ifdef the above code would 
> trigger a compilation error with CONFIG_BLK_DEV_ZONED=n. I have considered 
> to add a definition of blk_rq_zone_no() for CONFIG_BLK_DEV_ZONED=n. I 
> prefer not to do this because I think it's better to cause a compiler error 
> if blk_rq_zone_no() is used in code that is also used for the 
> CONFIG_BLK_DEV_ZONED=n case.

Ok.

^ permalink raw reply	[flat|nested] 26+ messages in thread

end of thread, other threads:[~2023-05-01  4:37 UTC | newest]

Thread overview: 26+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-04-24 20:33 [PATCH v3 0/9] mq-deadline: Improve support for zoned block devices Bart Van Assche
2023-04-24 20:33 ` [PATCH v3 1/9] block: Simplify blk_req_needs_zone_write_lock() Bart Van Assche
2023-04-24 20:33 ` [PATCH v3 2/9] block: Micro-optimize blk_req_needs_zone_write_lock() Bart Van Assche
2023-04-28  5:44   ` Christoph Hellwig
2023-04-28 19:46     ` Bart Van Assche
2023-05-01  4:34       ` Christoph Hellwig
2023-04-24 20:33 ` [PATCH v3 3/9] block: Introduce blk_rq_is_seq_zoned_write() Bart Van Assche
2023-04-28  5:45   ` Christoph Hellwig
2023-04-28 19:59     ` Bart Van Assche
2023-04-24 20:33 ` [PATCH v3 4/9] block: mq-deadline: Clean up deadline_check_fifo() Bart Van Assche
2023-04-28  5:46   ` Christoph Hellwig
2023-04-24 20:33 ` [PATCH v3 5/9] block: mq-deadline: Simplify deadline_skip_seq_writes() Bart Van Assche
2023-04-28  5:47   ` Christoph Hellwig
2023-04-24 20:33 ` [PATCH v3 6/9] block: mq-deadline: Improve deadline_skip_seq_writes() Bart Van Assche
2023-04-28  5:48   ` Christoph Hellwig
2023-04-24 20:33 ` [PATCH v3 7/9] block: mq-deadline: Track the dispatch position Bart Van Assche
2023-04-28  5:50   ` Christoph Hellwig
2023-04-28 17:04     ` Bart Van Assche
2023-04-28 20:04     ` Bart Van Assche
2023-05-01  4:35       ` Christoph Hellwig
2023-04-24 20:33 ` [PATCH v3 8/9] block: mq-deadline: Handle requeued requests correctly Bart Van Assche
2023-04-28  5:54   ` Christoph Hellwig
2023-04-28 17:03     ` Bart Van Assche
2023-05-01  4:37       ` Christoph Hellwig
2023-04-24 20:33 ` [PATCH v3 9/9] block: mq-deadline: Fix handling of at-head zoned writes Bart Van Assche
2023-04-28  5:58   ` Christoph Hellwig

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).