[PATCH 0/3] Preserve the request order in the block layer

public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed

* [PATCH 0/3] Preserve the request order in the block layer
@ 2025-04-18 17:53 Bart Van Assche
  2025-04-18 17:53 ` [PATCH 1/3] block: remove rq_list_move Bart Van Assche
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: Bart Van Assche @ 2025-04-18 17:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, Christoph Hellwig, Damien Le Moal, Jaegeuk Kim,
	Bart Van Assche

Hi Greg,

In kernel v6.10 the zoned storage approach was changed from zoned write
locking to zone write plugging. Because of this change the block layer
must preserve the request order. Hence this backport of Christoph's
"don't reorder requests passed to ->queue_rqs" patch series. Please
consider this patch series for inclusion in the 6.12 stable kernel.

See also https://lore.kernel.org/linux-block/20241113152050.157179-1-hch@lst.de/.

Thanks,

Bart.

Christoph Hellwig (3):
  block: remove rq_list_move
  block: add a rq_list type
  block: don't reorder requests in blk_add_rq_to_plug

 block/blk-core.c              |  6 +--
 block/blk-merge.c             |  2 +-
 block/blk-mq.c                | 42 +++++++--------
 block/blk-mq.h                |  2 +-
 drivers/block/null_blk/main.c |  9 ++--
 drivers/block/virtio_blk.c    | 13 +++--
 drivers/nvme/host/apple.c     |  2 +-
 drivers/nvme/host/pci.c       | 15 +++---
 include/linux/blk-mq.h        | 99 +++++++++++++++++------------------
 include/linux/blkdev.h        | 11 ++--
 io_uring/rw.c                 |  4 +-
 11 files changed, 102 insertions(+), 103 deletions(-)


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [PATCH 1/3] block: remove rq_list_move
  2025-04-18 17:53 [PATCH 0/3] Preserve the request order in the block layer Bart Van Assche
@ 2025-04-18 17:53 ` Bart Van Assche
  2025-04-19 11:47   ` Sasha Levin
  2025-04-18 17:54 ` [PATCH 2/3] block: add a rq_list type Bart Van Assche
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: Bart Van Assche @ 2025-04-18 17:53 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, Christoph Hellwig, Damien Le Moal, Jaegeuk Kim,
	Bart Van Assche, Jens Axboe

From: Christoph Hellwig <hch@lst.de>

Upstream commit e8225ab15006fbcdb14cef426a0a54475292fbbc.

Unused now.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241113152050.157179-4-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 include/linux/blk-mq.h | 17 -----------------
 1 file changed, 17 deletions(-)

diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 7b5e5388c380..cd04e71ecb88 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -269,23 +269,6 @@ static inline unsigned short req_get_ioprio(struct request *req)
 #define rq_list_next(rq)	(rq)->rq_next
 #define rq_list_empty(list)	((list) == (struct request *) NULL)
 
-/**
- * rq_list_move() - move a struct request from one list to another
- * @src: The source list @rq is currently in
- * @dst: The destination list that @rq will be appended to
- * @rq: The request to move
- * @prev: The request preceding @rq in @src (NULL if @rq is the head)
- */
-static inline void rq_list_move(struct request **src, struct request **dst,
-				struct request *rq, struct request *prev)
-{
-	if (prev)
-		prev->rq_next = rq->rq_next;
-	else
-		*src = rq->rq_next;
-	rq_list_add(dst, rq);
-}
-
 /**
  * enum blk_eh_timer_return - How the timeout handler should proceed
  * @BLK_EH_DONE: The block driver completed the command or will complete it at

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 2/3] block: add a rq_list type
  2025-04-18 17:53 [PATCH 0/3] Preserve the request order in the block layer Bart Van Assche
  2025-04-18 17:53 ` [PATCH 1/3] block: remove rq_list_move Bart Van Assche
@ 2025-04-18 17:54 ` Bart Van Assche
  2025-04-19 11:47   ` Sasha Levin
  2025-04-18 17:54 ` [PATCH 3/3] block: don't reorder requests in blk_add_rq_to_plug Bart Van Assche
  2025-04-22 11:51 ` [PATCH 0/3] Preserve the request order in the block layer Greg Kroah-Hartman
  3 siblings, 1 reply; 9+ messages in thread
From: Bart Van Assche @ 2025-04-18 17:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, Christoph Hellwig, Damien Le Moal, Jaegeuk Kim,
	Bart Van Assche, Jens Axboe

From: Christoph Hellwig <hch@lst.de>

Upstream commit a3396b99990d8b4e5797e7b16fdeb64c15ae97bb.

Replace the semi-open coded request list helpers with a proper rq_list
type that mirrors the bio_list and has head and tail pointers.  Besides
better type safety this actually allows to insert at the tail of the
list, which will be useful soon.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241113152050.157179-5-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-core.c              |  6 +--
 block/blk-merge.c             |  2 +-
 block/blk-mq.c                | 40 ++++++++--------
 block/blk-mq.h                |  2 +-
 drivers/block/null_blk/main.c |  9 ++--
 drivers/block/virtio_blk.c    | 13 +++---
 drivers/nvme/host/apple.c     |  2 +-
 drivers/nvme/host/pci.c       | 15 +++---
 include/linux/blk-mq.h        | 88 +++++++++++++++++++++--------------
 include/linux/blkdev.h        | 11 +++--
 io_uring/rw.c                 |  4 +-
 11 files changed, 104 insertions(+), 88 deletions(-)

diff --git a/block/blk-core.c b/block/blk-core.c
index 42023addf9cd..c7b6c1f76359 100644
--- a/block/blk-core.c
+++ b/block/blk-core.c
@@ -1121,8 +1121,8 @@ void blk_start_plug_nr_ios(struct blk_plug *plug, unsigned short nr_ios)
 		return;
 
 	plug->cur_ktime = 0;
-	plug->mq_list = NULL;
-	plug->cached_rq = NULL;
+	rq_list_init(&plug->mq_list);
+	rq_list_init(&plug->cached_rqs);
 	plug->nr_ios = min_t(unsigned short, nr_ios, BLK_MAX_REQUEST_COUNT);
 	plug->rq_count = 0;
 	plug->multiple_queues = false;
@@ -1218,7 +1218,7 @@ void __blk_flush_plug(struct blk_plug *plug, bool from_schedule)
 	 * queue for cached requests, we don't want a blocked task holding
 	 * up a queue freeze/quiesce event.
 	 */
-	if (unlikely(!rq_list_empty(plug->cached_rq)))
+	if (unlikely(!rq_list_empty(&plug->cached_rqs)))
 		blk_mq_free_plug_rqs(plug);
 
 	plug->cur_ktime = 0;
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 5baa950f34fe..ceac64e796ea 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -1175,7 +1175,7 @@ bool blk_attempt_plug_merge(struct request_queue *q, struct bio *bio,
 	struct blk_plug *plug = current->plug;
 	struct request *rq;
 
-	if (!plug || rq_list_empty(plug->mq_list))
+	if (!plug || rq_list_empty(&plug->mq_list))
 		return false;
 
 	rq_list_for_each(&plug->mq_list, rq) {
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 662e52ab0646..c7fb3722d620 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -506,7 +506,7 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data)
 		prefetch(tags->static_rqs[tag]);
 		tag_mask &= ~(1UL << i);
 		rq = blk_mq_rq_ctx_init(data, tags, tag);
-		rq_list_add(data->cached_rq, rq);
+		rq_list_add_head(data->cached_rqs, rq);
 		nr++;
 	}
 	if (!(data->rq_flags & RQF_SCHED_TAGS))
@@ -515,7 +515,7 @@ __blk_mq_alloc_requests_batch(struct blk_mq_alloc_data *data)
 	percpu_ref_get_many(&data->q->q_usage_counter, nr - 1);
 	data->nr_tags -= nr;
 
-	return rq_list_pop(data->cached_rq);
+	return rq_list_pop(data->cached_rqs);
 }
 
 static struct request *__blk_mq_alloc_requests(struct blk_mq_alloc_data *data)
@@ -612,7 +612,7 @@ static struct request *blk_mq_rq_cache_fill(struct request_queue *q,
 		.flags		= flags,
 		.cmd_flags	= opf,
 		.nr_tags	= plug->nr_ios,
-		.cached_rq	= &plug->cached_rq,
+		.cached_rqs	= &plug->cached_rqs,
 	};
 	struct request *rq;
 
@@ -637,14 +637,14 @@ static struct request *blk_mq_alloc_cached_request(struct request_queue *q,
 	if (!plug)
 		return NULL;
 
-	if (rq_list_empty(plug->cached_rq)) {
+	if (rq_list_empty(&plug->cached_rqs)) {
 		if (plug->nr_ios == 1)
 			return NULL;
 		rq = blk_mq_rq_cache_fill(q, plug, opf, flags);
 		if (!rq)
 			return NULL;
 	} else {
-		rq = rq_list_peek(&plug->cached_rq);
+		rq = rq_list_peek(&plug->cached_rqs);
 		if (!rq || rq->q != q)
 			return NULL;
 
@@ -653,7 +653,7 @@ static struct request *blk_mq_alloc_cached_request(struct request_queue *q,
 		if (op_is_flush(rq->cmd_flags) != op_is_flush(opf))
 			return NULL;
 
-		plug->cached_rq = rq_list_next(rq);
+		rq_list_pop(&plug->cached_rqs);
 		blk_mq_rq_time_init(rq, 0);
 	}
 
@@ -830,7 +830,7 @@ void blk_mq_free_plug_rqs(struct blk_plug *plug)
 {
 	struct request *rq;
 
-	while ((rq = rq_list_pop(&plug->cached_rq)) != NULL)
+	while ((rq = rq_list_pop(&plug->cached_rqs)) != NULL)
 		blk_mq_free_request(rq);
 }
 
@@ -1386,8 +1386,7 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
 	 */
 	if (!plug->has_elevator && (rq->rq_flags & RQF_SCHED_TAGS))
 		plug->has_elevator = true;
-	rq->rq_next = NULL;
-	rq_list_add(&plug->mq_list, rq);
+	rq_list_add_head(&plug->mq_list, rq);
 	plug->rq_count++;
 }
 
@@ -2781,7 +2780,7 @@ static void blk_mq_plug_issue_direct(struct blk_plug *plug)
 	blk_status_t ret = BLK_STS_OK;
 
 	while ((rq = rq_list_pop(&plug->mq_list))) {
-		bool last = rq_list_empty(plug->mq_list);
+		bool last = rq_list_empty(&plug->mq_list);
 
 		if (hctx != rq->mq_hctx) {
 			if (hctx) {
@@ -2824,8 +2823,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
 {
 	struct blk_mq_hw_ctx *this_hctx = NULL;
 	struct blk_mq_ctx *this_ctx = NULL;
-	struct request *requeue_list = NULL;
-	struct request **requeue_lastp = &requeue_list;
+	struct rq_list requeue_list = {};
 	unsigned int depth = 0;
 	bool is_passthrough = false;
 	LIST_HEAD(list);
@@ -2839,12 +2837,12 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
 			is_passthrough = blk_rq_is_passthrough(rq);
 		} else if (this_hctx != rq->mq_hctx || this_ctx != rq->mq_ctx ||
 			   is_passthrough != blk_rq_is_passthrough(rq)) {
-			rq_list_add_tail(&requeue_lastp, rq);
+			rq_list_add_tail(&requeue_list, rq);
 			continue;
 		}
 		list_add(&rq->queuelist, &list);
 		depth++;
-	} while (!rq_list_empty(plug->mq_list));
+	} while (!rq_list_empty(&plug->mq_list));
 
 	plug->mq_list = requeue_list;
 	trace_block_unplug(this_hctx->queue, depth, !from_sched);
@@ -2899,19 +2897,19 @@ void blk_mq_flush_plug_list(struct blk_plug *plug, bool from_schedule)
 		if (q->mq_ops->queue_rqs) {
 			blk_mq_run_dispatch_ops(q,
 				__blk_mq_flush_plug_list(q, plug));
-			if (rq_list_empty(plug->mq_list))
+			if (rq_list_empty(&plug->mq_list))
 				return;
 		}
 
 		blk_mq_run_dispatch_ops(q,
 				blk_mq_plug_issue_direct(plug));
-		if (rq_list_empty(plug->mq_list))
+		if (rq_list_empty(&plug->mq_list))
 			return;
 	}
 
 	do {
 		blk_mq_dispatch_plug_list(plug, from_schedule);
-	} while (!rq_list_empty(plug->mq_list));
+	} while (!rq_list_empty(&plug->mq_list));
 }
 
 static void blk_mq_try_issue_list_directly(struct blk_mq_hw_ctx *hctx,
@@ -2976,7 +2974,7 @@ static struct request *blk_mq_get_new_requests(struct request_queue *q,
 	if (plug) {
 		data.nr_tags = plug->nr_ios;
 		plug->nr_ios = 1;
-		data.cached_rq = &plug->cached_rq;
+		data.cached_rqs = &plug->cached_rqs;
 	}
 
 	rq = __blk_mq_alloc_requests(&data);
@@ -2999,7 +2997,7 @@ static struct request *blk_mq_peek_cached_request(struct blk_plug *plug,
 
 	if (!plug)
 		return NULL;
-	rq = rq_list_peek(&plug->cached_rq);
+	rq = rq_list_peek(&plug->cached_rqs);
 	if (!rq || rq->q != q)
 		return NULL;
 	if (type != rq->mq_hctx->type &&
@@ -3013,14 +3011,14 @@ static struct request *blk_mq_peek_cached_request(struct blk_plug *plug,
 static void blk_mq_use_cached_rq(struct request *rq, struct blk_plug *plug,
 		struct bio *bio)
 {
-	WARN_ON_ONCE(rq_list_peek(&plug->cached_rq) != rq);
+	if (rq_list_pop(&plug->cached_rqs) != rq)
+		WARN_ON_ONCE(1);
 
 	/*
 	 * If any qos ->throttle() end up blocking, we will have flushed the
 	 * plug and hence killed the cached_rq list as well. Pop this entry
 	 * before we throttle.
 	 */
-	plug->cached_rq = rq_list_next(rq);
 	rq_qos_throttle(rq->q, bio);
 
 	blk_mq_rq_time_init(rq, 0);
diff --git a/block/blk-mq.h b/block/blk-mq.h
index 364c0415293c..a80d3b3105f9 100644
--- a/block/blk-mq.h
+++ b/block/blk-mq.h
@@ -155,7 +155,7 @@ struct blk_mq_alloc_data {
 
 	/* allocate multiple requests/tags in one go */
 	unsigned int nr_tags;
-	struct request **cached_rq;
+	struct rq_list *cached_rqs;
 
 	/* input & output parameter */
 	struct blk_mq_ctx *ctx;
diff --git a/drivers/block/null_blk/main.c b/drivers/block/null_blk/main.c
index c479348ce8ff..f10369ad90f7 100644
--- a/drivers/block/null_blk/main.c
+++ b/drivers/block/null_blk/main.c
@@ -1638,10 +1638,9 @@ static blk_status_t null_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return BLK_STS_OK;
 }
 
-static void null_queue_rqs(struct request **rqlist)
+static void null_queue_rqs(struct rq_list *rqlist)
 {
-	struct request *requeue_list = NULL;
-	struct request **requeue_lastp = &requeue_list;
+	struct rq_list requeue_list = {};
 	struct blk_mq_queue_data bd = { };
 	blk_status_t ret;
 
@@ -1651,8 +1650,8 @@ static void null_queue_rqs(struct request **rqlist)
 		bd.rq = rq;
 		ret = null_queue_rq(rq->mq_hctx, &bd);
 		if (ret != BLK_STS_OK)
-			rq_list_add_tail(&requeue_lastp, rq);
-	} while (!rq_list_empty(*rqlist));
+			rq_list_add_tail(&requeue_list, rq);
+	} while (!rq_list_empty(rqlist));
 
 	*rqlist = requeue_list;
 }
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 44a6937a4b65..2069bf9701f5 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -472,7 +472,7 @@ static bool virtblk_prep_rq_batch(struct request *req)
 }
 
 static void virtblk_add_req_batch(struct virtio_blk_vq *vq,
-					struct request **rqlist)
+		struct rq_list *rqlist)
 {
 	struct request *req;
 	unsigned long flags;
@@ -499,11 +499,10 @@ static void virtblk_add_req_batch(struct virtio_blk_vq *vq,
 		virtqueue_notify(vq->vq);
 }
 
-static void virtio_queue_rqs(struct request **rqlist)
+static void virtio_queue_rqs(struct rq_list *rqlist)
 {
-	struct request *submit_list = NULL;
-	struct request *requeue_list = NULL;
-	struct request **requeue_lastp = &requeue_list;
+	struct rq_list submit_list = { };
+	struct rq_list requeue_list = { };
 	struct virtio_blk_vq *vq = NULL;
 	struct request *req;
 
@@ -515,9 +514,9 @@ static void virtio_queue_rqs(struct request **rqlist)
 		vq = this_vq;
 
 		if (virtblk_prep_rq_batch(req))
-			rq_list_add(&submit_list, req); /* reverse order */
+			rq_list_add_head(&submit_list, req); /* reverse order */
 		else
-			rq_list_add_tail(&requeue_lastp, req);
+			rq_list_add_tail(&requeue_list, req);
 	}
 
 	if (vq)
diff --git a/drivers/nvme/host/apple.c b/drivers/nvme/host/apple.c
index e79a0adf1395..328f5a103628 100644
--- a/drivers/nvme/host/apple.c
+++ b/drivers/nvme/host/apple.c
@@ -650,7 +650,7 @@ static bool apple_nvme_handle_cq(struct apple_nvme_queue *q, bool force)
 
 	found = apple_nvme_poll_cq(q, &iob);
 
-	if (!rq_list_empty(iob.req_list))
+	if (!rq_list_empty(&iob.req_list))
 		apple_nvme_complete_batch(&iob);
 
 	return found;
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index af45a1b865ee..e943c1be0fca 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -985,7 +985,7 @@ static blk_status_t nvme_queue_rq(struct blk_mq_hw_ctx *hctx,
 	return BLK_STS_OK;
 }
 
-static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct request **rqlist)
+static void nvme_submit_cmds(struct nvme_queue *nvmeq, struct rq_list *rqlist)
 {
 	struct request *req;
 
@@ -1013,11 +1013,10 @@ static bool nvme_prep_rq_batch(struct nvme_queue *nvmeq, struct request *req)
 	return nvme_prep_rq(nvmeq->dev, req) == BLK_STS_OK;
 }
 
-static void nvme_queue_rqs(struct request **rqlist)
+static void nvme_queue_rqs(struct rq_list *rqlist)
 {
-	struct request *submit_list = NULL;
-	struct request *requeue_list = NULL;
-	struct request **requeue_lastp = &requeue_list;
+	struct rq_list submit_list = { };
+	struct rq_list requeue_list = { };
 	struct nvme_queue *nvmeq = NULL;
 	struct request *req;
 
@@ -1027,9 +1026,9 @@ static void nvme_queue_rqs(struct request **rqlist)
 		nvmeq = req->mq_hctx->driver_data;
 
 		if (nvme_prep_rq_batch(nvmeq, req))
-			rq_list_add(&submit_list, req); /* reverse order */
+			rq_list_add_head(&submit_list, req); /* reverse order */
 		else
-			rq_list_add_tail(&requeue_lastp, req);
+			rq_list_add_tail(&requeue_list, req);
 	}
 
 	if (nvmeq)
@@ -1176,7 +1175,7 @@ static irqreturn_t nvme_irq(int irq, void *data)
 	DEFINE_IO_COMP_BATCH(iob);
 
 	if (nvme_poll_cq(nvmeq, &iob)) {
-		if (!rq_list_empty(iob.req_list))
+		if (!rq_list_empty(&iob.req_list))
 			nvme_pci_complete_batch(&iob);
 		return IRQ_HANDLED;
 	}
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index cd04e71ecb88..b160d131204e 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -230,44 +230,60 @@ static inline unsigned short req_get_ioprio(struct request *req)
 #define rq_dma_dir(rq) \
 	(op_is_write(req_op(rq)) ? DMA_TO_DEVICE : DMA_FROM_DEVICE)
 
-#define rq_list_add(listptr, rq)	do {		\
-	(rq)->rq_next = *(listptr);			\
-	*(listptr) = rq;				\
-} while (0)
-
-#define rq_list_add_tail(lastpptr, rq)	do {		\
-	(rq)->rq_next = NULL;				\
-	**(lastpptr) = rq;				\
-	*(lastpptr) = &rq->rq_next;			\
-} while (0)
-
-#define rq_list_pop(listptr)				\
-({							\
-	struct request *__req = NULL;			\
-	if ((listptr) && *(listptr))	{		\
-		__req = *(listptr);			\
-		*(listptr) = __req->rq_next;		\
-	}						\
-	__req;						\
-})
+static inline int rq_list_empty(const struct rq_list *rl)
+{
+	return rl->head == NULL;
+}
 
-#define rq_list_peek(listptr)				\
-({							\
-	struct request *__req = NULL;			\
-	if ((listptr) && *(listptr))			\
-		__req = *(listptr);			\
-	__req;						\
-})
+static inline void rq_list_init(struct rq_list *rl)
+{
+	rl->head = NULL;
+	rl->tail = NULL;
+}
+
+static inline void rq_list_add_tail(struct rq_list *rl, struct request *rq)
+{
+	rq->rq_next = NULL;
+	if (rl->tail)
+		rl->tail->rq_next = rq;
+	else
+		rl->head = rq;
+	rl->tail = rq;
+}
+
+static inline void rq_list_add_head(struct rq_list *rl, struct request *rq)
+{
+	rq->rq_next = rl->head;
+	rl->head = rq;
+	if (!rl->tail)
+		rl->tail = rq;
+}
+
+static inline struct request *rq_list_pop(struct rq_list *rl)
+{
+	struct request *rq = rl->head;
+
+	if (rq) {
+		rl->head = rl->head->rq_next;
+		if (!rl->head)
+			rl->tail = NULL;
+		rq->rq_next = NULL;
+	}
+
+	return rq;
+}
 
-#define rq_list_for_each(listptr, pos)			\
-	for (pos = rq_list_peek((listptr)); pos; pos = rq_list_next(pos))
+static inline struct request *rq_list_peek(struct rq_list *rl)
+{
+	return rl->head;
+}
 
-#define rq_list_for_each_safe(listptr, pos, nxt)			\
-	for (pos = rq_list_peek((listptr)), nxt = rq_list_next(pos);	\
-		pos; pos = nxt, nxt = pos ? rq_list_next(pos) : NULL)
+#define rq_list_for_each(rl, pos)					\
+	for (pos = rq_list_peek((rl)); (pos); pos = pos->rq_next)
 
-#define rq_list_next(rq)	(rq)->rq_next
-#define rq_list_empty(list)	((list) == (struct request *) NULL)
+#define rq_list_for_each_safe(rl, pos, nxt)				\
+	for (pos = rq_list_peek((rl)), nxt = pos->rq_next;		\
+		pos; pos = nxt, nxt = pos ? pos->rq_next : NULL)
 
 /**
  * enum blk_eh_timer_return - How the timeout handler should proceed
@@ -560,7 +576,7 @@ struct blk_mq_ops {
 	 * empty the @rqlist completely, then the rest will be queued
 	 * individually by the block layer upon return.
 	 */
-	void (*queue_rqs)(struct request **rqlist);
+	void (*queue_rqs)(struct rq_list *rqlist);
 
 	/**
 	 * @get_budget: Reserve budget before queue request, once .queue_rq is
@@ -893,7 +909,7 @@ static inline bool blk_mq_add_to_batch(struct request *req,
 	else if (iob->complete != complete)
 		return false;
 	iob->need_ts |= blk_mq_need_time_stamp(req);
-	rq_list_add(&iob->req_list, req);
+	rq_list_add_head(&iob->req_list, req);
 	return true;
 }
 
diff --git a/include/linux/blkdev.h b/include/linux/blkdev.h
index 8f37c5dd52b2..402a7d7fe98d 100644
--- a/include/linux/blkdev.h
+++ b/include/linux/blkdev.h
@@ -996,6 +996,11 @@ extern void blk_put_queue(struct request_queue *);
 void blk_mark_disk_dead(struct gendisk *disk);
 
 #ifdef CONFIG_BLOCK
+struct rq_list {
+	struct request *head;
+	struct request *tail;
+};
+
 /*
  * blk_plug permits building a queue of related requests by holding the I/O
  * fragments for a short period. This allows merging of sequential requests
@@ -1008,10 +1013,10 @@ void blk_mark_disk_dead(struct gendisk *disk);
  * blk_flush_plug() is called.
  */
 struct blk_plug {
-	struct request *mq_list; /* blk-mq requests */
+	struct rq_list mq_list; /* blk-mq requests */
 
 	/* if ios_left is > 1, we can batch tag/rq allocations */
-	struct request *cached_rq;
+	struct rq_list cached_rqs;
 	u64 cur_ktime;
 	unsigned short nr_ios;
 
@@ -1660,7 +1665,7 @@ int bdev_thaw(struct block_device *bdev);
 void bdev_fput(struct file *bdev_file);
 
 struct io_comp_batch {
-	struct request *req_list;
+	struct rq_list req_list;
 	bool need_ts;
 	void (*complete)(struct io_comp_batch *);
 };
diff --git a/io_uring/rw.c b/io_uring/rw.c
index 6abc495602a4..a1ed64760eba 100644
--- a/io_uring/rw.c
+++ b/io_uring/rw.c
@@ -1190,12 +1190,12 @@ int io_do_iopoll(struct io_ring_ctx *ctx, bool force_nonspin)
 			poll_flags |= BLK_POLL_ONESHOT;
 
 		/* iopoll may have completed current req */
-		if (!rq_list_empty(iob.req_list) ||
+		if (!rq_list_empty(&iob.req_list) ||
 		    READ_ONCE(req->iopoll_completed))
 			break;
 	}
 
-	if (!rq_list_empty(iob.req_list))
+	if (!rq_list_empty(&iob.req_list))
 		iob.complete(&iob);
 	else if (!pos)
 		return 0;

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [PATCH 3/3] block: don't reorder requests in blk_add_rq_to_plug
  2025-04-18 17:53 [PATCH 0/3] Preserve the request order in the block layer Bart Van Assche
  2025-04-18 17:53 ` [PATCH 1/3] block: remove rq_list_move Bart Van Assche
  2025-04-18 17:54 ` [PATCH 2/3] block: add a rq_list type Bart Van Assche
@ 2025-04-18 17:54 ` Bart Van Assche
  2025-04-19 11:47   ` Sasha Levin
  2025-04-22 11:51 ` [PATCH 0/3] Preserve the request order in the block layer Greg Kroah-Hartman
  3 siblings, 1 reply; 9+ messages in thread
From: Bart Van Assche @ 2025-04-18 17:54 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: stable, Christoph Hellwig, Damien Le Moal, Jaegeuk Kim,
	Bart Van Assche, Jens Axboe

From: Christoph Hellwig <hch@lst.de>

Upstream commit e70c301faece15b618e54b613b1fd6ece3dd05b4.

Add requests to the tail of the list instead of the front so that they
are queued up in submission order.

Remove the re-reordering in blk_mq_dispatch_plug_list, virtio_queue_rqs
and nvme_queue_rqs now that the list is ordered as expected.

Signed-off-by: Christoph Hellwig <hch@lst.de>
Link: https://lore.kernel.org/r/20241113152050.157179-6-hch@lst.de
Signed-off-by: Jens Axboe <axboe@kernel.dk>
Signed-off-by: Bart Van Assche <bvanassche@acm.org>
---
 block/blk-mq.c             | 4 ++--
 drivers/block/virtio_blk.c | 2 +-
 drivers/nvme/host/pci.c    | 2 +-
 3 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/block/blk-mq.c b/block/blk-mq.c
index c7fb3722d620..f26bee562693 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -1386,7 +1386,7 @@ static void blk_add_rq_to_plug(struct blk_plug *plug, struct request *rq)
 	 */
 	if (!plug->has_elevator && (rq->rq_flags & RQF_SCHED_TAGS))
 		plug->has_elevator = true;
-	rq_list_add_head(&plug->mq_list, rq);
+	rq_list_add_tail(&plug->mq_list, rq);
 	plug->rq_count++;
 }
 
@@ -2840,7 +2840,7 @@ static void blk_mq_dispatch_plug_list(struct blk_plug *plug, bool from_sched)
 			rq_list_add_tail(&requeue_list, rq);
 			continue;
 		}
-		list_add(&rq->queuelist, &list);
+		list_add_tail(&rq->queuelist, &list);
 		depth++;
 	} while (!rq_list_empty(&plug->mq_list));
 
diff --git a/drivers/block/virtio_blk.c b/drivers/block/virtio_blk.c
index 2069bf9701f5..fd6c565f8a50 100644
--- a/drivers/block/virtio_blk.c
+++ b/drivers/block/virtio_blk.c
@@ -514,7 +514,7 @@ static void virtio_queue_rqs(struct rq_list *rqlist)
 		vq = this_vq;
 
 		if (virtblk_prep_rq_batch(req))
-			rq_list_add_head(&submit_list, req); /* reverse order */
+			rq_list_add_tail(&submit_list, req);
 		else
 			rq_list_add_tail(&requeue_list, req);
 	}
diff --git a/drivers/nvme/host/pci.c b/drivers/nvme/host/pci.c
index e943c1be0fca..e70618e8d35e 100644
--- a/drivers/nvme/host/pci.c
+++ b/drivers/nvme/host/pci.c
@@ -1026,7 +1026,7 @@ static void nvme_queue_rqs(struct rq_list *rqlist)
 		nvmeq = req->mq_hctx->driver_data;
 
 		if (nvme_prep_rq_batch(nvmeq, req))
-			rq_list_add_head(&submit_list, req); /* reverse order */
+			rq_list_add_tail(&submit_list, req);
 		else
 			rq_list_add_tail(&requeue_list, req);
 	}

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [PATCH 2/3] block: add a rq_list type
  2025-04-18 17:54 ` [PATCH 2/3] block: add a rq_list type Bart Van Assche
@ 2025-04-19 11:47   ` Sasha Levin
  0 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2025-04-19 11:47 UTC (permalink / raw)
  To: stable, bvanassche; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

Summary of potential issues:
ℹ️ This is part 2/3 of a series
⚠️ Found matching upstream commit but patch is missing proper reference to it
⚠️ Found follow-up fixes in mainline

Found matching upstream commit: a3396b99990d8b4e5797e7b16fdeb64c15ae97bb

WARNING: Author mismatch between patch and found commit:
Backport author: Bart Van Assche<bvanassche@acm.org>
Commit author: Christoph Hellwig<hch@lst.de>

Found fixes commits:
957860cbc1dc block: make struct rq_list available for !CONFIG_BLOCK

Note: The patch differs from the upstream commit:
---
1:  a3396b99990d8 < -:  ------------- block: add a rq_list type
-:  ------------- > 1:  9bc5c94e278f7 Linux 6.14.2
---

NOTE: These results are for this patch alone. Full series testing will be
performed when all parts are received.

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.4.y        |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 1/3] block: remove rq_list_move
  2025-04-18 17:53 ` [PATCH 1/3] block: remove rq_list_move Bart Van Assche
@ 2025-04-19 11:47   ` Sasha Levin
  0 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2025-04-19 11:47 UTC (permalink / raw)
  To: stable, bvanassche; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

Summary of potential issues:
⚠️ Found matching upstream commit but patch is missing proper reference to it

Found matching upstream commit: e8225ab15006fbcdb14cef426a0a54475292fbbc

WARNING: Author mismatch between patch and found commit:
Backport author: Bart Van Assche<bvanassche@acm.org>
Commit author: Christoph Hellwig<hch@lst.de>

Note: The patch differs from the upstream commit:
---
1:  e8225ab15006f < -:  ------------- block: remove rq_list_move
-:  ------------- > 1:  9bc5c94e278f7 Linux 6.14.2
---

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.4.y        |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 3/3] block: don't reorder requests in blk_add_rq_to_plug
  2025-04-18 17:54 ` [PATCH 3/3] block: don't reorder requests in blk_add_rq_to_plug Bart Van Assche
@ 2025-04-19 11:47   ` Sasha Levin
  0 siblings, 0 replies; 9+ messages in thread
From: Sasha Levin @ 2025-04-19 11:47 UTC (permalink / raw)
  To: stable, bvanassche; +Cc: Sasha Levin

[ Sasha's backport helper bot ]

Hi,

Summary of potential issues:
ℹ️ This is part 3/3 of a series
⚠️ Found matching upstream commit but patch is missing proper reference to it

Found matching upstream commit: e70c301faece15b618e54b613b1fd6ece3dd05b4

WARNING: Author mismatch between patch and found commit:
Backport author: Bart Van Assche<bvanassche@acm.org>
Commit author: Christoph Hellwig<hch@lst.de>

Note: The patch differs from the upstream commit:
---
1:  e70c301faece1 < -:  ------------- block: don't reorder requests in blk_add_rq_to_plug
-:  ------------- > 1:  9bc5c94e278f7 Linux 6.14.2
---

NOTE: These results are for this patch alone. Full series testing will be
performed when all parts are received.

Results of testing on various branches:

| Branch                    | Patch Apply | Build Test |
|---------------------------|-------------|------------|
| stable/linux-5.4.y        |  Success    |  Success   |

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Preserve the request order in the block layer
  2025-04-18 17:53 [PATCH 0/3] Preserve the request order in the block layer Bart Van Assche
                   ` (2 preceding siblings ...)
  2025-04-18 17:54 ` [PATCH 3/3] block: don't reorder requests in blk_add_rq_to_plug Bart Van Assche
@ 2025-04-22 11:51 ` Greg Kroah-Hartman
  2025-05-04 17:14   ` Bart Van Assche
  3 siblings, 1 reply; 9+ messages in thread
From: Greg Kroah-Hartman @ 2025-04-22 11:51 UTC (permalink / raw)
  To: Bart Van Assche; +Cc: stable, Christoph Hellwig, Damien Le Moal, Jaegeuk Kim

On Fri, Apr 18, 2025 at 10:53:58AM -0700, Bart Van Assche wrote:
> Hi Greg,
> 
> In kernel v6.10 the zoned storage approach was changed from zoned write
> locking to zone write plugging. Because of this change the block layer
> must preserve the request order. Hence this backport of Christoph's
> "don't reorder requests passed to ->queue_rqs" patch series. Please
> consider this patch series for inclusion in the 6.12 stable kernel.
> 
> See also https://lore.kernel.org/linux-block/20241113152050.157179-1-hch@lst.de/.

You sent this twice, right?  I'll grab this "second" version as I'm
guessing they were the same?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [PATCH 0/3] Preserve the request order in the block layer
  2025-04-22 11:51 ` [PATCH 0/3] Preserve the request order in the block layer Greg Kroah-Hartman
@ 2025-05-04 17:14   ` Bart Van Assche
  0 siblings, 0 replies; 9+ messages in thread
From: Bart Van Assche @ 2025-05-04 17:14 UTC (permalink / raw)
  To: Greg Kroah-Hartman; +Cc: stable, Christoph Hellwig, Damien Le Moal, Jaegeuk Kim

On 4/22/25 4:51 AM, Greg Kroah-Hartman wrote:
> On Fri, Apr 18, 2025 at 10:53:58AM -0700, Bart Van Assche wrote:
>> Hi Greg,
>>
>> In kernel v6.10 the zoned storage approach was changed from zoned write
>> locking to zone write plugging. Because of this change the block layer
>> must preserve the request order. Hence this backport of Christoph's
>> "don't reorder requests passed to ->queue_rqs" patch series. Please
>> consider this patch series for inclusion in the 6.12 stable kernel.
>>
>> See also https://lore.kernel.org/linux-block/20241113152050.157179-1-hch@lst.de/.
> 
> You sent this twice, right?  I'll grab this "second" version as I'm
> guessing they were the same?

(took the past two weeks off)

Hi Greg,

stable@vger.kernel.org was not Cc-ed the first time I sent this series,
hence the resend. Both series are indeed identical.

Thank you,

Bart.

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-05-04 17:18 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-18 17:53 [PATCH 0/3] Preserve the request order in the block layer Bart Van Assche
2025-04-18 17:53 ` [PATCH 1/3] block: remove rq_list_move Bart Van Assche
2025-04-19 11:47   ` Sasha Levin
2025-04-18 17:54 ` [PATCH 2/3] block: add a rq_list type Bart Van Assche
2025-04-19 11:47   ` Sasha Levin
2025-04-18 17:54 ` [PATCH 3/3] block: don't reorder requests in blk_add_rq_to_plug Bart Van Assche
2025-04-19 11:47   ` Sasha Levin
2025-04-22 11:51 ` [PATCH 0/3] Preserve the request order in the block layer Greg Kroah-Hartman
2025-05-04 17:14   ` Bart Van Assche

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox