* [PATCHv3 00/10] block integrity merging and counting
@ 2024-09-04 15:25 Keith Busch
2024-09-04 15:25 ` [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio Keith Busch
` (9 more replies)
0 siblings, 10 replies; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:25 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
Just some fixes and cleanups to counting integrity segments when
metadata is used. This fixes merging issues.
Changes from v2:
Included all the follow up patches I mentioned in the v2 thread.
Incorporated Christoph's suggestion to rename blk_rq_count_integrity_sg.
Keith Busch (10):
blk-mq: set the nr_integrity_segments from bio
block: provide helper for nr_integrity_segments
scsi: use request helper to get integrity segments
nvme-rdma: use request helper to get integrity segments
block: unexport blk_rq_count_integrity_sg
blk-integrity: simplify counting segments
blk-integrity: simplify mapping sg
blk-integrity: remove inappropriate limit checks
blk-integrity: consider entire bio list for merging
blk-merge: properly account for integrity segments
block/bio-integrity.c | 7 ----
block/blk-integrity.c | 78 +++++++----------------------------
block/blk-merge.c | 13 +++---
block/blk-mq.c | 4 ++
block/blk.h | 34 ---------------
drivers/nvme/host/rdma.c | 6 +--
drivers/scsi/scsi_lib.c | 6 +--
include/linux/blk-integrity.h | 11 ++---
include/linux/blk-mq.h | 12 ++++++
9 files changed, 47 insertions(+), 124 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 33+ messages in thread
* [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
@ 2024-09-04 15:25 ` Keith Busch
2024-09-10 15:29 ` Christoph Hellwig
2024-09-04 15:25 ` [PATCHv3 02/10] block: provide helper for nr_integrity_segments Keith Busch
` (8 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:25 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
This value is used for potential merging later.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/blk-mq.c | 4 ++++
1 file changed, 4 insertions(+)
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 36abbaefe3874..3ed5181c75610 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2546,6 +2546,10 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
rq->__sector = bio->bi_iter.bi_sector;
rq->write_hint = bio->bi_write_hint;
blk_rq_bio_prep(rq, bio, nr_segs);
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ if (bio->bi_opf & REQ_INTEGRITY)
+ rq->nr_integrity_segments = blk_rq_count_integrity_sg(rq->q, bio);
+#endif
/* This can't fail, since GFP_NOIO includes __GFP_DIRECT_RECLAIM. */
err = blk_crypto_rq_bio_prep(rq, bio, GFP_NOIO);
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 02/10] block: provide helper for nr_integrity_segments
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
2024-09-04 15:25 ` [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio Keith Busch
@ 2024-09-04 15:25 ` Keith Busch
2024-09-10 15:30 ` Christoph Hellwig
2024-09-04 15:25 ` [PATCHv3 03/10] scsi: use request helper to get integrity segments Keith Busch
` (7 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:25 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
This way drivers that want this value don't need to concern themselves
with the CONFIG_BLK_DEV_INTEGRITY setting.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
include/linux/blk-mq.h | 12 ++++++++++++
1 file changed, 12 insertions(+)
diff --git a/include/linux/blk-mq.h b/include/linux/blk-mq.h
index 8d304b1d16b15..3984aad9bf64a 100644
--- a/include/linux/blk-mq.h
+++ b/include/linux/blk-mq.h
@@ -228,6 +228,18 @@ static inline unsigned short req_get_ioprio(struct request *req)
return req->ioprio;
}
+#ifdef CONFIG_BLK_DEV_INTEGRITY
+static inline unsigned short blk_rq_integrity_segments(struct request *rq)
+{
+ return rq->nr_integrity_segments;
+}
+#else
+static inline unsigned short blk_rq_integrity_segments(struct request *rq)
+{
+ return 0;
+}
+#endif
+
#define rq_data_dir(rq) (op_is_write(req_op(rq)) ? WRITE : READ)
#define rq_dma_dir(rq) \
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 03/10] scsi: use request helper to get integrity segments
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
2024-09-04 15:25 ` [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio Keith Busch
2024-09-04 15:25 ` [PATCHv3 02/10] block: provide helper for nr_integrity_segments Keith Busch
@ 2024-09-04 15:25 ` Keith Busch
2024-09-10 15:32 ` Christoph Hellwig
2024-09-04 15:25 ` [PATCHv3 04/10] nvme-rdma: " Keith Busch
` (6 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:25 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
The request tracks the integrity segments already, so no need to recount
the segements again.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/scsi/scsi_lib.c | 3 +--
1 file changed, 1 insertion(+), 2 deletions(-)
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index 3958a6d14bf45..dc1a1644cbc0c 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1175,8 +1175,7 @@ blk_status_t scsi_alloc_sgtables(struct scsi_cmnd *cmd)
goto out_free_sgtables;
}
- ivecs = blk_rq_count_integrity_sg(rq->q, rq->bio);
-
+ ivecs = blk_rq_integrity_segments(rq);
if (sg_alloc_table_chained(&prot_sdb->table, ivecs,
prot_sdb->table.sgl,
SCSI_INLINE_PROT_SG_CNT)) {
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 04/10] nvme-rdma: use request helper to get integrity segments
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (2 preceding siblings ...)
2024-09-04 15:25 ` [PATCHv3 03/10] scsi: use request helper to get integrity segments Keith Busch
@ 2024-09-04 15:25 ` Keith Busch
2024-09-10 15:32 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 05/10] block: unexport blk_rq_count_integrity_sg Keith Busch
` (5 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:25 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
The request tracks the integrity segments already, so no need to
recount the segements again.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
drivers/nvme/host/rdma.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 2eb33842f9711..dc0987d42c6b2 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1496,7 +1496,7 @@ static int nvme_rdma_dma_map_req(struct ib_device *ibdev, struct request *rq,
req->metadata_sgl->sg_table.sgl =
(struct scatterlist *)(req->metadata_sgl + 1);
ret = sg_alloc_table_chained(&req->metadata_sgl->sg_table,
- blk_rq_count_integrity_sg(rq->q, rq->bio),
+ blk_rq_integrity_segments(rq),
req->metadata_sgl->sg_table.sgl,
NVME_INLINE_METADATA_SG_CNT);
if (unlikely(ret)) {
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 05/10] block: unexport blk_rq_count_integrity_sg
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (3 preceding siblings ...)
2024-09-04 15:25 ` [PATCHv3 04/10] nvme-rdma: " Keith Busch
@ 2024-09-04 15:26 ` Keith Busch
2024-09-10 15:33 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 06/10] blk-integrity: simplify counting segments Keith Busch
` (4 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:26 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
There are no external users of this.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/blk-integrity.c | 1 -
1 file changed, 1 deletion(-)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 010decc892eaa..4365960153b91 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -53,7 +53,6 @@ int blk_rq_count_integrity_sg(struct request_queue *q, struct bio *bio)
return segments;
}
-EXPORT_SYMBOL(blk_rq_count_integrity_sg);
/**
* blk_rq_map_integrity_sg - Map integrity metadata into a scatterlist
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 06/10] blk-integrity: simplify counting segments
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (4 preceding siblings ...)
2024-09-04 15:26 ` [PATCHv3 05/10] block: unexport blk_rq_count_integrity_sg Keith Busch
@ 2024-09-04 15:26 ` Keith Busch
2024-09-10 15:41 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 07/10] blk-integrity: simplify mapping sg Keith Busch
` (3 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:26 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
The segments are already packed to the queue limits when adding them to
the bio, so each vector is already its own segment. No need to attempt
compacting them even more. And give the function a more appropriate
name, since we're counting segments, not scatter-gather elements.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/blk-integrity.c | 33 ++++++---------------------------
block/blk-mq.c | 2 +-
include/linux/blk-integrity.h | 5 ++---
3 files changed, 9 insertions(+), 31 deletions(-)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 4365960153b91..c180141b7871c 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -17,39 +17,18 @@
#include "blk.h"
/**
- * blk_rq_count_integrity_sg - Count number of integrity scatterlist elements
- * @q: request queue
+ * blk_rq_count_integrity_segs - Count number of integrity segments
* @bio: bio with integrity metadata attached
*
* Description: Returns the number of elements required in a
* scatterlist corresponding to the integrity metadata in a bio.
*/
-int blk_rq_count_integrity_sg(struct request_queue *q, struct bio *bio)
+int blk_rq_count_integrity_segs(struct bio *bio)
{
- struct bio_vec iv, ivprv = { NULL };
unsigned int segments = 0;
- unsigned int seg_size = 0;
- struct bvec_iter iter;
- int prev = 0;
-
- bio_for_each_integrity_vec(iv, bio, iter) {
- if (prev) {
- if (!biovec_phys_mergeable(q, &ivprv, &iv))
- goto new_segment;
- if (seg_size + iv.bv_len > queue_max_segment_size(q))
- goto new_segment;
-
- seg_size += iv.bv_len;
- } else {
-new_segment:
- segments++;
- seg_size = iv.bv_len;
- }
-
- prev = 1;
- ivprv = iv;
- }
+ for_each_bio(bio)
+ segments += bio->bi_integrity->bip_vcnt;
return segments;
}
@@ -62,7 +41,7 @@ int blk_rq_count_integrity_sg(struct request_queue *q, struct bio *bio)
*
* Description: Map the integrity vectors in request into a
* scatterlist. The scatterlist must be big enough to hold all
- * elements. I.e. sized using blk_rq_count_integrity_sg().
+ * elements. I.e. sized using blk_rq_count_integrity_segs().
*/
int blk_rq_map_integrity_sg(struct request_queue *q, struct bio *bio,
struct scatterlist *sglist)
@@ -145,7 +124,7 @@ bool blk_integrity_merge_bio(struct request_queue *q, struct request *req,
return false;
bio->bi_next = NULL;
- nr_integrity_segs = blk_rq_count_integrity_sg(q, bio);
+ nr_integrity_segs = blk_rq_count_integrity_segs(bio);
bio->bi_next = next;
if (req->nr_integrity_segments + nr_integrity_segs >
diff --git a/block/blk-mq.c b/block/blk-mq.c
index 3ed5181c75610..79cc66275f1cd 100644
--- a/block/blk-mq.c
+++ b/block/blk-mq.c
@@ -2548,7 +2548,7 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
blk_rq_bio_prep(rq, bio, nr_segs);
#if defined(CONFIG_BLK_DEV_INTEGRITY)
if (bio->bi_opf & REQ_INTEGRITY)
- rq->nr_integrity_segments = blk_rq_count_integrity_sg(rq->q, bio);
+ rq->nr_integrity_segments = blk_rq_count_integrity_segs(bio);
#endif
/* This can't fail, since GFP_NOIO includes __GFP_DIRECT_RECLAIM. */
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index de98049b7ded9..0de05278ac824 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -27,7 +27,7 @@ static inline bool queue_limits_stack_integrity_bdev(struct queue_limits *t,
#ifdef CONFIG_BLK_DEV_INTEGRITY
int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
struct scatterlist *);
-int blk_rq_count_integrity_sg(struct request_queue *, struct bio *);
+int blk_rq_count_integrity_segs(struct bio *);
static inline bool
blk_integrity_queue_supports_integrity(struct request_queue *q)
@@ -91,8 +91,7 @@ static inline struct bio_vec rq_integrity_vec(struct request *rq)
rq->bio->bi_integrity->bip_iter);
}
#else /* CONFIG_BLK_DEV_INTEGRITY */
-static inline int blk_rq_count_integrity_sg(struct request_queue *q,
- struct bio *b)
+static inline int blk_rq_count_integrity_segs(struct bio *b)
{
return 0;
}
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 07/10] blk-integrity: simplify mapping sg
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (5 preceding siblings ...)
2024-09-04 15:26 ` [PATCHv3 06/10] blk-integrity: simplify counting segments Keith Busch
@ 2024-09-04 15:26 ` Keith Busch
2024-09-10 15:43 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks Keith Busch
` (2 subsequent siblings)
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:26 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
The segments are already packed to the queue limits when adding them to
the bio, so each vector is already its own segment. No need to attempt
compacting them even more.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/blk-integrity.c | 35 +++++++++--------------------------
drivers/nvme/host/rdma.c | 4 ++--
drivers/scsi/scsi_lib.c | 3 +--
include/linux/blk-integrity.h | 6 ++----
4 files changed, 14 insertions(+), 34 deletions(-)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index c180141b7871c..cfb394eff35c8 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -35,7 +35,6 @@ int blk_rq_count_integrity_segs(struct bio *bio)
/**
* blk_rq_map_integrity_sg - Map integrity metadata into a scatterlist
- * @q: request queue
* @bio: bio with integrity metadata attached
* @sglist: target scatterlist
*
@@ -43,39 +42,23 @@ int blk_rq_count_integrity_segs(struct bio *bio)
* scatterlist. The scatterlist must be big enough to hold all
* elements. I.e. sized using blk_rq_count_integrity_segs().
*/
-int blk_rq_map_integrity_sg(struct request_queue *q, struct bio *bio,
- struct scatterlist *sglist)
+int blk_rq_map_integrity_sg(struct bio *bio, struct scatterlist *sglist)
{
- struct bio_vec iv, ivprv = { NULL };
struct scatterlist *sg = NULL;
unsigned int segments = 0;
struct bvec_iter iter;
- int prev = 0;
+ struct bio_vec iv;
bio_for_each_integrity_vec(iv, bio, iter) {
-
- if (prev) {
- if (!biovec_phys_mergeable(q, &ivprv, &iv))
- goto new_segment;
- if (sg->length + iv.bv_len > queue_max_segment_size(q))
- goto new_segment;
-
- sg->length += iv.bv_len;
- } else {
-new_segment:
- if (!sg)
- sg = sglist;
- else {
- sg_unmark_end(sg);
- sg = sg_next(sg);
- }
-
- sg_set_page(sg, iv.bv_page, iv.bv_len, iv.bv_offset);
- segments++;
+ if (!sg)
+ sg = sglist;
+ else {
+ sg_unmark_end(sg);
+ sg = sg_next(sg);
}
- prev = 1;
- ivprv = iv;
+ sg_set_page(sg, iv.bv_page, iv.bv_len, iv.bv_offset);
+ segments++;
}
if (sg)
diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index dc0987d42c6b2..fab205bb4f3ed 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1504,8 +1504,8 @@ static int nvme_rdma_dma_map_req(struct ib_device *ibdev, struct request *rq,
goto out_unmap_sg;
}
- req->metadata_sgl->nents = blk_rq_map_integrity_sg(rq->q,
- rq->bio, req->metadata_sgl->sg_table.sgl);
+ req->metadata_sgl->nents = blk_rq_map_integrity_sg(rq->bio,
+ req->metadata_sgl->sg_table.sgl);
*pi_count = ib_dma_map_sg(ibdev,
req->metadata_sgl->sg_table.sgl,
req->metadata_sgl->nents,
diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
index dc1a1644cbc0c..33a7d07dcbe26 100644
--- a/drivers/scsi/scsi_lib.c
+++ b/drivers/scsi/scsi_lib.c
@@ -1183,8 +1183,7 @@ blk_status_t scsi_alloc_sgtables(struct scsi_cmnd *cmd)
goto out_free_sgtables;
}
- count = blk_rq_map_integrity_sg(rq->q, rq->bio,
- prot_sdb->table.sgl);
+ count = blk_rq_map_integrity_sg(rq->bio, prot_sdb->table.sgl);
BUG_ON(count > ivecs);
BUG_ON(count > queue_max_integrity_segments(rq->q));
diff --git a/include/linux/blk-integrity.h b/include/linux/blk-integrity.h
index 0de05278ac824..38b43d6c797df 100644
--- a/include/linux/blk-integrity.h
+++ b/include/linux/blk-integrity.h
@@ -25,8 +25,7 @@ static inline bool queue_limits_stack_integrity_bdev(struct queue_limits *t,
}
#ifdef CONFIG_BLK_DEV_INTEGRITY
-int blk_rq_map_integrity_sg(struct request_queue *, struct bio *,
- struct scatterlist *);
+int blk_rq_map_integrity_sg(struct bio *, struct scatterlist *);
int blk_rq_count_integrity_segs(struct bio *);
static inline bool
@@ -95,8 +94,7 @@ static inline int blk_rq_count_integrity_segs(struct bio *b)
{
return 0;
}
-static inline int blk_rq_map_integrity_sg(struct request_queue *q,
- struct bio *b,
+static inline int blk_rq_map_integrity_sg(struct bio *b,
struct scatterlist *s)
{
return 0;
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (6 preceding siblings ...)
2024-09-04 15:26 ` [PATCHv3 07/10] blk-integrity: simplify mapping sg Keith Busch
@ 2024-09-04 15:26 ` Keith Busch
2024-09-10 15:45 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 09/10] blk-integrity: consider entire bio list for merging Keith Busch
2024-09-04 15:26 ` [PATCHv3 10/10] blk-merge: properly account for integrity segments Keith Busch
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:26 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
The queue limits for block access are not the same as metadata access.
Delete these.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/bio-integrity.c | 7 -------
block/blk-integrity.c | 3 ---
block/blk-merge.c | 6 ------
block/blk.h | 34 ----------------------------------
4 files changed, 50 deletions(-)
diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 8d1fb38f745f9..ddd85eb46fbfb 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -184,13 +184,6 @@ int bio_integrity_add_page(struct bio *bio, struct page *page,
if (bip->bip_vcnt >=
min(bip->bip_max_vcnt, queue_max_integrity_segments(q)))
return 0;
-
- /*
- * If the queue doesn't support SG gaps and adding this segment
- * would create a gap, disallow it.
- */
- if (bvec_gap_to_prev(&q->limits, bv, offset))
- return 0;
}
bvec_set_page(&bip->bip_vec[bip->bip_vcnt], page, len, offset);
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index cfb394eff35c8..f9367f3a04208 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -85,9 +85,6 @@ bool blk_integrity_merge_rq(struct request_queue *q, struct request *req,
q->limits.max_integrity_segments)
return false;
- if (integrity_req_gap_back_merge(req, next->bio))
- return false;
-
return true;
}
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 56769c4bcd799..43ab1ce09de65 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -650,9 +650,6 @@ int ll_back_merge_fn(struct request *req, struct bio *bio, unsigned int nr_segs)
{
if (req_gap_back_merge(req, bio))
return 0;
- if (blk_integrity_rq(req) &&
- integrity_req_gap_back_merge(req, bio))
- return 0;
if (!bio_crypt_ctx_back_mergeable(req, bio))
return 0;
if (blk_rq_sectors(req) + bio_sectors(bio) >
@@ -669,9 +666,6 @@ static int ll_front_merge_fn(struct request *req, struct bio *bio,
{
if (req_gap_front_merge(req, bio))
return 0;
- if (blk_integrity_rq(req) &&
- integrity_req_gap_front_merge(req, bio))
- return 0;
if (!bio_crypt_ctx_front_mergeable(req, bio))
return 0;
if (blk_rq_sectors(req) + bio_sectors(bio) >
diff --git a/block/blk.h b/block/blk.h
index 32f4e9f630a3a..3f6198824b258 100644
--- a/block/blk.h
+++ b/block/blk.h
@@ -224,29 +224,6 @@ bool blk_integrity_merge_rq(struct request_queue *, struct request *,
struct request *);
bool blk_integrity_merge_bio(struct request_queue *, struct request *,
struct bio *);
-
-static inline bool integrity_req_gap_back_merge(struct request *req,
- struct bio *next)
-{
- struct bio_integrity_payload *bip = bio_integrity(req->bio);
- struct bio_integrity_payload *bip_next = bio_integrity(next);
-
- return bvec_gap_to_prev(&req->q->limits,
- &bip->bip_vec[bip->bip_vcnt - 1],
- bip_next->bip_vec[0].bv_offset);
-}
-
-static inline bool integrity_req_gap_front_merge(struct request *req,
- struct bio *bio)
-{
- struct bio_integrity_payload *bip = bio_integrity(bio);
- struct bio_integrity_payload *bip_next = bio_integrity(req->bio);
-
- return bvec_gap_to_prev(&req->q->limits,
- &bip->bip_vec[bip->bip_vcnt - 1],
- bip_next->bip_vec[0].bv_offset);
-}
-
extern const struct attribute_group blk_integrity_attr_group;
#else /* CONFIG_BLK_DEV_INTEGRITY */
static inline bool blk_integrity_merge_rq(struct request_queue *rq,
@@ -259,17 +236,6 @@ static inline bool blk_integrity_merge_bio(struct request_queue *rq,
{
return true;
}
-static inline bool integrity_req_gap_back_merge(struct request *req,
- struct bio *next)
-{
- return false;
-}
-static inline bool integrity_req_gap_front_merge(struct request *req,
- struct bio *bio)
-{
- return false;
-}
-
static inline void blk_flush_integrity(void)
{
}
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 09/10] blk-integrity: consider entire bio list for merging
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (7 preceding siblings ...)
2024-09-04 15:26 ` [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks Keith Busch
@ 2024-09-04 15:26 ` Keith Busch
2024-09-10 15:48 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 10/10] blk-merge: properly account for integrity segments Keith Busch
9 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:26 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
If a bio is merged to a req, the entire bio list is merged, so don't
temporarily unchain it when counting segments for consideration.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/blk-integrity.c | 4 ----
1 file changed, 4 deletions(-)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index f9367f3a04208..985de64409cf5 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -92,7 +92,6 @@ bool blk_integrity_merge_bio(struct request_queue *q, struct request *req,
struct bio *bio)
{
int nr_integrity_segs;
- struct bio *next = bio->bi_next;
if (blk_integrity_rq(req) == 0 && bio_integrity(bio) == NULL)
return true;
@@ -103,10 +102,7 @@ bool blk_integrity_merge_bio(struct request_queue *q, struct request *req,
if (bio_integrity(req->bio)->bip_flags != bio_integrity(bio)->bip_flags)
return false;
- bio->bi_next = NULL;
nr_integrity_segs = blk_rq_count_integrity_segs(bio);
- bio->bi_next = next;
-
if (req->nr_integrity_segments + nr_integrity_segs >
q->limits.max_integrity_segments)
return false;
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* [PATCHv3 10/10] blk-merge: properly account for integrity segments
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
` (8 preceding siblings ...)
2024-09-04 15:26 ` [PATCHv3 09/10] blk-integrity: consider entire bio list for merging Keith Busch
@ 2024-09-04 15:26 ` Keith Busch
2024-09-06 11:28 ` Anuj Gupta
2024-09-10 15:49 ` Christoph Hellwig
9 siblings, 2 replies; 33+ messages in thread
From: Keith Busch @ 2024-09-04 15:26 UTC (permalink / raw)
To: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi
Cc: Keith Busch
From: Keith Busch <kbusch@kernel.org>
Merging two requests wasn't accounting for the new segment count, so add
the "next" segement count to the first on a successful merge.
Merging a bio into an existing request was double counting the bio's
segments, even if the merge failed later on. Move the segment accounting
to the end when the merge is successful.
Signed-off-by: Keith Busch <kbusch@kernel.org>
---
block/blk-integrity.c | 2 --
block/blk-merge.c | 7 +++++++
2 files changed, 7 insertions(+), 2 deletions(-)
diff --git a/block/blk-integrity.c b/block/blk-integrity.c
index 985de64409cf5..4222c78eab18f 100644
--- a/block/blk-integrity.c
+++ b/block/blk-integrity.c
@@ -107,8 +107,6 @@ bool blk_integrity_merge_bio(struct request_queue *q, struct request *req,
q->limits.max_integrity_segments)
return false;
- req->nr_integrity_segments += nr_integrity_segs;
-
return true;
}
diff --git a/block/blk-merge.c b/block/blk-merge.c
index 43ab1ce09de65..734e6e22a6d22 100644
--- a/block/blk-merge.c
+++ b/block/blk-merge.c
@@ -639,6 +639,10 @@ static inline int ll_new_hw_segment(struct request *req, struct bio *bio,
* counters.
*/
req->nr_phys_segments += nr_phys_segs;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ if (bio->bi_opf & REQ_INTEGRITY)
+ req->nr_integrity_segments += blk_rq_count_integrity_segs(bio);
+#endif
return 1;
no_merge:
@@ -725,6 +729,9 @@ static int ll_merge_requests_fn(struct request_queue *q, struct request *req,
/* Merge is OK... */
req->nr_phys_segments = total_phys_segments;
+#if defined(CONFIG_BLK_DEV_INTEGRITY)
+ req->nr_integrity_segments += next->nr_integrity_segments;
+#endif
return 1;
}
--
2.43.5
^ permalink raw reply related [flat|nested] 33+ messages in thread
* Re: [PATCHv3 10/10] blk-merge: properly account for integrity segments
2024-09-04 15:26 ` [PATCHv3 10/10] blk-merge: properly account for integrity segments Keith Busch
@ 2024-09-06 11:28 ` Anuj Gupta
2024-09-10 15:49 ` Christoph Hellwig
1 sibling, 0 replies; 33+ messages in thread
From: Anuj Gupta @ 2024-09-06 11:28 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
[-- Attachment #1: Type: text/plain, Size: 243 bytes --]
On 04/09/24 08:26AM, Keith Busch wrote:
>From: Keith Busch <kbusch@kernel.org>
>
>Merging two requests wasn't accounting for the new segment count, so add
>the "next" segement count to the first on a successful merge.
Nit: s/segement/segment
[-- Attachment #2: Type: text/plain, Size: 0 bytes --]
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio
2024-09-04 15:25 ` [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio Keith Busch
@ 2024-09-10 15:29 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:29 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 02/10] block: provide helper for nr_integrity_segments
2024-09-04 15:25 ` [PATCHv3 02/10] block: provide helper for nr_integrity_segments Keith Busch
@ 2024-09-10 15:30 ` Christoph Hellwig
2024-09-10 15:39 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:30 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
On Wed, Sep 04, 2024 at 08:25:57AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> This way drivers that want this value don't need to concern themselves
> with the CONFIG_BLK_DEV_INTEGRITY setting.
Looks ok:
Reviewed-by: Christoph Hellwig <hch@lst.de>
Although I wonder if we should simply define the field unconditionally
given that it is only 2 bytes wide and packs nicely.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 03/10] scsi: use request helper to get integrity segments
2024-09-04 15:25 ` [PATCHv3 03/10] scsi: use request helper to get integrity segments Keith Busch
@ 2024-09-10 15:32 ` Christoph Hellwig
2024-09-10 23:02 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:32 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
> diff --git a/drivers/scsi/scsi_lib.c b/drivers/scsi/scsi_lib.c
> index 3958a6d14bf45..dc1a1644cbc0c 100644
> --- a/drivers/scsi/scsi_lib.c
> +++ b/drivers/scsi/scsi_lib.c
> @@ -1175,8 +1175,7 @@ blk_status_t scsi_alloc_sgtables(struct scsi_cmnd *cmd)
> goto out_free_sgtables;
> }
>
> - ivecs = blk_rq_count_integrity_sg(rq->q, rq->bio);
> -
> + ivecs = blk_rq_integrity_segments(rq);
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
although I'd be tempted to just remove the BUG_ON below (or move
it into blk_rq_map_integrity_sg) and the ivecs variable entirely.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 04/10] nvme-rdma: use request helper to get integrity segments
2024-09-04 15:25 ` [PATCHv3 04/10] nvme-rdma: " Keith Busch
@ 2024-09-10 15:32 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:32 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 05/10] block: unexport blk_rq_count_integrity_sg
2024-09-04 15:26 ` [PATCHv3 05/10] block: unexport blk_rq_count_integrity_sg Keith Busch
@ 2024-09-10 15:33 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:33 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 02/10] block: provide helper for nr_integrity_segments
2024-09-10 15:30 ` Christoph Hellwig
@ 2024-09-10 15:39 ` Keith Busch
0 siblings, 0 replies; 33+ messages in thread
From: Keith Busch @ 2024-09-10 15:39 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Tue, Sep 10, 2024 at 05:30:46PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 04, 2024 at 08:25:57AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> >
> > This way drivers that want this value don't need to concern themselves
> > with the CONFIG_BLK_DEV_INTEGRITY setting.
>
> Looks ok:
>
> Reviewed-by: Christoph Hellwig <hch@lst.de>
>
> Although I wonder if we should simply define the field unconditionally
> given that it is only 2 bytes wide and packs nicely.
Good idea, I didn't consider that. Various parts become cleaner if it's
unconditionally part of the request. I'll try it out.
BTW, just want to mention the the return value here is unreliable until
patch 10. I've reworked the series so that appears first to avoid a
bisect hazard. The end result is the same though, so I didn't want to
spam the mailing list with a v2 just yet.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 06/10] blk-integrity: simplify counting segments
2024-09-04 15:26 ` [PATCHv3 06/10] blk-integrity: simplify counting segments Keith Busch
@ 2024-09-10 15:41 ` Christoph Hellwig
2024-09-10 16:06 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:41 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
On Wed, Sep 04, 2024 at 08:26:01AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The segments are already packed to the queue limits when adding them to
> the bio,
I can't really parse this. I guess this talks about
bio_integrity_add_page trying to append the payload to the last
vector when possible?
> -int blk_rq_count_integrity_sg(struct request_queue *q, struct bio *bio)
> +int blk_rq_count_integrity_segs(struct bio *bio)
> {
> - struct bio_vec iv, ivprv = { NULL };
> unsigned int segments = 0;
> - unsigned int seg_size = 0;
> - struct bvec_iter iter;
> - int prev = 0;
> -
> - bio_for_each_integrity_vec(iv, bio, iter) {
>
> - if (prev) {
> - if (!biovec_phys_mergeable(q, &ivprv, &iv))
> - goto new_segment;
> - if (seg_size + iv.bv_len > queue_max_segment_size(q))
> - goto new_segment;
> -
> - seg_size += iv.bv_len;
> - } else {
> -new_segment:
> - segments++;
> - seg_size = iv.bv_len;
Q: for the data path the caller submitted bio_vecs can be larger
than the max segment size, and given that the metadata API tries
to follow that in general, I'd assume we could also get metadata
segments larger than the segment size in theory, in which case
we'd need to split a bvec into multiple segments, similar to what
bvec_split_segs does. Do we need similar handling for metadata?
Or are we going to say that metadata must e.g. always be smaller
than PAGE_SIZE as max_segment_sizse must be >= PAGE_SIZE?
> + for_each_bio(bio)
> + segments += bio->bi_integrity->bip_vcnt;
If a bio was cloned bip_vcnt isn't the correct value here,
we'll need to use the iter to count the segments.
> bio->bi_next = NULL;
> - nr_integrity_segs = blk_rq_count_integrity_sg(q, bio);
> + nr_integrity_segs = blk_rq_count_integrity_segs(bio);
> bio->bi_next = next;
And instead of playing the magic with the bio chain here, I'd have
a low-level helper to count the bio segments here.
>
> if (req->nr_integrity_segments + nr_integrity_segs >
> diff --git a/block/blk-mq.c b/block/blk-mq.c
> index 3ed5181c75610..79cc66275f1cd 100644
> --- a/block/blk-mq.c
> +++ b/block/blk-mq.c
> @@ -2548,7 +2548,7 @@ static void blk_mq_bio_to_request(struct request *rq, struct bio *bio,
> blk_rq_bio_prep(rq, bio, nr_segs);
> #if defined(CONFIG_BLK_DEV_INTEGRITY)
> if (bio->bi_opf & REQ_INTEGRITY)
> - rq->nr_integrity_segments = blk_rq_count_integrity_sg(rq->q, bio);
> + rq->nr_integrity_segments = blk_rq_count_integrity_segs(bio);
And here I'm actually pretty sure this is always a single bio and not
a chain either.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 07/10] blk-integrity: simplify mapping sg
2024-09-04 15:26 ` [PATCHv3 07/10] blk-integrity: simplify mapping sg Keith Busch
@ 2024-09-10 15:43 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:43 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
On Wed, Sep 04, 2024 at 08:26:02AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The segments are already packed to the queue limits when adding them to
> the bio, so each vector is already its own segment. No need to attempt
> compacting them even more.
Well. For one the immutable bio API is explicitly designed so that
the callers don't need to know the limits, so this isn't really
true.
And the other thing the map_sg helpers for data and metadata do is
cross-bio merges, that is if the end of one bio is contiguous with
the start of the next, it will merge the segments. I don't really
know how useful that is there days - maybe we can removed it with
a proper rational after a bit of testing.
Again the current code closely mirrors the code for mapping the data
bvecs, and I'd preferably keep them as close as possible.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks
2024-09-04 15:26 ` [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks Keith Busch
@ 2024-09-10 15:45 ` Christoph Hellwig
2024-09-10 16:21 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:45 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
On Wed, Sep 04, 2024 at 08:26:03AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> The queue limits for block access are not the same as metadata access.
> Delete these.
While for NVMe-PCIe there isn't any metadata mapping using PRPs, for RDMA
the PRP-like scheme is used for anything, so the same virtual boundary
limits apply for all mappings.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 09/10] blk-integrity: consider entire bio list for merging
2024-09-04 15:26 ` [PATCHv3 09/10] blk-integrity: consider entire bio list for merging Keith Busch
@ 2024-09-10 15:48 ` Christoph Hellwig
2024-09-10 17:19 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:48 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
On Wed, Sep 04, 2024 at 08:26:04AM -0700, Keith Busch wrote:
> From: Keith Busch <kbusch@kernel.org>
>
> If a bio is merged to a req, the entire bio list is merged, so don't
> temporarily unchain it when counting segments for consideration.
As far as I can tell we never do merge decisions on bio lists. If
bi_next is non-NULL here it probably is due to scheduler lists or
something like it.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 10/10] blk-merge: properly account for integrity segments
2024-09-04 15:26 ` [PATCHv3 10/10] blk-merge: properly account for integrity segments Keith Busch
2024-09-06 11:28 ` Anuj Gupta
@ 2024-09-10 15:49 ` Christoph Hellwig
1 sibling, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-10 15:49 UTC (permalink / raw)
To: Keith Busch
Cc: axboe, hch, martin.petersen, linux-block, linux-nvme, linux-scsi,
sagi, Keith Busch
Looks good:
Reviewed-by: Christoph Hellwig <hch@lst.de>
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 06/10] blk-integrity: simplify counting segments
2024-09-10 15:41 ` Christoph Hellwig
@ 2024-09-10 16:06 ` Keith Busch
2024-09-11 8:17 ` Christoph Hellwig
0 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-10 16:06 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Tue, Sep 10, 2024 at 05:41:05PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 04, 2024 at 08:26:01AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> >
> > The segments are already packed to the queue limits when adding them to
> > the bio,
>
> I can't really parse this. I guess this talks about
> bio_integrity_add_page trying to append the payload to the last
> vector when possible?
Exactly. bio_integrity_add_page will use the queue's limits to decide if
it can combine pages into one vector, so appending pages through that
interface will always result in the most compact bip vector.
This patch doesn't combine merged bio's but that's unlikely to have
mergable segments.
> > -int blk_rq_count_integrity_sg(struct request_queue *q, struct bio *bio)
> > +int blk_rq_count_integrity_segs(struct bio *bio)
> > {
> > - struct bio_vec iv, ivprv = { NULL };
> > unsigned int segments = 0;
> > - unsigned int seg_size = 0;
> > - struct bvec_iter iter;
> > - int prev = 0;
> > -
> > - bio_for_each_integrity_vec(iv, bio, iter) {
> >
> > - if (prev) {
> > - if (!biovec_phys_mergeable(q, &ivprv, &iv))
> > - goto new_segment;
> > - if (seg_size + iv.bv_len > queue_max_segment_size(q))
> > - goto new_segment;
> > -
> > - seg_size += iv.bv_len;
> > - } else {
> > -new_segment:
> > - segments++;
> > - seg_size = iv.bv_len;
>
> Q: for the data path the caller submitted bio_vecs can be larger
> than the max segment size, and given that the metadata API tries
> to follow that in general, I'd assume we could also get metadata
> segments larger than the segment size in theory, in which case
> we'd need to split a bvec into multiple segments, similar to what
> bvec_split_segs does. Do we need similar handling for metadata?
> Or are we going to say that metadata must e.g. always be smaller
> than PAGE_SIZE as max_segment_sizse must be >= PAGE_SIZE?
The common use cases don't add integrity data until after the bio is
already split in __bio_split_to_limits(), and it won't be split again
after integrity is added via bio_integrity_prep(). The common path
always adds integrity in a single segment, so it's always valid.
There are just a few other users that set their own bio integrity before
submitting (the nvme and scsi target drivers), and I think both can
break from possible bio splitting, but I haven't been able to test
those.
> > + for_each_bio(bio)
> > + segments += bio->bi_integrity->bip_vcnt;
>
> If a bio was cloned bip_vcnt isn't the correct value here,
> we'll need to use the iter to count the segments.
Darn. The common use case doesn't have integrity added until just before
it's dispatched, so the integrity cloning doesn't normally happen for
that case.
Let's just drop patches 6 and 7 from consideration for now. They are a
bit too optimistic, and doesn't really fix anything anyway.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks
2024-09-10 15:45 ` Christoph Hellwig
@ 2024-09-10 16:21 ` Keith Busch
2024-09-11 8:18 ` Christoph Hellwig
0 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-10 16:21 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Tue, Sep 10, 2024 at 05:45:26PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 04, 2024 at 08:26:03AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> >
> > The queue limits for block access are not the same as metadata access.
> > Delete these.
>
> While for NVMe-PCIe there isn't any metadata mapping using PRPs, for RDMA
> the PRP-like scheme is used for anything, so the same virtual boundary
> limits apply for all mappings.
Oh shoot. The end goal here is to support NVMe META SGL mode, and that
virt boundary doesn't work there.
I was hoping to not introduce another queue limit for the metadata
virt_boundary_mask. We could remove the virt_boundary from the NVMe PCI
driver if it supports SGL's and then we're fine. But we're commiting to
that data format for all IO even if PRP might have been more efficient.
That might be alright, though.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 09/10] blk-integrity: consider entire bio list for merging
2024-09-10 15:48 ` Christoph Hellwig
@ 2024-09-10 17:19 ` Keith Busch
0 siblings, 0 replies; 33+ messages in thread
From: Keith Busch @ 2024-09-10 17:19 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Tue, Sep 10, 2024 at 05:48:43PM +0200, Christoph Hellwig wrote:
> On Wed, Sep 04, 2024 at 08:26:04AM -0700, Keith Busch wrote:
> > From: Keith Busch <kbusch@kernel.org>
> >
> > If a bio is merged to a req, the entire bio list is merged, so don't
> > temporarily unchain it when counting segments for consideration.
>
> As far as I can tell we never do merge decisions on bio lists. If
> bi_next is non-NULL here it probably is due to scheduler lists or
> something like it.
I think bi_next is always NULL, so the unlinking should be a no-op. But
just in case it isn't, the current unlinking will get the wrong segment
count from the resulting merge.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 03/10] scsi: use request helper to get integrity segments
2024-09-10 15:32 ` Christoph Hellwig
@ 2024-09-10 23:02 ` Keith Busch
2024-09-11 8:06 ` Christoph Hellwig
0 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-10 23:02 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Tue, Sep 10, 2024 at 05:32:17PM +0200, Christoph Hellwig wrote:
>
> although I'd be tempted to just remove the BUG_ON below (or move
> it into blk_rq_map_integrity_sg) and the ivecs variable entirely.
Right, I actually have more follow up's doing that. We just need to
change blk_rq_map_integrity_sg() first to take a request instead of a
request_queue + bio. I thought I might be getting carried away with the
"cleanups" though, so was saving that for later. I can definitely add
that into the series now though.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 03/10] scsi: use request helper to get integrity segments
2024-09-10 23:02 ` Keith Busch
@ 2024-09-11 8:06 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-11 8:06 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Keith Busch, axboe, martin.petersen,
linux-block, linux-nvme, linux-scsi, sagi
On Tue, Sep 10, 2024 at 05:02:18PM -0600, Keith Busch wrote:
> Right, I actually have more follow up's doing that. We just need to
> change blk_rq_map_integrity_sg() first to take a request instead of a
> request_queue + bio. I thought I might be getting carried away with the
> "cleanups" though, so was saving that for later. I can definitely add
> that into the series now though.
Well, it seems like the metadata code really needs a lot more attention
:(
And it seems like we really need the io_uring passthrough support to
be able to fully exercise it.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 06/10] blk-integrity: simplify counting segments
2024-09-10 16:06 ` Keith Busch
@ 2024-09-11 8:17 ` Christoph Hellwig
2024-09-11 15:28 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-11 8:17 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Keith Busch, axboe, martin.petersen,
linux-block, linux-nvme, linux-scsi, sagi
On Tue, Sep 10, 2024 at 10:06:03AM -0600, Keith Busch wrote:
> Exactly. bio_integrity_add_page will use the queue's limits to decide if
> it can combine pages into one vector, so appending pages through that
> interface will always result in the most compact bip vector.
>
> This patch doesn't combine merged bio's but that's unlikely to have
> mergable segments.
Oh, bio_integrity_add_page uses bvec_try_merge_hw_page. That means it
doesn't really work well for our stacking model, as any stacking driver
can and will change these. Maybe we need to take a step back and fully
apply the immutable biovec and split at final submission model to
metadata?
> The common use cases don't add integrity data until after the bio is
> already split in __bio_split_to_limits(), and it won't be split again
> after integrity is added via bio_integrity_prep(). The common path
> always adds integrity in a single segment, so it's always valid.
Where common case is the somewhat awful auto PI in the lowest level
driver. I'd really much prefer to move to the file system adding
the PI wherever possible, as that way it can actually look into it
(and return it to the driver, etc).
> There are just a few other users that set their own bio integrity before
> submitting (the nvme and scsi target drivers), and I think both can
> break from possible bio splitting, but I haven't been able to test
> those.
Yes. Plus dm-integrity and the new io_uring read/write with PI code
that's being submitted. I plan to also support this from the file
system eventually. None of these seems widely used, which I think
explains the current messy state of PI as soon as merging/splitting
or remapping is involved.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks
2024-09-10 16:21 ` Keith Busch
@ 2024-09-11 8:18 ` Christoph Hellwig
2024-09-11 15:18 ` Keith Busch
0 siblings, 1 reply; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-11 8:18 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Keith Busch, axboe, martin.petersen,
linux-block, linux-nvme, linux-scsi, sagi
On Tue, Sep 10, 2024 at 10:21:09AM -0600, Keith Busch wrote:
> I was hoping to not introduce another queue limit for the metadata
> virt_boundary_mask. We could remove the virt_boundary from the NVMe PCI
> driver if it supports SGL's and then we're fine. But we're commiting to
> that data format for all IO even if PRP might have been more efficient.
> That might be alright, though.
The real question is if we actually want that. If we want to take
fully advantage of the new dma mapping API that Leon is proposing we'd
need to impose a similar limit. And a lot of these super tiny SGL
segments probably aren't very efficient to start with. Something that
bounce buffers a page worth of PI tuples might end up beeing more
efficient in the end.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks
2024-09-11 8:18 ` Christoph Hellwig
@ 2024-09-11 15:18 ` Keith Busch
0 siblings, 0 replies; 33+ messages in thread
From: Keith Busch @ 2024-09-11 15:18 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Wed, Sep 11, 2024 at 10:18:46AM +0200, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 10:21:09AM -0600, Keith Busch wrote:
> > I was hoping to not introduce another queue limit for the metadata
> > virt_boundary_mask. We could remove the virt_boundary from the NVMe PCI
> > driver if it supports SGL's and then we're fine. But we're commiting to
> > that data format for all IO even if PRP might have been more efficient.
> > That might be alright, though.
>
> The real question is if we actually want that. If we want to take
> fully advantage of the new dma mapping API that Leon is proposing we'd
> need to impose a similar limit. And a lot of these super tiny SGL
> segments probably aren't very efficient to start with. Something that
> bounce buffers a page worth of PI tuples might end up beeing more
> efficient in the end.
Yes, a bounce buffer is more efficient for the device side of the
transaction. We have the infrastructure for it when the integrity data
comes from user space addresses, so I suppose we could introduce a "kern"
equivalent.
But currently, instead of many small SGL's stiched together in a single
command, we have the same small SGLs in their own commands. I think this
patch set is a step in the right direction, and doesn't preclude bounce
optimizations.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 06/10] blk-integrity: simplify counting segments
2024-09-11 8:17 ` Christoph Hellwig
@ 2024-09-11 15:28 ` Keith Busch
2024-09-12 7:01 ` Christoph Hellwig
0 siblings, 1 reply; 33+ messages in thread
From: Keith Busch @ 2024-09-11 15:28 UTC (permalink / raw)
To: Christoph Hellwig
Cc: Keith Busch, axboe, martin.petersen, linux-block, linux-nvme,
linux-scsi, sagi
On Wed, Sep 11, 2024 at 10:17:20AM +0200, Christoph Hellwig wrote:
> On Tue, Sep 10, 2024 at 10:06:03AM -0600, Keith Busch wrote:
> > Exactly. bio_integrity_add_page will use the queue's limits to decide if
> > it can combine pages into one vector, so appending pages through that
> > interface will always result in the most compact bip vector.
> >
> > This patch doesn't combine merged bio's but that's unlikely to have
> > mergable segments.
>
> Oh, bio_integrity_add_page uses bvec_try_merge_hw_page. That means it
> doesn't really work well for our stacking model, as any stacking driver
> can and will change these. Maybe we need to take a step back and fully
> apply the immutable biovec and split at final submission model to
> metadata?
Would you be okay if I resubmit this patchset with just the minimum to
fix the existing merging? I agree stacking and splitting integrity is
currently broken, but I think the merging fixes from this series need to
happen regardless of the how the block stack might change when integrity
data is set in bios.
^ permalink raw reply [flat|nested] 33+ messages in thread
* Re: [PATCHv3 06/10] blk-integrity: simplify counting segments
2024-09-11 15:28 ` Keith Busch
@ 2024-09-12 7:01 ` Christoph Hellwig
0 siblings, 0 replies; 33+ messages in thread
From: Christoph Hellwig @ 2024-09-12 7:01 UTC (permalink / raw)
To: Keith Busch
Cc: Christoph Hellwig, Keith Busch, axboe, martin.petersen,
linux-block, linux-nvme, linux-scsi, sagi
On Wed, Sep 11, 2024 at 09:28:38AM -0600, Keith Busch wrote:
> Would you be okay if I resubmit this patchset with just the minimum to
> fix the existing merging? I agree stacking and splitting integrity is
> currently broken, but I think the merging fixes from this series need to
> happen regardless of the how the block stack might change when integrity
> data is set in bios.
I guess everything that makes it less broken is good. I'm a little
worried about removing some of the split/count code we'll eventually
need, though.
^ permalink raw reply [flat|nested] 33+ messages in thread
end of thread, other threads:[~2024-09-12 7:02 UTC | newest]
Thread overview: 33+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-09-04 15:25 [PATCHv3 00/10] block integrity merging and counting Keith Busch
2024-09-04 15:25 ` [PATCHv3 01/10] blk-mq: set the nr_integrity_segments from bio Keith Busch
2024-09-10 15:29 ` Christoph Hellwig
2024-09-04 15:25 ` [PATCHv3 02/10] block: provide helper for nr_integrity_segments Keith Busch
2024-09-10 15:30 ` Christoph Hellwig
2024-09-10 15:39 ` Keith Busch
2024-09-04 15:25 ` [PATCHv3 03/10] scsi: use request helper to get integrity segments Keith Busch
2024-09-10 15:32 ` Christoph Hellwig
2024-09-10 23:02 ` Keith Busch
2024-09-11 8:06 ` Christoph Hellwig
2024-09-04 15:25 ` [PATCHv3 04/10] nvme-rdma: " Keith Busch
2024-09-10 15:32 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 05/10] block: unexport blk_rq_count_integrity_sg Keith Busch
2024-09-10 15:33 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 06/10] blk-integrity: simplify counting segments Keith Busch
2024-09-10 15:41 ` Christoph Hellwig
2024-09-10 16:06 ` Keith Busch
2024-09-11 8:17 ` Christoph Hellwig
2024-09-11 15:28 ` Keith Busch
2024-09-12 7:01 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 07/10] blk-integrity: simplify mapping sg Keith Busch
2024-09-10 15:43 ` Christoph Hellwig
2024-09-04 15:26 ` [PATCHv3 08/10] blk-integrity: remove inappropriate limit checks Keith Busch
2024-09-10 15:45 ` Christoph Hellwig
2024-09-10 16:21 ` Keith Busch
2024-09-11 8:18 ` Christoph Hellwig
2024-09-11 15:18 ` Keith Busch
2024-09-04 15:26 ` [PATCHv3 09/10] blk-integrity: consider entire bio list for merging Keith Busch
2024-09-10 15:48 ` Christoph Hellwig
2024-09-10 17:19 ` Keith Busch
2024-09-04 15:26 ` [PATCHv3 10/10] blk-merge: properly account for integrity segments Keith Busch
2024-09-06 11:28 ` Anuj Gupta
2024-09-10 15:49 ` Christoph Hellwig
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).