public inbox for linux-nvme@lists.infradead.org
 help / color / mirror / Atom feed
* [PATCH] nvme-rdma: Fix T10-PI when SW doesn't generate/verify metadata
@ 2023-06-06 10:51 Israel Rukshin
  2023-06-07  5:08 ` Christoph Hellwig
  0 siblings, 1 reply; 3+ messages in thread
From: Israel Rukshin @ 2023-06-06 10:51 UTC (permalink / raw)
  To: Linux-nvme, Sagi Grimberg, Christoph Hellwig
  Cc: Israel Rukshin, Nitzan Carmi, Max Gurtovoy

When the SW doesn't generate/verify metadata, the SG length is
smaller than the transfer length. This is because the SG length
doesn't include the metadata length that is added by the HW on
the wire. The target failes those commands with "Data SGL Length
Invalid" by comparing the transfer length and the SG length. The
commit fixes it by adding the metadata length to the transfer
length when there is no metadata SGL. The bug reproduces when
setting read_verify/write_generate to 0 at the child multipath
device or at the primary device when multipath is disabled. Note
that setting those to 0 on the multipath device doesn't have any
impact on the I/Os.

Fixes: 5ec5d3bddc6b ("nvme-rdma: add metadata/T10-PI support")
Signed-off-by: Israel Rukshin <israelr@nvidia.com>
Reviewed-by: Max Gurtovoy <mgurtovoy@nvidia.com>
---
 drivers/nvme/host/rdma.c | 11 ++++++++---
 1 file changed, 8 insertions(+), 3 deletions(-)

diff --git a/drivers/nvme/host/rdma.c b/drivers/nvme/host/rdma.c
index 0eb79696fb73..8bbf38918dd3 100644
--- a/drivers/nvme/host/rdma.c
+++ b/drivers/nvme/host/rdma.c
@@ -1435,6 +1435,8 @@ static int nvme_rdma_map_sg_pi(struct nvme_rdma_queue *queue,
 	struct nvme_ns *ns = rq->q->queuedata;
 	struct bio *bio = rq->bio;
 	struct nvme_keyed_sgl_desc *sg = &c->common.dptr.ksgl;
+	struct blk_integrity *bi = blk_get_integrity(bio->bi_bdev->bd_disk);
+	u32 xfer_len;
 	int nr;
 
 	req->mr = ib_mr_pool_get(queue->qp, &queue->qp->sig_mrs);
@@ -1447,8 +1449,7 @@ static int nvme_rdma_map_sg_pi(struct nvme_rdma_queue *queue,
 	if (unlikely(nr))
 		goto mr_put;
 
-	nvme_rdma_set_sig_attrs(blk_get_integrity(bio->bi_bdev->bd_disk), c,
-				req->mr->sig_attrs, ns->pi_type);
+	nvme_rdma_set_sig_attrs(bi, c, req->mr->sig_attrs, ns->pi_type);
 	nvme_rdma_set_prot_checks(c, &req->mr->sig_attrs->check_mask);
 
 	ib_update_fast_reg_key(req->mr, ib_inc_rkey(req->mr->rkey));
@@ -1466,7 +1467,11 @@ static int nvme_rdma_map_sg_pi(struct nvme_rdma_queue *queue,
 		     IB_ACCESS_REMOTE_WRITE;
 
 	sg->addr = cpu_to_le64(req->mr->iova);
-	put_unaligned_le24(req->mr->length, sg->length);
+	xfer_len = req->mr->length;
+	/* Check if PI is added by the HW */
+	if (!pi_count)
+		xfer_len += (xfer_len >> bi->interval_exp) * ns->pi_size;
+	put_unaligned_le24(xfer_len, sg->length);
 	put_unaligned_le32(req->mr->rkey, sg->key);
 	sg->type = NVME_KEY_SGL_FMT_DATA_DESC << 4;
 
-- 
2.18.2



^ permalink raw reply related	[flat|nested] 3+ messages in thread

* Re: [PATCH] nvme-rdma: Fix T10-PI when SW doesn't generate/verify metadata
  2023-06-06 10:51 [PATCH] nvme-rdma: Fix T10-PI when SW doesn't generate/verify metadata Israel Rukshin
@ 2023-06-07  5:08 ` Christoph Hellwig
  2023-06-07  8:03   ` Max Gurtovoy
  0 siblings, 1 reply; 3+ messages in thread
From: Christoph Hellwig @ 2023-06-07  5:08 UTC (permalink / raw)
  To: Israel Rukshin
  Cc: Linux-nvme, Sagi Grimberg, Christoph Hellwig, Nitzan Carmi,
	Max Gurtovoy

On Tue, Jun 06, 2023 at 01:51:30PM +0300, Israel Rukshin wrote:
> When the SW doesn't generate/verify metadata, the SG length is
> smaller than the transfer length.

What is "the SW"?

> length when there is no metadata SGL. The bug reproduces when
> setting read_verify/write_generate to 0 at the child multipath
> device or at the primary device when multipath is disabled. Note
> that setting those to 0 on the multipath device doesn't have any
> impact on the I/Os.

And we really need to prevent this from happening per the last
discussion.  Please submit a patch to fail adding a new path to
a ns_head if the PI settings mismatch.



^ permalink raw reply	[flat|nested] 3+ messages in thread

* Re: [PATCH] nvme-rdma: Fix T10-PI when SW doesn't generate/verify metadata
  2023-06-07  5:08 ` Christoph Hellwig
@ 2023-06-07  8:03   ` Max Gurtovoy
  0 siblings, 0 replies; 3+ messages in thread
From: Max Gurtovoy @ 2023-06-07  8:03 UTC (permalink / raw)
  To: Christoph Hellwig, Israel Rukshin
  Cc: Linux-nvme, Sagi Grimberg, Nitzan Carmi, oevron



On 07/06/2023 8:08, Christoph Hellwig wrote:
> On Tue, Jun 06, 2023 at 01:51:30PM +0300, Israel Rukshin wrote:
>> When the SW doesn't generate/verify metadata, the SG length is
>> smaller than the transfer length.
> 
> What is "the SW"?

better to say the block layer.

> 
>> length when there is no metadata SGL. The bug reproduces when
>> setting read_verify/write_generate to 0 at the child multipath
>> device or at the primary device when multipath is disabled. Note
>> that setting those to 0 on the multipath device doesn't have any
>> impact on the I/Os.
> 
> And we really need to prevent this from happening per the last
> discussion.  Please submit a patch to fail adding a new path to
> a ns_head if the PI settings mismatch.

We're working on it.

I'm not sure what is the intention regarding the support for having 
shared namespaces that are not under multipath. There is some comment in 
the code that some support will be removed.
Will we allow the same namespace to be exposed twice ? or only the first 
one will be exposed in case there is no match in the PI-offload 
capabilities ?

but the above doesn't answer what is the behavior we should have if we 
change the read_verify/write_generate of the head device ? should it 
propagate to the children ?



^ permalink raw reply	[flat|nested] 3+ messages in thread

end of thread, other threads:[~2023-06-07  8:03 UTC | newest]

Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-06-06 10:51 [PATCH] nvme-rdma: Fix T10-PI when SW doesn't generate/verify metadata Israel Rukshin
2023-06-07  5:08 ` Christoph Hellwig
2023-06-07  8:03   ` Max Gurtovoy

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox