linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* fine-grained PI control
@ 2024-07-05  8:32 Christoph Hellwig
  2024-07-08 14:17 ` Anuj gupta
  2024-07-09  3:35 ` Martin K. Petersen
  0 siblings, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2024-07-05  8:32 UTC (permalink / raw)
  To: Kanchan Joshi, Anuj Gupta, Martin K . Petersen
  Cc: linux-block, linux-scsi, linux-nvme

Hi all,

I'm trying to kick of a discussion on fine grained control of T10 PI
flags.  This concerns the new io_uring metadata interface including
comments made by Martin in response to earlier series, and observations
on how block devices with PI enabled don't work quite right right now
for a few uses cases.

The SCSI and NVMe PI interfaces allow fine-grained checking of the guard,
reference and app tags.  The io_uring interface exposes them to
userspace, which is useful.  The in-kernel implementation in the posted
patch set only uses these flags when detecting user passthrough and
currently only in the nvme driver.  I think we'll need to change the
in-kernel interface matches the user one, and the submitter of the PI
data chooses which of the tags to generate and check.

Martin also mentioned he wanted to see the BIP_CTRL_NOCHECK,
BIP_DISK_NOCHECK and BIP_IP_CHECKSUM checksum flags exposed.  Can you
explain how you want them to fit into the API?  Especially as AFAIK
they can't work generically, e.g. NVMe never has an IP checksum and
SCSI controllers might not offer them either.  NVMe doesn't have a way
to distinguish between disk and controller.

Last but not least the fact that all reads and writes on PI enabled
devices by default check the guard (and reference if available for the
PI type) tags leads to a lot of annoying warnings when the kernel or
userspace does speculative reads.  Most of this is to read signatures
of file systems or partitions, and that previously always succeeded,
but now returns an error and warns in the block layer.  I've been
wondering a few things here:

 - is there much of a point in having block layer and driver warnings
   (for NVMe we'll currently get both) by default, or should we leave
   that to the submitter that needs to check errors anyway?
 - should we have an opt-out or even opt-in for guard tag verification
   to avoid the higher level code tripping up on returned errors where
   they just want to check if a signature matches?

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-05  8:32 fine-grained PI control Christoph Hellwig
@ 2024-07-08 14:17 ` Anuj gupta
  2024-07-09  7:08   ` Christoph Hellwig
  2024-07-09  3:35 ` Martin K. Petersen
  1 sibling, 1 reply; 14+ messages in thread
From: Anuj gupta @ 2024-07-08 14:17 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kanchan Joshi, Anuj Gupta, Martin K . Petersen, linux-block,
	linux-scsi, linux-nvme

On Fri, Jul 5, 2024 at 2:02 PM Christoph Hellwig <hch@lst.de> wrote:
>
> Hi all,
>
> Martin also mentioned he wanted to see the BIP_CTRL_NOCHECK,
> BIP_DISK_NOCHECK and BIP_IP_CHECKSUM checksum flags exposed.  Can you
> explain how you want them to fit into the API?  Especially as AFAIK
> they can't work generically, e.g. NVMe never has an IP checksum and
> SCSI controllers might not offer them either.  NVMe doesn't have a way
> to distinguish between disk and controller.
Yes, these flags are only valid in the context of SCSI. One possible
scheme can be that the API still has the ability to pass these flags
and the NVMe driver fails if the user specifies these flags.

>
> Last but not least the fact that all reads and writes on PI enabled
> devices by default check the guard (and reference if available for the
> PI type) tags leads to a lot of annoying warnings when the kernel or
> userspace does speculative reads.
In the current series the application can choose not to specify the
GUARD check flag, which would disable the guard checking even for PI
enabled devices. Did you still encounter errors or am I missing
something here?

--
Anuj Gupta

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-05  8:32 fine-grained PI control Christoph Hellwig
  2024-07-08 14:17 ` Anuj gupta
@ 2024-07-09  3:35 ` Martin K. Petersen
  2024-07-09  7:16   ` Christoph Hellwig
  2024-09-18  6:39   ` Anuj Gupta
  1 sibling, 2 replies; 14+ messages in thread
From: Martin K. Petersen @ 2024-07-09  3:35 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Kanchan Joshi, Anuj Gupta, Martin K . Petersen, linux-block,
	linux-scsi, linux-nvme


Christoph,

Sorry about the delay. Just got back from vacation.

> I think we'll need to change the in-kernel interface matches the user
> one, and the submitter of the PI data chooses which of the tags to
> generate and check.

Yep. I discussed this with Kanchan a while back.

I don't like having the BIP_USER_CHECK_FOO flags exposed in the block
layer. The io_uring interface obviously needs to expose some flags in
the user API. But there should not be a separate set of BIP_USER_* flags
bolted onto the side of the existing kernel integrity flags.

The bip flags should describe the contents of the integrity buffer and
how the hardware needs to interpret and check that information.

> Martin also mentioned he wanted to see the BIP_CTRL_NOCHECK,
> BIP_DISK_NOCHECK and BIP_IP_CHECKSUM checksum flags exposed. Can you
> explain how you want them to fit into the API? Especially as AFAIK
> they can't work generically, e.g. NVMe never has an IP checksum and
> SCSI controllers might not offer them either. NVMe doesn't have a way
> to distinguish between disk and controller.

I am not sure how to handle the protocol differences other than
returning an error if flags are passed that are not valid for the given
device.

The other alternative is to only expose a generic CHECK or NOCHECK flag
(depending which polarity we prefer) which will enable or disable
checking for both controller and disk in the SCSI case. But that also
means porting the DI test tooling will be impossible.

Another wrinkle is that SCSI does not have a way to directly specify
which tags to check. You can check guard only, check app+ref only, or
all three. But you can't just check the ref tag if that's what you want
to do.

I addressed that in DIX by having individual tag check flags and NVMe
inherited those in PRCHK. But for the SCSI disk itself we're limited to
what RDPROTECT/WRPROTECT can express. And that's why BIP_DISK_NOCHECK
disables checking of all tags and why there are currently no separate
BIP_DISK_NO_{GUARD,APP,REF}_CHECK flags.

> Last but not least the fact that all reads and writes on PI enabled
> devices by default check the guard (and reference if available for the
> PI type) tags leads to a lot of annoying warnings when the kernel or
> userspace does speculative reads.

> Most of this is to read signatures of file systems or partitions, and
> that previously always succeeded, but now returns an error and warns
> in the block layer. I've been wondering a few things here:

Is that on NVMe? It's been a while since I've tried. We don't get errors
for readahead on SCSI, that would be a bug.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-08 14:17 ` Anuj gupta
@ 2024-07-09  7:08   ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2024-07-09  7:08 UTC (permalink / raw)
  To: Anuj gupta
  Cc: Christoph Hellwig, Kanchan Joshi, Anuj Gupta, Martin K . Petersen,
	linux-block, linux-scsi, linux-nvme

On Mon, Jul 08, 2024 at 07:47:59PM +0530, Anuj gupta wrote:
> > Last but not least the fact that all reads and writes on PI enabled
> > devices by default check the guard (and reference if available for the
> > PI type) tags leads to a lot of annoying warnings when the kernel or
> > userspace does speculative reads.
> In the current series the application can choose not to specify the
> GUARD check flag, which would disable the guard checking even for PI
> enabled devices. Did you still encounter errors or am I missing
> something here?

Well, this is for probing reads from libblkid, the partition parser
or file system mount routines.  So we'll need a flag for the case where
PI is not controlled by the user, as we don't really want all the callers
to actually deal with PI.

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-09  3:35 ` Martin K. Petersen
@ 2024-07-09  7:16   ` Christoph Hellwig
  2024-07-10  3:47     ` Martin K. Petersen
  2024-07-26 10:21     ` Anuj Gupta
  2024-09-18  6:39   ` Anuj Gupta
  1 sibling, 2 replies; 14+ messages in thread
From: Christoph Hellwig @ 2024-07-09  7:16 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Kanchan Joshi, Anuj Gupta, linux-block,
	linux-scsi, linux-nvme

On Mon, Jul 08, 2024 at 11:35:13PM -0400, Martin K. Petersen wrote:
> I don't like having the BIP_USER_CHECK_FOO flags exposed in the block
> layer. The io_uring interface obviously needs to expose some flags in
> the user API. But there should not be a separate set of BIP_USER_* flags
> bolted onto the side of the existing kernel integrity flags.
>
> The bip flags should describe the contents of the integrity buffer and
> how the hardware needs to interpret and check that information.

Yes, that was also my review comments.

> The other alternative is to only expose a generic CHECK or NOCHECK flag
> (depending which polarity we prefer) which will enable or disable
> checking for both controller and disk in the SCSI case. But that also
> means porting the DI test tooling will be impossible.
> 
> Another wrinkle is that SCSI does not have a way to directly specify
> which tags to check. You can check guard only, check app+ref only, or
> all three. But you can't just check the ref tag if that's what you want
> to do.
> 
> I addressed that in DIX by having individual tag check flags and NVMe
> inherited those in PRCHK. But for the SCSI disk itself we're limited to
> what RDPROTECT/WRPROTECT can express. And that's why BIP_DISK_NOCHECK
> disables checking of all tags and why there are currently no separate
> BIP_DISK_NO_{GUARD,APP,REF}_CHECK flags.

So what are useful APIs we can/should expose?.

If we want full portability we can't support all the individual checks,
because the disk will check it for SCSI even if we don't do the extra
checks in the controller.  We could still expose the invidual
flags, but reuse the combinations SCSI doesn't support on SCSI,
although that would lead to surprises if people write their software
and test on NVMe and then move to SCSI.  Could we just expose the
valid SCSI combinations if people are find with that for now?

> > Last but not least the fact that all reads and writes on PI enabled
> > devices by default check the guard (and reference if available for the
> > PI type) tags leads to a lot of annoying warnings when the kernel or
> > userspace does speculative reads.
> 
> > Most of this is to read signatures of file systems or partitions, and
> > that previously always succeeded, but now returns an error and warns
> > in the block layer. I've been wondering a few things here:
> 
> Is that on NVMe? It's been a while since I've tried. We don't get errors
> for readahead on SCSI, that would be a bug.

Note that these reads aren't readaheads (well, they actually are too
because everything in the buffer cache does a readahead first), but
probing reads from blkid / partitions scans / etc.  Right now the
driver has not way to distinguish them for reads that are really looking
for (meta-)data that is expected to be there.

I'm not currently seeing warnings on SCSI, but that's because my only
PI testing is scsi_debug which starts out with deallocated blocks.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-09  7:16   ` Christoph Hellwig
@ 2024-07-10  3:47     ` Martin K. Petersen
  2024-07-11  5:42       ` Christoph Hellwig
  2024-07-26 10:21     ` Anuj Gupta
  1 sibling, 1 reply; 14+ messages in thread
From: Martin K. Petersen @ 2024-07-10  3:47 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin K. Petersen, Kanchan Joshi, Anuj Gupta, linux-block,
	linux-scsi, linux-nvme


Christoph,

> So what are useful APIs we can/should expose?.
>
> If we want full portability we can't support all the individual
> checks, because the disk will check it for SCSI even if we don't do
> the extra checks in the controller. We could still expose the invidual
> flags, but reuse the combinations SCSI doesn't support on SCSI,
> although that would lead to surprises if people write their software
> and test on NVMe and then move to SCSI. Could we just expose the valid
> SCSI combinations if people are find with that for now?

I didn't have any actual use for check-this-but-not-that. The rationale
behind having explicit checking flags was my dislike for the fact that
the policy decision about what to check was residing inside the disk
drive and depended on how it was formatted, which flags were wired up in
the EI VPD, etc. I preferred an approach where the OS tells the hardware
exactly what to do.

There are a couple of free bits in *PROTECT so we could conceivably work
with T10 to add the missing pieces. But it would have a pretty long
turnaround, of course, and wouldn't address existing devices.

Also, things are not entirely symmetric wrt. *PROTECT for reads and
writes either. I'll try to wrap my head around it tomorrow.

For the user API I think it would be most sensible to have CHECK_GUARD,
CHECK_APP, CHECK_REF to cover the common DIX/NVMe case.

And then we could have NO_CHECK_DISK and IP_CHECKSUM_CONVERSION to
handle the peculiar SCSI corner cases and document that these are
experimental flags to be used for test purposes only. Not particularly
elegant but I don't have a better idea. Especially since things are
inherently asymmetric with controller-to-target communication being
protected even if you don't attach PI to the bio.

I.e. I think the CHECK_{GUARD,APP,REF} flags should describe how a
DIX or NVMe controller should check the attached bip payload. And
nothing else.

The controller-to-target PI handling is orthogonal and refers to what
happens in the second protection envelope, i.e. the communication
between a DIX controller and a target. This may or may not be the same
PI as in the bip payload. Therefore I think these flags should be
separate.

I'll mull over it a bit more and revisit all the SCSI wrinkles.

> I'm not currently seeing warnings on SCSI, but that's because my only
> PI testing is scsi_debug which starts out with deallocated blocks.

SCSI says that deallocated blocks have 0xFFFF in the app tag and thus
checking should be disabled on read. And if you subsequently write a
block without providing PI, the drive generates a valid guard and ref
tag (for Type 1). So there should never be a situation where reading a
block returns a PI error unless the block is corrupted. Either the app
tag escape is present or the PI is valid.

SCSI subsequently added some blurriness to permit deviations from this
principle. But the original PI design explicitly ensured that PI was
never accidentally invalid and reads would never fail. Even if you wrote
the drive on a system that didn't know about PI things would be OK. This
was deliberately done so reading partition tables, etc. wouldn't fail.
In Linux we currently treat Type 2 as Type 1 for pretty much the same
reason: To ensure that the ref tag is always well-defined. I.e. it
contains the lower 32 bits of the LBA.

The intent when we defined E2EDP in NVMe was to match this never-fail
SCSI behavior. So I'm puzzled as to why you see errors.

I'll try to connect my NVMe test box tomorrow. It's been offline after a
rack move. Would like to understand what's going on. Are we not setting
ILBRT/EILBRT appropriately?

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-10  3:47     ` Martin K. Petersen
@ 2024-07-11  5:42       ` Christoph Hellwig
  2024-07-16  2:07         ` Martin K. Petersen
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2024-07-11  5:42 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Kanchan Joshi, Anuj Gupta, linux-block,
	linux-scsi, linux-nvme

On Tue, Jul 09, 2024 at 11:47:58PM -0400, Martin K. Petersen wrote:
> For the user API I think it would be most sensible to have CHECK_GUARD,
> CHECK_APP, CHECK_REF to cover the common DIX/NVMe case.
> 
> And then we could have NO_CHECK_DISK and IP_CHECKSUM_CONVERSION to
> handle the peculiar SCSI corner cases and document that these are
> experimental flags to be used for test purposes only. Not particularly
> elegant but I don't have a better idea. Especially since things are
> inherently asymmetric with controller-to-target communication being
> protected even if you don't attach PI to the bio.
> 
> I.e. I think the CHECK_{GUARD,APP,REF} flags should describe how a
> DIX or NVMe controller should check the attached bip payload. And
> nothing else.

I really hate an API that is basically exposes a completely
different set of flags for SCSI vs NVMe.

I am also really not sold on IP_CHECKSUM_CONVERSION and the separate
no check for disk vs controller things.  I can see why someone would
want to do that for testing, but as a general API at the syscall
level it just is not a useful abstraction and highly confusing.

Can we figure out a way to do these as error injection points in
scsi or something similar so that we don't have to overload the
user interface with it?

> I'll try to connect my NVMe test box tomorrow. It's been offline after a
> rack move. Would like to understand what's going on. Are we not setting
> ILBRT/EILBRT appropriately?

I think NVMe just had a real mess with deallocation and Write Zeroes
in the past, and my test driver might be old enough to have implemented
the ECNs that fixes this eventually, assuming it actually got fixed
everywhere.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-11  5:42       ` Christoph Hellwig
@ 2024-07-16  2:07         ` Martin K. Petersen
  0 siblings, 0 replies; 14+ messages in thread
From: Martin K. Petersen @ 2024-07-16  2:07 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin K. Petersen, Kanchan Joshi, Anuj Gupta, linux-block,
	linux-scsi, linux-nvme


Christoph,

> I really hate an API that is basically exposes a completely different
> set of flags for SCSI vs NVMe.

I don't disagree.

It is unfortunate that PI in NVMe was defined at a time when PCIe was
the only transport and therefore it didn't make sense to have a
distinction between controller and target. In our experience there is a
lot of value in being able to identify where exactly the corruption
happened, we use it all the time.

In an ideal world we'd be able to have multiple orthogonal protection
envelopes as envisioned in the SNIA efforts. But with NVMe only
supporting one envelope and the SCSI host-target envelope being wonky, I
sadly don't see that happening.

That said, I don't particularly like the idea of removing existing
functionality to match whichever subset of PI/DIX/AMDI that NVMe decided
to adopt.

I am OK in principle with the idea of the user API only describing the
first envelope as long as I have a different way of controlling the
second envelope on a per-I/O basis. It is fine to require this use case
to be explicitly configured and enabled. Many PI devices require
configuration prior to testing anyway.

> Can we figure out a way to do these as error injection points in scsi
> or something similar so that we don't have to overload the user
> interface with it?

I'm experimenting with a few different approaches...

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-09  7:16   ` Christoph Hellwig
  2024-07-10  3:47     ` Martin K. Petersen
@ 2024-07-26 10:21     ` Anuj Gupta
  2024-07-29 17:03       ` Christoph Hellwig
  1 sibling, 1 reply; 14+ messages in thread
From: Anuj Gupta @ 2024-07-26 10:21 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin K. Petersen, Kanchan Joshi, linux-block, linux-scsi,
	linux-nvme

[-- Attachment #1: Type: text/plain, Size: 3689 bytes --]

On Tue, Jul 09, 2024 at 09:16:04AM +0200, Christoph Hellwig wrote:
> On Mon, Jul 08, 2024 at 11:35:13PM -0400, Martin K. Petersen wrote:
> > I don't like having the BIP_USER_CHECK_FOO flags exposed in the block
> > layer. The io_uring interface obviously needs to expose some flags in
> > the user API. But there should not be a separate set of BIP_USER_* flags
> > bolted onto the side of the existing kernel integrity flags.
> >
> > The bip flags should describe the contents of the integrity buffer and
> > how the hardware needs to interpret and check that information.
> 
> Yes, that was also my review comments.

Hi Christoph, Martin,

I was thinking something like below patch[*] could help us get rid of the
BIP_USER_CHECK_FOO flags, and also driver can now check flags passed by block
layer instead of checking if it's user passthrough data. Haven't plumbed the
scsi side of things, but do you think it can work with scsi? 

Subject: [PATCH] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags

This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
indicate how the hardware should check the payload. The driver can now
just rely on block layer flags, and doesn't need to know the integrity
source. Submitter of PI chooses which tags to check. This would also
give us a unified interface for user and kernel generated integrity.

Signed-off-by: Anuj Gupta <anuj20.g@samsung.com>
Signed-off-by: Kanchan Joshi <joshi.k@samsung.com>
---
 block/bio-integrity.c         |  5 +++++
 drivers/nvme/host/core.c      | 12 +++---------
 include/linux/bio-integrity.h |  3 +++
 3 files changed, 11 insertions(+), 9 deletions(-)

diff --git a/block/bio-integrity.c b/block/bio-integrity.c
index 8d1fb38f745f..d179b0134e1d 100644
--- a/block/bio-integrity.c
+++ b/block/bio-integrity.c
@@ -443,6 +443,11 @@ bool bio_integrity_prep(struct bio *bio)
 	if (bi->csum_type == BLK_INTEGRITY_CSUM_IP)
 		bip->bip_flags |= BIP_IP_CHECKSUM;
 
+	/* describe what tags to check in payload */
+	if (bi->csum_type)
+		bip->bip_flags |= BIP_CHECK_GUARD;
+	if (bi->flags & BLK_INTEGRITY_REF_TAG)
+		bip->bip_flags |= BIP_CHECK_REFTAG;
 	if (bio_integrity_add_page(bio, virt_to_page(buf), len,
 			offset_in_page(buf)) < len) {
 		printk(KERN_ERR "could not attach integrity payload\n");
diff --git a/drivers/nvme/host/core.c b/drivers/nvme/host/core.c
index 19917253ba7b..5991f048f394 100644
--- a/drivers/nvme/host/core.c
+++ b/drivers/nvme/host/core.c
@@ -1001,19 +1001,13 @@ static inline blk_status_t nvme_setup_rw(struct nvme_ns *ns,
 				return BLK_STS_NOTSUPP;
 			control |= NVME_RW_PRINFO_PRACT;
 		}
-
-		switch (ns->head->pi_type) {
-		case NVME_NS_DPS_PI_TYPE3:
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_GUARD))
 			control |= NVME_RW_PRINFO_PRCHK_GUARD;
-			break;
-		case NVME_NS_DPS_PI_TYPE1:
-		case NVME_NS_DPS_PI_TYPE2:
-			control |= NVME_RW_PRINFO_PRCHK_GUARD |
-					NVME_RW_PRINFO_PRCHK_REF;
+		if (bio_integrity_flagged(req->bio, BIP_CHECK_REFTAG)) {
+			control |= NVME_RW_PRINFO_PRCHK_REF;
 			if (op == nvme_cmd_zone_append)
 				control |= NVME_RW_APPEND_PIREMAP;
 			nvme_set_ref_tag(ns, cmnd, req);
-			break;
 		}
 	}
 
diff --git a/include/linux/bio-integrity.h b/include/linux/bio-integrity.h
index dd831c269e99..a7e3dfc994b0 100644
--- a/include/linux/bio-integrity.h
+++ b/include/linux/bio-integrity.h
@@ -11,6 +11,9 @@ enum bip_flags {
 	BIP_DISK_NOCHECK	= 1 << 3, /* disable disk integrity checking */
 	BIP_IP_CHECKSUM		= 1 << 4, /* IP checksum */
 	BIP_COPY_USER		= 1 << 5, /* Kernel bounce buffer in use */
+	BIP_CHECK_GUARD		= 1 << 6,
+	BIP_CHECK_REFTAG	= 1 << 7,
+	BIP_CHECK_APPTAG	= 1 << 8,
 };
 
 struct bio_integrity_payload {
-- 
2.25.1

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-26 10:21     ` Anuj Gupta
@ 2024-07-29 17:03       ` Christoph Hellwig
  0 siblings, 0 replies; 14+ messages in thread
From: Christoph Hellwig @ 2024-07-29 17:03 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: Christoph Hellwig, Martin K. Petersen, Kanchan Joshi, linux-block,
	linux-scsi, linux-nvme

On Fri, Jul 26, 2024 at 03:51:56PM +0530, Anuj Gupta wrote:
> 
> I was thinking something like below patch[*] could help us get rid of the
> BIP_USER_CHECK_FOO flags, and also driver can now check flags passed by block
> layer instead of checking if it's user passthrough data. Haven't plumbed the
> scsi side of things, but do you think it can work with scsi? 
> 
> Subject: [PATCH] block: introduce BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags
> 
> This patch introduces BIP_CHECK_GUARD/REFTAG/APPTAG bip_flags which
> indicate how the hardware should check the payload. The driver can now
> just rely on block layer flags, and doesn't need to know the integrity
> source. Submitter of PI chooses which tags to check. This would also
> give us a unified interface for user and kernel generated integrity.

This looks reasonably to me for the in-kernel interface.  We'll still
need to deal with the fact that SCSI is a bit selective in what
combination of these flags it actually allows.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-07-09  3:35 ` Martin K. Petersen
  2024-07-09  7:16   ` Christoph Hellwig
@ 2024-09-18  6:39   ` Anuj Gupta
  2024-09-24  1:59     ` Martin K. Petersen
  1 sibling, 1 reply; 14+ messages in thread
From: Anuj Gupta @ 2024-09-18  6:39 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Christoph Hellwig, Kanchan Joshi, linux-block, linux-scsi,
	linux-nvme

[-- Attachment #1: Type: text/plain, Size: 692 bytes --]

>
>Another wrinkle is that SCSI does not have a way to directly specify
>which tags to check. You can check guard only, check app+ref only, or
>all three. But you can't just check the ref tag if that's what you want
>to do.

Hi Martin,

When the drive is formatted with type1 integrity and dix is supported,
the block layer would generate/verify the guard and reftag. For scsi,
apptag would also need to be generated/verified as reftag-check can't
be specified alone. But I am not not able to find (in SCSI code) where
exactly: 
1. this apptag is being generated
2. and getting added to the PI buffer and scsi command.

Can you please point to where/how it is handled.

Thank you,
Anuj Gupta

[-- Attachment #2: Type: text/plain, Size: 0 bytes --]



^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-09-18  6:39   ` Anuj Gupta
@ 2024-09-24  1:59     ` Martin K. Petersen
  2024-09-24  5:36       ` Christoph Hellwig
  0 siblings, 1 reply; 14+ messages in thread
From: Martin K. Petersen @ 2024-09-24  1:59 UTC (permalink / raw)
  To: Anuj Gupta
  Cc: Martin K. Petersen, Christoph Hellwig, Kanchan Joshi, linux-block,
	linux-scsi, linux-nvme


Hi Anuj!

> When the drive is formatted with type1 integrity and dix is supported,
> the block layer would generate/verify the guard and reftag. For scsi,
> apptag would also need to be generated/verified as reftag-check can't
> be specified alone. But I am not not able to find (in SCSI code) where
> exactly: 1. this apptag is being generated
> 2. and getting added to the PI buffer and scsi command.

The block layer on its own doesn't have a way to fill out the app tag in
any meaningful way. But if the PI comes from userland, the application
can set the app tag to whatever it pleases.

Originally we had code in Linux allowing filesystems to attach
additional metadata to bios which would then be stored in the app tag
space. This was intended for backpointers but never really took off.
However, if we move the burden of PI generation to filesystems as
Christoph suggested, then we can revisit using the app tag space.

No matter what, though, for target we need to be able to store whichever
app tag the initiator sends.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-09-24  1:59     ` Martin K. Petersen
@ 2024-09-24  5:36       ` Christoph Hellwig
  2024-09-27  2:01         ` Martin K. Petersen
  0 siblings, 1 reply; 14+ messages in thread
From: Christoph Hellwig @ 2024-09-24  5:36 UTC (permalink / raw)
  To: Martin K. Petersen
  Cc: Anuj Gupta, Christoph Hellwig, Kanchan Joshi, linux-block,
	linux-scsi, linux-nvme

On Mon, Sep 23, 2024 at 09:59:10PM -0400, Martin K. Petersen wrote:
> Originally we had code in Linux allowing filesystems to attach
> additional metadata to bios which would then be stored in the app tag
> space. This was intended for backpointers but never really took off.
> However, if we move the burden of PI generation to filesystems as
> Christoph suggested, then we can revisit using the app tag space.

Just to make clear I want file systems to be able to opt into
generating PI, not get rid of the auto generation.  That is actually
kinda possible already, I just want to make it nicer.


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: fine-grained PI control
  2024-09-24  5:36       ` Christoph Hellwig
@ 2024-09-27  2:01         ` Martin K. Petersen
  0 siblings, 0 replies; 14+ messages in thread
From: Martin K. Petersen @ 2024-09-27  2:01 UTC (permalink / raw)
  To: Christoph Hellwig
  Cc: Martin K. Petersen, Anuj Gupta, Kanchan Joshi, linux-block,
	linux-scsi, linux-nvme


Christoph,

> Just to make clear I want file systems to be able to opt into
> generating PI, not get rid of the auto generation. That is actually
> kinda possible already, I just want to make it nicer.

Yep.

-- 
Martin K. Petersen	Oracle Linux Engineering

^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2024-09-27  2:01 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-07-05  8:32 fine-grained PI control Christoph Hellwig
2024-07-08 14:17 ` Anuj gupta
2024-07-09  7:08   ` Christoph Hellwig
2024-07-09  3:35 ` Martin K. Petersen
2024-07-09  7:16   ` Christoph Hellwig
2024-07-10  3:47     ` Martin K. Petersen
2024-07-11  5:42       ` Christoph Hellwig
2024-07-16  2:07         ` Martin K. Petersen
2024-07-26 10:21     ` Anuj Gupta
2024-07-29 17:03       ` Christoph Hellwig
2024-09-18  6:39   ` Anuj Gupta
2024-09-24  1:59     ` Martin K. Petersen
2024-09-24  5:36       ` Christoph Hellwig
2024-09-27  2:01         ` Martin K. Petersen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).