linux-block.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Christoph Hellwig <hch@lst.de>
Cc: Kanchan Joshi <joshi.k@samsung.com>,
	Anuj Gupta <anuj20.g@samsung.com>,
	"Martin K . Petersen" <martin.petersen@oracle.com>,
	linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
	linux-nvme@lists.infradead.org
Subject: Re: fine-grained PI control
Date: Mon, 08 Jul 2024 23:35:13 -0400	[thread overview]
Message-ID: <yq1ttgz5l6d.fsf@ca-mkp.ca.oracle.com> (raw)
In-Reply-To: <20240705083205.2111277-1-hch@lst.de> (Christoph Hellwig's message of "Fri, 5 Jul 2024 10:32:04 +0200")


Christoph,

Sorry about the delay. Just got back from vacation.

> I think we'll need to change the in-kernel interface matches the user
> one, and the submitter of the PI data chooses which of the tags to
> generate and check.

Yep. I discussed this with Kanchan a while back.

I don't like having the BIP_USER_CHECK_FOO flags exposed in the block
layer. The io_uring interface obviously needs to expose some flags in
the user API. But there should not be a separate set of BIP_USER_* flags
bolted onto the side of the existing kernel integrity flags.

The bip flags should describe the contents of the integrity buffer and
how the hardware needs to interpret and check that information.

> Martin also mentioned he wanted to see the BIP_CTRL_NOCHECK,
> BIP_DISK_NOCHECK and BIP_IP_CHECKSUM checksum flags exposed. Can you
> explain how you want them to fit into the API? Especially as AFAIK
> they can't work generically, e.g. NVMe never has an IP checksum and
> SCSI controllers might not offer them either. NVMe doesn't have a way
> to distinguish between disk and controller.

I am not sure how to handle the protocol differences other than
returning an error if flags are passed that are not valid for the given
device.

The other alternative is to only expose a generic CHECK or NOCHECK flag
(depending which polarity we prefer) which will enable or disable
checking for both controller and disk in the SCSI case. But that also
means porting the DI test tooling will be impossible.

Another wrinkle is that SCSI does not have a way to directly specify
which tags to check. You can check guard only, check app+ref only, or
all three. But you can't just check the ref tag if that's what you want
to do.

I addressed that in DIX by having individual tag check flags and NVMe
inherited those in PRCHK. But for the SCSI disk itself we're limited to
what RDPROTECT/WRPROTECT can express. And that's why BIP_DISK_NOCHECK
disables checking of all tags and why there are currently no separate
BIP_DISK_NO_{GUARD,APP,REF}_CHECK flags.

> Last but not least the fact that all reads and writes on PI enabled
> devices by default check the guard (and reference if available for the
> PI type) tags leads to a lot of annoying warnings when the kernel or
> userspace does speculative reads.

> Most of this is to read signatures of file systems or partitions, and
> that previously always succeeded, but now returns an error and warns
> in the block layer. I've been wondering a few things here:

Is that on NVMe? It's been a while since I've tried. We don't get errors
for readahead on SCSI, that would be a bug.

-- 
Martin K. Petersen	Oracle Linux Engineering

  parent reply	other threads:[~2024-07-09  3:35 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-07-05  8:32 fine-grained PI control Christoph Hellwig
2024-07-08 14:17 ` Anuj gupta
2024-07-09  7:08   ` Christoph Hellwig
2024-07-09  3:35 ` Martin K. Petersen [this message]
2024-07-09  7:16   ` Christoph Hellwig
2024-07-10  3:47     ` Martin K. Petersen
2024-07-11  5:42       ` Christoph Hellwig
2024-07-16  2:07         ` Martin K. Petersen
2024-07-26 10:21     ` Anuj Gupta
2024-07-29 17:03       ` Christoph Hellwig
2024-09-18  6:39   ` Anuj Gupta
2024-09-24  1:59     ` Martin K. Petersen
2024-09-24  5:36       ` Christoph Hellwig
2024-09-27  2:01         ` Martin K. Petersen

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1ttgz5l6d.fsf@ca-mkp.ca.oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=anuj20.g@samsung.com \
    --cc=hch@lst.de \
    --cc=joshi.k@samsung.com \
    --cc=linux-block@vger.kernel.org \
    --cc=linux-nvme@lists.infradead.org \
    --cc=linux-scsi@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).