public inbox for linux-block@vger.kernel.org
 help / color / mirror / Atom feed
From: "Martin K. Petersen" <martin.petersen@oracle.com>
To: Kent Overstreet <kent.overstreet@linux.dev>
Cc: "Martin K. Petersen" <martin.petersen@oracle.com>,
	Keith Busch <kbusch@kernel.org>,
	Christoph Hellwig <hch@infradead.org>,
	Jens Axboe <axboe@kernel.dk>,
	linux-bcachefs@vger.kernel.org, linux-block@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>
Subject: Re: [PATCH 13/14] block: Allow REQ_FUA|REQ_READ
Date: Mon, 17 Mar 2025 13:57:53 -0400	[thread overview]
Message-ID: <yq1bjtzfyen.fsf@ca-mkp.ca.oracle.com> (raw)
In-Reply-To: <qhc7tpttpt57meqqyxrfuvvfaqg7hgrpivtwa5yxkvv22ubyia@ga3scmjr5kti> (Kent Overstreet's message of "Mon, 17 Mar 2025 11:43:46 -0400")


Kent,

>> At least for SCSI, given how FUA is usually implemented, I consider
>> it quite unlikely that two read operations back to back would somehow
>> cause different data to be transferred. Regardless of which flags you
>> use.
>
> Based on what, exactly?

Based on the fact that many devices will either blindly flush on FUA or
they'll do the equivalent of a media verify operation. In neither case
will you get different data returned. The emphasis for FUA is on media
durability, not caching.

In most implementations the cache isn't an optional memory buffer thingy
that can be sidestepped. It is the only access mechanism that exists
between the media and the host interface. Working memory if you will. So
bypassing the device cache is not really a good way to think about it.

The purpose of FUA is to ensure durability for future reads, it is a
media management flag. As such, any effect FUA may have on the device
cache is incidental.

For SCSI there is a different flag to specify caching behavior. That
flag is orthogonal to FUA and did not get carried over to NVMe.

> We _know_ devices are not perfect, and your claim that "it's quite
> unlikely that two reads back to back would return different data"
> amounts to claiming that there are no bugs in a good chunk of the IO
> path and all that is implemented perfectly.

I'm not saying that devices are perfect or that the standards make
sense. I'm just saying that your desired behavior does not match the
reality of how a large number of these devices are actually implemented.

The specs are largely written by device vendors and therefore
deliberately ambiguous. Many of the explicit cache management bits and
bobs have been removed from SCSI or are defined as hints because device
vendors don't want the OS to interfere with how they manage resources,
including caching. I get what your objective is. I just don't think FUA
offers sufficient guarantees in that department.

Also, given the amount of hardware checking done at the device level, my
experience tells me that you are way more likely to have undetected
corruption problems on the host side than inside the storage device. In
general storage devices implement very extensive checking on both
control and data paths. And they will return an error if there is a
mismatch (as opposed to returning random data).

-- 
Martin K. Petersen	Oracle Linux Engineering

  reply	other threads:[~2025-03-17 17:58 UTC|newest]

Thread overview: 42+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-11 20:15 [PATCH 00/14] better handling of checksum errors/bitrot Kent Overstreet
2025-03-11 20:15 ` [PATCH 13/14] block: Allow REQ_FUA|REQ_READ Kent Overstreet
2025-03-15 16:47   ` Jens Axboe
2025-03-15 17:01     ` Kent Overstreet
2025-03-15 17:03       ` Jens Axboe
2025-03-15 17:27         ` Kent Overstreet
2025-03-15 17:43           ` Jens Axboe
2025-03-15 18:07             ` Kent Overstreet
2025-03-15 18:32               ` Jens Axboe
2025-03-15 18:41                 ` Kent Overstreet
2025-03-17  6:00                   ` Christoph Hellwig
2025-03-17 12:15                     ` Kent Overstreet
2025-03-17 14:13                       ` Keith Busch
2025-03-17 14:49                         ` Kent Overstreet
2025-03-17 15:15                           ` Keith Busch
2025-03-17 15:22                             ` Kent Overstreet
2025-03-17 15:30                         ` Martin K. Petersen
2025-03-17 15:43                           ` Kent Overstreet
2025-03-17 17:57                             ` Martin K. Petersen [this message]
2025-03-17 18:21                               ` Kent Overstreet
2025-03-17 19:24                                 ` Keith Busch
2025-03-17 19:40                                   ` Kent Overstreet
2025-03-17 20:39                                     ` Keith Busch
2025-03-17 21:13                                       ` Bart Van Assche
2025-03-18  1:06                                         ` Kent Overstreet
2025-03-18  6:16                                           ` Christoph Hellwig
2025-03-18 17:49                                           ` Bart Van Assche
2025-03-18 18:00                                             ` Kent Overstreet
2025-03-18 18:10                                               ` Keith Busch
2025-03-18 18:13                                                 ` Kent Overstreet
2025-03-20  5:40                                                 ` Christoph Hellwig
2025-03-20 10:28                                                   ` Kent Overstreet
2025-03-18  0:27                                       ` Kent Overstreet
2025-03-18  6:11                                 ` Christoph Hellwig
2025-03-18 21:33                                   ` Kent Overstreet
2025-03-17 17:32                           ` Keith Busch
2025-03-18  6:19                             ` Christoph Hellwig
2025-03-18  6:01                       ` Christoph Hellwig
2025-03-17 20:55 ` [PATCH 00/14] better handling of checksum errors/bitrot John Stoffel
2025-03-18  1:15   ` Kent Overstreet
2025-03-18 14:47     ` John Stoffel
2025-03-20 17:15       ` Kent Overstreet

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=yq1bjtzfyen.fsf@ca-mkp.ca.oracle.com \
    --to=martin.petersen@oracle.com \
    --cc=axboe@kernel.dk \
    --cc=hch@infradead.org \
    --cc=kbusch@kernel.org \
    --cc=kent.overstreet@linux.dev \
    --cc=linux-bcachefs@vger.kernel.org \
    --cc=linux-block@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox