From: "Theodore Ts'o" <tytso@mit.edu>
To: Christoph Hellwig <hch@lst.de>
Cc: leah.rumancik@gmail.com, linux-ext4@vger.kernel.org,
linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-block@vger.kernel.org, linux-scsi@vger.kernel.org,
linux-nvme@lists.infradead.org
Subject: Re: [PATCH] ext4: fix EXT4_IOC_CHECKPOINT
Date: Wed, 7 Jul 2021 12:58:09 -0400 [thread overview]
Message-ID: <YOXdIZARJ0Rwtfbd@mit.edu> (raw)
In-Reply-To: <20210707085644.3041867-1-hch@lst.de>
On Wed, Jul 07, 2021 at 10:56:44AM +0200, Christoph Hellwig wrote:
> Issuing a discard for any kind of "contention deletion SLO" is highly
> dangerous as discard as defined by Linux (as well the underlying NVMe,
> SCSI, ATA, eMMC and virtio primitivies) are defined to not guarantee
> erasing of data but just allow optional and nondeterministic reclamation
> of space. Instead issuing write zeroes is the only think to perform
> such an operation. Remove the highly dangerous and misleading discard
> mode for EXT4_IOC_CHECKPOINT and only support the write zeroes based
> on, and clean up the resulting mess including the dry run mode.
A discard is not "dangerous"; how it behaves is simply not necessarily
guaranteed by the standards specification. The userspace which uses
the ioctl simply needs to know how a particular block device might
react when it is given a discard.
I'll note that there is a similar issue with "WRITE SAME" or "ZEROOUT.
A WRITE SAME might take a fraction of a second --- or it might take
days --- depending on how the storage device is implemented. It is
similarly unspecified by the various standards specification. Hence,
userspace needs to know something about the block device before
deciding whether or not it would be good idea to issue a "WRITE SAME"
operation for large number of blocks.
This is why the API is implemented in terms of what command will be
issued to the block device, and not what the semantic meaning is for
that particular command. That's up to the userspace application to
know out of band, and we should be able to give the privileged
application the freedom to decide which command makes the most amount
of sense.
- Ted
next prev parent reply other threads:[~2021-07-07 16:58 UTC|newest]
Thread overview: 3+ messages / expand[flat|nested] mbox.gz Atom feed top
2021-07-07 8:56 [PATCH] ext4: fix EXT4_IOC_CHECKPOINT Christoph Hellwig
2021-07-07 16:58 ` Theodore Ts'o [this message]
2021-07-08 5:13 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YOXdIZARJ0Rwtfbd@mit.edu \
--to=tytso@mit.edu \
--cc=hch@lst.de \
--cc=leah.rumancik@gmail.com \
--cc=linux-block@vger.kernel.org \
--cc=linux-ext4@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-nvme@lists.infradead.org \
--cc=linux-scsi@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).