From: Chuck Lever <chuck.lever@oracle.com>
To: Theodore Ts'o <tytso@mit.edu>, Christoph Hellwig <hch@infradead.org>
Cc: Dave Chinner <david@fromorbit.com>,
Anna Schumaker <anna.schumaker@oracle.com>,
lsf-pc@lists.linux-foundation.org, linux-fsdevel@vger.kernel.org,
Linux NFS Mailing List <linux-nfs@vger.kernel.org>
Subject: Re: [LSF/MM/BPF TOPIC] Implementing the NFS v4.2 WRITE_SAME operation: VFS or NFS ioctl() ?
Date: Thu, 16 Jan 2025 08:59:19 -0500 [thread overview]
Message-ID: <21c7789f-2d59-42ce-8fcc-fd4c08bcb06f@oracle.com> (raw)
In-Reply-To: <20250116133701.GB2446278@mit.edu>
On 1/16/25 8:37 AM, Theodore Ts'o wrote:
> On Wed, Jan 15, 2025 at 09:42:29PM -0800, Christoph Hellwig wrote:
>> On Wed, Jan 15, 2025 at 10:14:56AM +1100, Dave Chinner wrote:
>>> How closely does this match to the block device WRITE_SAME
>>> (SCSI/NVMe) commands? I note there is a reference to this in the
>>> RFC, but there are no details given.
>>
>> There is no write same in NVMe. In one of the few wiѕe choices in
>> NVMe the protocol only does a write zeroes for zeroing instead of the
>> overly complex write zeroes. And no one has complained about that so
>> far.
>
> It should be noted that there is currently a patch proposing to add to
> fallocate support for the operation FALLOC_FL_WRITE_ZEROS:
>
> https://lore.kernel.org/all/20250115114637.2705887-1-yi.zhang@huaweicloud.com/
>
> For those use cases where this is all the user requires, perhaps this
> is something that Linux's nfs4 client should consider implementing?
I've seen one or two other mentions of "let's make the NFS client do
such and such" in this thread.
To be clear: The proposal includes client and server implementation of
the NFSv4.2 WRITE_SAME operation. This is not a client-only thing.
In fact, the most recent requester mentioned only a server
implementation because they have a client that already implements
WRITE_SAME and want this feature in NFSD.
> In any case I'd suggest that interested file system developers comment
> on this patch series.
>
> Personally, I have no interest in using or implementing in a
> WRITE_SAME operation which implements the all-singing, all-dancing
> WRITE_SAME as envisioned by the SCSI and NFSv4.2 specifications.
I think we need to consider a weak generic implementation that resides
in the VFS or a library for file systems that choose not to implement.
> I will also note that many Cloud vendors (AWS, GCE, Azure) are moving
> to using NVMe instead of SCSI, especially for the higher performance
> VM and software-defined block devices. So, I would suspect that a
> customer would have to wave a **very** large amount of money under my
> employer's nose before this would be something that would be funded by
> $WORK for block-based file systems (and even then, it appears that
> NVMe is so much better at higher performance storage, such that I'm
> not sure how many customers would really be all that interested).
>
> But hey, if someone knows of some AI-related workload that needs to
> write the same non-zero block a very large number of times, let me
> know. :-)
See my previous reply in this thread: WRITE_SAME has a long-standing
existing use case in the database world. The NFSv4.2 WRITE_SAME
operation was designed around this use case.
You remember database workloads, right? ;-)
--
Chuck Lever
next prev parent reply other threads:[~2025-01-16 13:59 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-14 21:38 [LSF/MM/BPF TOPIC] Implementing the NFS v4.2 WRITE_SAME operation: VFS or NFS ioctl() ? Anna Schumaker
2025-01-14 23:14 ` Dave Chinner
2025-01-16 5:42 ` Christoph Hellwig
2025-01-16 13:37 ` Theodore Ts'o
2025-01-16 13:59 ` Chuck Lever [this message]
2025-01-16 15:36 ` Theodore Ts'o
2025-01-16 15:45 ` Chuck Lever
2025-01-16 17:30 ` Theodore Ts'o
2025-01-16 22:11 ` [Lsf-pc] " Martin K. Petersen
2025-01-16 21:54 ` Martin K. Petersen
2025-01-15 2:10 ` Darrick J. Wong
2025-01-15 14:24 ` Jeff Layton
2025-01-15 15:06 ` Matthew Wilcox
2025-01-15 15:31 ` Chuck Lever
2025-01-15 16:19 ` Matthew Wilcox
2025-01-15 18:20 ` Darrick J. Wong
2025-01-15 18:43 ` Chuck Lever
2025-01-16 5:40 ` Christoph Hellwig
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=21c7789f-2d59-42ce-8fcc-fd4c08bcb06f@oracle.com \
--to=chuck.lever@oracle.com \
--cc=anna.schumaker@oracle.com \
--cc=david@fromorbit.com \
--cc=hch@infradead.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-nfs@vger.kernel.org \
--cc=lsf-pc@lists.linux-foundation.org \
--cc=tytso@mit.edu \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox