linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Joe Damato <jdamato@fastly.com>
To: Jens Axboe <axboe@kernel.dk>
Cc: Christoph Hellwig <hch@infradead.org>,
	netdev@vger.kernel.org, linux-kernel@vger.kernel.org,
	asml.silence@gmail.com, linux-fsdevel@vger.kernel.org,
	edumazet@google.com, pabeni@redhat.com, horms@kernel.org,
	linux-api@vger.kernel.org, linux-arch@vger.kernel.org,
	viro@zeniv.linux.org.uk, jack@suse.cz, kuba@kernel.org,
	shuah@kernel.org, sdf@fomichev.me, mingo@redhat.com,
	arnd@arndb.de, brauner@kernel.org, akpm@linux-foundation.org,
	tglx@linutronix.de, jolsa@kernel.org,
	linux-kselftest@vger.kernel.org
Subject: Re: [RFC -next 00/10] Add ZC notifications to splice and sendfile
Date: Wed, 19 Mar 2025 10:04:36 -0700	[thread overview]
Message-ID: <Z9r5JE3AJdnsXy_u@LQ3V64L9R2> (raw)
In-Reply-To: <2d68bc91-c22c-4b48-a06d-fa9ec06dfb25@kernel.dk>

On Wed, Mar 19, 2025 at 10:07:27AM -0600, Jens Axboe wrote:
> On 3/19/25 9:32 AM, Joe Damato wrote:
> > On Wed, Mar 19, 2025 at 01:04:48AM -0700, Christoph Hellwig wrote:
> >> On Wed, Mar 19, 2025 at 12:15:11AM +0000, Joe Damato wrote:
> >>> One way to fix this is to add zerocopy notifications to sendfile similar
> >>> to how MSG_ZEROCOPY works with sendmsg. This is possible thanks to the
> >>> extensive work done by Pavel [1].
> >>
> >> What is a "zerocopy notification" 
> > 
> > See the docs on MSG_ZEROCOPY [1], but in short when a user app calls
> > sendmsg and passes MSG_ZEROCOPY a completion notification is added
> > to the error queue. The user app can poll for these to find out when
> > the TX has completed and the buffer it passed to the kernel can be
> > overwritten.
> > 
> > My series provides the same functionality via splice and sendfile2.
> > 
> > [1]: https://www.kernel.org/doc/html/v6.13/networking/msg_zerocopy.html
> > 
> >> and why aren't you simply plugging this into io_uring and generate
> >> a CQE so that it works like all other asynchronous operations?
> > 
> > I linked to the iouring work that Pavel did in the cover letter.
> > Please take a look.
> > 
> > That work refactored the internals of how zerocopy completion
> > notifications are wired up, allowing other pieces of code to use the
> > same infrastructure and extend it, if needed.
> > 
> > My series is using the same internals that iouring (and others) use
> > to generate zerocopy completion notifications. Unlike iouring,
> > though, I don't need a fully customized implementation with a new
> > user API for harvesting completion events; I can use the existing
> > mechanism already in the kernel that user apps already use for
> > sendmsg (the error queue, as explained above and in the
> > MSG_ZEROCOPY documentation).
> 
> The error queue is arguably a work-around for _not_ having a delivery
> mechanism that works with a sync syscall in the first place. The main
> question here imho would be "why add a whole new syscall etc when
> there's already an existing way to do accomplish this, with
> free-to-reuse notifications". If the answer is "because splice", then it
> would seem saner to plumb up those bits only. Would be much simpler
> too...

I may be misunderstanding your comment, but my response would be:

  There are existing apps which use sendfile today unsafely and
  it would be very nice to have a safe sendfile equivalent. Converting
  existing apps to using iouring (if I understood your suggestion?)
  would be significantly more work compared to calling sendfile2 and
  adding code to check the error queue.

I would also argue that there are likely user apps out there that
use both sendmsg MSG_ZEROCOPY for certain writes (for data in
memory) and also use sendfile (for data on disk). One example would
be a reverse proxy that might write HTTP headers to clients via
sendmsg but transmit the response body with sendfile.

For those apps, the code to check the error queue already exists for
sendmsg + MSG_ZEROCOPY, so swapping in sendfile2 seems like an easy
way to ensure safe sendfile usage.

As far as the bit about plumbing only the splice bits, sorry if I'm
being dense here, do you mean plumbing the error queue through to
splice only and dropping sendfile2?

That is an option. Then the apps currently using sendfile could use
splice instead and get completion notifications on the error queue.
That would probably work and be less work than rewriting to use
iouring, but probably a bit more work than using a new syscall.

Thanks for taking a look and responding.

  reply	other threads:[~2025-03-19 17:04 UTC|newest]

Thread overview: 38+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-03-19  0:15 [RFC -next 00/10] Add ZC notifications to splice and sendfile Joe Damato
2025-03-19  0:15 ` [RFC -next 01/10] splice: Add ubuf_info to prepare for ZC Joe Damato
2025-03-19  0:15 ` [RFC -next 02/10] splice: Add helper that passes through splice_desc Joe Damato
2025-03-19  0:15 ` [RFC -next 03/10] splice: Factor splice_socket into a helper Joe Damato
2025-03-19  0:15 ` [RFC -next 04/10] splice: Add SPLICE_F_ZC and attach ubuf Joe Damato
2025-03-19  0:15 ` [RFC -next 05/10] fs: Add splice_write_sd to file operations Joe Damato
2025-03-19  0:15 ` [RFC -next 06/10] fs: Extend do_sendfile to take a flags argument Joe Damato
2025-03-19  0:15 ` [RFC -next 07/10] fs: Add sendfile2 which accepts " Joe Damato
2025-03-19  0:15 ` [RFC -next 08/10] fs: Add sendfile flags for sendfile2 Joe Damato
2025-03-19  0:15 ` [RFC -next 09/10] fs: Add sendfile2 syscall Joe Damato
2025-03-19  0:15 ` [RFC -next 10/10] selftests: Add sendfile zerocopy notification test Joe Damato
2025-03-19  8:04 ` [RFC -next 00/10] Add ZC notifications to splice and sendfile Christoph Hellwig
2025-03-19 15:32   ` Joe Damato
2025-03-19 16:07     ` Jens Axboe
2025-03-19 17:04       ` Joe Damato [this message]
2025-03-19 17:20         ` Jens Axboe
2025-03-19 17:45           ` Joe Damato
2025-03-19 18:37             ` Jens Axboe
2025-03-19 19:15               ` Stefan Metzmacher
2025-03-20 10:46                 ` Pavel Begunkov
2025-03-21  7:55                   ` Stefan Metzmacher
2025-03-21 20:51                     ` Pavel Begunkov
2025-03-19 19:16               ` Joe Damato
2025-03-21 11:11                 ` Jens Axboe
2025-03-20  5:57             ` Christoph Hellwig
2025-03-20 18:23               ` Joe Damato
2025-03-21  5:56                 ` Christoph Hellwig
2025-03-21 11:14                   ` Jens Axboe
2025-03-21 16:36                     ` Joe Damato
2025-03-21 20:30                       ` Joe Damato
2025-03-21 20:33                         ` Jens Axboe
2025-03-21 21:28                           ` Joe Damato
2025-03-21 20:35                       ` Jens Axboe
2025-03-21 16:44                   ` Joe Damato
2025-03-19 23:22       ` Joe Damato
2025-03-21 11:13         ` Jens Axboe
2025-03-20  5:50     ` Christoph Hellwig
2025-03-20 18:05       ` Joe Damato

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z9r5JE3AJdnsXy_u@LQ3V64L9R2 \
    --to=jdamato@fastly.com \
    --cc=akpm@linux-foundation.org \
    --cc=arnd@arndb.de \
    --cc=asml.silence@gmail.com \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=edumazet@google.com \
    --cc=hch@infradead.org \
    --cc=horms@kernel.org \
    --cc=jack@suse.cz \
    --cc=jolsa@kernel.org \
    --cc=kuba@kernel.org \
    --cc=linux-api@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-kselftest@vger.kernel.org \
    --cc=mingo@redhat.com \
    --cc=netdev@vger.kernel.org \
    --cc=pabeni@redhat.com \
    --cc=sdf@fomichev.me \
    --cc=shuah@kernel.org \
    --cc=tglx@linutronix.de \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).