From: Miklos Szeredi <miklos@szeredi.hu>
To: Matt Whitlock <kernel@mattwhitlock.name>
Cc: David Howells <dhowells@redhat.com>,
netdev@vger.kernel.org, Matthew Wilcox <willy@infradead.org>,
Dave Chinner <david@fromorbit.com>,
Linus Torvalds <torvalds@linux-foundation.org>,
Jens Axboe <axboe@kernel.dk>,
linux-fsdevel@kvack.org, linux-mm@kvack.org,
linux-kernel@vger.kernel.org, Christoph Hellwig <hch@lst.de>,
linux-fsdevel@vger.kernel.org
Subject: Re: [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns
Date: Wed, 19 Jul 2023 21:35:33 +0200 [thread overview]
Message-ID: <CAJfpegvq4M_Go7fHiWVBBkrK6h4ChLqQTd0+EOKbRWZDcVerWA@mail.gmail.com> (raw)
In-Reply-To: <c634a18e-9f2b-4746-bd8f-aa1d41e6ddf7@mattwhitlock.name>
On Wed, 19 Jul 2023 at 19:59, Matt Whitlock <kernel@mattwhitlock.name> wrote:
>
> On Wednesday, 19 July 2023 06:17:51 EDT, Miklos Szeredi wrote:
> > On Thu, 29 Jun 2023 at 17:56, David Howells <dhowells@redhat.com> wrote:
> >>
> >> Splicing data from, say, a file into a pipe currently leaves the source
> >> pages in the pipe after splice() returns - but this means that those pages
> >> can be subsequently modified by shared-writable mmap(), write(),
> >> fallocate(), etc. before they're consumed.
> >
> > What is this trying to fix? The above behavior is well known, so
> > it's not likely to be a problem.
>
> Respectfully, it's not well-known, as it's not documented. If the splice(2)
> man page had mentioned that pages can be mutated after they're already
> ostensibly at rest in the output pipe buffer, then my nightly backups
> wouldn't have been incurring corruption silently for many months.
splice(2):
Though we talk of copying, actual copies are generally avoided.
The kernel does this by implementing a pipe buffer as a set of
refer‐
ence-counted pointers to pages of kernel memory. The
kernel creates "copies" of pages in a buffer by creating new pointers
(for the
output buffer) referring to the pages, and increasing the
reference counts for the pages: only pointers are copied, not the
pages of the
buffer.
While not explicitly stating that the contents of the pages can change
after being spliced, this can easily be inferred from the above
semantics.
Thanks,
Miklos
next prev parent reply other threads:[~2023-07-19 19:37 UTC|newest]
Thread overview: 22+ messages / expand[flat|nested] mbox.gz Atom feed top
[not found] <20230629155433.4170837-1-dhowells@redhat.com>
2023-06-29 15:54 ` [RFC PATCH 1/4] splice: Fix corruption of spliced data after splice() returns David Howells
2023-07-19 10:17 ` Miklos Szeredi
2023-07-19 17:59 ` Matt Whitlock
2023-07-19 19:35 ` Miklos Szeredi [this message]
2023-07-19 19:44 ` Matthew Wilcox
2023-07-19 19:56 ` Miklos Szeredi
2023-07-19 20:04 ` Matthew Wilcox
2023-07-19 20:16 ` Linus Torvalds
2023-07-19 21:02 ` Matt Whitlock
2023-07-19 23:20 ` Linus Torvalds
2023-07-19 23:41 ` Matt Whitlock
2023-07-20 0:00 ` Linus Torvalds
2023-07-19 23:48 ` Linus Torvalds
2023-07-24 9:44 ` David Howells
2023-07-24 13:55 ` Miklos Szeredi
2023-07-24 16:15 ` David Howells
2023-06-29 15:54 ` [RFC PATCH 2/4] splice: Make vmsplice() steal or copy David Howells
2023-06-30 13:44 ` Simon Horman
2023-06-30 15:29 ` David Howells
2023-06-30 17:32 ` Simon Horman
2023-06-29 15:54 ` [RFC PATCH 3/4] splice: Remove some now-unused bits David Howells
2023-06-29 15:54 ` [RFC PATCH 4/4] splice: Record some statistics David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAJfpegvq4M_Go7fHiWVBBkrK6h4ChLqQTd0+EOKbRWZDcVerWA@mail.gmail.com \
--to=miklos@szeredi.hu \
--cc=axboe@kernel.dk \
--cc=david@fromorbit.com \
--cc=dhowells@redhat.com \
--cc=hch@lst.de \
--cc=kernel@mattwhitlock.name \
--cc=linux-fsdevel@kvack.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=netdev@vger.kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).