Linux Hardening
 help / color / mirror / Atom feed
From: Pedro Falcato <pfalcato@suse.de>
To: Christian Brauner <brauner@kernel.org>
Cc: Jens Axboe <axboe@kernel.dk>,
	Alexander Viro <viro@zeniv.linux.org.uk>,
	 Jan Kara <jack@suse.cz>,
	linux-fsdevel@vger.kernel.org, linux-kernel@vger.kernel.org,
	 linux-mm@kvack.org, linux-hardening@vger.kernel.org,
	Kees Cook <kees@kernel.org>,  Mateusz Guzik <mjguzik@gmail.com>
Subject: Re: [RFC PATCH] fs/splice: allow for a way to block splice() with read-only files
Date: Mon, 18 May 2026 15:02:42 +0200	[thread overview]
Message-ID: <agsJyFYLg5sd_34j@pedro-suse> (raw)
In-Reply-To: <20260518-starten-messdaten-3b8aa670ec85@brauner>

On Mon, May 18, 2026 at 02:20:30PM +0200, Christian Brauner wrote:
> On Sat, May 16, 2026 at 07:21:26PM +0100, Pedro Falcato wrote:
> > Since the advent of vulns like Dirty Pipe, Dirty Frag, Copy Fail
> > and Fragnasia, splicing a read-only file is fundamentally unsafe.
> > 
> > As such, as a mitigation, add a way for users to block splice() for
> > files they cannot write to. This eliminates this whole class of exploits
> > that use splice()+confusion in pipe/net/etc code to gain write-access to
> > files they can only read.
> > 
> > Users can simply toggle fs.splice_needs_write=1 and suddenly splice() will
> > refuse perfectly legal splices() from files it can only read, but not write.
> > 
> > For vmsplice(), make due with the address_space attached to the folio. Care
> > is held to make sure the operation isn't too slowed down with locks. The check
> > itself isn't entirely equivalent (the mapping's host can be the internal bdev
> > inode, etc, and not the one in /dev against which permissions are checked),
> > but doing it in a more correct way would require dropping from GUP-fast to
> > GUP, and that would be too slow.
> > 
> > Signed-off-by: Pedro Falcato <pfalcato@suse.de>
> > ---
> > 
> > Hello,
> > 
> > sending this out as an RFC so I can get better opinions from VFS & security
> > folks upstream. I wrote this out as a way to harden against all the page
> > cache attacks we've seen lately, that bottom out to splice() from a file
> > they cannot write + confusion elsewhere on the net stack/pipes/etc.
> > 
> > This is _obviously_ not perfect and not complete. My first (unsent) version
> > straight up returned -EPERM on splice() for these files. This one attempts
> > to retain some compatibility by only blocking the page splicing operation,
> > but still issuing the operation with normal copies (kindly suggested by Jan).
> > vmsplice() is a complicated issue, because gup_fast does not allow us access
> > to the VMA's vm_file. I tried hacking around it but it's not perfect (e.g you
> > cannot grab the mnt_idmap for the file, since we only have access to the
> > address_space + its host).
> > I'm also not a fan of having somewhat hairy MM code in the middle of
> > fs/splice.c but that's something we can simply hoist elsewhere as this gets
> > un-RFC'd. It's also missing the external-facing docs for the sysctl.
> > 
> > My big questions are:
> > 1) Is this a viable way forward?
> 
> I think that splice and vmsplice() are pretty wonky apis. Ignoring it's
> recent prominent role in page cache attacks it suffers from weird issues
> due to its interactions with pipe_lock().
> 
> Bug with splice to a pipe preventing a process exit
> 20250122020850.2175427-1-kolyshkin@gmail.com
> Sendfile holding pipe->mutex blocks the peer's pipe_release() from do_exit().
> 
> Change in splice() behaviour after 5.10? (LTP splice07)
> 7F3B484F-9555-486A-B19A-5A8EB6442988@kernel.org
> 
> [PATCH v2 00/11] Avoid unprivileged splice(file->)/(->socket) pipe exclusion
> cover.1703126594.git.nabijaczleweli@nabijaczleweli.xyz
> Pending splice from tty/socket/FIFO holds pipe->mutex indefinitely, blocking all other FIFO ops incl. read(O_NONBLOCK)
> 
> splice: prevent deadlock when splicing a file to itself
> 20260320130615.1109449-1-kartikey406@gmail.com
> do_splice_direct_actor() still lacks file_inode(in) == file_inode(out) guard
> 
> AF_UNIX/zerocopy/pipe/vmsplice/splice vs FOLL_PIN
> 2135907.1747061490@warthog.procyon.org.uk
> vmsplice/splice into AF_UNIX/pipe doesn't FOLL_PIN the source memory
> 
> My main gripe with the patch as written is that I find it really hard to
> figure out who would deploy this. It half-cripples splice() and
> vmsplice() for some use-cases but leaves it intact for others.

Not just splice() and vmsplice(), but sendfile(), copy_file_range() too.
My bet (perhaps not informed enough) is that there simply aren't that many
users doing splice-like opeartions from files they do not own in some way.

(maybe not true for copy_file_range(), I admit)

> 
> At that point you can also just ENOSYS splice() and vmsplice() via
> seccomp and force a fallback on non-splice codepaths that userspace has
> to have anyway as splice() isn't supported unconditionally.

IIRC GNU grep is one simple example where they assume splice() from a pipe
to /dev/null Just Works(tm) and it exits(1) otherwise.

> It feels like a knee-jerk reaction to an exploit class originating in
> buggy modules that we have little control over and we would extend an
> API to users that is really difficult to use.
> 
> What might make more sense is to add a splice specific security_*() hook
> into the code so that an LSM can deny usage of splice in whatever way it
> wants to - bpf lsm or in-tree lsm.

I don't dislike that option, but I don't love leaving hardening to LSMs. The
kernel quite literally gets a new splice-related vulnerability every week now,
where userspace gets to pass pages it has no business passing to funky
codepaths that then write on these pages. I feel like natively restricting
what you can pass is simply a natural way forward.

> 
> Then we don't have to have all this gunk in the VFS layer that will be
> annoying to maintain with little value in the long-term. So I'm not very
> likely to pick this up as is.

Totally. That's what the RFC tag is for :)

-- 
Pedro

  reply	other threads:[~2026-05-18 13:02 UTC|newest]

Thread overview: 22+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-05-16 18:21 [RFC PATCH] fs/splice: allow for a way to block splice() with read-only files Pedro Falcato
2026-05-16 23:07 ` Matthew Wilcox
2026-05-17  0:59   ` Pedro Falcato
2026-05-17  1:17     ` Matthew Wilcox
2026-05-17  9:01       ` Pedro Falcato
2026-05-17 22:30         ` Matthew Wilcox
2026-05-16 23:51 ` Mateusz Guzik
2026-05-17  0:52   ` Pedro Falcato
2026-05-18 11:44   ` Christian Brauner
2026-05-18 12:20 ` Christian Brauner
2026-05-18 13:02   ` Pedro Falcato [this message]
2026-05-18 18:59   ` Jann Horn
2026-05-19  6:39     ` Christoph Hellwig
2026-05-19  9:49     ` Christian Brauner
2026-05-19 10:51       ` Mateusz Guzik
2026-05-19 10:59         ` Christian Brauner
2026-05-19 11:56           ` Mateusz Guzik
2026-05-22 13:11             ` Christian Brauner
2026-05-28 12:59               ` Pedro Falcato
2026-05-19 13:28         ` James Bottomley
2026-05-19 16:28         ` Jann Horn
2026-05-23 20:41       ` Askar Safin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=agsJyFYLg5sd_34j@pedro-suse \
    --to=pfalcato@suse.de \
    --cc=axboe@kernel.dk \
    --cc=brauner@kernel.org \
    --cc=jack@suse.cz \
    --cc=kees@kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-hardening@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mjguzik@gmail.com \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox