linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Christian Brauner <brauner@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	jack@suse.cz, david@fromorbit.com, amir73il@gmail.com,
	hch@lst.de
Subject: Re: [PATCH][RFC] fs: add levels to inode write access
Date: Thu, 30 May 2024 10:58:10 -0400	[thread overview]
Message-ID: <20240530145810.GA2205585@perftesting> (raw)
In-Reply-To: <20240530-atheismus-festland-c11c1d3b7671@brauner>

On Thu, May 30, 2024 at 12:32:06PM +0200, Christian Brauner wrote:
> On Wed, May 29, 2024 at 04:41:32PM -0400, Josef Bacik wrote:
> > NOTE:
> > This is compile tested only.  It's also based on an out of tree feature branch
> > from Amir that I'm extending to add page fault content events to allow us to
> > have on-demand binary loading at exec time.  If you need more context please let
> > me know, I'll push my current branch somewhere and wire up how I plan to use
> > this patch so you can see it in better context, but hopefully I've described
> > what I'm trying to accomplish enough that this leads to useful discussion.
> > 
> > 
> > Currently we have ->i_writecount to control write access to an inode.
> > Callers may deny write access by calling deny_write_access(), which will
> > cause ->i_writecount to go negative, and allow_write_access() to push it
> > back up to 0.
> > 
> > This is used in a few ways, the biggest user being exec.  Historically
> > we've blocked write access to files that are opened for executing.
> > fsverity is the other notable user.
> > 
> > With the upcoming fanotify feature that allows for on-demand population
> > of files, this blanket policy of denying writes to files opened for
> > executing creates a problem.  We have to issue events right before
> > denying access, and the entire file must be populated before we can
> > continue with the exec.
> > 
> > This creates a problem for users who have large binaries they want to
> > populate on demand.  Inside of Meta we often have multi-gigabyte
> > binaries, even on the order of tens of gigabytes.  Pre-loading these
> > large files is costly, especially when large portions of the binary may
> > never be read (think debuginfo).
> > 
> > In order to facilitate on-demand loading of binaries we need to have a
> > way to punch a hole in this exec related write denial.
> 
> Hm. I suggest we try to tackle this in a completely different way. Maybe
> I mentioned it during LSFMM but instead of doing this dance we should
> try and remove the deny_write_access() mechanisms for exec completely.
> 
> Back in 2021 we removed the remaining VM_DENYWRITE bits but left in
> deny_write_access() for exec. Back then I had thought that this was a
> bit risky for not too much gain. But looking at this code here I think
> we now have an even stronger reason to try and get rid of this
> restriction. And I've since changed my mind. See my notes on the
> _completely untested_ RFC patch I appended.
> 
> Ofc depends on whether Linus still agrees that removing this might be
> something we could try.
> 

Muahaha you took the bait.

I obviously much prefer this solution, and from what I can tell we're all pretty
well in agreement about this.  Turn it into a real patch and I'll happily add my
reviewed-by.  Thanks,

Josef

  parent reply	other threads:[~2024-05-30 14:58 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 20:41 [PATCH][RFC] fs: add levels to inode write access Josef Bacik
2024-05-29 22:00 ` Jeff Layton
2024-05-30  1:14 ` Matthew Wilcox
2024-05-30 10:32 ` Christian Brauner
2024-05-30 12:57   ` Amir Goldstein
2024-05-30 14:58   ` Josef Bacik [this message]
2024-05-30 15:23   ` Christian Brauner
2024-05-30 15:49   ` Linus Torvalds
2024-05-31 10:02     ` Christian Brauner
2024-05-31 12:32       ` Christian Brauner
2024-05-31 13:01         ` [PATCH] fs: don't block i_writecount during exec Christian Brauner
2024-05-31 15:40           ` Linus Torvalds
2024-05-31 18:08             ` Jeff Layton
2024-05-31 22:08           ` Josef Bacik
2024-06-03 13:52           ` (subset) " Christian Brauner
2024-06-06 12:45           ` Aishwarya TCV
2024-06-06 15:37             ` Mark Brown
2024-06-06 16:53               ` Josef Bacik
2024-06-06 17:33                 ` Mark Brown
2024-06-06 17:49                   ` Mark Brown
2024-08-07  9:59                     ` Tudor Ambarus
2024-09-04 17:04           ` Jann Horn
2024-09-05  7:38             ` Roberto Sassu
2024-05-31 13:09         ` [PATCH][RFC] fs: add levels to inode write access Amir Goldstein
2024-05-31 14:50           ` Christian Brauner
2024-05-31 15:47             ` Matthew Wilcox
2024-05-31 22:14               ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240530145810.GA2205585@perftesting \
    --to=josef@toxicpanda.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).