All of lore.kernel.org
 help / color / mirror / Atom feed
From: Josef Bacik <josef@toxicpanda.com>
To: Christian Brauner <brauner@kernel.org>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-fsdevel@vger.kernel.org, viro@zeniv.linux.org.uk,
	jack@suse.cz, david@fromorbit.com, amir73il@gmail.com,
	hch@lst.de
Subject: Re: [PATCH][RFC] fs: add levels to inode write access
Date: Thu, 30 May 2024 10:58:10 -0400	[thread overview]
Message-ID: <20240530145810.GA2205585@perftesting> (raw)
In-Reply-To: <20240530-atheismus-festland-c11c1d3b7671@brauner>

On Thu, May 30, 2024 at 12:32:06PM +0200, Christian Brauner wrote:
> On Wed, May 29, 2024 at 04:41:32PM -0400, Josef Bacik wrote:
> > NOTE:
> > This is compile tested only.  It's also based on an out of tree feature branch
> > from Amir that I'm extending to add page fault content events to allow us to
> > have on-demand binary loading at exec time.  If you need more context please let
> > me know, I'll push my current branch somewhere and wire up how I plan to use
> > this patch so you can see it in better context, but hopefully I've described
> > what I'm trying to accomplish enough that this leads to useful discussion.
> > 
> > 
> > Currently we have ->i_writecount to control write access to an inode.
> > Callers may deny write access by calling deny_write_access(), which will
> > cause ->i_writecount to go negative, and allow_write_access() to push it
> > back up to 0.
> > 
> > This is used in a few ways, the biggest user being exec.  Historically
> > we've blocked write access to files that are opened for executing.
> > fsverity is the other notable user.
> > 
> > With the upcoming fanotify feature that allows for on-demand population
> > of files, this blanket policy of denying writes to files opened for
> > executing creates a problem.  We have to issue events right before
> > denying access, and the entire file must be populated before we can
> > continue with the exec.
> > 
> > This creates a problem for users who have large binaries they want to
> > populate on demand.  Inside of Meta we often have multi-gigabyte
> > binaries, even on the order of tens of gigabytes.  Pre-loading these
> > large files is costly, especially when large portions of the binary may
> > never be read (think debuginfo).
> > 
> > In order to facilitate on-demand loading of binaries we need to have a
> > way to punch a hole in this exec related write denial.
> 
> Hm. I suggest we try to tackle this in a completely different way. Maybe
> I mentioned it during LSFMM but instead of doing this dance we should
> try and remove the deny_write_access() mechanisms for exec completely.
> 
> Back in 2021 we removed the remaining VM_DENYWRITE bits but left in
> deny_write_access() for exec. Back then I had thought that this was a
> bit risky for not too much gain. But looking at this code here I think
> we now have an even stronger reason to try and get rid of this
> restriction. And I've since changed my mind. See my notes on the
> _completely untested_ RFC patch I appended.
> 
> Ofc depends on whether Linus still agrees that removing this might be
> something we could try.
> 

Muahaha you took the bait.

I obviously much prefer this solution, and from what I can tell we're all pretty
well in agreement about this.  Turn it into a real patch and I'll happily add my
reviewed-by.  Thanks,

Josef

  parent reply	other threads:[~2024-05-30 14:58 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-05-29 20:41 [PATCH][RFC] fs: add levels to inode write access Josef Bacik
2024-05-29 22:00 ` Jeff Layton
2024-05-30  1:14 ` Matthew Wilcox
2024-05-30 10:32 ` Christian Brauner
2024-05-30 12:57   ` Amir Goldstein
2024-05-30 14:58   ` Josef Bacik [this message]
2024-05-30 15:23   ` Christian Brauner
2024-05-30 15:49   ` Linus Torvalds
2024-05-31 10:02     ` Christian Brauner
2024-05-31 12:32       ` Christian Brauner
2024-05-31 13:01         ` [PATCH] fs: don't block i_writecount during exec Christian Brauner
2024-05-31 15:40           ` Linus Torvalds
2024-05-31 18:08             ` Jeff Layton
2024-05-31 22:08           ` Josef Bacik
2024-06-03 13:52           ` (subset) " Christian Brauner
2024-06-06 12:45           ` Aishwarya TCV
2024-06-06 15:37             ` Mark Brown
2024-06-06 16:53               ` Josef Bacik
2024-06-06 17:33                 ` Mark Brown
2024-06-06 17:49                   ` Mark Brown
2024-08-07  9:59                     ` [LTP] " Tudor Ambarus
2024-08-07  9:59                       ` Tudor Ambarus
2024-09-04 17:04           ` Jann Horn
2024-09-05  7:38             ` Roberto Sassu
2024-05-31 13:09         ` [PATCH][RFC] fs: add levels to inode write access Amir Goldstein
2024-05-31 14:50           ` Christian Brauner
2024-05-31 15:47             ` Matthew Wilcox
2024-05-31 22:14               ` Matthew Wilcox

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20240530145810.GA2205585@perftesting \
    --to=josef@toxicpanda.com \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=david@fromorbit.com \
    --cc=hch@lst.de \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.