From: "Mickaël Salaün" <mic@digikod.net>
To: Linus Torvalds <torvalds@linux-foundation.org>
Cc: "Dave Chinner" <david@fromorbit.com>, "Jan Kara" <jack@suse.cz>,
"Christoph Hellwig" <hch@infradead.org>,
linux-fsdevel@vger.kernel.org, linux-xfs@vger.kernel.org,
linux-bcachefs@vger.kernel.org, kent.overstreet@linux.dev,
"Jann Horn" <jannh@google.com>, "Serge Hallyn" <serge@hallyn.com>,
"Kees Cook" <keescook@chromium.org>,
linux-security-module@vger.kernel.org,
"Amir Goldstein" <amir73il@gmail.com>,
"Günther Noack" <gnoack@google.com>
Subject: Re: lsm sb_delete hook, was Re: [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes()
Date: Tue, 8 Oct 2024 14:59:07 +0200 [thread overview]
Message-ID: <20241008.Pohc0dixeiZ8@digikod.net> (raw)
In-Reply-To: <CAHk-=wi5ZpW73nLn5h46Jxcng6wn_bCUxj6JjxyyEMAGzF5KZg@mail.gmail.com>
On Mon, Oct 07, 2024 at 05:28:57PM -0700, Linus Torvalds wrote:
> On Mon, 7 Oct 2024 at 16:33, Dave Chinner <david@fromorbit.com> wrote:
> >
> > There may be other inode references being held that make
> > the inode live longer than the dentry cache. When should the
> > fsnotify marks be removed from the inode in that case? Do they need
> > to remain until, e.g, writeback completes?
>
> Note that my idea is to just remove the fsnotify marks when the dentry
> discards the inode.
>
> That means that yes, the inode may still have a lifetime after the
> dentry (because of other references, _or_ just because I_DONTCACHE
> isn't set and we keep caching the inode).
>
> BUT - fsnotify won't care. There won't be any fsnotify marks on that
> inode any more, and without a dentry that points to it, there's no way
> to add such marks.
>
> (A new dentry may be re-attached to such an inode, and then fsnotify
> could re-add new marks, but that doesn't change anything - the next
> time the dentry is detached, the marks would go away again).
>
> And yes, this changes the timing on when fsnotify events happen, but
> what I'm actually hoping for is that Jan will agree that it doesn't
> actually matter semantically.
>
> > > Then at umount time, the dentry shrinking will deal with all live
> > > dentries, and at most the fsnotify layer would send the FS_UNMOUNT to
> > > just the root dentry inodes?
> >
> > I don't think even that is necessary, because
> > shrink_dcache_for_umount() drops the sb->s_root dentry after
> > trimming the dentry tree. Hence the dcache drop would cleanup all
> > inode references, roots included.
>
> Ahh - even better.
>
> I didn't actually look very closely at the actual umount path, I was
> looking just at the fsnotify_inoderemove() place in
> dentry_unlink_inode() and went "couldn't we do _this_ instead?"
>
> > > Wouldn't that make things much cleaner, and remove at least *one* odd
> > > use of the nasty s_inodes list?
> >
> > Yes, it would, but someone who knows exactly when the fsnotify
> > marks can be removed needs to chime in here...
>
> Yup. Honza?
>
> (Aside: I don't actually know if you prefer Jan or Honza, so I use
> both randomly and interchangeably?)
>
> > > I have this feeling that maybe we can just remove the other users too
> > > using similar models. I think the LSM layer use (in landlock) is bogus
> > > for exactly the same reason - there's really no reason to keep things
> > > around for a random cached inode without a dentry.
> >
> > Perhaps, but I'm not sure what the landlock code is actually trying
> > to do.
In Landlock, inodes (see landlock_object) may be referenced by several
rulesets, either tied to a task's cred or a ruleset's file descriptor.
A ruleset may outlive its referenced inodes, and this should not block
related umounts. security_sb_delete() is used to gracefully release
such references.
>
> Yeah, I wouldn't be surprised if it's just confused - it's very odd.
>
> But I'd be perfectly happy just removing one use at a time - even if
> we keep the s_inodes list around because of other users, it would
> still be "one less thing".
>
> > Hence, to me, the lifecycle and reference counting of inode related
> > objects in landlock doesn't seem quite right, and the use of the
> > security_sb_delete() callout appears to be papering over an internal
> > lifecycle issue.
> >
> > I'd love to get rid of it altogether.
I'm not sure to fully understand the implications for now, but it would
definitely be good to simplify this lifetime management. The only
requirement for Landlock is that inodes references should live as long
as the related inodes are accessible by user space or already in use.
The sooner these references are removed from related ruleset, the
better.
>
> Yeah, I think the inode lifetime is just so random these days that
> anything that depends on it is questionable.
>
> The quota case is probably the only thing where the inode lifetime
> *really* makes sense, and that's the one where I looked at the code
> and went "I *hope* this can be converted to traversing the dentry
> tree", but at the same time it did look sensible to make it be about
> inodes.
>
> If we can convert the quota side to be based on dentry lifetimes, it
> will almost certainly then have to react to the places that do
> "d_add()" when re-connecting an inode to a dentry at lookup time.
>
> So yeah, the quota code looks worse, but even if we could just remove
> fsnotify and landlock, I'd still be much happier.
>
> Linus
next prev parent reply other threads:[~2024-10-08 12:59 UTC|newest]
Thread overview: 72+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-10-02 1:33 [RFC PATCH 0/7] vfs: improving inode cache iteration scalability Dave Chinner
2024-10-02 1:33 ` [PATCH 1/7] vfs: replace invalidate_inodes() with evict_inodes() Dave Chinner
2024-10-03 7:07 ` Christoph Hellwig
2024-10-03 9:20 ` Jan Kara
2024-10-02 1:33 ` [PATCH 2/7] vfs: add inode iteration superblock method Dave Chinner
2024-10-03 7:12 ` Christoph Hellwig
2024-10-03 10:35 ` Dave Chinner
2024-10-04 9:53 ` kernel test robot
2024-10-02 1:33 ` [PATCH 3/7] vfs: convert vfs inode iterators to super_iter_inodes_unsafe() Dave Chinner
2024-10-03 7:14 ` Christoph Hellwig
2024-10-03 10:45 ` Dave Chinner
2024-10-04 10:55 ` kernel test robot
2024-10-02 1:33 ` [PATCH 4/7] vfs: Convert sb->s_inodes iteration to super_iter_inodes() Dave Chinner
2024-10-03 7:23 ` lsm sb_delete hook, was " Christoph Hellwig
2024-10-03 7:38 ` Christoph Hellwig
2024-10-03 11:57 ` Jan Kara
2024-10-03 12:11 ` Christoph Hellwig
2024-10-03 12:26 ` Jan Kara
2024-10-03 12:39 ` Christoph Hellwig
2024-10-03 12:56 ` Jan Kara
2024-10-03 13:04 ` Christoph Hellwig
2024-10-03 13:59 ` Dave Chinner
2024-10-03 16:17 ` Jan Kara
2024-10-04 0:46 ` Dave Chinner
2024-10-04 7:21 ` Christian Brauner
2024-10-04 12:14 ` Christoph Hellwig
2024-10-04 13:49 ` Jan Kara
2024-10-04 18:15 ` Paul Moore
2024-10-04 22:57 ` Dave Chinner
2024-10-05 15:21 ` Mickaël Salaün
2024-10-05 16:03 ` Mickaël Salaün
2024-10-05 16:03 ` Paul Moore
2024-10-07 20:37 ` Linus Torvalds
2024-10-07 23:33 ` Dave Chinner
2024-10-08 0:28 ` Linus Torvalds
2024-10-08 0:54 ` Linus Torvalds
2024-10-09 9:49 ` Jan Kara
2024-10-08 12:59 ` Mickaël Salaün [this message]
2024-10-09 0:21 ` Dave Chinner
2024-10-09 9:23 ` Mickaël Salaün
2024-10-08 8:57 ` Amir Goldstein
2024-10-08 11:23 ` Jan Kara
2024-10-08 12:16 ` Christian Brauner
2024-10-09 0:03 ` Dave Chinner
2024-10-08 23:44 ` Dave Chinner
2024-10-09 6:10 ` Amir Goldstein
2024-10-09 14:18 ` Jan Kara
2024-10-02 1:33 ` [PATCH 5/7] vfs: add inode iteration superblock method Dave Chinner
2024-10-03 7:24 ` Christoph Hellwig
2024-10-02 1:33 ` [PATCH 6/7] xfs: implement sb->iter_vfs_inodes Dave Chinner
2024-10-03 7:30 ` Christoph Hellwig
2024-10-02 1:33 ` [PATCH 7/7] bcachefs: " Dave Chinner
2024-10-02 10:00 ` [RFC PATCH 0/7] vfs: improving inode cache iteration scalability Christian Brauner
2024-10-02 12:34 ` Dave Chinner
2024-10-02 19:29 ` Kent Overstreet
2024-10-02 22:23 ` Dave Chinner
2024-10-02 23:20 ` Kent Overstreet
2024-10-03 1:41 ` Dave Chinner
2024-10-03 2:24 ` Kent Overstreet
2024-10-03 9:17 ` Jan Kara
2024-10-03 9:59 ` Dave Chinner
2024-10-02 19:49 ` Linus Torvalds
2024-10-02 20:28 ` Kent Overstreet
2024-10-02 23:17 ` Dave Chinner
2024-10-03 1:22 ` Kent Overstreet
2024-10-03 2:20 ` Dave Chinner
2024-10-03 2:42 ` Kent Overstreet
2024-10-03 11:45 ` Jan Kara
2024-10-03 12:18 ` Christoph Hellwig
2024-10-03 12:46 ` Jan Kara
2024-10-03 13:35 ` Dave Chinner
2024-10-03 13:03 ` Dave Chinner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20241008.Pohc0dixeiZ8@digikod.net \
--to=mic@digikod.net \
--cc=amir73il@gmail.com \
--cc=david@fromorbit.com \
--cc=gnoack@google.com \
--cc=hch@infradead.org \
--cc=jack@suse.cz \
--cc=jannh@google.com \
--cc=keescook@chromium.org \
--cc=kent.overstreet@linux.dev \
--cc=linux-bcachefs@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-security-module@vger.kernel.org \
--cc=linux-xfs@vger.kernel.org \
--cc=serge@hallyn.com \
--cc=torvalds@linux-foundation.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).