From: Christian Brauner <brauner@kernel.org>
To: Amir Goldstein <amir73il@gmail.com>
Cc: jack@suse.cz, Tejun Heo <tj@kernel.org>,
"T.J. Mercier" <tjmercier@google.com>,
gregkh@linuxfoundation.org, driver-core@lists.linux.dev,
linux-kernel@vger.kernel.org, cgroups@vger.kernel.org,
linux-fsdevel@vger.kernel.org, shuah@kernel.org,
linux-kselftest@vger.kernel.org
Subject: Re: [PATCH v4 2/3] kernfs: Send IN_DELETE_SELF and IN_IGNORED
Date: Wed, 4 Mar 2026 14:26:11 +0100 [thread overview]
Message-ID: <20260304-glasig-amtieren-5010757246ae@brauner> (raw)
In-Reply-To: <CAOQ4uxhSL3ZRzNjM6AM_poxeTsYgWb5_f3tO6_4ketg8sFSOBw@mail.gmail.com>
On Tue, Mar 03, 2026 at 03:27:52PM +0100, Amir Goldstein wrote:
> On Tue, Feb 24, 2026 at 12:03 PM Christian Brauner <brauner@kernel.org> wrote:
> >
> > On Mon, Feb 23, 2026 at 06:27:31AM -1000, Tejun Heo wrote:
> > > (cc'ing Christian Brauner)
> > >
> > > On Sat, Feb 21, 2026 at 06:11:28PM +0200, Amir Goldstein wrote:
> > > > On Sat, Feb 21, 2026 at 12:32 AM Tejun Heo <tj@kernel.org> wrote:
> > > > >
> > > > > Hello, Amir.
> > > > >
> > > > > On Fri, Feb 20, 2026 at 10:11:15PM +0200, Amir Goldstein wrote:
> > > > > > > Yeah, that can be useful. For cgroupfs, there would probably need to be a
> > > > > > > way to scope it so that it can be used on delegation boundaries too (which
> > > > > > > we can require to coincide with cgroup NS boundaries).
> > > > > >
> > > > > > I have no idea what the above means.
> > > > > > I could ask Gemini or you and I prefer the latter ;)
> > > > >
> > > > > Ah, you chose wrong. :)
> > > > >
> > > > > > What are delegation boundaries and NFS boundaries in this context?
> > > > >
> > > > > cgroup delegation is giving control of a subtree to someone else:
> > > > >
> > > > > https://git.kernel.org/pub/scm/linux/kernel/git/tj/cgroup.git/tree/Documentation/admin-guide/cgroup-v2.rst#n537
> > > > >
> > > > > There's an old way of doing it by changing perms on some files and new way
> > > > > using cgroup namespace.
> > > > >
> > > > > > > Would it be possible to make FAN_MNT_ATTACH work for that?
> > > > > >
> > > > > > FAN_MNT_ATTACH is an event generated on a mntns object.
> > > > > > If "cgroup NS boundaries" is referring to a mntns object and if
> > > > > > this object is available in the context of cgroup create/destroy
> > > > > > then it should be possible.
> > > > >
> > > > > Great, yes, cgroup namespace way should work then.
> > > > >
> > > > > > But FAN_MNT_ATTACH reports a mountid. Is there a mountid
> > > > > > to report on cgroup create? Probably not?
> > > > >
> > > > > Sorry, I thought that was per-mount recursive file event monitoring.
> > > > > FAN_MARK_MOUNT looks like the right thing if we want to allow monitoring
> > > > > cgroup creations / destructions in a subtree without recursively watching
> > > > > each cgroup.
> > > >
> > > > The problem sounds very similar to subtree monitoring for mkdir/rmdir on
> > > > a filesystem, which is a problem that we have not yet solved.
> > > >
> > > > The problem with FAN_MARK_MOUNT is that it does not support the
> > > > events CREATE/DELETE, because those events are currently
> > >
> > > Ah, bummer.
> > >
> > > > monitored in context where the mount is not available and anyway
> > > > what users want to get notified on a deleted file/dir in a subtree
> > > > regardless of the mount through which the create/delete was done.
> > > >
> > > > Since commit 58f5fbeb367ff ("fanotify: support watching filesystems
> > > > and mounts inside userns") and fnaotify groups can be associated
> > > > with a userns.
> > > >
> > > > I was thinking that we can have a model where events are delivered
> > > > to a listener based on whether or not the uid/gid of the object are
> > > > mappable to the userns of the group.
> > >
> > > Given how different NSes can be used independently of each other, it'd
> > > probably be cleaner if it doesn't have to depend on another NS.
> > >
> > > > In a filesystem, this criteria cannot guarantee the subtree isolation.
> > > > I imagine that for delegated cgroups this criteria could match what
> > > > you need, but I am basing this on pure speculation.
> > >
> > > There's a lot of flexibility in the mechanism, so it's difficult to tell.
> > > e.g. There's nothing preventing somebody from creating two separate subtrees
> > > delegated to the same user.
> >
> > Delegation is based on inode ownership I'm not sure how well this will
> > fit into the fanotify model. Maybe the group logic for userns that
> > fanotify added works. I'm not super sure.
> >
> > > Christian was mentioning allowing separate super for different cgroup mounts
> > > in another thread. cc'ing him for context.
> >
> > If cgroupfs changes to tmpfs semantics where each mount gives you a new
> > superblock then it's possible to give each container its own superblock.
> > That in turn would make it possible to place fanotify watches on the
> > superblock itself. I think you'd roughly need something like the
> > following permission model:
> >
>
> It's hard for me to estimate the effort of changing to multi sb model,
> but judging by the length of the email I trimmed below, it does not
> sound trivial...
>
> How do you guys feel about something like this patch which associates
> an owner userns to every cgroup?
>
> I have this POC branch from a long time ago [1] to filter all events
> on sb by in_userns() criteria. The semantics for real filesystems
> were a bit difficult, but perhaps this model can work well for these
> pseudo singleton fs.
>
> I am trying to work on a model that could be useful for both cgroupfs
> and nsfs:
>
> If user is capable in userns, user will be able to set an sb
> watch for all events (say DELETE_SELF) on the sb, for objects
> whose owner_userns is in_userns() of the fanotify listener.
>
> This will enable watching for torn down cgroups and namepsaces
> which are visible to said user via delegated cgroups mount
> or via listns().
>
> I would like to allow calling fsnotify_obj_remove() hook with
> encoded object fid (e.g. nsfs_file_handle) instead of the vfs inode,
> so that cgroupfs/nsfs could report dying objects without needing
> to associate a vfs inode with them.
>
> WDYT? Is this an interesting direction to persure?
I'd need to see the patches. I barely remember the details tbh.
It doesn't sound crazy though.
next prev parent reply other threads:[~2026-03-04 13:26 UTC|newest]
Thread overview: 20+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-20 5:54 [PATCH v4 0/3] kernfs: Add inotify IN_DELETE_SELF, IN_IGNORED support T.J. Mercier
2026-02-20 5:54 ` [PATCH v4 1/3] kernfs: Don't set_nlink for directories being removed T.J. Mercier
2026-02-20 5:54 ` [PATCH v4 2/3] kernfs: Send IN_DELETE_SELF and IN_IGNORED T.J. Mercier
2026-02-20 15:32 ` Tejun Heo
2026-02-20 17:15 ` Amir Goldstein
2026-02-20 19:50 ` Tejun Heo
2026-02-20 20:11 ` Amir Goldstein
2026-02-20 23:32 ` Tejun Heo
2026-02-21 16:11 ` Amir Goldstein
2026-02-23 16:27 ` Tejun Heo
2026-02-24 11:03 ` Christian Brauner
2026-03-03 14:27 ` Amir Goldstein
2026-03-04 13:26 ` Christian Brauner [this message]
2026-02-20 5:54 ` [PATCH v4 3/3] selftests: memcg: Add tests for " T.J. Mercier
2026-02-20 17:43 ` Amir Goldstein
2026-02-20 17:46 ` T.J. Mercier
2026-02-20 17:53 ` T.J. Mercier
2026-02-20 18:01 ` Amir Goldstein
2026-02-20 10:14 ` [syzbot ci] Re: kernfs: Add inotify IN_DELETE_SELF, IN_IGNORED support syzbot ci
2026-02-20 18:41 ` T.J. Mercier
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260304-glasig-amtieren-5010757246ae@brauner \
--to=brauner@kernel.org \
--cc=amir73il@gmail.com \
--cc=cgroups@vger.kernel.org \
--cc=driver-core@lists.linux.dev \
--cc=gregkh@linuxfoundation.org \
--cc=jack@suse.cz \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-kselftest@vger.kernel.org \
--cc=shuah@kernel.org \
--cc=tj@kernel.org \
--cc=tjmercier@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox