linux-security-module.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Dominique Martinet <asmadeus@codewreck.org>
To: Tingmao Wang <m@maowtm.org>
Cc: "Christian Schoenebeck" <linux_oss@crudebyte.com>,
	"Mickaël Salaün" <mic@digikod.net>,
	"Eric Van Hensbergen" <ericvh@kernel.org>,
	"Latchesar Ionkov" <lucho@ionkov.net>,
	v9fs@lists.linux.dev, "Günther Noack" <gnoack@google.com>,
	linux-security-module@vger.kernel.org, "Jan Kara" <jack@suse.cz>,
	"Amir Goldstein" <amir73il@gmail.com>,
	"Matthew Bobrowski" <repnop@google.com>,
	"Al Viro" <viro@zeniv.linux.org.uk>,
	"Christian Brauner" <brauner@kernel.org>,
	linux-fsdevel@vger.kernel.org
Subject: Re: [PATCH v2 0/7] fs/9p: Reuse inode based on path (in addition to qid)
Date: Tue, 16 Sep 2025 08:31:49 +0900	[thread overview]
Message-ID: <aMih5XYYrpP559de@codewreck.org> (raw)
In-Reply-To: <41d761e6-d826-4c29-b673-1fb2b0af77b9@maowtm.org>

(Thanks Christian, replying just once but your reply was helpful)

Tingmao Wang wrote on Mon, Sep 15, 2025 at 02:44:44PM +0100:
> > I'm fuzzy on the details but I don't see how inode pointers would be
> > stable for other filesystems as well, what prevents
> > e.g. vm.drop_caches=3 to drop these inodes on ext4?
> 
> Landlock holds a reference to the inode in the ruleset, so they shouldn't
> be dropped.  On security_sb_delete Landlock will iput those inodes so they
> won't cause issue with unmounting.  There is some special mechanism
> ("landlock objects") to decouple the ruleset themselves from the actual
> inodes, so that previously Landlocked things can keep running even after
> the inode has disappeared as a result of unmounting.

Thank you for the explanation, that makes more sense.
iirc even in cacheless mode 9p should keep inode arounds if there's an
open fd somewhere

> > Although looking at the patches what 9p seems to need isn't a new stable
> > handle, but "just" not allocating new inodes in iget5...
> > This was attempted in 724a08450f74 ("fs/9p: simplify iget to remove
> > unnecessary paths"), but later reverted in be2ca3825372 ("Revert "fs/9p:
> > simplify iget to remove unnecessary paths"") because it broke too many
> > users, but if you're comfortable with a new mount option for the lookup
> > by path I think we could make a new option saying
> > "yes_my_server_has_unique_qids"... Which I assume would work for
> > landlock/fsnotify?
> 
> I noticed that, but assumed that simply reverting them without additional
> work (such as tracking the string path) would be a no go given the reason
> why they are reverted.

Yes, just reverting and using that as default broke too much things, so
this is unfortunately not acceptable... And 9p has no "negotiation"
phase on mount to say "okay this is qemu with remap mode so we can do
that" to enable to disable the behaviour automatically; which has been
annoying in the past too.

I understand you'd prefer something that works by default.

> > I'm not sure how much effort we want to spend on 32bit: as far as I
> > know, if we have inode number collision on 32 bit we're already in
> > trouble (tools like tar will consider such files to be hardlink of each
> > other and happily skip reading data, producing corrupted archives);
> > this is not a happy state but I don't know how to do better in any
> > reasonable way, so we can probably keep a similar limitation for 32bit
> > and use inode number directly...
> 
> I think if 9pfs export a handle it can be the full 64bit qid.path on any
> platform, right?

yes, file handle can be an arbitrary size.

> > I'm not familiar with the qid remap implementation in qemu, but I'm
> > curious in what case you hit that.
> > Deleting and recreating files? Or as you seem to say below the 'qid' is
> > "freed" when fd is closed qemu-side and re-used by later open of other
> > files?
> 
> I tried mounting a qemu-exported 9pfs backed on ext4, with
> multidevs=remap, and created a file, used stat to note its inode number,
> deleted the file, created another file (of the same OR different name),
> and that new file will have the same inode number.
> 
> (If I don't delete the file, then a newly created file would of course
> have a different ext4 inode number, and in that case QEMU exposes a
> different qid)

Ok so from Christian's reply this is just ext4 reusing the same inode..
I briefly hinted at this above, but in this case ext4 will give the
inode a different generation number (so the ext4 file handle will be
different, and accessing the old one will get ESTALE); but that's not
something qemu currently tracks and it'd be a bit of an overhaul...
In theory qemu could hash mount_id + file handle to get a properly
unique qid, if we need to improve that, but that'd be limited to root
users (and to filesystems that support name_to_handle_at) so I don't
think it's really appropriate either... hmm..

(I also thought of checking if nlink is 0 when getting a new inode, but
that's technically legimitate from /proc/x/fd opens so I don't think we
can do that either)

And then there's also all the servers that don't give unique qids at
all, so we'll just get weird landlock/fsnotify behaviours for them if we
go that way...

-----------------

Okay, you've convinced me something like path tracking seems more
appropriate; I'll just struggle one last time first with a few more open
questions:
 - afaiu this (reusing inodes) work in cached mode because the dentry is
kept around; I don't understand the vfs well enough but could the inodes
hold its dentry and dentries hold their parent dentry alive somehow?
So in cacheless mode, if you have a tree like this:
a
└── b
    ├── c
    └── d
with c 'open' (or a reference held by landlock), then dentries for a/b/c
would be kept, but d could be droppable?

My understanding is that in cacheless mode we're dropping dentries
aggressively so that things like readdir() are refreshed, but I'm
thinking this should work even if we keep some dentries alive when their
inode is held up.

 - if that doesn't work (or is too complicated), I'm thinking tracking
path is probably better than qid-based filtering based on what we
discussed as it only affects uncached mode.. I'll need to spend some
time testing but I think we can move forward with the current patchset
rather than try something new.

Thanks!
-- 
Dominique Martinet | Asmadeus

  reply	other threads:[~2025-09-15 23:32 UTC|newest]

Thread overview: 27+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-09-04  0:04 [PATCH v2 0/7] fs/9p: Reuse inode based on path (in addition to qid) Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 1/7] fs/9p: Add ability to identify inode by path for .L in uncached mode Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 2/7] fs/9p: add option for path-based inodes Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 3/7] fs/9p: Add ability to identify inode by path for non-.L in uncached mode Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 4/7] fs/9p: .L: Refresh stale inodes on reuse Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 5/7] fs/9p: non-.L: " Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 6/7] fs/9p: update the target's ino_path on rename Tingmao Wang
2025-09-04  0:04 ` [PATCH v2 7/7] docs: fs/9p: Document the "inodeident" option Tingmao Wang
2025-09-14 21:25 ` [PATCH v2 0/7] fs/9p: Reuse inode based on path (in addition to qid) Tingmao Wang
2025-09-15 12:53   ` Dominique Martinet
2025-09-15 13:44     ` Tingmao Wang
2025-09-15 23:31       ` Dominique Martinet [this message]
2025-09-16 12:44         ` Tingmao Wang
2025-09-16 13:35           ` Dominique Martinet
2025-09-16 14:01             ` Tingmao Wang
2025-09-16 19:22               ` Christian Schoenebeck
2025-09-16 23:59                 ` Tingmao Wang
2025-09-17  9:52                   ` Christian Schoenebeck
2025-09-17 15:00                     ` Mickaël Salaün
2025-09-21 16:24                       ` Tingmao Wang
2025-09-27 18:27                         ` Mickaël Salaün
2025-09-27 22:53                           ` Tingmao Wang
2025-09-29 13:06                         ` Christian Schoenebeck
2025-10-13  9:24                           ` Greg Kurz
2025-09-16 13:43           ` Christian Schoenebeck
2025-09-15 14:10     ` Christian Schoenebeck
2025-09-17 15:00       ` Mickaël Salaün

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aMih5XYYrpP559de@codewreck.org \
    --to=asmadeus@codewreck.org \
    --cc=amir73il@gmail.com \
    --cc=brauner@kernel.org \
    --cc=ericvh@kernel.org \
    --cc=gnoack@google.com \
    --cc=jack@suse.cz \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-security-module@vger.kernel.org \
    --cc=linux_oss@crudebyte.com \
    --cc=lucho@ionkov.net \
    --cc=m@maowtm.org \
    --cc=mic@digikod.net \
    --cc=repnop@google.com \
    --cc=v9fs@lists.linux.dev \
    --cc=viro@zeniv.linux.org.uk \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).