From: Eric Van Hensbergen <ericvh@kernel.org>
To: Christian Schoenebeck <linux_oss@crudebyte.com>
Cc: v9fs@lists.linux.dev, asmadeus@codewreck.org, ericvh@gmail.com
Subject: Re: cache fixes (redux)
Date: Thu, 28 Dec 2023 09:07:46 -0600 [thread overview]
Message-ID: <ZY2PQkJ6jq4dx9p_@FV7GG9FTHL> (raw)
In-Reply-To: <3936280.6WX1Z9CqS8@silver>
On Thu, Dec 28, 2023 at 11:44:32AM +0100, Christian Schoenebeck wrote:
> On Thursday, December 28, 2023 1:32:17 AM CET Dominique Martinet wrote:
> > Eric Van Hensbergen wrote on Wed, Dec 27, 2023 at 11:39:19AM -0600:
> > >
> > > Well, the server will do perm checks (ideally on walks), so do we really
> > > have to? Feels like overkill to check in both places, in principal we
> > > expect/assume server perm checks.
> >
> > The server should definitely check as well, yes.
>
> I would rather see this as a brave kernel option for people who really know
> what they are doing, not as a potential default behaviour.
>
Likely exposed first as one of the many cache options, but yes - I was
planning on potentially guarding this initially -- although as Dominique
says the VFS may be doing checking regardless of what we do in v9fs and
based on the protocol traces I was looking at before it seems we were likely
doing duplicate checks (or at the very least duplicate getattrs) and getting
rid of those is a bug fix not a feature change. So I'm going to proceed
carefully and will post diffs on protocol traces.
> With QEMU there are two distinct behaviours: 'passthrough' permissions vs.
> 'mapped(-*)' permissions:
>
> https://wiki.qemu.org/Documentation/9psetup#Starting_the_Guest_directly
>
> Where 'passthrough' means guest sees the exact same permissions for files as
> host (9p server) does. In this mode you could probably leave it to server
> doing the permission checks. However 'passthrough' mode is discouraged for
> security reasons, and one of the 'mapped' permissions modes is recommended
> instead, because we had several security issues in the past which showed that
> it's not a good idea using 'passthrough' (whereas 'mapped' was unaffected).
>
> With 'mapped-(*)' however guest should do permissions checks. Because in this
> mode file permissions on host are different and separately handled between
> host and guest. So host sees something else than guest does.
>
I think we've had this discussion before, but relying on the client to do
permission checks for the server sounds like a very dangerous behavior from
a security perspective. If the server is configured for some form of mapped
mode then the server should be responsible for checking those semantics to
protect from a compromised client. This is probably even more the case in
a hosted guest environment. In any case, the main point kicking off this
thread was vfs_lookup cases where we are doing multi-walk -- I expect VFS
is still going to do double checks on any sort of transactions on the files
themselves so while I suppose it opens the window up to a traversal bypassing
a directory which doesn't have x permissions, that seems like a corner case.
Besides which, as per the previous paragraph, I think VFS is going to do
the checks anyways and we were just doing duplicate checks so shouldn't be
a problem.
> >
> > qemu mixes in some bits from st_dev to avoid this:
> > https://gitlab.com/qemu-project/qemu/-/blob/master/hw/9pfs/9p.c?ref_type=heads#L881
>
> Inode numbers are always just unique within the same file system. And that
> makes sense. If a unique file ID is required system wide, then inode number
> must be combined with the file system's device ID.
>
Ah, okay, good to know. Christian do you remember anything about this
parallel unlink case? Is there any special handling to guard against
reused inode numbers? From the discussion this seems likely in the
tmpfs case.
> > > > > - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
> > > > > longer used, can these be recovered? I was thinking of reusing one of
> > > > > these to mark unlinked files which we could do a more comprehensive
> > > > > compare for if its actually necessary.
> > > >
> > > See above, mark fid/qids that are unlinked with a different type I think.
> > > I´m gonna look at how other filesystems handle this case before doing
> > > anything.
> >
> > local filesystems don't do anything - the inode is just kept around, all
> > the way down to disk afaik (disk-mapped tmpfiles flush data to disk so
> > there must be some way to address them...)
> > (The last iput with nlink==0 calls evict_inode which does required
> > cleanup)
> >
> >
> > NFS doesn't delete the backing file but renames it - silly renames -
> > you've probably seen some .nfs1234 files around if you used a mount
> > point, the client won't let you delete them while it's open:
> > rm: cannot remove '.nfs000000000000f42300000001': Device or resource busy
> >
> > And it's unlinked when file is closed (DCACHE_NFSFS_RENAMED check in
> > nfs_dentry_iput). That's necessary because the connexion to the server
> > can drop at any time, and reconnecting needs to be able to open the
> > file again.
> > Servers will also scrub these files after they're been unused for a
> > while.
> >
Yeah, there's the files and then there's the inode. So if its just a rename,
its likely it kept the inode, right? I guess I should just gaze at some
VFS traces to understand the behavior and whether or not we need to do
anything from an inode/qid.path perspective.
I have publically stated I hate the temporary file hack. It just seems
awful, but I've punted this to be a server responsibility so I don't have
to think about it ;) I get where it may cause trouble on a reconnect,
but that could be handled different ways as well, just with a lot more
invasiveness on the server-side (like holding on to dangling fids on
unexpected disconnects) -- maybe something else to queue up for
future protocol revisions.
>
> I use to gather ideas for 9p protocol changes here:
> https://wiki.qemu.org/Documentation/9p#Protocol_Plans
>
> One thing that comes to my mind that's not yet on that list: server side push
> notifications, e.g. to inform guest that something changed on a file/dir guest
> currently keeps open. That might improve consistency when both on host and
> guest changes are made and caching being used by guest.
>
Yes, that list largely seems in line with mine except for adding the error
attribute on Rread/Rwrite -- that one maybe needs some discussion. I do agree
that looking into server-push or some other notification method would be good
to add to the protocol.
-eric
next prev parent reply other threads:[~2023-12-28 15:07 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-12-26 16:13 cache fixes (redux) Eric Van Hensbergen
2023-12-27 8:10 ` Dominique Martinet
2023-12-27 18:07 ` Eric Van Hensbergen
[not found] ` <CAFkjPT=8CZHATaraBrqAZGDKjQLOp=U1gdgteJ5jpXRGJyBojQ@mail.gmail.com>
2023-12-28 0:32 ` Dominique Martinet
2023-12-28 10:44 ` Christian Schoenebeck
2023-12-28 15:07 ` Eric Van Hensbergen [this message]
2023-12-28 21:48 ` asmadeus
2023-12-29 12:22 ` Christian Schoenebeck
2023-12-29 12:50 ` asmadeus
2023-12-29 16:16 ` Eric Van Hensbergen
2023-12-29 16:06 ` Eric Van Hensbergen
2023-12-29 15:50 ` Eric Van Hensbergen
2024-01-03 18:09 ` cache fixes (redux) (a tale of many inode numbers...) Eric Van Hensbergen
2024-01-03 20:24 ` cache fixes (redux) asmadeus
[not found] ` <CAFkjPT=61yygntbhSJMMWeK4JBrm85EZpz387aPHD_xmHVBbog@mail.gmail.com>
2024-01-03 22:42 ` Fwd: " Eric Van Hensbergen
2024-01-03 22:48 ` Eric Van Hensbergen
2024-01-04 12:37 ` Christian Schoenebeck
2024-01-04 15:59 ` Eric Van Hensbergen
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=ZY2PQkJ6jq4dx9p_@FV7GG9FTHL \
--to=ericvh@kernel.org \
--cc=asmadeus@codewreck.org \
--cc=ericvh@gmail.com \
--cc=linux_oss@crudebyte.com \
--cc=v9fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox