cache fixes (redux)

Linux 9p file system development
 help / color / mirror / Atom feed

* cache fixes (redux)
@ 2023-12-26 16:13 Eric Van Hensbergen
  2023-12-27  8:10 ` Dominique Martinet
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Van Hensbergen @ 2023-12-26 16:13 UTC (permalink / raw)
  To: v9fs

It's the holidays so I have some time to spend brain cycles on 9p --
I've been looking over the meta -data caching problems and found
several low-hanging fixes that might help without having to do the big
re-write I tried last year.

The two things I'm going after first

- The fact that we never seem to use multi-walk (this isn't a cache
specific problem, it actually is somewhat worse in non-cache mode).
This seems to be because fid_lookup goes recursive before entering
multiwalk.  This should be a fairly simple fix - I'm just going to add
an argument to limit recursion, but I'm noodling on whether we need
that path a bit first.

- figure out why even in loose mode we seem to be sending a lot of
stat/getattr -- this seems to be related to some code that came in
with the cache code.  It seems we almost always stat/getattr because
we are constantly creating new inodes.  There's some comments in
vfs_inode.c:755 that talk about needing to do this for parallel lookup
-- but this all seems like a hack.  Anyone have better insight into
why we need this?  We always to the stat/getattr to retreive the stat
to compare against because we don't trust qid.path -- which I think is
wrong.  If there is an issue with unlinked files, I think we can do
something smarter to differentiate them, maybe using a qid.type

     - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
longer used, can these be recovered?  I was thinking of reusing one of
these to mark unlinked files which we could do a more comprehensive
compare for if its actually necessary.

FWIW - I do plan to go back and try and do the broader rewrite on the
meta-data cache because I think we'll need it for temporal caches to
work properly, but I figure if I knock these two out it'll make life
better for many people's common use cases.

     -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-26 16:13 cache fixes (redux) Eric Van Hensbergen
@ 2023-12-27  8:10 ` Dominique Martinet
  2023-12-27 18:07   ` Eric Van Hensbergen
       [not found]   ` <CAFkjPT=8CZHATaraBrqAZGDKjQLOp=U1gdgteJ5jpXRGJyBojQ@mail.gmail.com>
  0 siblings, 2 replies; 18+ messages in thread
From: Dominique Martinet @ 2023-12-27  8:10 UTC (permalink / raw)
  To: Eric Van Hensbergen; +Cc: v9fs, Christian Schoenebeck

Hi Eric!

Eric Van Hensbergen wrote on Tue, Dec 26, 2023 at 10:13:20AM -0600:
> - The fact that we never seem to use multi-walk (this isn't a cache
> specific problem, it actually is somewhat worse in non-cache mode).
> This seems to be because fid_lookup goes recursive before entering
> multiwalk.  This should be a fairly simple fix - I'm just going to add
> an argument to limit recursion, but I'm noodling on whether we need
> that path a bit first.

I'm not sure multi-walk is possible in practice: even opening a file
directly through e.g. cat /mnt/long/path/to/file will have the VFS check
permissions along the path, so the VFS will need to stat all elements
along the way.

If you can get it to work though it's certainly worth implementing the
possibility, let's give it a try.

> - figure out why even in loose mode we seem to be sending a lot of
> stat/getattr -- this seems to be related to some code that came in
> with the cache code.  It seems we almost always stat/getattr because
> we are constantly creating new inodes.  There's some comments in
> vfs_inode.c:755 that talk about needing to do this for parallel lookup
> -- but this all seems like a hack.  Anyone have better insight into
> why we need this?  We always to the stat/getattr to retreive the stat
> to compare against because we don't trust qid.path -- which I think is
> wrong.  If there is an issue with unlinked files, I think we can do
> something smarter to differentiate them, maybe using a qid.type

Hmm in loose mode we definitely should be reusing the inodes...
No idea why we had this in the first place, I'd say rip it out and
compare?

>      - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
> longer used, can these be recovered?  I was thinking of reusing one of
> these to mark unlinked files which we could do a more comprehensive
> compare for if its actually necessary.

Since we don't support reconnects, unlinked files have been working just
fine with servers that keep the underlying files open (e.g. qemu)

What's your plan exactly?

We don't have any way of driving protocol changes forward (and there's
no compatibility negotiation on client connect...), but I recall
Christian also was looking forward to some other changes so we might
want to try to move this forward again...

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-27  8:10 ` Dominique Martinet
@ 2023-12-27 18:07   ` Eric Van Hensbergen
       [not found]   ` <CAFkjPT=8CZHATaraBrqAZGDKjQLOp=U1gdgteJ5jpXRGJyBojQ@mail.gmail.com>
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2023-12-27 18:07 UTC (permalink / raw)
  To: Dominique Martinet; +Cc: v9fs, Christian Schoenebeck

On Wed, Dec 27, 2023 at 2:11 AM Dominique Martinet
<asmadeus@codewreck.org> wrote:
>
> Eric Van Hensbergen wrote on Tue, Dec 26, 2023 at 10:13:20AM -0600:
> > - The fact that we never seem to use multi-walk (this isn't a cache
> > specific problem, it actually is somewhat worse in non-cache mode).
> > This seems to be because fid_lookup goes recursive before entering
> > multiwalk.  This should be a fairly simple fix - I'm just going to add
> > an argument to limit recursion, but I'm noodling on whether we need
> > that path a bit first.
>
> I'm not sure multi-walk is possible in practice: even opening a file
> directly through e.g. cat /mnt/long/path/to/file will have the VFS check
> permissions along the path, so the VFS will need to stat all elements
> along the way.
>

Well, the server will do perm checks (ideally on walks), so do we
really have to?  Feels like overkill to check in both places, in
principal we expect/assume server perm checks.
>
> > - figure out why even in loose mode we seem to be sending a lot of
> > stat/getattr -- this seems to be related to some code that came in
> > with the cache code.  It seems we almost always stat/getattr because
> > we are constantly creating new inodes.  There's some comments in
> > vfs_inode.c:755 that talk about needing to do this for parallel lookup
> > -- but this all seems like a hack.  Anyone have better insight into
> > why we need this?  We always to the stat/getattr to retreive the stat
> > to compare against because we don't trust qid.path -- which I think is
> > wrong.  If there is an issue with unlinked files, I think we can do
> > something smarter to differentiate them, maybe using a qid.type
>
> Hmm in loose mode we definitely should be reusing the inodes...
> No idea why we had this in the first place, I'd say rip it out and
> compare?
>

Yeah, I was thinking of using one of the unused qid bits to mark
negative inodes so we dont collide on lookups after unlink (although I
dont know how often this ever happened and some of this will be more
straightforward with all fids tracked in inodes versus dentries).

it did strike me that inodes are only unique per filesystem and I
doubt the servers guarantee uniqueness across underlying mounts - but
in principal we shouldnt have to rely on looking up inodes by qid.path
anyways….however nothing I’m planning on should make this worse, but
something to keep an eye out for.

> >      - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
> > longer used, can these be recovered?  I was thinking of reusing one of
> > these to mark unlinked files which we could do a more comprehensive
> > compare for if its actually necessary.
>
> Since we don't support reconnects, unlinked files have been working just
> fine with servers that keep the underlying files open (e.g. qemu)
>
> What's your plan exactly?
>
> We don't have any way of driving protocol changes forward (and there's
> no compatibility negotiation on client connect...), but I recall
> Christian also was looking forward to some other changes so we might
> want to try to move this forward again...
>

Well, those bits unused (I think) so shouldnt be protocol change.
However we can bump to
new protocol when necessary (9p2024?  its been 23 years, why not?).
big on my list would be qid.path -> 128 bit and qid.version -> 128 bit
and maybe include filesystem uid in qid to differentiate underlying
mounts and maybe i_generation.  This would help with cache maintenance
and uniqueness concerns.  I’m also thinking of how we might handle
security differently (use kernel TLS, public key auth integration,
etc. although those might not require protocol changes).  i’m also in
favor of dirread senantics for cache modes so we retrieve attributes
with the file list.  What other things should we be considering?
maybe time to start putting together an RFC.
Its probably worth driving things like qid changes together with the
Plan 9 community versus making 9p2024 a linux-only version since there
are still folks using and larger qid space is likely a reality on
newer systems).

       -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <CAFkjPT=8CZHATaraBrqAZGDKjQLOp=U1gdgteJ5jpXRGJyBojQ@mail.gmail.com>]

* Re: cache fixes (redux)
       [not found]   ` <CAFkjPT=8CZHATaraBrqAZGDKjQLOp=U1gdgteJ5jpXRGJyBojQ@mail.gmail.com>
@ 2023-12-28  0:32     ` Dominique Martinet
  2023-12-28 10:44       ` Christian Schoenebeck
  0 siblings, 1 reply; 18+ messages in thread
From: Dominique Martinet @ 2023-12-28  0:32 UTC (permalink / raw)
  To: Eric Van Hensbergen; +Cc: Christian Schoenebeck, v9fs

Eric Van Hensbergen wrote on Wed, Dec 27, 2023 at 11:39:19AM -0600:
> On Wed, Dec 27, 2023 at 2:11 AM Dominique Martinet <asmadeus@codewreck.org>
> wrote:
> > I'm not sure multi-walk is possible in practice: even opening a file
> > directly through e.g. cat /mnt/long/path/to/file will have the VFS check
> > permissions along the path, so the VFS will need to stat all elements
> > along the way.
> 
> Well, the server will do perm checks (ideally on walks), so do we really
> have to?  Feels like overkill to check in both places, in principal we
> expect/assume server perm checks.

The server should definitely check as well, yes.
I'm not sure it's possible to tell the linux VFS that the server already
checked though; ultimately the plan back when I was working on 9p for my
previous job was to plug in MPI-IO into a user-space client library
(e.g. https://github.com/martinetd/space9 -- now unmaintained) so we
wouldn't have to deal with the vfs...

But once again, if you can manage to tell it that, I'm all for it. Some
servers might not check properly but that can/should be fixed when it
comes up.

> it did strike me that inodes are only unique per filesystem and I doubt the
> servers guarantee uniqueness across underlying mounts - but in principal we
> shouldnt have to rely on looking up inodes by qid.path anyways….however
> nothing I’m planning on should make this worse, but something to keep an
> eye out for.

We've had collisions a few times -- most "real" file systems I'm aware
of pseudo-randomize inode generation but in particular tmpfs inode
numbering is linear per mount so it's very easy to trigger collisions.

qemu mixes in some bits from st_dev to avoid this:
https://gitlab.com/qemu-project/qemu/-/blob/master/hw/9pfs/9p.c?ref_type=heads#L881

> > >      - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
> > > longer used, can these be recovered?  I was thinking of reusing one of
> > > these to mark unlinked files which we could do a more comprehensive
> > > compare for if its actually necessary.
> >
> > Since we don't support reconnects, unlinked files have been working just
> > fine with servers that keep the underlying files open (e.g. qemu)
> >
> > What's your plan exactly?
> 
> 
> See above, mark fid/qids that are unlinked with a different type I think.
> I’m gonna look at how other filesystems handle this case before doing
> anything.

local filesystems don't do anything - the inode is just kept around, all
the way down to disk afaik (disk-mapped tmpfiles flush data to disk so
there must be some way to address them...)
(The last iput with nlink==0 calls evict_inode which does required
cleanup)

NFS doesn't delete the backing file but renames it - silly renames -
you've probably seen some .nfs1234 files around if you used a mount
point, the client won't let you delete them while it's open:
rm: cannot remove '.nfs000000000000f42300000001': Device or resource busy

And it's unlinked when file is closed (DCACHE_NFSFS_RENAMED check in
nfs_dentry_iput). That's necessary because the connexion to the server
can drop at any time, and reconnecting needs to be able to open the
file again.
Servers will also scrub these files after they're been unused for a
while.

For 9p since we don't handle reconnect I think we're fine just not doing
anything -- servers that don't close backing files will de facto keep
the file existing through their fd. There's a process with the file open
so we have an open fid somewhere and the inode won't get killed either.
When the last fd is closed the inode is evicted and we'll have all fids
closed so server will release its fd as well and backing filesystem will
cleanup appropriately.

If we want to handle reconnect at some point (CEA/Bull did that work but
never tried to publish it...), then we ought to do silly renames as
well, but servers really need background cleanup as well or it'll end
up full of .nfsxxx files...

> > We don't have any way of driving protocol changes forward (and there's
> > no compatibility negotiation on client connect...), but I recall
> > Christian also was looking forward to some other changes so we might
> > want to try to move this forward again...
> >
> 
> Well, those bits unused (I think) so shouldnt be protocol change.

Right, I didn't realize it was just local use -- we can do whatever we
want in memory yes.
I'd avoid reusing the bits and define some new ones at the end of the
permissible range to make the "can come from server" and "client mess"
separation but leaving this up to you -- ultimately as long as there's
no conflict with what's on wire we should be fine.

> However we can bump to
> new protocol when necessary (9p2023?  its been 23 years, why not?). big on
> my list would be qid.path -> 128 bit and qid.version -> 128 bit and maybe
> include filesystem uid in qid to differentiate underlying mounts and maybe
> i_generation.  This would help with cache maintenance and uniqueness
> concerns.  I’m also thinking of how we might handle security differently
> (use kernel TLS, public key auth integration, etc. although those might not
> require protocol changes).  i’m also in favor of dirread senantics for
> cache modes so we retrieve attributes with the file list.  What other
> things should we be considering?  maybe time to start putting together an
> RFC.

Right; there's plenty to improve if we start...
My free time is basically rock bottom as it has been for a few years so
I probably won't help much, but I wouldn't push back.

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-28  0:32     ` Dominique Martinet
@ 2023-12-28 10:44       ` Christian Schoenebeck
  2023-12-28 15:07         ` Eric Van Hensbergen
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Schoenebeck @ 2023-12-28 10:44 UTC (permalink / raw)
  To: Eric Van Hensbergen, Dominique Martinet; +Cc: v9fs, Greg Kurz

On Thursday, December 28, 2023 1:32:17 AM CET Dominique Martinet wrote:
> Eric Van Hensbergen wrote on Wed, Dec 27, 2023 at 11:39:19AM -0600:
> > On Wed, Dec 27, 2023 at 2:11 AM Dominique Martinet <asmadeus@codewreck.org>
> > wrote:
> > > I'm not sure multi-walk is possible in practice: even opening a file
> > > directly through e.g. cat /mnt/long/path/to/file will have the VFS check
> > > permissions along the path, so the VFS will need to stat all elements
> > > along the way.
> > 
> > Well, the server will do perm checks (ideally on walks), so do we really
> > have to?  Feels like overkill to check in both places, in principal we
> > expect/assume server perm checks.
> 
> The server should definitely check as well, yes.
> I'm not sure it's possible to tell the linux VFS that the server already
> checked though; ultimately the plan back when I was working on 9p for my
> previous job was to plug in MPI-IO into a user-space client library
> (e.g. https://github.com/martinetd/space9 -- now unmaintained) so we
> wouldn't have to deal with the vfs...
> 
> But once again, if you can manage to tell it that, I'm all for it. Some
> servers might not check properly but that can/should be fixed when it
> comes up.

I would rather see this as a brave kernel option for people who really know
what they are doing, not as a potential default behaviour.

With QEMU there are two distinct behaviours: 'passthrough' permissions vs.
'mapped(-*)' permissions:

  https://wiki.qemu.org/Documentation/9psetup#Starting_the_Guest_directly

Where 'passthrough' means guest sees the exact same permissions for files as
host (9p server) does. In this mode you could probably leave it to server
doing the permission checks. However 'passthrough' mode is discouraged for
security reasons, and one of the 'mapped' permissions modes is recommended
instead, because we had several security issues in the past which showed that
it's not a good idea using 'passthrough' (whereas 'mapped' was unaffected).

With 'mapped-(*)' however guest should do permissions checks. Because in this
mode file permissions on host are different and separately handled between
host and guest. So host sees something else than guest does.

> > it did strike me that inodes are only unique per filesystem and I doubt the
> > servers guarantee uniqueness across underlying mounts - but in principal we
> > shouldnt have to rely on looking up inodes by qid.path anyways….however
> > nothing I’m planning on should make this worse, but something to keep an
> > eye out for.
> 
> We've had collisions a few times -- most "real" file systems I'm aware
> of pseudo-randomize inode generation but in particular tmpfs inode
> numbering is linear per mount so it's very easy to trigger collisions.
> 
> qemu mixes in some bits from st_dev to avoid this:
> https://gitlab.com/qemu-project/qemu/-/blob/master/hw/9pfs/9p.c?ref_type=heads#L881

Inode numbers are always just unique within the same file system. And that
makes sense. If a unique file ID is required system wide, then inode number
must be combined with the file system's device ID.

And yes, that's what QEMU does (combining inode # with device ID) when option
`multidevs=remap` was supplied when generating qid.path to avoid file ID
collisions on guest. Unfortunately qid.path field is currently too small to
just use the device ID directly, so a little trick is used: the so called
'Exponential Golomb' algorithm is used to generate a much shorter device ID
that can be put into qid.path instead.

Note though, this behaviour is actually off by default ATM, controllable by:

 multidevs=remap|forbid|warn

 https://wiki.qemu.org/Documentation/9psetup#Starting_the_Guest_directly

By default (multidevs=warn) QEMU would currently only print/log a warning:

  warn_report_once(
    "9p: Multiple devices detected in same VirtFS export, "
    "which might lead to file ID collisions and severe "
    "misbehaviours on guest! You should either use a "
    "separate export for each device shared from host or "
    "use virtfs option 'multidevs=remap'!"
  );

> > > >      - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
> > > > longer used, can these be recovered?  I was thinking of reusing one of
> > > > these to mark unlinked files which we could do a more comprehensive
> > > > compare for if its actually necessary.
> > >
> > > Since we don't support reconnects, unlinked files have been working just
> > > fine with servers that keep the underlying files open (e.g. qemu)
> > >
> > > What's your plan exactly?
> > 
> > 
> > See above, mark fid/qids that are unlinked with a different type I think.
> > I’m gonna look at how other filesystems handle this case before doing
> > anything.
> 
> local filesystems don't do anything - the inode is just kept around, all
> the way down to disk afaik (disk-mapped tmpfiles flush data to disk so
> there must be some way to address them...)
> (The last iput with nlink==0 calls evict_inode which does required
> cleanup)
> 
> 
> NFS doesn't delete the backing file but renames it - silly renames -
> you've probably seen some .nfs1234 files around if you used a mount
> point, the client won't let you delete them while it's open:
> rm: cannot remove '.nfs000000000000f42300000001': Device or resource busy
> 
> And it's unlinked when file is closed (DCACHE_NFSFS_RENAMED check in
> nfs_dentry_iput). That's necessary because the connexion to the server
> can drop at any time, and reconnecting needs to be able to open the
> file again.
> Servers will also scrub these files after they're been unused for a
> while.
> 
> 
> For 9p since we don't handle reconnect I think we're fine just not doing
> anything -- servers that don't close backing files will de facto keep
> the file existing through their fd. There's a process with the file open
> so we have an open fid somewhere and the inode won't get killed either.
> When the last fd is closed the inode is evicted and we'll have all fids
> closed so server will release its fd as well and backing filesystem will
> cleanup appropriately.
> 
> If we want to handle reconnect at some point (CEA/Bull did that work but
> never tried to publish it...), then we ought to do silly renames as
> well, but servers really need background cleanup as well or it'll end
> up full of .nfsxxx files...
> 
> 
> > > We don't have any way of driving protocol changes forward (and there's
> > > no compatibility negotiation on client connect...), but I recall
> > > Christian also was looking forward to some other changes so we might
> > > want to try to move this forward again...
> > >
> > 
> > Well, those bits unused (I think) so shouldnt be protocol change.
> 
> Right, I didn't realize it was just local use -- we can do whatever we
> want in memory yes.
> I'd avoid reusing the bits and define some new ones at the end of the
> permissible range to make the "can come from server" and "client mess"
> separation but leaving this up to you -- ultimately as long as there's
> no conflict with what's on wire we should be fine.
> 
> > However we can bump to
> > new protocol when necessary (9p2023?  its been 23 years, why not?). big on
> > my list would be qid.path -> 128 bit and qid.version -> 128 bit and maybe
> > include filesystem uid in qid to differentiate underlying mounts and maybe
> > i_generation.  This would help with cache maintenance and uniqueness
> > concerns.  I’m also thinking of how we might handle security differently
> > (use kernel TLS, public key auth integration, etc. although those might not
> > require protocol changes).  i’m also in favor of dirread senantics for
> > cache modes so we retrieve attributes with the file list.  What other
> > things should we be considering?  maybe time to start putting together an
> > RFC.
> 
> Right; there's plenty to improve if we start...
> My free time is basically rock bottom as it has been for a few years so
> I probably won't help much, but I wouldn't push back.

I use to gather ideas for 9p protocol changes here:
https://wiki.qemu.org/Documentation/9p#Protocol_Plans

One thing that comes to my mind that's not yet on that list: server side push
notifications, e.g. to inform guest that something changed on a file/dir guest
currently keeps open. That might improve consistency when both on host and
guest changes are made and caching being used by guest.

/Christian



^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-28 10:44       ` Christian Schoenebeck
@ 2023-12-28 15:07         ` Eric Van Hensbergen
  2023-12-28 21:48           ` asmadeus
  0 siblings, 1 reply; 18+ messages in thread
From: Eric Van Hensbergen @ 2023-12-28 15:07 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: v9fs, asmadeus, ericvh

On Thu, Dec 28, 2023 at 11:44:32AM +0100, Christian Schoenebeck wrote:
> On Thursday, December 28, 2023 1:32:17 AM CET Dominique Martinet wrote:
> > Eric Van Hensbergen wrote on Wed, Dec 27, 2023 at 11:39:19AM -0600:
> > > 
> > > Well, the server will do perm checks (ideally on walks), so do we really
> > > have to?  Feels like overkill to check in both places, in principal we
> > > expect/assume server perm checks.
> > 
> > The server should definitely check as well, yes.
> 
> I would rather see this as a brave kernel option for people who really know
> what they are doing, not as a potential default behaviour.
> 

Likely exposed first as one of the many cache options, but yes - I was
planning on potentially guarding this initially -- although as Dominique 
says the VFS may be doing checking regardless of what we do in v9fs and 
based on the protocol traces I was looking at before it seems we were likely
doing duplicate checks (or at the very least duplicate getattrs) and getting
rid of those is a bug fix not a feature change.  So I'm going to proceed
carefully and will post diffs on protocol traces.

> With QEMU there are two distinct behaviours: 'passthrough' permissions vs.
> 'mapped(-*)' permissions:
> 
>   https://wiki.qemu.org/Documentation/9psetup#Starting_the_Guest_directly
> 
> Where 'passthrough' means guest sees the exact same permissions for files as
> host (9p server) does. In this mode you could probably leave it to server
> doing the permission checks. However 'passthrough' mode is discouraged for
> security reasons, and one of the 'mapped' permissions modes is recommended
> instead, because we had several security issues in the past which showed that
> it's not a good idea using 'passthrough' (whereas 'mapped' was unaffected).
> 
> With 'mapped-(*)' however guest should do permissions checks. Because in this
> mode file permissions on host are different and separately handled between
> host and guest. So host sees something else than guest does.
> 

I think we've had this discussion before, but relying on the client to do
permission checks for the server sounds like a very dangerous behavior from
a security perspective.  If the server is configured for some form of mapped
mode then the server should be responsible for checking those semantics to
protect from a compromised client.  This is probably even more the case in 
a hosted guest environment.  In any case, the main point kicking off this 
thread was vfs_lookup cases where we are doing multi-walk -- I expect VFS
is still going to do double checks on any sort of transactions on the files
themselves so while I suppose it opens the window up to a traversal bypassing
a directory which doesn't have x permissions, that seems like a corner case.
Besides which, as per the previous paragraph, I think VFS is going to do 
the checks anyways and we were just doing duplicate checks so shouldn't be
a problem.

> > 
> > qemu mixes in some bits from st_dev to avoid this:
> > https://gitlab.com/qemu-project/qemu/-/blob/master/hw/9pfs/9p.c?ref_type=heads#L881
> 
> Inode numbers are always just unique within the same file system. And that
> makes sense. If a unique file ID is required system wide, then inode number
> must be combined with the file system's device ID.
> 

Ah, okay, good to know.  Christian do you remember anything about this
parallel unlink case?  Is there any special handling to guard against
reused inode numbers?  From the discussion this seems likely in the
tmpfs case.

> > > > >      - QUESTION: it seems like P9_QTLINK and P9_QTSYMLINK are no
> > > > > longer used, can these be recovered?  I was thinking of reusing one of
> > > > > these to mark unlinked files which we could do a more comprehensive
> > > > > compare for if its actually necessary.
> > > >
> > > See above, mark fid/qids that are unlinked with a different type I think.
> > > I´m gonna look at how other filesystems handle this case before doing
> > > anything.
> > 
> > local filesystems don't do anything - the inode is just kept around, all
> > the way down to disk afaik (disk-mapped tmpfiles flush data to disk so
> > there must be some way to address them...)
> > (The last iput with nlink==0 calls evict_inode which does required
> > cleanup)
> > 
> > 
> > NFS doesn't delete the backing file but renames it - silly renames -
> > you've probably seen some .nfs1234 files around if you used a mount
> > point, the client won't let you delete them while it's open:
> > rm: cannot remove '.nfs000000000000f42300000001': Device or resource busy
> > 
> > And it's unlinked when file is closed (DCACHE_NFSFS_RENAMED check in
> > nfs_dentry_iput). That's necessary because the connexion to the server
> > can drop at any time, and reconnecting needs to be able to open the
> > file again.
> > Servers will also scrub these files after they're been unused for a
> > while.
> >

Yeah, there's the files and then there's the inode.  So if its just a rename,
its likely it kept the inode, right?  I guess I should just gaze at some
VFS traces to understand the behavior and whether or not we need to do
anything from an inode/qid.path perspective.

I have publically stated I hate the temporary file hack.  It just seems
awful, but I've punted this to be a server responsibility so I don't have
to think about it ;)  I get where it may cause trouble on a reconnect,
but that could be handled different ways as well, just with a lot more
invasiveness on the server-side (like holding on to dangling fids on 
unexpected disconnects) -- maybe something else to queue up for
future protocol revisions.

> 
> I use to gather ideas for 9p protocol changes here:
> https://wiki.qemu.org/Documentation/9p#Protocol_Plans
> 
> One thing that comes to my mind that's not yet on that list: server side push
> notifications, e.g. to inform guest that something changed on a file/dir guest
> currently keeps open. That might improve consistency when both on host and
> guest changes are made and caching being used by guest.
>

Yes, that list largely seems in line with mine except for adding the error
attribute on Rread/Rwrite -- that one maybe needs some discussion.  I do agree
that looking into server-push or some other notification method would be good
to add to the protocol. 

         -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-28 15:07         ` Eric Van Hensbergen
@ 2023-12-28 21:48           ` asmadeus
  2023-12-29 12:22             ` Christian Schoenebeck
  2023-12-29 15:50             ` Eric Van Hensbergen
  0 siblings, 2 replies; 18+ messages in thread
From: asmadeus @ 2023-12-28 21:48 UTC (permalink / raw)
  To: Eric Van Hensbergen; +Cc: Christian Schoenebeck, v9fs, ericvh

Eric Van Hensbergen wrote on Thu, Dec 28, 2023 at 09:07:46AM -0600:
> > > qemu mixes in some bits from st_dev to avoid this:
> > > https://gitlab.com/qemu-project/qemu/-/blob/master/hw/9pfs/9p.c?ref_type=heads#L881
> > 
> > Inode numbers are always just unique within the same file system. And that
> > makes sense. If a unique file ID is required system wide, then inode number
> > must be combined with the file system's device ID.
> 
> Ah, okay, good to know.  Christian do you remember anything about this
> parallel unlink case?  Is there any special handling to guard against
> reused inode numbers?  From the discussion this seems likely in the
> tmpfs case.

Just to clarify: I've quoted tmpfs as a way of manually generating
duplicates, but it still needs multiple mount points (as single tmpfs
will still ensure uniqueness)
e.g.

---
# mkdir dup
# cd dup
# mkdir 1 2
# mount -t tmpfs tmpfs 1
# mount -t tmpfs tmpfs 2
# echo one > 1/one
# echo second > 2/second
# ls -li 1 2
# ls -li 1 2
1:
total 4
2 -rw-r--r--. 1 root root 4 Dec 29 06:10 one

2:
total 4
2 -rw-r--r--. 1 root root 7 Dec 29 06:10 second
---

Then export that 'dup' directory without remapping to see how it
behaves; you'll find some weird behaviours with cached modes.


I don't think we can do anything if the server sends us identical
inodes, that's the server's job.

(iiuc from a networked filesystem's point of view (e.g. nfs), we ought
to check i_ino + i_generation as inode number can be reused after the
previous owner of the number was deleted, but we're guaranteed the
couple change... At any given point though i_ino is unique so we've been
relying on just that for two reasons: no room to send i_generation, and
the field isn't easily obtainable from userspace (need to get in through
bytes in export_to_handle_at which is file system dependant), so we only
rely on inode number -- if the field gets bigger it should be variable
length and we should write the whole mnt_id + export_to_handle_at
content for a stable, truly unique handle. That's what userspace NFS
servers do.)
> 
> Yeah, there's the files and then there's the inode.  So if its just a rename,
> its likely it kept the inode, right?  I guess I should just gaze at some
> VFS traces to understand the behavior and whether or not we need to do
> anything from an inode/qid.path perspective.

Yes, renames won't change the inode at all (ls -i will keep the same
number, and all props will be identical)


In linux VFS speak, there's:
 - dentries, what you called files?, pointer to an inode from
directories; that's what get updated on renames; there can be multiple
dentries for a single inode (hardlinks for regular files)
We keep some fid in there for directories to walk from there as needed
(one per uid)

 - inode numbers, that identify a given ""file"" (common speech file,
actual data); it must not change for the lifetime of the file and there
should be no collision

- the inode struct e.g. "file" informations (size, various times,
i_version..) that get updated when file is modfied (rename shouldn't
touch these either, but writes will);
I guess with what you said there can be multiple inode structs alive for
a given "file"? But that sounds like a bug to me, we should always reuse
as it's also what governs the vfs cache data: at best you'll be fetching
stats twice, at worse you'll end up with two mmap in parallel that don't
get updated the same...

- the file struct, that'll back a fd somewhere (can be multiple fd per
file with dup and friends); we keep the open 'fid' information around in
the file's private data.

(I'm not 100% sure how process cwd is handled -- probably just a ref on
the dentry? When opening a file we need to figure it out from the
dentry, so get the parent's dentry and fids from there and use
that parent's dentry cached fid, or build it up -- I guess if there's no
cache file here we could do multi-level walk, but my understanding was
that there should always be something cached at the point... Also, for
something like ls you'll stat all files in a directory, so if there
wasn't a cached fid for the parent's dentry you'd do a lot of
multi-level walks, where caching the parent's dentry sound more
efficent?)


So to recap there should only be one inode, then as many dentries as
there are paths to it, and as many files as there are "open handles" to
it

In cacheless modes we should drop dentries as soon as no file/process
use it, ad drop inodes as soon as there's no dentries left.
In cached mode we should have an extra ref to dentries (droppable on
memory pressure somehow -- we're not doing that correctly right now
afaik), that'll keep the inode alive, so we should try very hard to
always reuse the same inode/dentres to avoid needless lookups.

> I have publically stated I hate the temporary file hack.  It just seems
> awful, but I've punted this to be a server responsibility so I don't have
> to think about it ;)  I get where it may cause trouble on a reconnect,
> but that could be handled different ways as well, just with a lot more
> invasiveness on the server-side (like holding on to dangling fids on 
> unexpected disconnects) -- maybe something else to queue up for
> future protocol revisions.

Well there's reconnect (connection troubles), and there's reconnect
(server restart, or migration) -- in nfsv4 the server is expected to
have some persistent states so e.g. file leases will persist, so I guess
it would be possible to also store information for tmpfiles there too,
but at the very least the server needs to also be able to re-open the
file so it needs to be kept around...
(speaking of which, I guess qemu live migration doesn't work with 9p
mounts? or maybe just not tmpfiles... Can't say I tried...)

Anyway, we don't need to work on reconnect here, that's distinct enough
from the problems you've seen;
let's not worry about it at this point, and we can bring it back up
later if/when someone has energy for it :)

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-28 21:48           ` asmadeus
@ 2023-12-29 12:22             ` Christian Schoenebeck
  2023-12-29 12:50               ` asmadeus
  2023-12-29 16:06               ` Eric Van Hensbergen
  2023-12-29 15:50             ` Eric Van Hensbergen
  1 sibling, 2 replies; 18+ messages in thread
From: Christian Schoenebeck @ 2023-12-29 12:22 UTC (permalink / raw)
  To: Eric Van Hensbergen, asmadeus; +Cc: v9fs, ericvh

On Thursday, December 28, 2023 10:48:23 PM CET asmadeus@codewreck.org wrote:
> Eric Van Hensbergen wrote on Thu, Dec 28, 2023 at 09:07:46AM -0600:
> > > > qemu mixes in some bits from st_dev to avoid this:
> > > > https://gitlab.com/qemu-project/qemu/-/blob/master/hw/9pfs/9p.c?ref_type=heads#L881
> > > 
> > > Inode numbers are always just unique within the same file system. And that
> > > makes sense. If a unique file ID is required system wide, then inode number
> > > must be combined with the file system's device ID.
> > 
> > Ah, okay, good to know.  Christian do you remember anything about this
> > parallel unlink case?  Is there any special handling to guard against
> > reused inode numbers?  From the discussion this seems likely in the
> > tmpfs case.
> 
> Just to clarify: I've quoted tmpfs as a way of manually generating
> duplicates, but it still needs multiple mount points (as single tmpfs
> will still ensure uniqueness)
> e.g.
> 
> ---
> # mkdir dup
> # cd dup
> # mkdir 1 2
> # mount -t tmpfs tmpfs 1
> # mount -t tmpfs tmpfs 2
> # echo one > 1/one
> # echo second > 2/second
> # ls -li 1 2
> # ls -li 1 2
> 1:
> total 4
> 2 -rw-r--r--. 1 root root 4 Dec 29 06:10 one
> 
> 2:
> total 4
> 2 -rw-r--r--. 1 root root 7 Dec 29 06:10 second
> ---

Which is the expected, correct behaviour, as you can still distinguish the two
by their device IDs:

# stat 1/one
  File: 1/one
  Size: 4               Blocks: 8          IO Block: 4096   regular file
Device: 2dh/45d Inode: 2           Links: 1
  ...
# stat 2/second
  File: 2/second
  Size: 7               Blocks: 8          IO Block: 4096   regular file
Device: 2eh/46d Inode: 2           Links: 1
  ...

> Then export that 'dup' directory without remapping to see how it
> behaves; you'll find some weird behaviours with cached modes.
> 
> 
> I don't think we can do anything if the server sends us identical
> inodes, that's the server's job.

Right.

BTW even if qid.path length is increased as part of a 9p protocol change, it
would still be tricky for client implementation side, as that number is too
big as simply being exposed as inode number by client. IIRC virtiofsd is
handling this by automatically creating separate devices when needed. I
haven't checked how exactly, whether there is e.g. some easy way to create
"subdevices" with Linux.

> (iiuc from a networked filesystem's point of view (e.g. nfs), we ought
> to check i_ino + i_generation as inode number can be reused after the
> previous owner of the number was deleted, but we're guaranteed the
> couple change... At any given point though i_ino is unique so we've been
> relying on just that for two reasons: no room to send i_generation, and
> the field isn't easily obtainable from userspace (need to get in through
> bytes in export_to_handle_at which is file system dependant), so we only
> rely on inode number -- if the field gets bigger it should be variable
> length and we should write the whole mnt_id + export_to_handle_at
> content for a stable, truly unique handle. That's what userspace NFS
> servers do.)

Why is that? The inode number is a serial number which is consecutively
incremented:

# ls -li one
2 -rw-r--r-- 1 me  me  4 Dec 29 12:42 one
# rm one
# echo one > one
# ls -li one
3 -rw-r--r-- 1 me  me  4 Dec 29 12:55 one

So I would not expect an inode number to be reused before the consecutive
counter wraps at 2^64. And then, this discussion is about files still being
open. So host's file system can't recycle the inode number unless 9p client
closes the corresponding old FID.

> > 
> > Yeah, there's the files and then there's the inode.  So if its just a rename,
> > its likely it kept the inode, right?  I guess I should just gaze at some
> > VFS traces to understand the behavior and whether or not we need to do
> > anything from an inode/qid.path perspective.
> 
> Yes, renames won't change the inode at all (ls -i will keep the same
> number, and all props will be identical)
> 
> 
> In linux VFS speak, there's:
>  - dentries, what you called files?, pointer to an inode from
> directories; that's what get updated on renames; there can be multiple
> dentries for a single inode (hardlinks for regular files)
> We keep some fid in there for directories to walk from there as needed
> (one per uid)
> 
>  - inode numbers, that identify a given ""file"" (common speech file,
> actual data); it must not change for the lifetime of the file and there
> should be no collision
> 
> - the inode struct e.g. "file" informations (size, various times,
> i_version..) that get updated when file is modfied (rename shouldn't
> touch these either, but writes will);
> I guess with what you said there can be multiple inode structs alive for
> a given "file"? But that sounds like a bug to me, we should always reuse
> as it's also what governs the vfs cache data: at best you'll be fetching
> stats twice, at worse you'll end up with two mmap in parallel that don't
> get updated the same...
> 
> - the file struct, that'll back a fd somewhere (can be multiple fd per
> file with dup and friends); we keep the open 'fid' information around in
> the file's private data.
> 
> (I'm not 100% sure how process cwd is handled -- probably just a ref on
> the dentry? When opening a file we need to figure it out from the
> dentry, so get the parent's dentry and fids from there and use
> that parent's dentry cached fid, or build it up -- I guess if there's no
> cache file here we could do multi-level walk, but my understanding was
> that there should always be something cached at the point... Also, for
> something like ls you'll stat all files in a directory, so if there
> wasn't a cached fid for the parent's dentry you'd do a lot of
> multi-level walks, where caching the parent's dentry sound more
> efficent?)
> 
> 
> So to recap there should only be one inode, then as many dentries as
> there are paths to it, and as many files as there are "open handles" to
> it
> 
> In cacheless modes we should drop dentries as soon as no file/process
> use it, ad drop inodes as soon as there's no dentries left.
> In cached mode we should have an extra ref to dentries (droppable on
> memory pressure somehow -- we're not doing that correctly right now
> afaik), that'll keep the inode alive, so we should try very hard to
> always reuse the same inode/dentres to avoid needless lookups.

Yes, the cache currently just grows indefinitely. The question is what kind of
metric shall be used to decide when to start dropping old things from the
cache.

> > I have publically stated I hate the temporary file hack.  It just seems
> > awful, but I've punted this to be a server responsibility so I don't have
> > to think about it ;)  I get where it may cause trouble on a reconnect,
> > but that could be handled different ways as well, just with a lot more
> > invasiveness on the server-side (like holding on to dangling fids on 
> > unexpected disconnects) -- maybe something else to queue up for
> > future protocol revisions.
> 
> Well there's reconnect (connection troubles), and there's reconnect
> (server restart, or migration) -- in nfsv4 the server is expected to
> have some persistent states so e.g. file leases will persist, so I guess
> it would be possible to also store information for tmpfiles there too,
> but at the very least the server needs to also be able to re-open the
> file so it needs to be kept around...
> (speaking of which, I guess qemu live migration doesn't work with 9p
> mounts? or maybe just not tmpfiles... Can't say I tried...)

Right, QEMU live-migration doesn't work with 9p ATM, and probably won't in
near future.

> Anyway, we don't need to work on reconnect here, that's distinct enough
> from the problems you've seen;
> let's not worry about it at this point, and we can bring it back up
> later if/when someone has energy for it :)




^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-29 12:22             ` Christian Schoenebeck
@ 2023-12-29 12:50               ` asmadeus
  2023-12-29 16:16                 ` Eric Van Hensbergen
  2023-12-29 16:06               ` Eric Van Hensbergen
  1 sibling, 1 reply; 18+ messages in thread
From: asmadeus @ 2023-12-29 12:50 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: Eric Van Hensbergen, v9fs, ericvh

Christian Schoenebeck wrote on Fri, Dec 29, 2023 at 01:22:45PM +0100:
> Which is the expected, correct behaviour, as you can still distinguish the two
> by their device IDs:

(Yes, just meant to say one can produce collisions easily if they want)

> BTW even if qid.path length is increased as part of a 9p protocol change, it
> would still be tricky for client implementation side, as that number is too
> big as simply being exposed as inode number by client. IIRC virtiofsd is
> handling this by automatically creating separate devices when needed. I
> haven't checked how exactly, whether there is e.g. some easy way to create
> "subdevices" with Linux.

NFS also automatically creates a new mountpoint at junctions, so it's
definitely possible (fs/nfs/namespace.c):
/*      
 * nfs_d_automount - Handle crossing a mountpoint on the server
 * @path - The mountpoint
 *      
 * When we encounter a mountpoint on the server, we want to set up
 * a mountpoint on the client too, to prevent inode numbers from
 * colliding, and to allow "df" to work properly.
 * On NFSv4, we also want to allow for the fact that different
 * filesystems may be migrated to different servers in a failover
 * situation, and that different filesystems may want to use
 * different security flavours.
 */

It's not strictly needed though; in practice the vfs can handle
duplicate i_ino as long as appropriate functions are used for iget
(iget5_locked in fs/9p/vfs_inode.c); although now I'm thinking about it
that won't work for programs like tar that'll notice the inode number is
identical and consider the second file to be a hard link of the first...

I guess we could swipe it under the rug by making the inode number a
hash of everything in qid.path, and pretend collisions never happen™?

> > (iiuc from a networked filesystem's point of view (e.g. nfs), we ought
> > to check i_ino + i_generation as inode number can be reused after the
> > previous owner of the number was deleted, but we're guaranteed the
> > couple change... At any given point though i_ino is unique so we've been
> > relying on just that for two reasons: no room to send i_generation, and
> > the field isn't easily obtainable from userspace (need to get in through
> > bytes in export_to_handle_at which is file system dependant), so we only
> > rely on inode number -- if the field gets bigger it should be variable
> > length and we should write the whole mnt_id + export_to_handle_at
> > content for a stable, truly unique handle. That's what userspace NFS
> > servers do.)
> 
> Why is that? The inode number is a serial number which is consecutively
> incremented:

It's serial for tmpfs, but that's an implementation detail.
ext4 and xfs will both allocate through some larger stride.

... Actually didn't think it'd be this easy to reproduce, but ext4
reliably recyles the same inodes after unlink:
h
# touch a b c
# ls -li a b c
357 -rw-r--r--. 1 root root 0 Dec 29 21:39 a
365 -rw-r--r--. 1 root root 0 Dec 29 21:39 b
366 -rw-r--r--. 1 root root 0 Dec 29 21:39 c
# rm -f a b c
# touch new
# ls -li new
357 -rw-r--r--. 1 root root 0 Dec 29 21:40 new
# rm -f new
# touch again
# ls -li again
357 -rw-r--r--. 1 root root 0 Dec 29 21:40 again

(works even if I actually write something into the files and sync them)

> So I would not expect an inode number to be reused before the consecutive
> counter wraps at 2^64. And then, this discussion is about files still being
> open. So host's file system can't recycle the inode number unless 9p client
> closes the corresponding old FID.

If the file's still open, it's not removed from the fs -- we're
guaranted the inode won't come back yet.
For a networked file system though I was thinking of cache more than
open files, e.g. in the previous example a 9p client might think 'new'
and 'again' are the same file.
I guess it doesn't really matter since we have no cache invalidation /
refresh logic...

> > In cacheless modes we should drop dentries as soon as no file/process
> > use it, ad drop inodes as soon as there's no dentries left.
> > In cached mode we should have an extra ref to dentries (droppable on
> > memory pressure somehow -- we're not doing that correctly right now
> > afaik), that'll keep the inode alive, so we should try very hard to
> > always reuse the same inode/dentres to avoid needless lookups.
> 
> Yes, the cache currently just grows indefinitely. The question is what kind of
> metric shall be used to decide when to start dropping old things from the
> cache.

My understanding is that the vfs already can keep track of this for us.
Look at e.g. prune_icache_sb in fs/inode.c
It's called when there's memory pressure, and it'll call inode op's
evict_inode when actually freeing thing so we could close any still
remembered fid there, and I think it'd work out in general...
Except that we keep an explicit ref so it doesn't have anything to free
iirc.

Well, it's more stuff that'd need experimenting with...

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-29 12:50               ` asmadeus
@ 2023-12-29 16:16                 ` Eric Van Hensbergen
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2023-12-29 16:16 UTC (permalink / raw)
  To: asmadeus; +Cc: Christian Schoenebeck, v9fs, ericvh

On Fri, Dec 29, 2023 at 09:50:39PM +0900, asmadeus@codewreck.org wrote:
> 
> It's not strictly needed though; in practice the vfs can handle
> duplicate i_ino as long as appropriate functions are used for iget
> (iget5_locked in fs/9p/vfs_inode.c); although now I'm thinking about it
> that won't work for programs like tar that'll notice the inode number is
> identical and consider the second file to be a hard link of the first...
>

This is the problem though, iget functions use lookup functions which
right now compare a bunch of state requiring a getattr to validate.  This
is one of the things I was trying to eliminate.  The other option would
be putting some of these additional bits (device, i_generation, etc.) in 
the qid.

>
> I guess we could swipe it under the rug by making the inode number a
> hash of everything in qid.path, and pretend collisions never happenTM?
> 

feels wrong, hashes always have collisions and it makes me uncomfortable.
I know some of the other on disk file systems have moved to 128-bit inodes,
does that help or just push out the problem (or make it less likely I suppose).
The combo of qid.path and qid.version is probably a pretty good mix, if
qid.version is some hash on time then the likelyhood of collision goes way
down (but notably isn't zero..) and servers don't always populate qid.version,
but that's something to fix server side.

> 
> ... Actually didn't think it'd be this easy to reproduce, but ext4
> reliably recyles the same inodes after unlink:
> h
> # touch a b c
> # ls -li a b c
> 357 -rw-r--r--. 1 root root 0 Dec 29 21:39 a
> 365 -rw-r--r--. 1 root root 0 Dec 29 21:39 b
> 366 -rw-r--r--. 1 root root 0 Dec 29 21:39 c
> # rm -f a b c
> # touch new
> # ls -li new
> 357 -rw-r--r--. 1 root root 0 Dec 29 21:40 new
> # rm -f new
> # touch again
> # ls -li again
> 357 -rw-r--r--. 1 root root 0 Dec 29 21:40 again
> 
> (works even if I actually write something into the files and sync them)
>
...
> 
> If the file's still open, it's not removed from the fs -- we're
> guaranted the inode won't come back yet.
> For a networked file system though I was thinking of cache more than
> open files, e.g. in the previous example a 9p client might think 'new'
> and 'again' are the same file.

So this is just a cache race problem potentially.  The real concern 
being that the cache is out of date with the server and then we get
an inode collision.  But that probably likely for other reasons with
a loose cache -- I guess the question is how to handle for a tight
cache.  qid.path+qid.vers maybe enough - but races still possible.

>
> I guess it doesn't really matter since we have no cache invalidation /
> refresh logic...
>

Well, we have logic - its just always valid or always invalid.  The real
question is how do we want it to work in tight mode.  Is qid valid or do
we always need to do a stat/getattr.

    -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-29 12:22             ` Christian Schoenebeck
  2023-12-29 12:50               ` asmadeus
@ 2023-12-29 16:06               ` Eric Van Hensbergen
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2023-12-29 16:06 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: asmadeus, v9fs, ericvh

On Fri, Dec 29, 2023 at 01:22:45PM +0100, Christian Schoenebeck wrote:
> On Thursday, December 28, 2023 10:48:23 PM CET asmadeus@codewreck.org wrote:
> > Eric Van Hensbergen wrote on Thu, Dec 28, 2023 at 09:07:46AM -0600:
> 
> BTW even if qid.path length is increased as part of a 9p protocol change, it
> would still be tricky for client implementation side, as that number is too
> big as simply being exposed as inode number by client. IIRC virtiofsd is
> handling this by automatically creating separate devices when needed. I
> haven't checked how exactly, whether there is e.g. some easy way to create
> "subdevices" with Linux.
> 

I'll have a look at this and see if there is something we can do that 
is similar with different mount points.

> > (iiuc from a networked filesystem's point of view (e.g. nfs), we ought
> > to check i_ino + i_generation as inode number can be reused after the
> > previous owner of the number was deleted, but we're guaranteed the
> > couple change... 

Current iget validation logic checks i_generation from getattr.  This was
one of the things I wasn't sure if we needed and would definitely mess
with qid lookup on our side.  This was why I was thinking if we marked
unlinked inodes in some way on the client side we'd be less likely to hit
a collision, but if the inode wasn't really released on server side (because
it was open), maybe this doesn't happen and we only have to worry about stale
inodes in cache on client that haven't been properly released.  But is that
ever a valid case? (it may be happening because of cache code being broken,
but that seems like a different case -- maybe if we see an inode lose its
last link and its not currently open we just garbage collect it?  Don't we
do that already?  I can see where this might screw up if we have multiple
instances of inode structs around, which the current code seems to gesture
towards, but I think its a bug.  I may write up some auditing checks to
look for dup inode structs and other things as part of debug.)

> > 
> > In cacheless modes we should drop dentries as soon as no file/process
> > use it, ad drop inodes as soon as there's no dentries left.
> > In cached mode we should have an extra ref to dentries (droppable on
> > memory pressure somehow -- we're not doing that correctly right now
> > afaik), that'll keep the inode alive, so we should try very hard to
> > always reuse the same inode/dentres to avoid needless lookups.
> 
> Yes, the cache currently just grows indefinitely. The question is what kind of
> metric shall be used to decide when to start dropping old things from the
> cache.
>

Some of this is to do with the fid/dentry handling though.  Fids stay in loose,
which mean dentries stay, which means inodes stay.  I think we can be a bit
more aggresive with transient fid recovery and decouple them from dentry by
attaching to inodes so we can aggresively clean up dcache if we want to without
impacting things -- this will mean less interstitial fids although transient
fids for cwd like references will exist, but we'll have one instaead of one
per path element.

I'll probably keep aggressive dentry recovery to a secondary change set since
it'll break a bunch of existing assumptions and I think that was what was
giving me trouble in my rewrite branch.

> > 
> > Well there's reconnect (connection troubles), and there's reconnect
> > (server restart, or migration) -- in nfsv4 the server is expected to
...
> 
> Right, QEMU live-migration doesn't work with 9p ATM, and probably won't in
> near future.
>

Yeah, I think there's a whole kettle of fish here.  I'm interested in tilting
at this windmill to support simulation checkpoint/restart cases, but I agree
its a hard problem so will hold off until this other stuff is sorted.  

	-eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-28 21:48           ` asmadeus
  2023-12-29 12:22             ` Christian Schoenebeck
@ 2023-12-29 15:50             ` Eric Van Hensbergen
  2024-01-03 18:09               ` cache fixes (redux) (a tale of many inode numbers...) Eric Van Hensbergen
  2024-01-03 20:24               ` cache fixes (redux) asmadeus
  1 sibling, 2 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2023-12-29 15:50 UTC (permalink / raw)
  To: asmadeus; +Cc: Christian Schoenebeck, v9fs, ericvh

On Fri, Dec 29, 2023 at 06:48:23AM +0900, asmadeus@codewreck.org wrote:
> Eric Van Hensbergen wrote on Thu, Dec 28, 2023 at 09:07:46AM -0600:
> 
> - the file struct, that'll back a fd somewhere (can be multiple fd per
> file with dup and friends); we keep the open 'fid' information around in
> the file's private data.
> 
> (I'm not 100% sure how process cwd is handled -- probably just a ref on
> the dentry? When opening a file we need to figure it out from the
> dentry, so get the parent's dentry and fids from there 

This was why I originally hooked transient fids up to dentries, because it
seemed like that was what was holding the open reference.  Of course,
every dentry points to an inode so either work for these purposes, but
I wanted to clunk transient fids on dentry release versus wait for inode
release (which can take a while, particularly when caching).

In the cache re-write I was looking at moving all fid management to inode, 
but clunking transient fids inodes lose all dentry references.  I need to go
back over the patches but I think I still had a clever way of linking 
transient fids to particular dentries so that the tracking worked as
expected.

>
> that parent's dentry cached fid, or build it up -- I guess if there's no
> cache file here we could do multi-level walk, but my understanding was
> that there should always be something cached at the point... 
>

The concern I have in the back of my head is a detached hierarchy where
we have a reference to the directory at the top of that hierarchy but
not all the way back to the mount point.  In that situation, we can't
multiwalk all the way from root, which was the logic behind the step by
step d_parent walk until we find a fid.  But I think that's the exception
case that has become the only case because of the structure of the code,
I think if we can't find a usable fid off of the d_parent, we try multi-walk
from root and if that fails we do the step-by-step.  To be clear, I'm not
even sure how often this would every happen (when we don't have a valid
d_parent with a fid, I don't think that happens at all in practice in cache
mode but we are hanging on to way too many fids in cache mode, so maybe it
should occur more than it does....)

>
> something like ls you'll stat all files in a directory, so if there
> wasn't a cached fid for the parent's dentry you'd do a lot of
> multi-level walks, where caching the parent's dentry sound more
> efficent?)
> 

yes, and I think this is what Ron was seeing recently -- although this
was without any cache mode.

> 
> So to recap there should only be one inode, then as many dentries as
> there are paths to it, and as many files as there are "open handles" to
> it
> 
> In cacheless modes we should drop dentries as soon as no file/process
> use it, ad drop inodes as soon as there's no dentries left.
> In cached mode we should have an extra ref to dentries (droppable on
> memory pressure somehow -- we're not doing that correctly right now
> afaik), that'll keep the inode alive, so we should try very hard to
> always reuse the same inode/dentres to avoid needless lookups.
>

I'm not convinced of the second part, I think we don't want to hold the
extra ref to dentries because that causes fid explosion.  Inodes can be
kept alive without a dentry (but we do need to be able to lookup an inode
by i_no / qid.path -- which we are able to do).  There's another thing which
has bugged me looking through the code - there's lots of code (lookup, create,
open, link, etc.) which passes inode *parent which we only use for session 
information, but really that should be fine as a start-point for finding the 
parent fid versus doing a d_parent traversal.

Then for refresh logic we have inode_revalidate which can do different
things depending on cache policy:
   - cacheless - release immediately, clunk all fids (this assumes the
	mountpoint inode will have a reference as long as we are mounted,
	but I think that is correct
   - data/dir cache - same semantics as now, inode holds data cache, all good
   - policy differences: (which will trigger on inode_revalidate/refresh)
	- none: always call walk/getattr to revalidate inode from server
	  (the combo of walk/getattr may be a good new compount proto element)
	- loose: same as current semantics, keep around forever if we can,
		always assume client side cache is correct.
	- tight: use qid.version from walk/re-walk to validate, and getattr
		if stale.
	- temporal: cache times out after a mount configured timeo and we call
		getattr
	- push(future): use a subscription method to get updates on cache
		invalidations from server.

       -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux) (a tale of many inode numbers...)
  2023-12-29 15:50             ` Eric Van Hensbergen
@ 2024-01-03 18:09               ` Eric Van Hensbergen
  2024-01-03 20:24               ` cache fixes (redux) asmadeus
  1 sibling, 0 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2024-01-03 18:09 UTC (permalink / raw)
  To: asmadeus; +Cc: Christian Schoenebeck, v9fs, ericvh

Okay, been delving through this code.  Most (all?) was added by
Aneesh when he did the cache stuff and I clearly wasn't paying
close attention a decade ago (mostly because I didn't care about
the cache then).  In any case, I think there is a ton of code
I am going to be able to delete (patches coming soon), but wanted
to sanity check one thing with you folks.

Looking at the inode get code (which also doubles as inode creation
code).  The inodes are hashed with qid2ino(qid) which I get to take
the blame for because its been there for 18th years.
If the size of the inode_tt is 64-bit, it just
assigns the qid->path+2 (+2?) to the i_ino.  If not (presumably for 32
bit) it sets the inode to (qid->path+2) ^ (qid->path+2 >> 32). 

If ino_t goes to 64 bit this means we reduce precision for no reason,
so that has to be fixed -- but I'm trying to remember the relevance
of +2 which dates back all the way to the original submission.  I
don't remember why we did this, but I think I'm gonna kill that in
one patch and in a second patch fix the length check.

Moving up the stack a bit, we use the modified qid.path to hash the
inode, but then we also assign it (overriding the allocated inode number
on the client).  I'm gonna say this is wrong (definitely wrong in the
case where we are allocated inodes willy-nilly) and we should probably
continue to use qid.path to hash, but then keep the client side inode
number which is guarenteed unique on allocation.  Or do you folks think
we should align client-side inode with server-side inode to help with
debug?  We should still have consistent qid->paths regardless.  Thoughts?

        -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2023-12-29 15:50             ` Eric Van Hensbergen
  2024-01-03 18:09               ` cache fixes (redux) (a tale of many inode numbers...) Eric Van Hensbergen
@ 2024-01-03 20:24               ` asmadeus
       [not found]                 ` <CAFkjPT=61yygntbhSJMMWeK4JBrm85EZpz387aPHD_xmHVBbog@mail.gmail.com>
  2024-01-03 22:48                 ` Eric Van Hensbergen
  1 sibling, 2 replies; 18+ messages in thread
From: asmadeus @ 2024-01-03 20:24 UTC (permalink / raw)
  To: Eric Van Hensbergen; +Cc: Christian Schoenebeck, v9fs, ericvh

Eric Van Hensbergen wrote on Fri, Dec 29, 2023 at 09:50:07AM -0600:
> > So to recap there should only be one inode, then as many dentries as
> > there are paths to it, and as many files as there are "open handles" to
> > it
> > 
> > In cacheless modes we should drop dentries as soon as no file/process
> > use it, ad drop inodes as soon as there's no dentries left.
> > In cached mode we should have an extra ref to dentries (droppable on
> > memory pressure somehow -- we're not doing that correctly right now
> > afaik), that'll keep the inode alive, so we should try very hard to
> > always reuse the same inode/dentres to avoid needless lookups.
> >
> 
> I'm not convinced of the second part, I think we don't want to hold the
> extra ref to dentries because that causes fid explosion.

(Note: no longer relevant if we get rid of fid lookup by dentry)

Might have the wrong word -- we don't want to prevent the vfs from
cleaning up if there's cache pressure, but we need to differentiate
cacheless (drop the inode as soon as it's no longer used) and cache
(keep it around while it's here)

It definitely causes total fids open to rise (which can be a probem
depending on the server), I guess we could count how many there are and
have a watermark to start reclaiming if too many are around as an
option, but I've sort of given up on that at this point...

> There's another thing which has bugged me looking through the code -
> there's lots of code (lookup, create, open, link, etc.) which passes
> inode *parent which we only use for session information, but really
> that should be fine as a start-point for finding the parent fid versus
> doing a d_parent traversal.

That makes sense, I sort of assumed we didn't have the parent's inode
but looking back we seem to always have an inode *dir..
Would be great to get rid of this dentry fid cache.

(Note the dentry will still be kept around by the vfs anyway, but we
don't care about that anymore if it doesn't hold fids)

> Then for refresh logic we have inode_revalidate which can do different
> things depending on cache policy:
>    - cacheless - release immediately, clunk all fids (this assumes the
> 	mountpoint inode will have a reference as long as we are mounted,
> 	but I think that is correct
>    - data/dir cache - same semantics as now, inode holds data cache, all good
>    - policy differences: (which will trigger on inode_revalidate/refresh)
> 	- none: always call walk/getattr to revalidate inode from server
> 	  (the combo of walk/getattr may be a good new compount proto element)
> 	- loose: same as current semantics, keep around forever if we can,
> 		always assume client side cache is correct.
> 	- tight: use qid.version from walk/re-walk to validate, and getattr
> 		if stale.
> 	- temporal: cache times out after a mount configured timeo and we call
> 		getattr
> 	- push(future): use a subscription method to get updates on cache
> 		invalidations from server.

Sounds good.


Also I need to correct one point I said in this thread: I had assumed
creat+unlink worked, but something as simple as fstat on the unlinked
file doesn't work in cacheless mode:
---
#include <fcntl.h>
#include <stdio.h>
#include <sys/stat.h>
#include <unistd.h>

int main(int argc, char *argv[]) {
	int fd;
	struct stat statbuf;

	fd = open("test_unlink_stat", O_CREAT|O_RDWR, 0600);
	if (fd < 0) {
		perror("open");
		return 1;
	}
	if (unlink("test_unlink_stat") < 0) {
		perror("unlink");
		return 2;
	}
	write(fd, "data\n", 5);
	if (fstat(fd, &statbuf) < 0) {
		perror("fstat");
		return 3;
	}
	printf("stat ok, size %zd\n", statbuf.st_size);
	return 0;
}
---

Perhaps that'll just start working with your fixes but either way it's
already not working cacheless, so I don't think you need to worry about
that at this point...

(I noticed because I finally took the time to boot a debian on 9p
rootfs, and apt breaks without this -- works fine with caches)


Eric Van Hensbergen wrote on Wed, Jan 03, 2024 at 12:09:45PM -0600:
> Okay, been delving through this code.  Most (all?) was added by
> Aneesh when he did the cache stuff and I clearly wasn't paying
> close attention a decade ago (mostly because I didn't care about
> the cache then).  In any case, I think there is a ton of code
> I am going to be able to delete (patches coming soon), but wanted
> to sanity check one thing with you folks.

Good work!

> Looking at the inode get code (which also doubles as inode creation
> code).  The inodes are hashed with qid2ino(qid) which I get to take
> the blame for because its been there for 18th years.
> If the size of the inode_tt is 64-bit, it just
> assigns the qid->path+2 (+2?)

hm, inode 1 is definitely usable (a tmpfs' root directory uses it), but
perhaps 0 would caus problems?
In which case +1 makes sense if qid->path can be 0, or +2 if it could be
-1 (I wouldn't put it past some servers); but I can only guess.

> to the i_ino.  If not (presumably for 32 bit)

I had assumed the conversion to inode64 was done in practice but the
struct inode's i_ino is still an unsigned long... Although kstat's ino
field is u64 so if userspace is willing to handle it we could just bump
the size.

I remember having troubles with 64 bit inodes on 32 bit binaries quite a
few years ago so I guess we still need to do that dance -- although not
necessarily internally; we could emulate xfs here:
 - they use xfs_ino_t internally which is unsigned long long (u64);
there's even a confusingly named i_ino in xfs_inode so the inode's
struct i_ino is actually never used as far as I can see
 - there are inode32/inode64 mount option to force allocating inodes in
the 32 bit range if requested (but doesn't try to remap existing inodes,
this is just for new files)

In our case we could add a 64bit inode field in v9fs_inode so we don't
need to worry about it, and remap for kstat (v9fs_vfs_getattr*) if a new
inode32 mount option is requested?

I wouldn't do it by default on 32 bit systems as that'll blow up, and
hopefully by now any userspace that'll upgrade their kernel will have
long converted to _FILE_OFFSET_BITS=64 (which also controls 64bit inode
for stat's st_ino)

> it sets the inode to (qid->path+2) ^ (qid->path+2 >> 32).

(Seems a bit simplistic but probably good enough in practice...)


> If ino_t goes to 64 bit this means we reduce precision for no reason,
> so that has to be fixed -- but I'm trying to remember the relevance
> of +2 which dates back all the way to the original submission.  I
> don't remember why we did this, but I think I'm gonna kill that in
> one patch and in a second patch fix the length check.

Unfortunately cannot help as my involvement started much later..

> Moving up the stack a bit, we use the modified qid.path to hash the
> inode, but then we also assign it (overriding the allocated inode number
> on the client).  I'm gonna say this is wrong (definitely wrong in the
> case where we are allocated inodes willy-nilly) and we should probably
> continue to use qid.path to hash, but then keep the client side inode
> number which is guarenteed unique on allocation.  Or do you folks think
> we should align client-side inode with server-side inode to help with
> debug?  We should still have consistent qid->paths regardless.  Thoughts?

I ended up suggesting something similar above without reading this
properly, if I read this correctly I think we agree -- use as much bits
as we can internally and keep it coherent, then just downsize for
getattr if required.

-- 
Dominique Martinet | Asmadeus

^ permalink raw reply	[flat|nested] 18+ messages in thread

[parent not found: <CAFkjPT=61yygntbhSJMMWeK4JBrm85EZpz387aPHD_xmHVBbog@mail.gmail.com>]

* Fwd: cache fixes (redux)
       [not found]                 ` <CAFkjPT=61yygntbhSJMMWeK4JBrm85EZpz387aPHD_xmHVBbog@mail.gmail.com>
@ 2024-01-03 22:42                   ` Eric Van Hensbergen
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2024-01-03 22:42 UTC (permalink / raw)
  To: v9fs; +Cc: Christian Schoenebeck

ooops, didn't hit reply all.

---------- Forwarded message ---------
From: Eric Van Hensbergen <ericvh@kernel.org>
Date: Wed, Jan 3, 2024 at 2:52 PM
Subject: Re: cache fixes (redux)
To: Dominique Martinet <asmadeus@codewreck.org>

On Wed, Jan 3, 2024 at 2:25 PM <asmadeus@codewreck.org> wrote:
> > Moving up the stack a bit, we use the modified qid.path to hash the
> > inode, but then we also assign it (overriding the allocated inode number
> > on the client).  I'm gonna say this is wrong (definitely wrong in the
> > case where we are allocated inodes willy-nilly) and we should probably
> > continue to use qid.path to hash, but then keep the client side inode
> > number which is guarenteed unique on allocation.  Or do you folks think
> > we should align client-side inode with server-side inode to help with
> > debug?  We should still have consistent qid->paths regardless.  Thoughts?
>
> I ended up suggesting something similar above without reading this
> properly, if I read this correctly I think we agree -- use as much bits
> as we can internally and keep it coherent, then just downsize for
> getattr if required.
>

One gotcha here and I'm inclined to maybe ignore it.  readdir uses
qid2ino to populate the inode number for directory listings.  So, I'd
need to convert that code to lookup the inode just to extract the
client-side inode number.  But this is all client side, so maybe not a
big deal and we aren't even using the readdir code right now in .L so
its not really on the critical path.

The other problem of course is that the inode hash is also an unsigned
long in the kernel. So really same problem applies... But I'm inclined
to switch to using qid.path to hash anyways as I think it makes the
code more understandable and we can get rid of weirdness like +2 and
get guarenteed unique inode numbers.

The other thing that needs to change though is v9fs_mount needs to use
iunique to grab a new inode number versus using qid.path from the
mount which means the logic there needs to change to using iget so we
can insert the mount point into the hash table.  I'm assuming that
just works, but gonna reorder my changes to try doing that first and
see if anything blows up.

        -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2024-01-03 20:24               ` cache fixes (redux) asmadeus
       [not found]                 ` <CAFkjPT=61yygntbhSJMMWeK4JBrm85EZpz387aPHD_xmHVBbog@mail.gmail.com>
@ 2024-01-03 22:48                 ` Eric Van Hensbergen
  2024-01-04 12:37                   ` Christian Schoenebeck
  1 sibling, 1 reply; 18+ messages in thread
From: Eric Van Hensbergen @ 2024-01-03 22:48 UTC (permalink / raw)
  To: asmadeus; +Cc: Christian Schoenebeck, v9fs

One other thing I noticed when looking through the code (related to
xattrs and acls) - it seems on create we are pre-populating client
side ACLs -- this requires some extra protocol and steps it feels like
we don't have to take.  So I guess two thoughts:
- shouldn't the server side be populating ACLs appropriately on create
if the underlying file system supports them
- if this isn't the case it feels like we might be able to get away
with a slightly reduced approach of walking to the new fid and sending
the appropriate setacl proto messages and then rely on the inode
refresh to fill the cache (if the cache is enabled

A bunch of this has come up unwinding complexity in where we are
"creating inodes" versus using iget with optional create.  I think
everything needs to use the latter so things show up in the hash
appropriately.

     -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2024-01-03 22:48                 ` Eric Van Hensbergen
@ 2024-01-04 12:37                   ` Christian Schoenebeck
  2024-01-04 15:59                     ` Eric Van Hensbergen
  0 siblings, 1 reply; 18+ messages in thread
From: Christian Schoenebeck @ 2024-01-04 12:37 UTC (permalink / raw)
  To: asmadeus, Eric Van Hensbergen; +Cc: v9fs, Greg Kurz

On Wednesday, January 3, 2024 11:48:13 PM CET Eric Van Hensbergen wrote:
> One other thing I noticed when looking through the code (related to
> xattrs and acls) - it seems on create we are pre-populating client
> side ACLs -- this requires some extra protocol and steps it feels like
> we don't have to take.  So I guess two thoughts:
> - shouldn't the server side be populating ACLs appropriately on create
> if the underlying file system supports them

To resume a discussion we recently had on this topic, i.e. relying on server
handling all permissions: ACLs are actually the line where it becomes
difficult to be handled solely by server on its own. Because AFAICS there is
nothing defined in the protocol to handle ACLs in the first place, instead
client sends xattr queries to 9p server to map ACLs as xattrs on server side:

$ getfattr foo
# file: foo
user.virtfs.gid
user.virtfs.mode
user.virtfs.system.posix_acl_access    # <-- client encoded & managed
user.virtfs.system.posix_acl_default   # <-- client encoded & managed
user.virtfs.uid
...

Now even if the protocol got extended to handle ACLs, it would still be
difficult for server to handle this task reliably; I am thinking of setups
where e.g. guest runs in an Active Directory domain, LDAP, PAM modules, and
what not (all on guest side that is). A huge bunch of information server would
simply not have and a user would need to duplicate as a setup on host side
(not a good idea for security reasons).

So in short: simple Unix permissions, sure, but anything beyond that, I don't
think so.

> - if this isn't the case it feels like we might be able to get away
> with a slightly reduced approach of walking to the new fid and sending
> the appropriate setacl proto messages and then rely on the inode
> refresh to fill the cache (if the cache is enabled
> 
> A bunch of this has come up unwinding complexity in where we are
> "creating inodes" versus using iget with optional create.  I think
> everything needs to use the latter so things show up in the hash
> appropriately.
> 
>      -eric
> 

^ permalink raw reply	[flat|nested] 18+ messages in thread

* Re: cache fixes (redux)
  2024-01-04 12:37                   ` Christian Schoenebeck
@ 2024-01-04 15:59                     ` Eric Van Hensbergen
  0 siblings, 0 replies; 18+ messages in thread
From: Eric Van Hensbergen @ 2024-01-04 15:59 UTC (permalink / raw)
  To: Christian Schoenebeck; +Cc: asmadeus, v9fs, Greg Kurz

Makes sense I guess - I'll leave ACLs alone for now.

       -eric

^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2024-01-04 15:59 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2023-12-26 16:13 cache fixes (redux) Eric Van Hensbergen
2023-12-27  8:10 ` Dominique Martinet
2023-12-27 18:07   ` Eric Van Hensbergen
     [not found]   ` <CAFkjPT=8CZHATaraBrqAZGDKjQLOp=U1gdgteJ5jpXRGJyBojQ@mail.gmail.com>
2023-12-28  0:32     ` Dominique Martinet
2023-12-28 10:44       ` Christian Schoenebeck
2023-12-28 15:07         ` Eric Van Hensbergen
2023-12-28 21:48           ` asmadeus
2023-12-29 12:22             ` Christian Schoenebeck
2023-12-29 12:50               ` asmadeus
2023-12-29 16:16                 ` Eric Van Hensbergen
2023-12-29 16:06               ` Eric Van Hensbergen
2023-12-29 15:50             ` Eric Van Hensbergen
2024-01-03 18:09               ` cache fixes (redux) (a tale of many inode numbers...) Eric Van Hensbergen
2024-01-03 20:24               ` cache fixes (redux) asmadeus
     [not found]                 ` <CAFkjPT=61yygntbhSJMMWeK4JBrm85EZpz387aPHD_xmHVBbog@mail.gmail.com>
2024-01-03 22:42                   ` Fwd: " Eric Van Hensbergen
2024-01-03 22:48                 ` Eric Van Hensbergen
2024-01-04 12:37                   ` Christian Schoenebeck
2024-01-04 15:59                     ` Eric Van Hensbergen

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox