From: "Eric Van Hensbergen" <eric.vanhensbergen@linux.dev>
To: "Joakim Sindholt" <opensource@zhasha.com>
Cc: v9fs@lists.linux.dev
Subject: Re: [PATCH v2 2/4] fs/9p: drop inodes immediately on non-.L too
Date: Thu, 28 Mar 2024 20:03:36 +0000 [thread overview]
Message-ID: <4d7d7f9020fecc66d34c3b925591371e2aeac42f@linux.dev> (raw)
In-Reply-To: <6a4f3bed800f9f5021e17311e6d7288f9a2b56c6@linux.dev>
Also digging a bit deeper -- looks like for legacy we set nlink to 1 always. So server doesn't have anything
to do with that, and it may be a problem in the newer code that we aren't coping properly. I guess I need to compare a legacy trace to a dotL trace and make sure that's right -- a bit worried about how directories show up since they should have 3 nlinks (. and ..) and this code looks generic.
-eric
March 28, 2024 at 2:47 PM, "Eric Van Hensbergen" <eric.vanhensbergen@linux.dev> wrote:
>
> Not sure if your server source is public or not, but it would be useful to test against it and see what I see. It's possible a mismatch with the nlink stat information could indeed cause inodes to stick around even under memory pressure and that'd cause all sorts of problems.
>
> The newer kernel versions have much better qid->inode mapping stuff that should provent some of the weirdness in the implementation before (there were lots of inodes with duplicate ino_num because the way we crushed the allocated inode number with the qid.path) so its definitely worth testing against the latest and greatest if you can. I will try and allocate some time to revive my regressions against 9p2000 and 9p2000.u servers so we catch some of this earlier in the future, but I'm not sure I would have caught any degredation.
>
> -eric
>
> March 28, 2024 at 10:37 AM, "Joakim Sindholt" <opensource@zhasha.com> wrote:
>
> >
> > On Thu, 28 Mar 2024 15:08:59 +0000, "Eric Van Hensbergen" <eric.vanhensbergen@linux.dev> wrote:
> >
> >
> >
> >
> >
> > Slowly parsing through these, thanks for the fixes.
> >
> >
> >
> > This one has a bit of a problem in that we don't have a v9fs specific
> >
> >
> >
> > drop_inode anymore (either in legacy or .L). The release of the fid should
> >
> >
> >
> > be handled by v9fs_dir_release in both versions of the protocol.
> >
> >
> >
> >
> >
> > My apologies. I didn't realize that it had been changed in the next
> >
> >
> >
> > branch until after sending it and I haven't had the time to check how
> >
> >
> >
> > it's supposed to work now.
> >
> >
> >
> >
> >
> > I had convinced myself we could just use generic_drop_inode because we
> >
> >
> >
> > decoupled clunking fids from the dentry/inode structures and just handled
> >
> >
> >
> > them directly in v9fs_dir_release -- so really by recovering the inode
> >
> >
> >
> > structure every time we were just forcing churn in the inode alloc/dealloc
> >
> >
> >
> > routines that seemed unnecessary (even in the uncached mode).
> >
> >
> >
> >
> >
> > I agree that the alloc followed by immediately dropping it from the
> >
> >
> >
> > cache is a waste of cycles. This was just the most reliable way I knew
> >
> >
> >
> > of fixing it without impacting anything else.
> >
> >
> >
> >
> >
> > Was your concern the performance of the client side lookup of the inode
> >
> >
> >
> > based on qid or the server side based on fid and can you give me a bit
> >
> >
> >
> > more information on what you are seeing? Are you not seeing fid clunks
> >
> >
> >
> > for certain conditions (ie. transient fids, etc.)?
> >
> >
> >
> >
> >
> > This is what I'm seeing in slabtop on a system with no caching enabled
> >
> >
> >
> > on a 6.1 kernel. I never tested if it persisted on the 6.6 kernel but I
> >
> >
> >
> > assume it would.
> >
> >
> >
> > Active / Total Objects (% used) : 2738808 / 2788118 (98.2%)
> >
> >
> >
> > Active / Total Slabs (% used) : 89853 / 89853 (100.0%)
> >
> >
> >
> > Active / Total Caches (% used) : 164 / 261 (62.8%)
> >
> >
> >
> > Active / Total Size (% used) : 1071558.97K / 1084648.27K (98.8%)
> >
> >
> >
> > Minimum / Average / Maximum Object : 0.02K / 0.39K / 16.01K
> >
> >
> >
> > OBJS ACTIVE USE OBJ SIZE SLABS OBJ/SLAB CACHE SIZE NAME
> >
> >
> >
> > 1235968 1235915 99% 0.06K 19312 64 77248K lsm_inode_cache
> >
> >
> >
> > 1200339 1200339 100% 0.75K 57159 21 914544K v9fs_inode_cache
> >
> >
> >
> > 62136 54815 88% 0.11K 1726 36 6904K buffer_head
> >
> >
> >
> > 49400 45639 92% 0.20K 2600 19 10400K dentry
> >
> >
> >
> > 26368 26364 99% 0.03K 206 128 824K avtab_node
> >
> >
> >
> > 26222 15713 59% 0.57K 1873 14 14984K radix_tree_node
> >
> >
> >
> > 19162 18337 95% 1.20K 1474 13 23584K ext4_inode_cache
> >
> >
> >
> > ...
> >
> >
> >
> > What this is not showing, and it's really rather difficult to show, is
> >
> >
> >
> > that it also causes a large amount of CPU usage on any file operations.
> >
> >
> >
> > I'm not sure if it's simply because the table is enormous or if it's
> >
> >
> >
> > because most of the qids v9fs gets to contend with are identical ones
> >
> >
> >
> > from a whole host of synthetic file systems that it then has to iterate
> >
> >
> >
> > over linearly.
> >
> >
> >
> > The v9fs_inode_cache grows until it takes up all available RAM.
> >
> >
> >
> > And snatching this one from your second mail:
> >
> >
> >
> >
> >
> > Which server are you testing against? I'm wondering if the underlying problem
> >
> >
> >
> > might actually be i_nlink being maintained incorrectly by the server which would
> >
> >
> >
> > lead to inodes lingering since that is the primary way the generic_inode_drop
> >
> >
> >
> > differentiates. It is possible that we could could add an v9fs_inode_always_drop
> >
> >
> >
> > in for legacy if the servers weren't reporting i_nlink a compatible fashion
> >
> >
> >
> > although that might result in never caching legacy 9p2000 (which is probably
> >
> >
> >
> > what most legacy folks want anyways).
> >
> >
> >
> >
> >
> > I'm running against a regular old 9P2000 server that I wrote myself. No
> >
> >
> >
> > extensions of any kind, so there are no links. I don't really have any
> >
> >
> >
> > objection to the kind of qid.version contextual you implemented in 6.6
> >
> >
> >
> > and would actually quite like to use it in the near future, but I
> >
> >
> >
> > wouldn't be terribly sad if you removed it either.
> >
>
next prev parent reply other threads:[~2024-03-28 20:03 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-03-18 11:22 Bugfixes for plain 9P2000 Joakim Sindholt
2024-03-18 11:22 ` [PATCH v2 1/4] fs/9p: only translate RWX permissions " Joakim Sindholt
2024-03-18 11:22 ` [PATCH v2 2/4] fs/9p: drop inodes immediately on non-.L too Joakim Sindholt
2024-03-28 15:08 ` Eric Van Hensbergen
2024-03-28 15:17 ` Eric Van Hensbergen
2024-03-28 15:37 ` Joakim Sindholt
2024-03-28 19:47 ` Eric Van Hensbergen
2024-03-28 20:03 ` Eric Van Hensbergen [this message]
2024-03-30 6:47 ` Joakim Sindholt
2024-03-18 11:22 ` [PATCH v2 3/4] fs/9p: translate O_TRUNC into OTRUNC Joakim Sindholt
2024-03-18 11:22 ` [PATCH v2 4/4] fs/9p: fix the cache always being enabled on files with qid flags Joakim Sindholt
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=4d7d7f9020fecc66d34c3b925591371e2aeac42f@linux.dev \
--to=eric.vanhensbergen@linux.dev \
--cc=opensource@zhasha.com \
--cc=v9fs@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox