From: Horst Birthelmer <horst@birthelmer.de>
To: Jan Kara <jack@suse.cz>
Cc: Matthew Wilcox <willy@infradead.org>,
Horst Birthelmer <horst@birthelmer.com>,
Miklos Szeredi <miklos@szeredi.hu>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org,
Horst Birthelmer <hbirthelmer@ddn.com>
Subject: Re: Re: [PATCH v2] dcache: add fs.dentry-limit sysctl with negative-first reaper
Date: Tue, 19 May 2026 11:37:38 +0200 [thread overview]
Message-ID: <agwpRu1CWU4X0ytf@fedora.fritz.box> (raw)
In-Reply-To: <mptmd2qxgqwkhfrq5dgwomysdnwoy6fnztr3ibrvbbsb7hvrv3@peg7mojzfucy>
On Tue, May 19, 2026 at 10:45:09AM +0200, Jan Kara wrote:
> Hi Horst!
>
> On Sun 17-05-26 09:57:41, Horst Birthelmer wrote:
> > On Sun, May 17, 2026 at 12:09:26AM +0100, Matthew Wilcox wrote:
> > > On Sat, May 16, 2026 at 04:52:54PM +0200, Horst Birthelmer wrote:
> > > > There was a discussion at LSFMM about servers with too many cached
> > > > negative dentries.
> > > > That gave me the idea to keep the dentries in general limited
> > > > if the system administrator needs it to.
> > >
> > > I feel you should link to the dozens of previous attempts at this kind
> > > of thing to show that you're aware that this has been tried before and
> > > you're doing something meaningfully different.
>
> <snip>
>
> > As a conclusion, I think I have an uncommon perspective on the cache entries
> > since I don't usually work on vfs but argue from the perspective of a fuse server
> > Where the kernel makes us waste resources. This hurts way more in the FUSE context
> > than in a 'normal' file system.
> > I have taken the look at the dentry cache just because people told me that this
> > has to be solved in the vfs (and I agree). I actually have a somewhat hacky patch
> > to do this from fuse and only for the fuse sb.
>
> So I'm a bit confused here. The changelog speaks only about negative
> dentries (and that's what the change also concentrates on). OTOH you've
> mentioned multiple times that you are not really interested in limiting
> negative dentries but rather positive ones because you have a problem with
> cached inodes. So can you perhaps formulate what is exactly the problem
> you're trying to solve?
Maybe the changelog was a bit misleading here.
I did of course prefer negative entries, since that could bring down the
number of cached entries. In retrospect that was probably a mistake but I
was somewhat afraid if I don't reduce those, too, someone would shurely
point out that it would be easier to cut those, since they are not really
used anyway, and would be cheap to free.
This was only to be more useful than just solving _my_ problem. Maybe not
a good approach, I don't know yet.
>
> Also you mention that cached (positive) dentries and inodes are a wasted
> memory when they aren't used. That is certainly a valid view, OTOH you can
> never predict future so you don't really know what will get used in the
> future and thus will be useful. That's why we currently side with the idea
> that memory that isn't used for something is wasted and unless there's
> something to use the memory for, we cache dentries & inodes & page cache in
> it.
>
> If I remember correctly the discussion we had at LSF, the problem why inode
> caching is a problem for you, although there's enough free memory and no
> memory pressure, is that these cached inodes pin memory on the other end of
> the FUSE communication channel and there we are getting short on memory. Is
> this what you're trying to solve?
You remember our conversation correctly and have masterfully summerized it in
the passage above. Yes, that is what I'm trying to solve.
The problem we are facing is, that the fuse server has to keep a lot of private
data and some data for locks (DLM) for the cached inodes and dentries.
(inodes are even more expensive due to byte range locking)
So my idea was to NOT keep unused (and negative) entries around.
Letting the admin set the limit where the kernel starts to clean, was just
for convenience. If it was up to me I would like to set this in the initial
negotiation in FUSE during mount.
The waste of memory for me is not in the kernel but in the fuse server. The
kernel is just the master of what we have to keep, and thus the kernel moved
to the center of attention.
In short:
all caching in the kernel hurts us since we have to keep our private data
for all positive dentries, and I want to get the most for the amount pain.
OTOH caching meta data is really useful but you have to have a good prediction
on what to keep. As we cannot predict that on either side of the kernel,
throwing away the unused parts when they get out of hand seemed like a good idea.
After the discussions here it seems like everybody has his own interpretation on
what useful data to cache is.
I'm really inclined to think about letting the lower layers decide what useful
cached data should be.
In this context probably a fuse server message as a notification for which data
it thinks can be thrown out, similar to the FORGET call but in the other direction
and if the kernel agrees it really sends a FORGET and we can clean up on the other
side.
> --
> Jan Kara <jack@suse.com>
> SUSE Labs, CR
Thanks a lot for looking at this,
I really appreciate it!
... and I hope I could clarify, what I was trying to do.
Horst
next prev parent reply other threads:[~2026-05-19 9:37 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-16 14:52 [PATCH v2] dcache: add fs.dentry-limit sysctl with negative-first reaper Horst Birthelmer
2026-05-16 23:09 ` Matthew Wilcox
2026-05-17 7:57 ` Horst Birthelmer
2026-05-19 8:45 ` Jan Kara
2026-05-19 9:37 ` Horst Birthelmer [this message]
2026-05-17 9:15 ` Mateusz Guzik
2026-05-17 9:42 ` Horst Birthelmer
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agwpRu1CWU4X0ytf@fedora.fritz.box \
--to=horst@birthelmer.de \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=hbirthelmer@ddn.com \
--cc=horst@birthelmer.com \
--cc=jack@suse.cz \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=skhan@linuxfoundation.org \
--cc=viro@zeniv.linux.org.uk \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox