From: Horst Birthelmer <horst@birthelmer.de>
To: NeilBrown <neil@brown.name>
Cc: Horst Birthelmer <horst@birthelmer.com>,
Miklos Szeredi <miklos@szeredi.hu>,
Jonathan Corbet <corbet@lwn.net>,
Shuah Khan <skhan@linuxfoundation.org>,
Alexander Viro <viro@zeniv.linux.org.uk>,
Christian Brauner <brauner@kernel.org>, Jan Kara <jack@suse.cz>,
linux-doc@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-fsdevel@vger.kernel.org,
Horst Birthelmer <hbirthelmer@ddn.com>
Subject: Re: Re: [PATCH] dcache: add fs.dentry-limit sysctl with negative-first reaper
Date: Mon, 18 May 2026 09:01:56 +0200 [thread overview]
Message-ID: <agq1xnx2lMvA22BL@fedora.fritz.box> (raw)
In-Reply-To: <177906210551.3947082.4313294634549021141@noble.neil.brown.name>
On Mon, May 18, 2026 at 09:55:05AM +1000, NeilBrown wrote:
> On Fri, 15 May 2026, Horst Birthelmer wrote:
> > From: Horst Birthelmer <hbirthelmer@ddn.com>
> >
> > The dcache only shrinks under memory pressure, which is rarely reached
> > on machines with ample RAM, so cached negative dentries can accumulate
> > without bound. Give administrators a soft cap they can set,
> > and a background worker that prefers negative dentries when reclaiming.
> >
> > Two new sysctls under /proc/sys/fs/:
> >
> > dentry-limit -- soft cap on nr_dentry. 0 (default)
> > disables the feature; behaviour is then
> > identical to before.
>
> Is a system-wide cap really a suitable tool? What guidance would you
> give to sysadmins who are considering setting a number?
I know it is a rhetorical question ... nevertheless
It's a soft cap, so it depends on the number of open files usually floating
around on the machine. It even depends on the file systems. That was actually
my motivation (more than the negative entries). Some cache entries are
expensive for our fuse server due to our DLM usage and private data
held in user space.
> Is there a better approach?
After reading your thoughts and those of the others who have taken the time
to revisit this, I think there is no better solution in the VFS layer.
Since 2025 (commit 395b95530343e) shrink_dentry_list() is an exported symbol
and that can be used for a specific file system to do its own housekeeping.
This will probably be considered a misuse by some , but it would be more
specific and better controllable especially from filesystems where certain
cache entries are more expensive than others and/or running in user space (FUSE).
>
> According to the email you linked, a problem arises when a directory has
> a great many negative children. Code which walks the list of children
> (such as fsnotify) while holding a lock can suffer unpredictable delays
> and result in long lock-hold times. So maybe a limit on negative
> dentries for any parent is what we really want. That would be clumsy to
> implement I imagine.
>
> But what if we move dentries to the end of the list when they become
> negative, and to the start of the list when they become positive? Then
> code which walks the child list could simply abort on the first
> negative.
>
> I doubt that would be quite as easy as it sounds, but it would at least
> be more focused on the observed symptom rather than some whole-system
> number which only vaguely correlates with the observed symptom.
>
> Maybe a completely different approach: change children-walking code to
> drop and retake the lock (with appropriate validation) periodically.
> What too would address the specific symptom.
>
> Thanks for attempting to resolve this issue, but I'm not convinced that
> you have found a good solution yet.
Thanks for the clear words. I realy appreciate it!
>
> NeilBrown
>
prev parent reply other threads:[~2026-05-18 7:02 UTC|newest]
Thread overview: 17+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-14 15:13 [PATCH] dcache: add fs.dentry-limit sysctl with negative-first reaper Horst Birthelmer
2026-05-15 15:09 ` kernel test robot
2026-05-16 6:55 ` Horst Birthelmer
2026-05-16 10:33 ` Stafford Horne
2026-05-16 14:15 ` Horst Birthelmer
2026-05-15 15:09 ` kernel test robot
2026-05-17 23:55 ` NeilBrown
2026-05-18 2:55 ` Ian Kent
2026-05-18 8:19 ` Jan Kara
2026-05-18 13:39 ` Ian Kent
2026-05-19 9:12 ` Jan Kara
2026-05-20 7:16 ` Ian Kent
2026-05-20 9:43 ` Amir Goldstein
2026-05-21 0:55 ` Ian Kent
2026-05-22 4:16 ` NeilBrown
2026-05-22 8:27 ` Amir Goldstein
2026-05-18 7:01 ` Horst Birthelmer [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=agq1xnx2lMvA22BL@fedora.fritz.box \
--to=horst@birthelmer.de \
--cc=brauner@kernel.org \
--cc=corbet@lwn.net \
--cc=hbirthelmer@ddn.com \
--cc=horst@birthelmer.com \
--cc=jack@suse.cz \
--cc=linux-doc@vger.kernel.org \
--cc=linux-fsdevel@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=miklos@szeredi.hu \
--cc=neil@brown.name \
--cc=skhan@linuxfoundation.org \
--cc=viro@zeniv.linux.org.uk \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox