From: Dave Chinner <david@fromorbit.com>
To: Keyur Govande <keyurgovande@gmail.com>
Cc: linux-fsdevel@vger.kernel.org
Subject: Re: Limit dentry cache entries
Date: Mon, 27 May 2013 09:23:52 +1000 [thread overview]
Message-ID: <20130526232352.GQ24543@dastard> (raw)
In-Reply-To: <CAJhmKH=7m8sOAxCtAttK7qKBH_GRY5_ppxisgnKdU-j1OszpHw@mail.gmail.com>
On Fri, May 24, 2013 at 11:12:50PM -0400, Keyur Govande wrote:
> On Mon, May 20, 2013 at 6:53 PM, Dave Chinner <david@fromorbit.com> wrote:
> > On Sun, May 19, 2013 at 11:50:55PM -0400, Keyur Govande wrote:
> >> Hello,
> >>
> >> We have a bunch of servers that create a lot of temp files, or check
> >> for the existence of non-existent files. Every such operation creates
> >> a dentry object and soon most of the free memory is consumed for
> >> 'negative' dentry entries. This behavior was observed on both CentOS
> >> kernel v.2.6.32-358 and Amazon Linux kernel v.3.4.43-4.
> >>
> >> There are also some processes running that occasionally allocate large
> >> chunks of memory, and when this happens the kernel clears out a bunch
> >> of stale dentry caches. This clearing takes some time. kswapd kicks
> >> in, and allocations and bzero() of 4GB that normally takes <1s, takes
> >> 20s or more.
> >>
> >> Because the memory needs are non-continuous but negative dentry
> >> generation is fairly continuous, vfs_cache_pressure doesn't help much.
> >>
> >> The thought I had was to have a sysctl that limits the number of
> >> dentries per super-block (sb-max-dentry). Everytime a new dentry is
> >> allocated in d_alloc(), check if dentry_stat.nr_dentry exceeds (number
> >> of super blocks * sb-max-dentry). If yes, queue up an asynchronous
> >> workqueue call to prune_dcache(). Also have a separate sysctl to
> >> indicate by what percentage to reduce the dentry entries when this
> >> happens.
> >
> > This request does come up every so often. There are valid reasons
> > for being able to control the exact size of the dentry and page
> > caches - I've seen a few implementations in storage appliance
> > vendor kernels where total control of memory usage yields a few
> > percent better performance of industry specific benchmarks. Indeed,
> > years ago I thought that capping the size of the dnetry cache was a
> > good idea, too.
> >
> > However, the problem that I've seen with every single on of these
> > implementations is that the limit is carefully tuned for best all
> > round performance in a given set of canned workloads. When the limit
> > is wrong, performance tanks, and it is just about impossible to set
> > a limit correctly for a machine that has a changing workload.
> >
> > If your problem is negative dentries building up, where do you set
> > the limit? Set it low enough to keep only a small number of total
> > dentries to keep the negative dentries down, and you'll end up
> > with a dentry cache that isn't big enough to hold all th dentries
> > needed for efficient performance with workloads that do directory
> > traversals. It's a two-edged sword, and most people do not have
> > enough knowledge to tune a knob correctly.
> >
> > IOWs, the automatic sizing of the dentry cache based on memory
> > pressure is the correct thing to do. Capping it, or allowing it to
> > be capped will simply generate bug reports for strange performance
> > problems....
> >
> > That said, keeping lots of negative dentries around until memory
> > pressure kicks them out is probably the wrong thing to do. Negative
> > dentries are an optimisation for some workloads, but they tend to
> > have references to negative dentries with a temporal locality that
> > matches the unlink time.
> >
> > Perhaps we need to separately reclaim negative dentries i.e. not
> > wait for memory pressure to reclaim them but use some other kind of
> > trigger for reclamation. That doesn't cap the size of the dentry
> > cache, but would address the problem of negative dentry buildup....
> >
> > Cheers,
> >
> > Dave.
> > --
> > Dave Chinner
> > david@fromorbit.com
>
> Hi Dave,
>
> Thank you for responding. Sorry it took so long for me to get back,
> been a bit busy.
>
> I do agree that having a knob, and then setting a bad value can tank
> performance. But not having a knob IMO is worse. Currently there are
> no options for controlling the cache, bar dropping the caches
> altogether every so often. The knob would have a default value of
> ((unsigned long) -1)), so if one does not care for it, they would
> experience the same behavior as today.
And therein lies the problem with a knob. What's the point of having
a knob that nobody but a handful of people know what it does or
evenhow to recognise when they need to tweak it. It's long been a
linux kernel policy that the kernel should do the right thing by
default. As such, knobs to tweak things are a last resort.
> Also, setting a bad value for the knob would negatively impact file-IO
> performance, which on a spinning disk isn't guaranteed anyway. The
> current situation tanks memory performance which is more unexpected to
> a normal user.
Which is precisely why a knob is the wrong solution. If it's
something a normal, unsuspecting user has problems with, then it
needs to be handled automatically by the kernel. Expecting users who
don't even know what a dentry is to know about a magic knob that
fixes a problem they don't even know they have is not an acceptable
solution.
The first step to solving such a problem is to provide a
reproducable, measurable test case in a simple script that
demonstrates the problem that needs solving. If we can reproduce it
at will, then half the battle is already won....
Cheers,
Dave.
--
Dave Chinner
david@fromorbit.com
next prev parent reply other threads:[~2013-05-26 23:23 UTC|newest]
Thread overview: 11+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-05-20 3:50 Limit dentry cache entries Keyur Govande
2013-05-20 12:20 ` Bob Peterson
2013-05-25 3:03 ` Keyur Govande
2013-05-20 22:53 ` Dave Chinner
2013-05-25 3:12 ` Keyur Govande
2013-05-26 23:23 ` Dave Chinner [this message]
2013-05-28 6:12 ` Keyur Govande
2013-05-28 6:24 ` Keyur Govande
2013-05-28 10:49 ` Dave Chinner
2013-05-28 16:42 ` Keyur Govande
2013-05-28 17:14 ` Keyur Govande
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20130526232352.GQ24543@dastard \
--to=david@fromorbit.com \
--cc=keyurgovande@gmail.com \
--cc=linux-fsdevel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.