From mboxrd@z Thu Jan 1 00:00:00 1970 From: Keyur Govande Subject: Re: Limit dentry cache entries Date: Fri, 24 May 2013 23:12:50 -0400 Message-ID: References: <20130520225342.GC24543@dastard> Mime-Version: 1.0 Content-Type: text/plain; charset=ISO-8859-1 Cc: linux-fsdevel@vger.kernel.org To: Dave Chinner Return-path: Received: from mail-wg0-f54.google.com ([74.125.82.54]:52261 "EHLO mail-wg0-f54.google.com" rhost-flags-OK-OK-OK-OK) by vger.kernel.org with ESMTP id S1753906Ab3EYDMw (ORCPT ); Fri, 24 May 2013 23:12:52 -0400 Received: by mail-wg0-f54.google.com with SMTP id j13so1660323wgh.33 for ; Fri, 24 May 2013 20:12:51 -0700 (PDT) In-Reply-To: <20130520225342.GC24543@dastard> Sender: linux-fsdevel-owner@vger.kernel.org List-ID: On Mon, May 20, 2013 at 6:53 PM, Dave Chinner wrote: > On Sun, May 19, 2013 at 11:50:55PM -0400, Keyur Govande wrote: >> Hello, >> >> We have a bunch of servers that create a lot of temp files, or check >> for the existence of non-existent files. Every such operation creates >> a dentry object and soon most of the free memory is consumed for >> 'negative' dentry entries. This behavior was observed on both CentOS >> kernel v.2.6.32-358 and Amazon Linux kernel v.3.4.43-4. >> >> There are also some processes running that occasionally allocate large >> chunks of memory, and when this happens the kernel clears out a bunch >> of stale dentry caches. This clearing takes some time. kswapd kicks >> in, and allocations and bzero() of 4GB that normally takes <1s, takes >> 20s or more. >> >> Because the memory needs are non-continuous but negative dentry >> generation is fairly continuous, vfs_cache_pressure doesn't help much. >> >> The thought I had was to have a sysctl that limits the number of >> dentries per super-block (sb-max-dentry). Everytime a new dentry is >> allocated in d_alloc(), check if dentry_stat.nr_dentry exceeds (number >> of super blocks * sb-max-dentry). If yes, queue up an asynchronous >> workqueue call to prune_dcache(). Also have a separate sysctl to >> indicate by what percentage to reduce the dentry entries when this >> happens. > > This request does come up every so often. There are valid reasons > for being able to control the exact size of the dentry and page > caches - I've seen a few implementations in storage appliance > vendor kernels where total control of memory usage yields a few > percent better performance of industry specific benchmarks. Indeed, > years ago I thought that capping the size of the dnetry cache was a > good idea, too. > > However, the problem that I've seen with every single on of these > implementations is that the limit is carefully tuned for best all > round performance in a given set of canned workloads. When the limit > is wrong, performance tanks, and it is just about impossible to set > a limit correctly for a machine that has a changing workload. > > If your problem is negative dentries building up, where do you set > the limit? Set it low enough to keep only a small number of total > dentries to keep the negative dentries down, and you'll end up > with a dentry cache that isn't big enough to hold all th dentries > needed for efficient performance with workloads that do directory > traversals. It's a two-edged sword, and most people do not have > enough knowledge to tune a knob correctly. > > IOWs, the automatic sizing of the dentry cache based on memory > pressure is the correct thing to do. Capping it, or allowing it to > be capped will simply generate bug reports for strange performance > problems.... > > That said, keeping lots of negative dentries around until memory > pressure kicks them out is probably the wrong thing to do. Negative > dentries are an optimisation for some workloads, but they tend to > have references to negative dentries with a temporal locality that > matches the unlink time. > > Perhaps we need to separately reclaim negative dentries i.e. not > wait for memory pressure to reclaim them but use some other kind of > trigger for reclamation. That doesn't cap the size of the dentry > cache, but would address the problem of negative dentry buildup.... > > Cheers, > > Dave. > -- > Dave Chinner > david@fromorbit.com Hi Dave, Thank you for responding. Sorry it took so long for me to get back, been a bit busy. I do agree that having a knob, and then setting a bad value can tank performance. But not having a knob IMO is worse. Currently there are no options for controlling the cache, bar dropping the caches altogether every so often. The knob would have a default value of ((unsigned long) -1)), so if one does not care for it, they would experience the same behavior as today. Also, setting a bad value for the knob would negatively impact file-IO performance, which on a spinning disk isn't guaranteed anyway. The current situation tanks memory performance which is more unexpected to a normal user. Thanks, Keyur.