linux-fsdevel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
From: Waiman Long <longman@redhat.com>
To: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jonathan Corbet <corbet@lwn.net>,
	Luis Chamberlain <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	Iurii Zaikin <yzaikin@google.com>
Cc: linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-doc@vger.kernel.org,
	Mauro Carvalho Chehab <mchehab+samsung@kernel.org>,
	Eric Biggers <ebiggers@google.com>,
	Dave Chinner <david@fromorbit.com>,
	Eric Sandeen <sandeen@redhat.com>,
	Waiman Long <longman@redhat.com>
Subject: [PATCH 04/11] fs/dcache: Add sysctl parameter dentry-dir-max
Date: Wed, 26 Feb 2020 11:13:57 -0500	[thread overview]
Message-ID: <20200226161404.14136-5-longman@redhat.com> (raw)
In-Reply-To: <20200226161404.14136-1-longman@redhat.com>

The number of positive dentries in a system is limited by the actual
number of files in the system. The number of negative dentries, however,
has no such limit. As a result, it is possible that a system can have a
significant amount of memory tied up in negative dentries. For example,
running a negative dentry generator on a 4-socket 256GB x86-64 system,
the system can use up more than 150GB of memory in dentries and more
than 200GB in slabs and almost running out of free memory before memory
reclaim kicks in.

There are two major problems with having too many negative dentries:

 1) When a filesystem with too many negative dentries is being unmounted,
    the process of draining the dentries associated with the filesystem
    can take some time. To users, the system may seem to hang for
    a while.  The long wait may also cause unexpected timeout error or
    other warnings.  This can happen when a long running container with
    many negative dentries is being destroyed, for instance.

 2) Tying up too much memory in unused negative dentries means there
    are less memory available for other use. Even though the kernel is
    able to reclaim unused dentries when running out of free memory,
    it will still introduce additional latency to the application
    reducing its performance.

This patch introduces a new sysctl parameter "dentry-dir-max" to
/proc/sys/fs. This syctl parameter represents a soft limit on the
total number of negative dentries allowable under any given directory.
Any non-negative integer is allowed. The default is 0 which means there
is no limit.

The actual negative dentry reclaim process to enforce the limit will
be done in a later patch.

Signed-off-by: Waiman Long <longman@redhat.com>
---
 Documentation/admin-guide/sysctl/fs.rst | 18 ++++++++++++++++++
 fs/dcache.c                             | 10 ++++++++++
 kernel/sysctl.c                         | 10 ++++++++++
 3 files changed, 38 insertions(+)

diff --git a/Documentation/admin-guide/sysctl/fs.rst b/Documentation/admin-guide/sysctl/fs.rst
index 2a45119e3331..7274a7b34ee4 100644
--- a/Documentation/admin-guide/sysctl/fs.rst
+++ b/Documentation/admin-guide/sysctl/fs.rst
@@ -28,6 +28,7 @@ Currently, these files are in /proc/sys/fs:
 
 - aio-max-nr
 - aio-nr
+- dentry-dir-max
 - dentry-state
 - dquot-max
 - dquot-nr
@@ -60,6 +61,23 @@ raising aio-max-nr does not result in the pre-allocation or re-sizing
 of any kernel data structures.
 
 
+dentry-dir-max
+--------------
+
+This integer value specifies a soft limit on the maximum number
+of negative dentries that are allowed under any given directory.
+A negative dentry contains filename that is known to be nonexistent
+in the directory.  No restriction is placed on the number of positive
+dentries as it is naturally limited by the number of files in the
+directory.
+
+The default value is 0 which means there is no limit.  Any non-negative
+value is allowed.  However, internal tracking is done on all dentry
+types. So the value given, if non-zero, should be larger than the
+number of files in a typical large directory in order to reduce the
+tracking overhead.
+
+
 dentry-state
 ------------
 
diff --git a/fs/dcache.c b/fs/dcache.c
index 0ee5aa2c31cf..8f3ac31a582b 100644
--- a/fs/dcache.c
+++ b/fs/dcache.c
@@ -123,6 +123,16 @@ static DEFINE_PER_CPU(long, nr_dentry);
 static DEFINE_PER_CPU(long, nr_dentry_unused);
 static DEFINE_PER_CPU(long, nr_dentry_negative);
 
+/*
+ * dcache_dentry_dir_max_sysctl:
+ *
+ * This is sysctl parameter "dentry-dir-max" which specifies a limit on
+ * the maximum number of negative dentries that are allowed under any
+ * given directory.
+ */
+int dcache_dentry_dir_max_sysctl __read_mostly;
+EXPORT_SYMBOL_GPL(dcache_dentry_dir_max_sysctl);
+
 #if defined(CONFIG_SYSCTL) && defined(CONFIG_PROC_FS)
 
 /*
diff --git a/kernel/sysctl.c b/kernel/sysctl.c
index d396aaaf19a3..cd0a83ff1029 100644
--- a/kernel/sysctl.c
+++ b/kernel/sysctl.c
@@ -118,6 +118,7 @@ extern unsigned int sysctl_nr_open_min, sysctl_nr_open_max;
 #ifndef CONFIG_MMU
 extern int sysctl_nr_trim_pages;
 #endif
+extern int dcache_dentry_dir_max_sysctl;
 
 /* Constants used for minimum and  maximum */
 #ifdef CONFIG_LOCKUP_DETECTOR
@@ -127,6 +128,7 @@ static int sixty = 60;
 static int __maybe_unused neg_one = -1;
 static int __maybe_unused two = 2;
 static int __maybe_unused four = 4;
+static int __maybe_unused zero;
 static unsigned long zero_ul;
 static unsigned long one_ul = 1;
 static unsigned long long_max = LONG_MAX;
@@ -1949,6 +1951,14 @@ static struct ctl_table fs_table[] = {
 		.proc_handler	= proc_dointvec_minmax,
 		.extra1		= SYSCTL_ONE,
 	},
+	{
+		.procname	= "dentry-dir-max",
+		.data		= &dcache_dentry_dir_max_sysctl,
+		.maxlen		= sizeof(dcache_dentry_dir_max_sysctl),
+		.mode		= 0644,
+		.proc_handler	= proc_dointvec_minmax,
+		.extra1		= &zero,
+	},
 	{ }
 };
 
-- 
2.18.1


  parent reply	other threads:[~2020-02-26 16:15 UTC|newest]

Thread overview: 34+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-02-26 16:13 [PATCH 00/11] fs/dcache: Limit # of negative dentries Waiman Long
2020-02-26 16:13 ` [PATCH 01/11] fs/dcache: Fix incorrect accounting " Waiman Long
2020-02-26 16:13 ` [PATCH 02/11] fs/dcache: Simplify __dentry_kill() Waiman Long
2020-02-26 16:13 ` [PATCH 03/11] fs/dcache: Add a counter to track number of children Waiman Long
2020-02-26 16:13 ` Waiman Long [this message]
2020-02-26 16:13 ` [PATCH 05/11] fs/dcache: Reclaim excessive negative dentries in directories Waiman Long
2020-02-26 16:13 ` [PATCH 06/11] fs/dcache: directory opportunistically stores # of positive dentries Waiman Long
2020-02-26 16:14 ` [PATCH 07/11] fs/dcache: Add static key negative_reclaim_enable Waiman Long
2020-02-26 16:14 ` [PATCH 08/11] fs/dcache: Limit dentry reclaim count in negative_reclaim_workfn() Waiman Long
2020-02-26 16:14 ` [PATCH 09/11] fs/dcache: Don't allow small values for dentry-dir-max Waiman Long
2020-02-26 16:14 ` [PATCH 10/11] fs/dcache: Kill off dentry as last resort Waiman Long
2020-02-26 16:14 ` [PATCH 11/11] fs/dcache: Track # of negative dentries reclaimed & killed Waiman Long
2020-02-26 16:29 ` [PATCH 00/11] fs/dcache: Limit # of negative dentries Matthew Wilcox
2020-02-26 19:19   ` Waiman Long
2020-02-26 21:28     ` Matthew Wilcox
2020-02-26 21:28   ` Andreas Dilger
2020-02-26 21:45     ` Matthew Wilcox
2020-02-27  8:07       ` Dave Chinner
2020-02-27  9:55     ` Ian Kent
2020-02-28  3:34       ` Matthew Wilcox
2020-02-28  4:16         ` Ian Kent
2020-02-28  4:36           ` Ian Kent
2020-02-28  4:52             ` Al Viro
2020-02-28  4:22         ` Al Viro
2020-02-28  4:52           ` Ian Kent
2020-02-28 15:32         ` Waiman Long
2020-02-28 15:39           ` Matthew Wilcox
2020-02-28 19:32         ` Theodore Y. Ts'o
2020-02-27 19:04   ` Eric Sandeen
2020-02-27 22:39     ` Dave Chinner
2020-02-27  8:30 ` Dave Chinner
2020-02-28 15:47   ` Waiman Long
2020-03-15  3:46 ` Matthew Wilcox
2020-03-21 10:17   ` Konstantin Khlebnikov

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20200226161404.14136-5-longman@redhat.com \
    --to=longman@redhat.com \
    --cc=corbet@lwn.net \
    --cc=david@fromorbit.com \
    --cc=ebiggers@google.com \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mcgrof@kernel.org \
    --cc=mchehab+samsung@kernel.org \
    --cc=sandeen@redhat.com \
    --cc=viro@zeniv.linux.org.uk \
    --cc=yzaikin@google.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).