All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michal Hocko <mhocko@kernel.org>
To: Waiman Long <longman@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jonathan Corbet <corbet@lwn.net>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Larry Woodman <lwoodman@redhat.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Wangkai (Kevin C)" <wangkai86@huawei.com>
Subject: Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries
Date: Mon, 9 Jul 2018 10:19:20 +0200	[thread overview]
Message-ID: <20180709081920.GD22049@dhcp22.suse.cz> (raw)
In-Reply-To: <1530905572-817-1-git-send-email-longman@redhat.com>

On Fri 06-07-18 15:32:45, Waiman Long wrote:
[...]
> A rogue application can potentially create a large number of negative
> dentries in the system consuming most of the memory available if it
> is not under the direct control of a memory controller that enforce
> kernel memory limit.

How does this differ from other untracked allocations for untrusted
tasks in general? E.g. nothing really prevents a task to create a long
chain of unreclaimable dentries and even go to OOM potentially. Negative
dentries should be easily reclaimable on the other hand. So why does the
later needs a special treatment while the first one is ok? There are
quite some resources which allow a non privileged user to consume a lot
of memory and the memory controller is the only reliable way to mitigate
the risk.

> This patchset introduces changes to the dcache subsystem to track and
> optionally limit the number of negative dentries allowed to be created by
> background pruning of excess negative dentries or even kill it after use.
> This capability will help to limit the amount of memory that can be
> consumed by negative dentries.

How are you going to balance that between workload? What prevents a
rogue application to simply consume the limit and force all others in
the system to go slow path?

> Patch 1 tracks the number of negative dentries present in the LRU
> lists and reports it in /proc/sys/fs/dentry-state.

If anything I _think_ vmstat would benefit from this because behavior of
the memory reclaim does depend on the amount of neg. dentries.

> Patch 2 adds a "neg-dentry-pc" sysctl parameter that can be used to to
> specify a soft limit on the number of negative allowed as a percentage
> of total system memory. This parameter is 0 by default which means no
> negative dentry limiting will be performed.

percentage has turned out to be a really wrong unit for many tunables
over time. Even 1% can be just too much on really large machines.

> Patch 3 enables automatic pruning of least recently used negative
> dentries when the total number is close to the preset limit.

Please explain why this cannot be done in a standard dcache shrinking
way. I strongly suspect that you are developing yet another reclaim with
its own sets of tunable and bypassing the existing infrastructure. I
haven't read patches yet but the cover letter doesn't really explain
design much so I am only guessing.
-- 
Michal Hocko
SUSE Labs
--
To unsubscribe from this list: send the line "unsubscribe linux-doc" in
the body of a message to majordomo@vger.kernel.org
More majordomo info at  http://vger.kernel.org/majordomo-info.html

WARNING: multiple messages have this Message-ID (diff)
From: Michal Hocko <mhocko@kernel.org>
To: Waiman Long <longman@redhat.com>
Cc: Alexander Viro <viro@zeniv.linux.org.uk>,
	Jonathan Corbet <corbet@lwn.net>,
	"Luis R. Rodriguez" <mcgrof@kernel.org>,
	Kees Cook <keescook@chromium.org>,
	linux-kernel@vger.kernel.org, linux-fsdevel@vger.kernel.org,
	linux-mm@kvack.org, linux-doc@vger.kernel.org,
	Linus Torvalds <torvalds@linux-foundation.org>,
	Jan Kara <jack@suse.cz>,
	"Paul E. McKenney" <paulmck@linux.vnet.ibm.com>,
	Andrew Morton <akpm@linux-foundation.org>,
	Ingo Molnar <mingo@kernel.org>,
	Miklos Szeredi <mszeredi@redhat.com>,
	Matthew Wilcox <willy@infradead.org>,
	Larry Woodman <lwoodman@redhat.com>,
	James Bottomley <James.Bottomley@HansenPartnership.com>,
	"Wangkai (Kevin C)" <wangkai86@huawei.com>
Subject: Re: [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries
Date: Mon, 9 Jul 2018 10:19:20 +0200	[thread overview]
Message-ID: <20180709081920.GD22049@dhcp22.suse.cz> (raw)
In-Reply-To: <1530905572-817-1-git-send-email-longman@redhat.com>

On Fri 06-07-18 15:32:45, Waiman Long wrote:
[...]
> A rogue application can potentially create a large number of negative
> dentries in the system consuming most of the memory available if it
> is not under the direct control of a memory controller that enforce
> kernel memory limit.

How does this differ from other untracked allocations for untrusted
tasks in general? E.g. nothing really prevents a task to create a long
chain of unreclaimable dentries and even go to OOM potentially. Negative
dentries should be easily reclaimable on the other hand. So why does the
later needs a special treatment while the first one is ok? There are
quite some resources which allow a non privileged user to consume a lot
of memory and the memory controller is the only reliable way to mitigate
the risk.

> This patchset introduces changes to the dcache subsystem to track and
> optionally limit the number of negative dentries allowed to be created by
> background pruning of excess negative dentries or even kill it after use.
> This capability will help to limit the amount of memory that can be
> consumed by negative dentries.

How are you going to balance that between workload? What prevents a
rogue application to simply consume the limit and force all others in
the system to go slow path?

> Patch 1 tracks the number of negative dentries present in the LRU
> lists and reports it in /proc/sys/fs/dentry-state.

If anything I _think_ vmstat would benefit from this because behavior of
the memory reclaim does depend on the amount of neg. dentries.

> Patch 2 adds a "neg-dentry-pc" sysctl parameter that can be used to to
> specify a soft limit on the number of negative allowed as a percentage
> of total system memory. This parameter is 0 by default which means no
> negative dentry limiting will be performed.

percentage has turned out to be a really wrong unit for many tunables
over time. Even 1% can be just too much on really large machines.

> Patch 3 enables automatic pruning of least recently used negative
> dentries when the total number is close to the preset limit.

Please explain why this cannot be done in a standard dcache shrinking
way. I strongly suspect that you are developing yet another reclaim with
its own sets of tunable and bypassing the existing infrastructure. I
haven't read patches yet but the cover letter doesn't really explain
design much so I am only guessing.
-- 
Michal Hocko
SUSE Labs

  parent reply	other threads:[~2018-07-09  8:19 UTC|newest]

Thread overview: 106+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2018-07-06 19:32 [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Waiman Long
2018-07-06 19:32 ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 1/7] fs/dcache: Track & report number " Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 2/7] fs/dcache: Add sysctl parameter neg-dentry-pc as a soft limit on " Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 3/7] fs/dcache: Enable automatic pruning of " Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 4/7] fs/dcache: Spread negative dentry pruning across multiple CPUs Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 5/7] fs/dcache: Add negative dentries to LRU head initially Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 6/7] fs/dcache: Allow optional enforcement of negative dentry limit Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 19:32 ` [PATCH v6 7/7] fs/dcache: Allow deconfiguration of negative dentry code to reduce kernel size Waiman Long
2018-07-06 19:32   ` Waiman Long
2018-07-06 21:54   ` Eric Biggers
2018-07-06 21:54     ` Eric Biggers
2018-07-06 22:28 ` [PATCH v6 0/7] fs/dcache: Track & limit # of negative dentries Al Viro
2018-07-06 22:28   ` Al Viro
2018-07-07  3:02   ` Waiman Long
2018-07-07  3:02     ` Waiman Long
2018-07-09  8:19 ` Michal Hocko [this message]
2018-07-09  8:19   ` Michal Hocko
2018-07-09 16:01   ` Waiman Long
2018-07-09 16:01     ` Waiman Long
2018-07-10 14:27     ` Michal Hocko
2018-07-10 14:27       ` Michal Hocko
2018-07-10 16:09       ` Waiman Long
2018-07-10 16:09         ` Waiman Long
2018-07-11 10:21         ` Michal Hocko
2018-07-11 10:21           ` Michal Hocko
2018-07-11 15:13           ` Waiman Long
2018-07-11 15:13             ` Waiman Long
2018-07-11 17:42             ` James Bottomley
2018-07-11 17:42               ` James Bottomley
2018-07-11 19:07               ` Waiman Long
2018-07-11 19:07                 ` Waiman Long
2018-07-11 19:21                 ` James Bottomley
2018-07-11 19:21                   ` James Bottomley
2018-07-11 19:21                   ` James Bottomley
2018-07-12 15:54                   ` Waiman Long
2018-07-12 15:54                     ` Waiman Long
2018-07-12 16:04                     ` James Bottomley
2018-07-12 16:04                       ` James Bottomley
2018-07-12 16:04                       ` James Bottomley
2018-07-12 16:26                       ` Waiman Long
2018-07-12 16:26                         ` Waiman Long
2018-07-12 17:33                         ` James Bottomley
2018-07-12 17:33                           ` James Bottomley
2018-07-12 17:33                           ` James Bottomley
2018-07-13 15:32                           ` Waiman Long
2018-07-13 15:32                             ` Waiman Long
2018-07-12 16:49                       ` Matthew Wilcox
2018-07-12 16:49                         ` Matthew Wilcox
2018-07-12 17:21                         ` James Bottomley
2018-07-12 17:21                           ` James Bottomley
2018-07-12 17:21                           ` James Bottomley
2018-07-12 18:06                           ` Linus Torvalds
2018-07-12 19:57                             ` James Bottomley
2018-07-12 19:57                               ` James Bottomley
2018-07-12 19:57                               ` James Bottomley
2018-07-13  0:36                               ` Dave Chinner
2018-07-13  0:36                                 ` Dave Chinner
2018-07-13 15:46                                 ` James Bottomley
2018-07-13 15:46                                   ` James Bottomley
2018-07-13 15:46                                   ` James Bottomley
2018-07-13 23:17                                   ` Dave Chinner
2018-07-13 23:17                                     ` Dave Chinner
2018-07-13 23:17                                     ` Dave Chinner
2018-07-13 23:17                                     ` Dave Chinner
2018-07-16  9:10                                   ` Michal Hocko
2018-07-16  9:10                                     ` Michal Hocko
2018-07-16 14:42                                     ` James Bottomley
2018-07-16 14:42                                       ` James Bottomley
2018-07-16 14:42                                       ` James Bottomley
2018-07-16  9:09                                 ` Michal Hocko
2018-07-16  9:09                                   ` Michal Hocko
2018-07-16  9:12                                   ` Michal Hocko
2018-07-16  9:12                                     ` Michal Hocko
2018-07-16 12:41                                   ` Matthew Wilcox
2018-07-16 12:41                                     ` Matthew Wilcox
2018-07-16 23:40                                     ` Andrew Morton
2018-07-16 23:40                                       ` Andrew Morton
2018-07-17  1:30                                       ` Matthew Wilcox
2018-07-17  1:30                                         ` Matthew Wilcox
2018-07-17  8:33                                       ` Michal Hocko
2018-07-17  8:33                                         ` Michal Hocko
2018-07-19  0:33                                         ` Dave Chinner
2018-07-19  0:33                                           ` Dave Chinner
2018-07-19  8:45                                           ` Michal Hocko
2018-07-19  8:45                                             ` Michal Hocko
2018-07-19  9:13                                             ` Jan Kara
2018-07-19  9:13                                               ` Jan Kara
2018-07-18 18:39                                       ` Waiman Long
2018-07-18 18:39                                         ` Waiman Long
2018-07-18 16:17                                   ` Waiman Long
2018-07-18 16:17                                     ` Waiman Long
2018-07-19  8:48                                     ` Michal Hocko
2018-07-19  8:48                                       ` Michal Hocko
2018-07-12  8:48             ` Michal Hocko
2018-07-12  8:48               ` Michal Hocko
2018-07-12 16:12               ` Waiman Long
2018-07-12 16:12                 ` Waiman Long
2018-07-12 23:16                 ` Andrew Morton
2018-07-12 23:16                   ` Andrew Morton

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20180709081920.GD22049@dhcp22.suse.cz \
    --to=mhocko@kernel.org \
    --cc=James.Bottomley@HansenPartnership.com \
    --cc=akpm@linux-foundation.org \
    --cc=corbet@lwn.net \
    --cc=jack@suse.cz \
    --cc=keescook@chromium.org \
    --cc=linux-doc@vger.kernel.org \
    --cc=linux-fsdevel@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=longman@redhat.com \
    --cc=lwoodman@redhat.com \
    --cc=mcgrof@kernel.org \
    --cc=mingo@kernel.org \
    --cc=mszeredi@redhat.com \
    --cc=paulmck@linux.vnet.ibm.com \
    --cc=torvalds@linux-foundation.org \
    --cc=viro@zeniv.linux.org.uk \
    --cc=wangkai86@huawei.com \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.