Re: [PATCH 4.14] mm: memcontrol: fix excessive complexity in memory.stat reporting

stable.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Greg KH <greg@kroah.com>
To: Shaoying Xu <shaoyi@amazon.com>
Cc: stable@vger.kernel.org, fllinden@amazon.com, samjonas@amazon.com,
	surajjs@amazon.com
Subject: Re: [PATCH 4.14] mm: memcontrol: fix excessive complexity in memory.stat reporting
Date: Sat, 19 Dec 2020 13:38:19 +0100	[thread overview]
Message-ID: <X930O8w8w8VjCLP5@kroah.com> (raw)
In-Reply-To: <20201216221450.GB19206@amazon.com>

On Wed, Dec 16, 2020 at 10:14:50PM +0000, Shaoying Xu wrote:
> From: Johannes Weiner <hannes@cmpxchg.org>
> 
> [ Upstream commit a983b5ebee57209c99f68c8327072f25e0e6e3da ]
> 
> mm: memcontrol: fix excessive complexity in memory.stat reporting
> 
> We've seen memory.stat reads in top-level cgroups take up to fourteen
> seconds during a userspace bug that created tens of thousands of ghost
> cgroups pinned by lingering page cache.
> 
> Even with a more reasonable number of cgroups, aggregating memory.stat
> is unnecessarily heavy.  The complexity is this:
> 
>         nr_cgroups * nr_stat_items * nr_possible_cpus
> 
> where the stat items are ~70 at this point.  With 128 cgroups and 128
> CPUs - decent, not enormous setups - reading the top-level memory.stat
> has to aggregate over a million per-cpu counters.  This doesn't scale.
> 
> Instead of spreading the source of truth across all CPUs, use the
> per-cpu counters merely to batch updates to shared atomic counters.
> 
> This is the same as the per-cpu stocks we use for charging memory to the
> shared atomic page_counters, and also the way the global vmstat counters
> are implemented.
> 
> Vmstat has elaborate spilling thresholds that depend on the number of
> CPUs, amount of memory, and memory pressure - carefully balancing the
> cost of counter updates with the amount of per-cpu error.  That's
> because the vmstat counters are system-wide, but also used for decisions
> inside the kernel (e.g.  NR_FREE_PAGES in the allocator).  Neither is
> true for the memory controller.
> 
> Use the same static batch size we already use for page_counter updates
> during charging.  The per-cpu error in the stats will be 128k, which is
> an acceptable ratio of cores to memory accounting granularity.
> 
> [hannes@cmpxchg.org: fix warning in __this_cpu_xchg() calls]
> Link: http://lkml.kernel.org/r/20171201135750.GB8097@cmpxchg.org
> Link: http://lkml.kernel.org/r/20171103153336.24044-3-hannes@cmpxchg.org
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: stable@vger.kernel.org c9019e9: mm: memcontrol: eliminate raw access to stat and event counters
> Cc: stable@vger.kernel.org 2845426: mm: memcontrol: implement lruvec stat functions on top of each other
> Cc: stable@vger.kernel.org
> [shaoyi@amazon.com: resolved the conflict brought by commit 17ffa29c355658c8e9b19f56cbf0388500ca7905 in mm/memcontrol.c by contextual fix]
> Signed-off-by: Shaoying Xu <shaoyi@amazon.com>
> ---
> The excessive complexity in memory.stat reporting was fixed in v4.16 but didn't appear to make it to 4.14 stable. When backporting this patch, there is a small conflict brought by commit 17ffa29c355658c8e9b19f56cbf0388500ca7905 within free_mem_cgroup_per_node_info() of mm/memcontrol.c and can be resolved by contextual fix.
> 
> include/linux/memcontrol.h |  96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------
> mm/memcontrol.c            | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------
> 2 files changed, 113 insertions(+), 84 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1ffc54ac4cc9..882046863581 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -108,7 +108,10 @@ struct lruvec_stat {
>   */
> struct mem_cgroup_per_node {
> 	struct lruvec		lruvec;
> -	struct lruvec_stat __percpu *lruvec_stat;
> +
> +	struct lruvec_stat __percpu *lruvec_stat_cpu;
> +	atomic_long_t		lruvec_stat[NR_VM_NODE_STAT_ITEMS];
> +
> 	unsigned long		lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
> 
> 	struct mem_cgroup_reclaim_iter	iter[DEF_PRIORITY + 1];


This patch is corrupted and can not be applied :(

Please fix up and resend.

thanks,

greg k-h

next prev parent reply	other threads:[~2020-12-19 12:38 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-16 22:14 [PATCH 4.14] mm: memcontrol: fix excessive complexity in memory.stat reporting Shaoying Xu
2020-12-19 12:38 ` Greg KH [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-12-21 19:35 Shaoying Xu
2020-12-28 11:31 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X930O8w8w8VjCLP5@kroah.com \
    --to=greg@kroah.com \
    --cc=fllinden@amazon.com \
    --cc=samjonas@amazon.com \
    --cc=shaoyi@amazon.com \
    --cc=stable@vger.kernel.org \
    --cc=surajjs@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).