All of lore.kernel.org
 help / color / mirror / Atom feed
From: Greg KH <greg@kroah.com>
To: Shaoying Xu <shaoyi@amazon.com>
Cc: stable@vger.kernel.org, fllinden@amazon.com, samjonas@amazon.com,
	surajjs@amazon.com
Subject: Re: [PATCH 4.14] mm: memcontrol: fix excessive complexity in memory.stat reporting
Date: Sat, 19 Dec 2020 13:38:19 +0100	[thread overview]
Message-ID: <X930O8w8w8VjCLP5@kroah.com> (raw)
In-Reply-To: <20201216221450.GB19206@amazon.com>

On Wed, Dec 16, 2020 at 10:14:50PM +0000, Shaoying Xu wrote:
> From: Johannes Weiner <hannes@cmpxchg.org>
> 
> [ Upstream commit a983b5ebee57209c99f68c8327072f25e0e6e3da ]
> 
> mm: memcontrol: fix excessive complexity in memory.stat reporting
> 
> We've seen memory.stat reads in top-level cgroups take up to fourteen
> seconds during a userspace bug that created tens of thousands of ghost
> cgroups pinned by lingering page cache.
> 
> Even with a more reasonable number of cgroups, aggregating memory.stat
> is unnecessarily heavy.  The complexity is this:
> 
>         nr_cgroups * nr_stat_items * nr_possible_cpus
> 
> where the stat items are ~70 at this point.  With 128 cgroups and 128
> CPUs - decent, not enormous setups - reading the top-level memory.stat
> has to aggregate over a million per-cpu counters.  This doesn't scale.
> 
> Instead of spreading the source of truth across all CPUs, use the
> per-cpu counters merely to batch updates to shared atomic counters.
> 
> This is the same as the per-cpu stocks we use for charging memory to the
> shared atomic page_counters, and also the way the global vmstat counters
> are implemented.
> 
> Vmstat has elaborate spilling thresholds that depend on the number of
> CPUs, amount of memory, and memory pressure - carefully balancing the
> cost of counter updates with the amount of per-cpu error.  That's
> because the vmstat counters are system-wide, but also used for decisions
> inside the kernel (e.g.  NR_FREE_PAGES in the allocator).  Neither is
> true for the memory controller.
> 
> Use the same static batch size we already use for page_counter updates
> during charging.  The per-cpu error in the stats will be 128k, which is
> an acceptable ratio of cores to memory accounting granularity.
> 
> [hannes@cmpxchg.org: fix warning in __this_cpu_xchg() calls]
> Link: http://lkml.kernel.org/r/20171201135750.GB8097@cmpxchg.org
> Link: http://lkml.kernel.org/r/20171103153336.24044-3-hannes@cmpxchg.org
> Signed-off-by: Johannes Weiner <hannes@cmpxchg.org>
> Acked-by: Vladimir Davydov <vdavydov.dev@gmail.com>
> Cc: Michal Hocko <mhocko@suse.com>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
> Signed-off-by: Linus Torvalds <torvalds@linux-foundation.org>
> Cc: stable@vger.kernel.org c9019e9: mm: memcontrol: eliminate raw access to stat and event counters
> Cc: stable@vger.kernel.org 2845426: mm: memcontrol: implement lruvec stat functions on top of each other
> Cc: stable@vger.kernel.org
> [shaoyi@amazon.com: resolved the conflict brought by commit 17ffa29c355658c8e9b19f56cbf0388500ca7905 in mm/memcontrol.c by contextual fix]
> Signed-off-by: Shaoying Xu <shaoyi@amazon.com>
> ---
> The excessive complexity in memory.stat reporting was fixed in v4.16 but didn't appear to make it to 4.14 stable. When backporting this patch, there is a small conflict brought by commit 17ffa29c355658c8e9b19f56cbf0388500ca7905 within free_mem_cgroup_per_node_info() of mm/memcontrol.c and can be resolved by contextual fix.
> 
> include/linux/memcontrol.h |  96 ++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++++----------------------------------
> mm/memcontrol.c            | 101 +++++++++++++++++++++++++++++++++++++++++++++++++++--------------------------------------------------
> 2 files changed, 113 insertions(+), 84 deletions(-)
> 
> diff --git a/include/linux/memcontrol.h b/include/linux/memcontrol.h
> index 1ffc54ac4cc9..882046863581 100644
> --- a/include/linux/memcontrol.h
> +++ b/include/linux/memcontrol.h
> @@ -108,7 +108,10 @@ struct lruvec_stat {
>   */
> struct mem_cgroup_per_node {
> 	struct lruvec		lruvec;
> -	struct lruvec_stat __percpu *lruvec_stat;
> +
> +	struct lruvec_stat __percpu *lruvec_stat_cpu;
> +	atomic_long_t		lruvec_stat[NR_VM_NODE_STAT_ITEMS];
> +
> 	unsigned long		lru_zone_size[MAX_NR_ZONES][NR_LRU_LISTS];
> 
> 	struct mem_cgroup_reclaim_iter	iter[DEF_PRIORITY + 1];


This patch is corrupted and can not be applied :(

Please fix up and resend.

thanks,

greg k-h

  reply	other threads:[~2020-12-19 12:38 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-12-16 22:14 [PATCH 4.14] mm: memcontrol: fix excessive complexity in memory.stat reporting Shaoying Xu
2020-12-19 12:38 ` Greg KH [this message]
  -- strict thread matches above, loose matches on Subject: below --
2020-12-21 19:35 Shaoying Xu
2020-12-28 11:31 ` Greg KH

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=X930O8w8w8VjCLP5@kroah.com \
    --to=greg@kroah.com \
    --cc=fllinden@amazon.com \
    --cc=samjonas@amazon.com \
    --cc=shaoyi@amazon.com \
    --cc=stable@vger.kernel.org \
    --cc=surajjs@amazon.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.