Re: [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: Roman Gushchin <roman.gushchin@linux.dev>
To: Shakeel Butt <shakeel.butt@linux.dev>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	Johannes Weiner <hannes@cmpxchg.org>,
	Michal Hocko <mhocko@kernel.org>,
	Muchun Song <muchun.song@linux.dev>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org
Subject: Re: [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats
Date: Mon, 29 Apr 2024 09:00:16 -0700	[thread overview]
Message-ID: <Zi_EEOUS_iCh2Nfh@P9FQF9L96D> (raw)
In-Reply-To: <20240427003733.3898961-4-shakeel.butt@linux.dev>

On Fri, Apr 26, 2024 at 05:37:29PM -0700, Shakeel Butt wrote:
> At the moment, the amount of memory allocated for stats related structs
> in the mem_cgroup corresponds to the size of enum node_stat_item.
> However not all fields in enum node_stat_item has corresponding memcg
> stats. So, let's use indirection mechanism similar to the one used for
> memcg vmstats management.
> 
> For a given x86_64 config, the size of stats with and without patch is:
> 
> structs size in bytes         w/o     with
> 
> struct lruvec_stats           1128     648
> struct lruvec_stats_percpu     752     432
> struct memcg_vmstats          1832    1352
> struct memcg_vmstats_percpu   1280     960
> 
> The memory savings is further compounded by the fact that these structs
> are allocated for each cpu and for each node. To be precise, for each
> memcg the memory saved would be:
> 
> Memory saved = ((21 * 3 * NR_NODES) + (21 * 2 * NR_NODS * NR_CPUS) +
> 	       (21 * 3) + (21 * 2 * NR_CPUS)) * sizeof(long)
> 
> Where 21 is the number of fields eliminated.

Nice savings!

> 
> Signed-off-by: Shakeel Butt <shakeel.butt@linux.dev>
> ---
>  mm/memcontrol.c | 138 ++++++++++++++++++++++++++++++++++++++++--------
>  1 file changed, 115 insertions(+), 23 deletions(-)
> 
> diff --git a/mm/memcontrol.c b/mm/memcontrol.c
> index 5e337ed6c6bf..c164bc9b8ed6 100644
> --- a/mm/memcontrol.c
> +++ b/mm/memcontrol.c
> @@ -576,35 +576,105 @@ mem_cgroup_largest_soft_limit_node(struct mem_cgroup_tree_per_node *mctz)
>  	return mz;
>  }
>  
> +/* Subset of node_stat_item for memcg stats */
> +static const unsigned int memcg_node_stat_items[] = {
> +	NR_INACTIVE_ANON,
> +	NR_ACTIVE_ANON,
> +	NR_INACTIVE_FILE,
> +	NR_ACTIVE_FILE,
> +	NR_UNEVICTABLE,
> +	NR_SLAB_RECLAIMABLE_B,
> +	NR_SLAB_UNRECLAIMABLE_B,
> +	WORKINGSET_REFAULT_ANON,
> +	WORKINGSET_REFAULT_FILE,
> +	WORKINGSET_ACTIVATE_ANON,
> +	WORKINGSET_ACTIVATE_FILE,
> +	WORKINGSET_RESTORE_ANON,
> +	WORKINGSET_RESTORE_FILE,
> +	WORKINGSET_NODERECLAIM,
> +	NR_ANON_MAPPED,
> +	NR_FILE_MAPPED,
> +	NR_FILE_PAGES,
> +	NR_FILE_DIRTY,
> +	NR_WRITEBACK,
> +	NR_SHMEM,
> +	NR_SHMEM_THPS,
> +	NR_FILE_THPS,
> +	NR_ANON_THPS,
> +	NR_KERNEL_STACK_KB,
> +	NR_PAGETABLE,
> +	NR_SECONDARY_PAGETABLE,
> +#ifdef CONFIG_SWAP
> +	NR_SWAPCACHE,
> +#endif
> +};
> +
> +static const unsigned int memcg_stat_items[] = {
> +	MEMCG_SWAP,
> +	MEMCG_SOCK,
> +	MEMCG_PERCPU_B,
> +	MEMCG_VMALLOC,
> +	MEMCG_KMEM,
> +	MEMCG_ZSWAP_B,
> +	MEMCG_ZSWAPPED,
> +};
> +
> +#define NR_MEMCG_NODE_STAT_ITEMS ARRAY_SIZE(memcg_node_stat_items)
> +#define NR_MEMCG_STATS (NR_MEMCG_NODE_STAT_ITEMS + ARRAY_SIZE(memcg_stat_items))
> +static int8_t mem_cgroup_stats_index[MEMCG_NR_STAT] __read_mostly;
> +
> +static void init_memcg_stats(void)
> +{
> +	int8_t i, j = 0;
> +
> +	/* Switch to short once this failure occurs. */
> +	BUILD_BUG_ON(NR_MEMCG_STATS >= 127 /* INT8_MAX */);
> +
> +	for (i = 0; i < NR_MEMCG_NODE_STAT_ITEMS; ++i)
> +		mem_cgroup_stats_index[memcg_node_stat_items[i]] = ++j;
> +
> +	for (i = 0; i < ARRAY_SIZE(memcg_stat_items); ++i)
> +		mem_cgroup_stats_index[memcg_stat_items[i]] = ++j;
> +}
> +
> +static inline int memcg_stats_index(int idx)
> +{
> +	return mem_cgroup_stats_index[idx] - 1;
> +}

Hm, I'm slightly worried about the performance penalty due to the increased cache
footprint. Can't we have some formula to translate idx to memcg_idx instead of
a translation table?
If it requires a re-arrangement of items we can add a translation table on the
read side to save the visible order in procfs/sysfs.
Or I'm overthinking and the real difference is negligible?

Thanks!

next prev parent reply	other threads:[~2024-04-29 16:00 UTC|newest]

Thread overview: 32+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-27  0:37 [PATCH v2 0/7] memcg: reduce memory consumption by memcg stats Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 1/7] memcg: reduce memory size of mem_cgroup_events_index Shakeel Butt
2024-04-27  0:42   ` Yosry Ahmed
2024-04-27  1:15     ` Shakeel Butt
2024-04-29 15:36   ` Roman Gushchin
2024-04-27  0:37 ` [PATCH v2 2/7] memcg: dynamically allocate lruvec_stats Shakeel Butt
2024-04-29 15:50   ` Roman Gushchin
2024-04-29 19:46     ` Shakeel Butt
2024-04-29 21:02       ` Roman Gushchin
2024-04-29 21:59         ` Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 3/7] memcg: reduce memory for the lruvec and memcg stats Shakeel Butt
2024-04-27  0:51   ` Yosry Ahmed
2024-04-27  1:16     ` Shakeel Butt
2024-04-27  1:18       ` Yosry Ahmed
2024-04-29 16:00   ` Roman Gushchin [this message]
2024-04-29 20:00     ` Shakeel Butt
2024-04-29 17:35   ` T.J. Mercier
2024-04-29 20:13     ` Shakeel Butt
2024-04-29 22:23       ` T.J. Mercier
2024-04-27  0:37 ` [PATCH v2 4/7] memcg: cleanup __mod_memcg_lruvec_state Shakeel Butt
2024-04-27  0:53   ` Yosry Ahmed
2024-04-29 15:45   ` Roman Gushchin
2024-04-27  0:37 ` [PATCH v2 5/7] memcg: pr_warn_once for unexpected events and stats Shakeel Butt
2024-04-27  0:58   ` Yosry Ahmed
2024-04-27  1:18     ` Shakeel Butt
2024-04-27 14:22       ` Johannes Weiner
2024-04-29 19:54         ` Shakeel Butt
2024-04-29 16:06   ` Roman Gushchin
2024-04-29 19:56     ` Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 6/7] memcg: use proper type for mod_memcg_state Shakeel Butt
2024-04-27  0:37 ` [PATCH v2 7/7] mm: cleanup WORKINGSET_NODES in workingset Shakeel Butt
2024-04-29 16:07   ` Roman Gushchin

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Zi_EEOUS_iCh2Nfh@P9FQF9L96D \
    --to=roman.gushchin@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=hannes@cmpxchg.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=mhocko@kernel.org \
    --cc=muchun.song@linux.dev \
    --cc=shakeel.butt@linux.dev \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).