Re: [PATCH RFC 1/2] mm: collect the number of anon large folios

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Ryan Roberts <ryan.roberts@arm.com>
To: Barry Song <21cnbao@gmail.com>,
	akpm@linux-foundation.org, linux-mm@kvack.org
Cc: chrisl@kernel.org, david@redhat.com, kaleshsingh@google.com,
	kasong@tencent.com, linux-kernel@vger.kernel.org,
	ioworker0@gmail.com, baolin.wang@linux.alibaba.com,
	ziy@nvidia.com, hanchuanhua@oppo.com,
	Barry Song <v-songbaohua@oppo.com>
Subject: Re: [PATCH RFC 1/2] mm: collect the number of anon large folios
Date: Fri, 9 Aug 2024 09:13:46 +0100	[thread overview]
Message-ID: <e9f82fd8-e1da-49ea-a735-b174575c02bc@arm.com> (raw)
In-Reply-To: <20240808010457.228753-2-21cnbao@gmail.com>

On 08/08/2024 02:04, Barry Song wrote:
> From: Barry Song <v-songbaohua@oppo.com>
> 
> When a new anonymous mTHP is added to the rmap, we increase the count.
> We reduce the count whenever an mTHP is completely unmapped.
> 
> Signed-off-by: Barry Song <v-songbaohua@oppo.com>
> ---
>  Documentation/admin-guide/mm/transhuge.rst |  5 +++++
>  include/linux/huge_mm.h                    | 15 +++++++++++++--
>  mm/huge_memory.c                           |  2 ++
>  mm/rmap.c                                  |  3 +++
>  4 files changed, 23 insertions(+), 2 deletions(-)
> 
> diff --git a/Documentation/admin-guide/mm/transhuge.rst b/Documentation/admin-guide/mm/transhuge.rst
> index 058485daf186..715f181543f6 100644
> --- a/Documentation/admin-guide/mm/transhuge.rst
> +++ b/Documentation/admin-guide/mm/transhuge.rst
> @@ -527,6 +527,11 @@ split_deferred
>          it would free up some memory. Pages on split queue are going to
>          be split under memory pressure, if splitting is possible.
>  
> +anon_num
> +       the number of anon huge pages we have in the whole system.
> +       These huge pages could be still entirely mapped and have partially
> +       unmapped and unused subpages.

nit: "entirely mapped and have partially unmapped and unused subpages" ->
"entirely mapped or have partially unmapped/unused subpages"

> +
>  As the system ages, allocating huge pages may be expensive as the
>  system uses memory compaction to copy data around memory to free a
>  huge page for use. There are some counters in ``/proc/vmstat`` to help
> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> index e25d9ebfdf89..294c348fe3cc 100644
> --- a/include/linux/huge_mm.h
> +++ b/include/linux/huge_mm.h
> @@ -281,6 +281,7 @@ enum mthp_stat_item {
>  	MTHP_STAT_SPLIT,
>  	MTHP_STAT_SPLIT_FAILED,
>  	MTHP_STAT_SPLIT_DEFERRED,
> +	MTHP_STAT_NR_ANON,
>  	__MTHP_STAT_COUNT
>  };
>  
> @@ -291,14 +292,24 @@ struct mthp_stat {
>  #ifdef CONFIG_SYSFS
>  DECLARE_PER_CPU(struct mthp_stat, mthp_stats);
>  
> -static inline void count_mthp_stat(int order, enum mthp_stat_item item)
> +static inline void mod_mthp_stat(int order, enum mthp_stat_item item, int delta)
>  {
>  	if (order <= 0 || order > PMD_ORDER)
>  		return;
>  
> -	this_cpu_inc(mthp_stats.stats[order][item]);
> +	this_cpu_add(mthp_stats.stats[order][item], delta);
> +}
> +
> +static inline void count_mthp_stat(int order, enum mthp_stat_item item)
> +{
> +	mod_mthp_stat(order, item, 1);
>  }
> +
>  #else
> +static inline void mod_mthp_stat(int order, enum mthp_stat_item item, int delta)
> +{
> +}
> +
>  static inline void count_mthp_stat(int order, enum mthp_stat_item item)
>  {
>  }
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 697fcf89f975..b6bc2a3791e3 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -578,6 +578,7 @@ DEFINE_MTHP_STAT_ATTR(shmem_fallback_charge, MTHP_STAT_SHMEM_FALLBACK_CHARGE);
>  DEFINE_MTHP_STAT_ATTR(split, MTHP_STAT_SPLIT);
>  DEFINE_MTHP_STAT_ATTR(split_failed, MTHP_STAT_SPLIT_FAILED);
>  DEFINE_MTHP_STAT_ATTR(split_deferred, MTHP_STAT_SPLIT_DEFERRED);
> +DEFINE_MTHP_STAT_ATTR(anon_num, MTHP_STAT_NR_ANON);
>  
>  static struct attribute *stats_attrs[] = {
>  	&anon_fault_alloc_attr.attr,
> @@ -591,6 +592,7 @@ static struct attribute *stats_attrs[] = {
>  	&split_attr.attr,
>  	&split_failed_attr.attr,
>  	&split_deferred_attr.attr,
> +	&anon_num_attr.attr,
>  	NULL,
>  };
>  
> diff --git a/mm/rmap.c b/mm/rmap.c
> index 901950200957..2b722f26224c 100644
> --- a/mm/rmap.c
> +++ b/mm/rmap.c
> @@ -1467,6 +1467,7 @@ void folio_add_new_anon_rmap(struct folio *folio, struct vm_area_struct *vma,
>  	}
>  
>  	__folio_mod_stat(folio, nr, nr_pmdmapped);
> +	mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, 1);
>  }
>  
>  static __always_inline void __folio_add_file_rmap(struct folio *folio,
> @@ -1582,6 +1583,8 @@ static __always_inline void __folio_remove_rmap(struct folio *folio,
>  	    list_empty(&folio->_deferred_list))
>  		deferred_split_folio(folio);
>  	__folio_mod_stat(folio, -nr, -nr_pmdmapped);
> +	if (folio_test_anon(folio) && !atomic_read(mapped))

Agree that atomic_read() is dodgy here.

Not sure I fully understand why David prefers to do the unaccounting at
free-time though? It feels unbalanced to me to increment when first mapped but
decrement when freed. Surely its safer to either use alloc/free or use first
map/last map?

If using alloc/free isn't there a THP constructor/destructor that prepares the
deferred list? (My memory may be failing me). Could we use that?

> +		mod_mthp_stat(folio_order(folio), MTHP_STAT_NR_ANON, -1);
>  
>  	/*
>  	 * It would be tidy to reset folio_test_anon mapping when fully

next prev parent reply	other threads:[~2024-08-09  8:13 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-08-08  1:04 [PATCH RFC 0/2] mm: collect the number of anon mTHP Barry Song
2024-08-08  1:04 ` [PATCH RFC 1/2] mm: collect the number of anon large folios Barry Song
2024-08-08  7:08   ` Barry Song
2024-08-08  8:03     ` David Hildenbrand
2024-08-08  8:08       ` David Hildenbrand
2024-08-08  8:17         ` David Hildenbrand
2024-08-08  9:20           ` Barry Song
2024-08-09  7:04           ` Barry Song
2024-08-09  7:22             ` David Hildenbrand
2024-08-11  5:20               ` Barry Song
2024-08-11  6:54                 ` Barry Song
2024-08-11  8:51                   ` David Hildenbrand
2024-08-11  9:22                     ` Barry Song
2024-08-09  5:17   ` kernel test robot
2024-08-09  5:28   ` kernel test robot
2024-08-09  8:13   ` Ryan Roberts [this message]
2024-08-09  8:27     ` Ryan Roberts
2024-08-09  8:40       ` Barry Song
2024-08-09  8:42       ` David Hildenbrand
2024-08-09  8:58         ` David Hildenbrand
2024-08-09  9:05           ` Ryan Roberts
2024-08-09  9:22             ` David Hildenbrand
2024-08-11  8:13               ` Barry Song
2024-08-09  8:39     ` David Hildenbrand
2024-08-09  9:00       ` Ryan Roberts
2024-08-08  1:04 ` [PATCH RFC 2/2] mm: collect the number of anon large folios partially unmapped Barry Song
2024-08-09  8:23   ` Ryan Roberts
2024-08-09  8:48     ` Barry Song

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=e9f82fd8-e1da-49ea-a735-b174575c02bc@arm.com \
    --to=ryan.roberts@arm.com \
    --cc=21cnbao@gmail.com \
    --cc=akpm@linux-foundation.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=chrisl@kernel.org \
    --cc=david@redhat.com \
    --cc=hanchuanhua@oppo.com \
    --cc=ioworker0@gmail.com \
    --cc=kaleshsingh@google.com \
    --cc=kasong@tencent.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=v-songbaohua@oppo.com \
    --cc=ziy@nvidia.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.