From: Barry Song <baohua@kernel.org>
To: Ryan Roberts <ryan.roberts@arm.com>
Cc: Lance Yang <ioworker0@gmail.com>,
akpm@linux-foundation.org, david@redhat.com,
baolin.wang@linux.alibaba.com, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: [PATCH 1/2] mm: add per-order mTHP split counters
Date: Fri, 9 Aug 2024 09:27:07 +1200 [thread overview]
Message-ID: <CAGsJ_4zO52_xGpbJ5MrJLqoxbGShgrwLChWCcr8O6Q6oHzceDw@mail.gmail.com> (raw)
In-Reply-To: <71fdab06-0442-4c55-811b-b38d3b024c85@arm.com>
On Mon, Jul 1, 2024 at 8:16 PM Ryan Roberts <ryan.roberts@arm.com> wrote:
>
> On 30/06/2024 12:34, Lance Yang wrote:
> > Hi Barry,
> >
> > Thanks for following up!
> >
> > On Sun, Jun 30, 2024 at 5:48 PM Barry Song <baohua@kernel.org> wrote:
> >>
> >> On Thu, Apr 25, 2024 at 3:41 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
> >>>
> >>> + Barry
> >>>
> >>> On 24/04/2024 14:51, Lance Yang wrote:
> >>>> At present, the split counters in THP statistics no longer include
> >>>> PTE-mapped mTHP. Therefore, this commit introduces per-order mTHP split
> >>>> counters to monitor the frequency of mTHP splits. This will assist
> >>>> developers in better analyzing and optimizing system performance.
> >>>>
> >>>> /sys/kernel/mm/transparent_hugepage/hugepages-<size>/stats
> >>>> split_page
> >>>> split_page_failed
> >>>> deferred_split_page
> >>>>
> >>>> Signed-off-by: Lance Yang <ioworker0@gmail.com>
> >>>> ---
> >>>> include/linux/huge_mm.h | 3 +++
> >>>> mm/huge_memory.c | 14 ++++++++++++--
> >>>> 2 files changed, 15 insertions(+), 2 deletions(-)
> >>>>
> >>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
> >>>> index 56c7ea73090b..7b9c6590e1f7 100644
> >>>> --- a/include/linux/huge_mm.h
> >>>> +++ b/include/linux/huge_mm.h
> >>>> @@ -272,6 +272,9 @@ enum mthp_stat_item {
> >>>> MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE,
> >>>> MTHP_STAT_ANON_SWPOUT,
> >>>> MTHP_STAT_ANON_SWPOUT_FALLBACK,
> >>>> + MTHP_STAT_SPLIT_PAGE,
> >>>> + MTHP_STAT_SPLIT_PAGE_FAILED,
> >>>> + MTHP_STAT_DEFERRED_SPLIT_PAGE,
> >>>> __MTHP_STAT_COUNT
> >>>> };
> >>>>
> >>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> >>>> index 055df5aac7c3..52db888e47a6 100644
> >>>> --- a/mm/huge_memory.c
> >>>> +++ b/mm/huge_memory.c
> >>>> @@ -557,6 +557,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK);
> >>>> DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE);
> >>>> DEFINE_MTHP_STAT_ATTR(anon_swpout, MTHP_STAT_ANON_SWPOUT);
> >>>> DEFINE_MTHP_STAT_ATTR(anon_swpout_fallback, MTHP_STAT_ANON_SWPOUT_FALLBACK);
> >>>> +DEFINE_MTHP_STAT_ATTR(split_page, MTHP_STAT_SPLIT_PAGE);
> >>>> +DEFINE_MTHP_STAT_ATTR(split_page_failed, MTHP_STAT_SPLIT_PAGE_FAILED);
> >>>> +DEFINE_MTHP_STAT_ATTR(deferred_split_page, MTHP_STAT_DEFERRED_SPLIT_PAGE);
> >>>>
> >>>> static struct attribute *stats_attrs[] = {
> >>>> &anon_fault_alloc_attr.attr,
> >>>> @@ -564,6 +567,9 @@ static struct attribute *stats_attrs[] = {
> >>>> &anon_fault_fallback_charge_attr.attr,
> >>>> &anon_swpout_attr.attr,
> >>>> &anon_swpout_fallback_attr.attr,
> >>>> + &split_page_attr.attr,
> >>>> + &split_page_failed_attr.attr,
> >>>> + &deferred_split_page_attr.attr,
> >>>> NULL,
> >>>> };
> >>>>
> >>>> @@ -3083,7 +3089,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>> XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order);
> >>>> struct anon_vma *anon_vma = NULL;
> >>>> struct address_space *mapping = NULL;
> >>>> - bool is_thp = folio_test_pmd_mappable(folio);
> >>>> + int order = folio_order(folio);
> >>>> int extra_pins, ret;
> >>>> pgoff_t end;
> >>>> bool is_hzp;
> >>>> @@ -3262,8 +3268,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
> >>>> i_mmap_unlock_read(mapping);
> >>>> out:
> >>>> xas_destroy(&xas);
> >>>> - if (is_thp)
> >>>> + if (order >= HPAGE_PMD_ORDER)
> >>>> count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
> >>>> + count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT_PAGE :
> >>>> + MTHP_STAT_SPLIT_PAGE_FAILED);
> >>>> return ret;
> >>>> }
> >>>>
> >>>> @@ -3327,6 +3335,8 @@ void deferred_split_folio(struct folio *folio)
> >>>> if (list_empty(&folio->_deferred_list)) {
> >>>> if (folio_test_pmd_mappable(folio))
> >>>> count_vm_event(THP_DEFERRED_SPLIT_PAGE);
> >>>> + count_mthp_stat(folio_order(folio),
> >>>> + MTHP_STAT_DEFERRED_SPLIT_PAGE);
> >>>
> >>> There is a very long conversation with Barry about adding a 'global "mTHP became
> >>> partially mapped 1 or more processes" counter (inc only)', which terminates at
> >>> [1]. There is a lot of discussion about the required semantics around the need
> >>> for partial map to cover alignment and contiguity as well as whether all pages
> >>> are mapped, and to trigger once it becomes partial in at least 1 process.
> >>>
> >>> MTHP_STAT_DEFERRED_SPLIT_PAGE is giving much simpler semantics, but less
> >>> information as a result. Barry, what's your view here? I'm guessing this doesn't
> >>> quite solve what you are looking for?
> >>
> >> This doesn't quite solve what I am looking for but I still think the
> >> patch has its value.
> >>
> >> I'm looking for a solution that can:
> >>
> >> * Count the amount of memory in the system for each mTHP size.
> >> * Determine how much memory for each mTHP size is partially unmapped.
> >>
> >> For example, in a system with 16GB of memory, we might find that we have 3GB
> >> of 64KB mTHP, and within that, 512MB is partially unmapped, potentially wasting
> >> memory at this moment. I'm uncertain whether Lance is interested in
> >> this job :-)
> >
> > Nice, that's an interesting/valuable job for me ;)
> >
> > Let's do it separately, as 'split' and friends probably can’t be the
> > solution you
> > mentioned above, IMHO.
> >
> > Hmm... I don't have a good idea about the solution for now, but will
> > think it over
> > and come back to discuss it here.
>
> I have a grad starting in a couple of weeks and I had been planning to initially
> ask him to look at this to help him get up to speed on mTHP/mm stuff. But I have
> plenty of other things for him to do if Lance wants to take this :)
Hi Ryan, Lance,
My performance profiling is pending on the mTHP size and partially
unmapped mTHP size issues (understanding the distribution of folio
sizes within the system), so I'm not waiting for either Ryan's grad
or Lance. I've sent an RFC for this, and both of you are CC'd:
https://lore.kernel.org/all/20240808010457.228753-1-21cnbao@gmail.com/
Apologies for not waiting. You are still warmly welcomed to participate
in the discussion and review.
>
> >
> >>
> >> Counting deferred_split remains valuable as it can signal whether the system is
> >> experiencing significant partial unmapping.
> >
> > Have a nice weekend!
> > Lance
> >
> >>
> >>>
> >>> [1] https://lore.kernel.org/linux-mm/6cc7d781-884f-4d8f-a175-8609732b87eb@arm.com/
> >>>
> >>> Thanks,
> >>> Ryan
> >>>
> >>>> list_add_tail(&folio->_deferred_list, &ds_queue->split_queue);
> >>>> ds_queue->split_queue_len++;
> >>>> #ifdef CONFIG_MEMCG
> >>>
> >>
Thanks
Barry
next prev parent reply other threads:[~2024-08-08 21:27 UTC|newest]
Thread overview: 26+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-04-24 13:51 [PATCH 0/2] mm: introduce per-order mTHP split counters Lance Yang
2024-04-24 13:51 ` [PATCH 1/2] mm: add " Lance Yang
2024-04-24 15:41 ` Ryan Roberts
2024-06-30 9:48 ` Barry Song
2024-06-30 11:34 ` Lance Yang
2024-07-01 8:16 ` Ryan Roberts
2024-07-01 11:00 ` Lance Yang
2024-08-08 21:27 ` Barry Song [this message]
2024-08-09 7:50 ` Ryan Roberts
2024-07-01 8:56 ` David Hildenbrand
2024-07-01 11:06 ` Lance Yang
2024-07-01 11:43 ` Barry Song
2024-07-01 12:21 ` David Hildenbrand
2024-04-24 17:12 ` Bang Li
2024-04-24 17:58 ` Bang Li
2024-04-25 4:47 ` Lance Yang
2024-04-24 19:44 ` Yang Shi
2024-04-25 5:13 ` Lance Yang
2024-04-24 13:51 ` [PATCH 2/2] mm: add docs for " Lance Yang
2024-04-24 15:34 ` Ryan Roberts
2024-04-25 5:26 ` Lance Yang
2024-04-24 15:00 ` [PATCH 0/2] mm: introduce " David Hildenbrand
2024-04-24 15:20 ` Ryan Roberts
2024-04-24 15:29 ` David Hildenbrand
2024-04-24 15:53 ` Lance Yang
2024-04-24 15:54 ` Lance Yang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4zO52_xGpbJ5MrJLqoxbGShgrwLChWCcr8O6Q6oHzceDw@mail.gmail.com \
--to=baohua@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=ioworker0@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ryan.roberts@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).