Re: [PATCH 1/2] mm: add per-order mTHP split counters

linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed

From: David Hildenbrand <david@redhat.com>
To: Barry Song <baohua@kernel.org>
Cc: Ryan Roberts <ryan.roberts@arm.com>,
	Lance Yang <ioworker0@gmail.com>,
	akpm@linux-foundation.org, baolin.wang@linux.alibaba.com,
	linux-kernel@vger.kernel.org, linux-mm@kvack.org
Subject: Re: [PATCH 1/2] mm: add per-order mTHP split counters
Date: Mon, 1 Jul 2024 14:21:05 +0200	[thread overview]
Message-ID: <b3f11614-5e81-406e-b9f6-14db367af3b7@redhat.com> (raw)
In-Reply-To: <CAGsJ_4zz9KKpz51hgmLEv0v=rh1niB1DWqeEPrRrgRVO_0o+-A@mail.gmail.com>

On 01.07.24 13:43, Barry Song wrote:
> On Mon, Jul 1, 2024 at 8:56 PM David Hildenbrand <david@redhat.com> wrote:
>>
>> On 30.06.24 11:48, Barry Song wrote:
>>> On Thu, Apr 25, 2024 at 3:41 AM Ryan Roberts <ryan.roberts@arm.com> wrote:
>>>>
>>>> + Barry
>>>>
>>>> On 24/04/2024 14:51, Lance Yang wrote:
>>>>> At present, the split counters in THP statistics no longer include
>>>>> PTE-mapped mTHP. Therefore, this commit introduces per-order mTHP split
>>>>> counters to monitor the frequency of mTHP splits. This will assist
>>>>> developers in better analyzing and optimizing system performance.
>>>>>
>>>>> /sys/kernel/mm/transparent_hugepage/hugepages-<size>/stats
>>>>>           split_page
>>>>>           split_page_failed
>>>>>           deferred_split_page
>>>>>
>>>>> Signed-off-by: Lance Yang <ioworker0@gmail.com>
>>>>> ---
>>>>>    include/linux/huge_mm.h |  3 +++
>>>>>    mm/huge_memory.c        | 14 ++++++++++++--
>>>>>    2 files changed, 15 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/include/linux/huge_mm.h b/include/linux/huge_mm.h
>>>>> index 56c7ea73090b..7b9c6590e1f7 100644
>>>>> --- a/include/linux/huge_mm.h
>>>>> +++ b/include/linux/huge_mm.h
>>>>> @@ -272,6 +272,9 @@ enum mthp_stat_item {
>>>>>         MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE,
>>>>>         MTHP_STAT_ANON_SWPOUT,
>>>>>         MTHP_STAT_ANON_SWPOUT_FALLBACK,
>>>>> +     MTHP_STAT_SPLIT_PAGE,
>>>>> +     MTHP_STAT_SPLIT_PAGE_FAILED,
>>>>> +     MTHP_STAT_DEFERRED_SPLIT_PAGE,
>>>>>         __MTHP_STAT_COUNT
>>>>>    };
>>>>>
>>>>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>>>>> index 055df5aac7c3..52db888e47a6 100644
>>>>> --- a/mm/huge_memory.c
>>>>> +++ b/mm/huge_memory.c
>>>>> @@ -557,6 +557,9 @@ DEFINE_MTHP_STAT_ATTR(anon_fault_fallback, MTHP_STAT_ANON_FAULT_FALLBACK);
>>>>>    DEFINE_MTHP_STAT_ATTR(anon_fault_fallback_charge, MTHP_STAT_ANON_FAULT_FALLBACK_CHARGE);
>>>>>    DEFINE_MTHP_STAT_ATTR(anon_swpout, MTHP_STAT_ANON_SWPOUT);
>>>>>    DEFINE_MTHP_STAT_ATTR(anon_swpout_fallback, MTHP_STAT_ANON_SWPOUT_FALLBACK);
>>>>> +DEFINE_MTHP_STAT_ATTR(split_page, MTHP_STAT_SPLIT_PAGE);
>>>>> +DEFINE_MTHP_STAT_ATTR(split_page_failed, MTHP_STAT_SPLIT_PAGE_FAILED);
>>>>> +DEFINE_MTHP_STAT_ATTR(deferred_split_page, MTHP_STAT_DEFERRED_SPLIT_PAGE);
>>>>>
>>>>>    static struct attribute *stats_attrs[] = {
>>>>>         &anon_fault_alloc_attr.attr,
>>>>> @@ -564,6 +567,9 @@ static struct attribute *stats_attrs[] = {
>>>>>         &anon_fault_fallback_charge_attr.attr,
>>>>>         &anon_swpout_attr.attr,
>>>>>         &anon_swpout_fallback_attr.attr,
>>>>> +     &split_page_attr.attr,
>>>>> +     &split_page_failed_attr.attr,
>>>>> +     &deferred_split_page_attr.attr,
>>>>>         NULL,
>>>>>    };
>>>>>
>>>>> @@ -3083,7 +3089,7 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>>>         XA_STATE_ORDER(xas, &folio->mapping->i_pages, folio->index, new_order);
>>>>>         struct anon_vma *anon_vma = NULL;
>>>>>         struct address_space *mapping = NULL;
>>>>> -     bool is_thp = folio_test_pmd_mappable(folio);
>>>>> +     int order = folio_order(folio);
>>>>>         int extra_pins, ret;
>>>>>         pgoff_t end;
>>>>>         bool is_hzp;
>>>>> @@ -3262,8 +3268,10 @@ int split_huge_page_to_list_to_order(struct page *page, struct list_head *list,
>>>>>                 i_mmap_unlock_read(mapping);
>>>>>    out:
>>>>>         xas_destroy(&xas);
>>>>> -     if (is_thp)
>>>>> +     if (order >= HPAGE_PMD_ORDER)
>>>>>                 count_vm_event(!ret ? THP_SPLIT_PAGE : THP_SPLIT_PAGE_FAILED);
>>>>> +     count_mthp_stat(order, !ret ? MTHP_STAT_SPLIT_PAGE :
>>>>> +                                   MTHP_STAT_SPLIT_PAGE_FAILED);
>>>>>         return ret;
>>>>>    }
>>>>>
>>>>> @@ -3327,6 +3335,8 @@ void deferred_split_folio(struct folio *folio)
>>>>>         if (list_empty(&folio->_deferred_list)) {
>>>>>                 if (folio_test_pmd_mappable(folio))
>>>>>                         count_vm_event(THP_DEFERRED_SPLIT_PAGE);
>>>>> +             count_mthp_stat(folio_order(folio),
>>>>> +                             MTHP_STAT_DEFERRED_SPLIT_PAGE);
>>>>
>>>> There is a very long conversation with Barry about adding a 'global "mTHP became
>>>> partially mapped 1 or more processes" counter (inc only)', which terminates at
>>>> [1]. There is a lot of discussion about the required semantics around the need
>>>> for partial map to cover alignment and contiguity as well as whether all pages
>>>> are mapped, and to trigger once it becomes partial in at least 1 process.
>>>>
>>>> MTHP_STAT_DEFERRED_SPLIT_PAGE is giving much simpler semantics, but less
>>>> information as a result. Barry, what's your view here? I'm guessing this doesn't
>>>> quite solve what you are looking for?
>>>
>>> This doesn't quite solve what I am looking for but I still think the
>>> patch has its value.
>>>
>>> I'm looking for a solution that can:
>>>
>>>     *  Count the amount of memory in the system for each mTHP size.
>>>     *  Determine how much memory for each mTHP size is partially unmapped.
>>>
>>> For example, in a system with 16GB of memory, we might find that we have 3GB
>>> of 64KB mTHP, and within that, 512MB is partially unmapped, potentially wasting
>>> memory at this moment.  I'm uncertain whether Lance is interested in
>>> this job :-)
>>>
>>> Counting deferred_split remains valuable as it can signal whether the system is
>>> experiencing significant partial unmapping.
>>
>> I'll note that, especially without subpage mapcounts, in the future we
>> won't have that information (how much is currently mapped) readily
>> available in all cases. To obtain that information on demand, we'd have
>> to scan page tables or walk the rmap.
> 
> I'd like to keep things simple. We can ignore the details about how
> the folio is partially
> unmapped. For example, whether 15 out of 16 subpages are unmapped or just 1 is
> unmapped doesn't matter. When we add a folio to the deferred_list, we
> increase the
> count by 1. When we remove a folio from the deferred_list (for any
> reason, such as
> a real split), we decrease the count by 1.

Yes, that's valuable and not too complicated.

-- 
Cheers,

David / dhildenb

next prev parent reply	other threads:[~2024-07-01 12:21 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-04-24 13:51 [PATCH 0/2] mm: introduce per-order mTHP split counters Lance Yang
2024-04-24 13:51 ` [PATCH 1/2] mm: add " Lance Yang
2024-04-24 15:41   ` Ryan Roberts
2024-06-30  9:48     ` Barry Song
2024-06-30 11:34       ` Lance Yang
2024-07-01  8:16         ` Ryan Roberts
2024-07-01 11:00           ` Lance Yang
2024-08-08 21:27           ` Barry Song
2024-08-09  7:50             ` Ryan Roberts
2024-07-01  8:56       ` David Hildenbrand
2024-07-01 11:06         ` Lance Yang
2024-07-01 11:43         ` Barry Song
2024-07-01 12:21           ` David Hildenbrand [this message]
2024-04-24 17:12   ` Bang Li
2024-04-24 17:58     ` Bang Li
2024-04-25  4:47       ` Lance Yang
2024-04-24 19:44   ` Yang Shi
2024-04-25  5:13     ` Lance Yang
2024-04-24 13:51 ` [PATCH 2/2] mm: add docs for " Lance Yang
2024-04-24 15:34   ` Ryan Roberts
2024-04-25  5:26     ` Lance Yang
2024-04-24 15:00 ` [PATCH 0/2] mm: introduce " David Hildenbrand
2024-04-24 15:20   ` Ryan Roberts
2024-04-24 15:29     ` David Hildenbrand
2024-04-24 15:53       ` Lance Yang
2024-04-24 15:54     ` Lance Yang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=b3f11614-5e81-406e-b9f6-14db367af3b7@redhat.com \
    --to=david@redhat.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=baolin.wang@linux.alibaba.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=ryan.roberts@arm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).