From: Lance Yang <lance.yang@linux.dev>
To: David Hildenbrand <david@redhat.com>
Cc: "Qun-wei Lin (林群崴)" <Qun-wei.Lin@mediatek.com>,
"catalin.marinas@arm.com" <catalin.marinas@arm.com>,
"usamaarif642@gmail.com" <usamaarif642@gmail.com>,
"linux-mm@kvack.org" <linux-mm@kvack.org>,
"yuzhao@google.com" <yuzhao@google.com>,
"akpm@linux-foundation.org" <akpm@linux-foundation.org>,
"corbet@lwn.net" <corbet@lwn.net>,
"Andrew Yang (楊智強)" <Andrew.Yang@mediatek.com>,
"npache@redhat.com" <npache@redhat.com>,
"rppt@kernel.org" <rppt@kernel.org>,
"willy@infradead.org" <willy@infradead.org>,
"kernel-team@meta.com" <kernel-team@meta.com>,
"roman.gushchin@linux.dev" <roman.gushchin@linux.dev>,
"hannes@cmpxchg.org" <hannes@cmpxchg.org>,
"cerasuolodomenico@gmail.com" <cerasuolodomenico@gmail.com>,
"linux-kernel@vger.kernel.org" <linux-kernel@vger.kernel.org>,
"ryncsn@gmail.com" <ryncsn@gmail.com>,
"surenb@google.com" <surenb@google.com>,
"riel@surriel.com" <riel@surriel.com>,
"shakeel.butt@linux.dev" <shakeel.butt@linux.dev>,
"Chinwen Chang (張錦文)" <chinwen.chang@mediatek.com>,
"linux-doc@vger.kernel.org" <linux-doc@vger.kernel.org>,
"Casper Li (李中榮)" <casper.li@mediatek.com>,
"ryan.roberts@arm.com" <ryan.roberts@arm.com>,
"linux-mediatek@lists.infradead.org"
<linux-mediatek@lists.infradead.org>,
"baohua@kernel.org" <baohua@kernel.org>,
"kaleshsingh@google.com" <kaleshsingh@google.com>,
"zhais@google.com" <zhais@google.com>,
"linux-arm-kernel@lists.infradead.org"
<linux-arm-kernel@lists.infradead.org>
Subject: Re: [PATCH v5 2/6] mm: remap unused subpages to shared zeropage when splitting isolated thp
Date: Fri, 19 Sep 2025 18:53:05 +0800 [thread overview]
Message-ID: <dfb303b9-151b-49ec-b7ef-691c71bd5385@linux.dev> (raw)
In-Reply-To: <4cf41cd5-e93a-412b-b209-4180bd2d4015@linux.dev>
On 2025/9/19 16:14, Lance Yang wrote:
>
>
> On 2025/9/19 15:55, David Hildenbrand wrote:
>>>> I think where possible we really only want to identify problematic
>>>> (tagged) pages and skip them. And we should either look into fixing KSM
>>>> as well or finding out why KSM is not affected.
>>>
>>> Yeah. Seems like we could introduce a new helper,
>>> folio_test_mte_tagged(struct
>>> folio *folio). By default, it would return false, and architectures like
>>> arm64
>>> can override it.
>>
>> If we add a new helper it should instead express the semantics that we
>> cannot deduplicate.
>
> Agreed.
>
>>
>> For THP, I recall that only some pages might be tagged. So likely we
>> want to check per page.
>
> Yes, a per-page check would be simpler.
>
>>
>>>
>>> Looking at the code, the PG_mte_tagged flag is not set for regular THP.
>>
>> I think it's supported for THP per page. Only for hugetlb we tag the
>> whole thing through the head page instead of individual pages.
>
> Right. That's exactly what I meant.
>
>>
>>> The MTE
>>> status actually comes from the VM_MTE flag in the VMA that maps it.
>>>
>>
>> During the rmap walk we could check the VMA flag, but there would be
>> no way to just stop the THP shrinker scanning this page early.
>>
>>> static inline bool folio_test_hugetlb_mte_tagged(struct folio *folio)
>>> {
>>> bool ret = test_bit(PG_mte_tagged, &folio->flags.f);
>>>
>>> VM_WARN_ON_ONCE(!folio_test_hugetlb(folio));
>>>
>>> /*
>>> * If the folio is tagged, ensure ordering with a likely subsequent
>>> * read of the tags.
>>> */
>>> if (ret)
>>> smp_rmb();
>>> return ret;
>>> }
>>>
>>> static inline bool page_mte_tagged(struct page *page)
>>> {
>>> bool ret = test_bit(PG_mte_tagged, &page->flags.f);
>>>
>>> VM_WARN_ON_ONCE(folio_test_hugetlb(page_folio(page)));
>>>
>>> /*
>>> * If the page is tagged, ensure ordering with a likely subsequent
>>> * read of the tags.
>>> */
>>> if (ret)
>>> smp_rmb();
>>> return ret;
>>> }
>>>
>>> contpte_set_ptes()
>>> __set_ptes()
>>> __set_ptes_anysz()
>>> __sync_cache_and_tags()
>>> mte_sync_tags()
>>> set_page_mte_tagged()
>>>
>>> Then, having the THP shrinker skip any folios that are identified as
>>> MTE-tagged.
>>
>> Likely we should just do something like (maybe we want better naming)
>>
>> #ifndef page_is_mergable
>> #define page_is_mergable(page) (true)
>> #endif
>
>
> Maybe something like page_is_optimizable()? Just a thought ;p
>
>>
>> And for arm64 have it be
>>
>> #define page_is_mergable(page) (!page_mte_tagged(page))
>>
>>
>> And then do
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index 1f0813b956436..1cac9093918d6 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -4251,7 +4251,8 @@ static bool thp_underused(struct folio *folio)
>>
>> for (i = 0; i < folio_nr_pages(folio); i++) {
>> kaddr = kmap_local_folio(folio, i * PAGE_SIZE);
>> - if (!memchr_inv(kaddr, 0, PAGE_SIZE)) {
>> + if (page_is_mergable(folio_page(folio, i)) &&
>> + !memchr_inv(kaddr, 0, PAGE_SIZE)) {
>> num_zero_pages++;
>> if (num_zero_pages > khugepaged_max_ptes_none) {
>> kunmap_local(kaddr);
>> diff --git a/mm/migrate.c b/mm/migrate.c
>> index 946253c398072..476a9a9091bd3 100644
>> --- a/mm/migrate.c
>> +++ b/mm/migrate.c
>> @@ -306,6 +306,8 @@ static bool try_to_map_unused_to_zeropage(struct
>> page_vma_mapped_walk *pvmw,
>>
>> if (PageCompound(page))
>> return false;
>> + if (!page_is_mergable(page))
>> + return false;
>> VM_BUG_ON_PAGE(!PageAnon(page), page);
>> VM_BUG_ON_PAGE(!PageLocked(page), page);
>> VM_BUG_ON_PAGE(pte_present(ptep_get(pvmw->pte)), page);
>
> Looks good to me!
>
>>
>>
>> For KSM, similarly just bail out early. But still wondering if this is
>> already checked
>> somehow for KSM.
>
> +1 I'm looking for a machine to test it on.
Interestingly, it seems KSM is already skipping MTE-tagged pages. My test,
running on a v6.8.0 kernel inside QEMU (with MTE enabled), shows no merging
activity for those pages ...
next prev parent reply other threads:[~2025-09-19 10:53 UTC|newest]
Thread overview: 33+ messages / expand[flat|nested] mbox.gz Atom feed top
2024-08-30 10:03 [PATCH v5 0/6] mm: split underused THPs Usama Arif
2024-08-30 10:03 ` [PATCH v5 1/6] mm: free zapped tail pages when splitting isolated thp Usama Arif
2024-09-05 8:46 ` Hugh Dickins
2024-09-05 10:21 ` Usama Arif
2024-09-05 18:05 ` Hugh Dickins
2024-09-05 19:24 ` Usama Arif
2024-08-30 10:03 ` [PATCH v5 2/6] mm: remap unused subpages to shared zeropage " Usama Arif
2024-10-23 16:21 ` Zi Yan
2024-10-23 16:50 ` Usama Arif
2024-10-23 16:55 ` Zi Yan
2024-10-23 16:56 ` Yu Zhao
2025-09-18 8:53 ` Qun-wei Lin (林群崴)
2025-09-18 8:56 ` David Hildenbrand
2025-09-18 11:42 ` Usama Arif
2025-09-18 11:57 ` David Hildenbrand
2025-09-18 12:22 ` Lance Yang
2025-09-18 12:25 ` Lance Yang
2025-09-18 12:35 ` David Hildenbrand
2025-09-19 5:16 ` Lance Yang
2025-09-19 7:55 ` David Hildenbrand
2025-09-19 8:14 ` Lance Yang
2025-09-19 10:53 ` Lance Yang [this message]
2025-09-19 12:19 ` Lance Yang
2025-09-19 12:44 ` Lance Yang
2025-09-19 13:09 ` David Hildenbrand
2025-09-19 13:24 ` Lance Yang
2024-08-30 10:03 ` [PATCH v5 3/6] mm: selftest to verify zero-filled pages are mapped to zeropage Usama Arif
2024-08-30 10:03 ` [PATCH v5 4/6] mm: Introduce a pageflag for partially mapped folios Usama Arif
2024-12-11 15:03 ` David Hildenbrand
2024-12-12 10:30 ` Usama Arif
2024-12-12 10:49 ` David Hildenbrand
2024-08-30 10:03 ` [PATCH v5 5/6] mm: split underused THPs Usama Arif
2024-08-30 10:03 ` [PATCH v5 6/6] mm: add sysfs entry to disable splitting " Usama Arif
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=dfb303b9-151b-49ec-b7ef-691c71bd5385@linux.dev \
--to=lance.yang@linux.dev \
--cc=Andrew.Yang@mediatek.com \
--cc=Qun-wei.Lin@mediatek.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=casper.li@mediatek.com \
--cc=catalin.marinas@arm.com \
--cc=cerasuolodomenico@gmail.com \
--cc=chinwen.chang@mediatek.com \
--cc=corbet@lwn.net \
--cc=david@redhat.com \
--cc=hannes@cmpxchg.org \
--cc=kaleshsingh@google.com \
--cc=kernel-team@meta.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-doc@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mediatek@lists.infradead.org \
--cc=linux-mm@kvack.org \
--cc=npache@redhat.com \
--cc=riel@surriel.com \
--cc=roman.gushchin@linux.dev \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=ryncsn@gmail.com \
--cc=shakeel.butt@linux.dev \
--cc=surenb@google.com \
--cc=usamaarif642@gmail.com \
--cc=willy@infradead.org \
--cc=yuzhao@google.com \
--cc=zhais@google.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).