From: "David Hildenbrand (Red Hat)" <david@kernel.org>
To: Qi Zheng <qi.zheng@linux.dev>,
will@kernel.org, aneesh.kumar@kernel.org, npiggin@gmail.com,
peterz@infradead.org, dev.jain@arm.com,
akpm@linux-foundation.org, ioworker0@gmail.com
Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
linux-mm@kvack.org, linux-alpha@vger.kernel.org,
linux-snps-arc@lists.infradead.org, loongarch@lists.linux.dev,
linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
linux-um@lists.infradead.org,
Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH 7/7] mm: make PT_RECLAIM depend on MMU_GATHER_RCU_TABLE_FREE && 64BIT
Date: Wed, 19 Nov 2025 13:24:35 +0100 [thread overview]
Message-ID: <7160b6ec-4da5-4273-be91-1339bd00d009@kernel.org> (raw)
In-Reply-To: <479b0409-335f-4450-8eb2-5270a5847f5e@linux.dev>
On 19.11.25 13:13, Qi Zheng wrote:
>
>
> On 11/19/25 7:35 PM, David Hildenbrand (Red Hat) wrote:
>> On 19.11.25 12:02, Qi Zheng wrote:
>>> Hi David,
>>>
>>> On 11/19/25 6:19 PM, David Hildenbrand (Red Hat) wrote:
>>>> On 18.11.25 13:02, Qi Zheng wrote:
>>>>>
>>>>>
>>>>> On 11/18/25 12:57 AM, David Hildenbrand (Red Hat) wrote:
>>>>>> On 14.11.25 12:11, Qi Zheng wrote:
>>>>>>> From: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>>
>>>>>> Subject: s/&&/&/
>>>>>
>>>>> will do.
>>>>>
>>>>>>
>>>>>>>
>>>>>>> Make PT_RECLAIM depend on MMU_GATHER_RCU_TABLE_FREE so that
>>>>>>> PT_RECLAIM
>>>>>>> can
>>>>>>> be enabled by default on all architectures that support
>>>>>>> MMU_GATHER_RCU_TABLE_FREE.
>>>>>>>
>>>>>>> Considering that a large number of PTE page table pages (such as
>>>>>>> 100GB+)
>>>>>>> can only be caused on a 64-bit system, let PT_RECLAIM also depend on
>>>>>>> 64BIT.
>>>>>>>
>>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>>> ---
>>>>>>> arch/x86/Kconfig | 1 -
>>>>>>> mm/Kconfig | 6 +-----
>>>>>>> 2 files changed, 1 insertion(+), 6 deletions(-)
>>>>>>>
>>>>>>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>>>>>>> index eac2e86056902..96bff81fd4787 100644
>>>>>>> --- a/arch/x86/Kconfig
>>>>>>> +++ b/arch/x86/Kconfig
>>>>>>> @@ -330,7 +330,6 @@ config X86
>>>>>>> select FUNCTION_ALIGNMENT_4B
>>>>>>> imply IMA_SECURE_AND_OR_TRUSTED_BOOT if EFI
>>>>>>> select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
>>>>>>> - select ARCH_SUPPORTS_PT_RECLAIM if X86_64
>>>>>>> select ARCH_SUPPORTS_SCHED_SMT if SMP
>>>>>>> select SCHED_SMT if SMP
>>>>>>> select ARCH_SUPPORTS_SCHED_CLUSTER if SMP
>>>>>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>>>>>> index a5a90b169435d..e795fbd69e50c 100644
>>>>>>> --- a/mm/Kconfig
>>>>>>> +++ b/mm/Kconfig
>>>>>>> @@ -1440,14 +1440,10 @@ config ARCH_HAS_USER_SHADOW_STACK
>>>>>>> The architecture has hardware support for userspace shadow
>>>>>>> call
>>>>>>> stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss).
>>>>>>> -config ARCH_SUPPORTS_PT_RECLAIM
>>>>>>> - def_bool n
>>>>>>> -
>>>>>>> config PT_RECLAIM
>>>>>>> bool "reclaim empty user page table pages"
>>>>>>> default y
>>>>>>> - depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP
>>>>>>> - select MMU_GATHER_RCU_TABLE_FREE
>>>>>>> + depends on MMU_GATHER_RCU_TABLE_FREE && MMU && SMP && 64BIT
>>>>>>
>>>>>> Who would we have MMU_GATHER_RCU_TABLE_FREE without MMU? (can we drop
>>>>>> the MMU part)
>>>>>
>>>>> OK.
>>>>>
>>>>>>
>>>>>> Why do we care about SMP in the first place? (can we frop SMP)
>>>>>
>>>>> OK.
>>>>>
>>>>>>
>>>>>> But I also wonder why we need "MMU_GATHER_RCU_TABLE_FREE && 64BIT":
>>>>>>
>>>>>> Would it be harmful on 32bit (sure, we might not reclaim as much, but
>>>>>> still there is memory to be reclaimed?)?
>>>>>
>>>>> This is also fine on 32bit, but the benefits are not significant, So I
>>>>> chose to enable it only on 64-bit.
>>>>
>>>> Right. Address space is smaller, but also memory is smaller. Not that I
>>>> think we strictly *must* to support 32bit, I merely wonder why we
>>>> wouldn't just enable it here.
>>>>
>>>> OTOH, if there is a good reason we cannot enable it, we can definitely
>>>> just keep it 64bit only.
>>>
>>> The only difficulty is this:
>>>
>>>>
>>>>>
>>>>> I actually tried enabling MMU_GATHER_RCU_TABLE_FREE on all
>>>>> architectures, and apart from sparc32 being a bit troublesome (because
>>>>> it uses mm->page_table_lock for synchronization within
>>>>> __pte_free_tlb()), the modifications were relatively simple.
>>>
>>> in sparc32:
>>>
>>> void pte_free(struct mm_struct *mm, pgtable_t ptep)
>>> {
>>> struct page *page;
>>>
>>> page = pfn_to_page(__nocache_pa((unsigned long)ptep) >>
>>> PAGE_SHIFT);
>>> spin_lock(&mm->page_table_lock);
>>> if (page_ref_dec_return(page) == 1)
>>> pagetable_dtor(page_ptdesc(page));
>>> spin_unlock(&mm->page_table_lock);
>>>
>>> srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
>>> }
>>>
>>> #define __pte_free_tlb(tlb, pte, addr) pte_free((tlb)->mm, pte)
>>>
>>> To enable MMU_GATHER_RCU_TABLE_FREE on sparc32, we need to implement
>>> __tlb_remove_table(), and call the pte_free() above in
>>> __tlb_remove_table().
>>>
>>> However, the __tlb_remove_table() does not have an mm parameter:
>>>
>>> void __tlb_remove_table(void *_table)
>>>
>>> so we need to use another lock instead of mm->page_table_lock.
>>>
>>> I have already sent the v2 [1], and perhaps after that I can enable
>>> PT_RECLAIM on all 32-bit architectures as well.
>>>
>>
>> I guess if we just make it depend on MMU_GATHER_RCU_TABLE_FREE that will
>> be fine.
>>
>>> [1].
>>> https://lore.kernel.org/all/
>>> cover.1763537007.git.zhengqi.arch@bytedance.com/
>>>
>>>>>
>>>>>>
>>>>>> If all 64BIT support MMU_GATHER_RCU_TABLE_FREE (as you previously
>>>>>> state), why can't we only check for 64BIT?
>>>>>
>>>>> OK, will do.
>>>>
>>>> This was also more of a question for discussion:
>>>>
>>>> Would it make sense to have
>>>>
>>>> config PT_RECLAIM
>>>> def_bool y
>>>> depends on MMU_GATHER_RCU_TABLE_FREE
>>>
>>> make sense.
>>>
>>>>
>>>> (a) Would we want to make it configurable (why?)
>>>
>>> No, it was just out of caution before.
>>>
>>>> (b) Do we really care about SMP (why?)
>>>
>>> No. Simply because the following situation is impossible to occur:
>>>
>>> pte_offset_map
>>> traversing the PTE page table
>>>
>>> <preemption or hardirq>
>>>
>>> call madvise(MADV_DONTNEED)
>>>
>>> so there's no need to free PTE page via RCU.
>>>
>>>> (c) Do we want to limit to 64bit (why?)
>>>
>>> No, just because the profit is greater at 64-BIT.
>>
>> I was briefly wondering if on 32bit (but maybe also on 64bit with
>> configurable user page table levels?) we could have the scenario that we
>> only have two page table levels.
>>
>> So reclaiming the PMD level (corresponding to the highest level) would
>
> reclaiming the PMD level? The PT_RECLAIM only reclaim PTE pages, not PMD
> pages, am I misunderstanding something?
Sorry, I looked too much into PMD table sharing the last days :D
You're right, it would work in any case even with only 2 levels of apge
tables.
--
Cheers
David
next prev parent reply other threads:[~2025-11-19 12:24 UTC|newest]
Thread overview: 30+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-11-14 11:11 [PATCH 0/7] enable PT_RECLAIM on all 64-bit architectures Qi Zheng
2025-11-14 11:11 ` [PATCH 1/7] alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE Qi Zheng
2025-11-14 19:13 ` Magnus Lindholm
2025-11-15 9:06 ` Qi Zheng
2025-11-14 11:11 ` [PATCH 2/7] arc: " Qi Zheng
2025-11-14 11:20 ` Qi Zheng
2025-11-14 23:10 ` Vineet Gupta
2025-11-15 9:08 ` Qi Zheng
2025-11-14 11:11 ` [PATCH 3/7] loongarch: " Qi Zheng
2025-11-14 14:17 ` Huacai Chen
2025-11-14 15:55 ` Qi Zheng
2025-11-17 6:41 ` Qi Zheng
2025-11-17 6:57 ` Huacai Chen
2025-11-14 11:11 ` [PATCH 4/7] mips: " Qi Zheng
2025-11-14 11:11 ` [PATCH 5/7] parisc: " Qi Zheng
2025-11-14 11:11 ` [PATCH 6/7] um: " Qi Zheng
2025-11-14 11:11 ` [PATCH 7/7] mm: make PT_RECLAIM depend on MMU_GATHER_RCU_TABLE_FREE && 64BIT Qi Zheng
2025-11-15 0:51 ` kernel test robot
2025-11-15 1:12 ` kernel test robot
2025-11-17 16:57 ` David Hildenbrand (Red Hat)
2025-11-18 12:02 ` Qi Zheng
2025-11-19 10:19 ` David Hildenbrand (Red Hat)
2025-11-19 11:02 ` Qi Zheng
2025-11-19 11:35 ` David Hildenbrand (Red Hat)
2025-11-19 12:13 ` Qi Zheng
2025-11-19 12:24 ` David Hildenbrand (Red Hat) [this message]
2025-11-17 16:53 ` [PATCH 0/7] enable PT_RECLAIM on all 64-bit architectures David Hildenbrand (Red Hat)
2025-11-18 11:53 ` Qi Zheng
2025-11-19 10:13 ` David Hildenbrand (Red Hat)
2025-11-19 10:37 ` Qi Zheng
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=7160b6ec-4da5-4273-be91-1339bd00d009@kernel.org \
--to=david@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=aneesh.kumar@kernel.org \
--cc=dev.jain@arm.com \
--cc=ioworker0@gmail.com \
--cc=linux-alpha@vger.kernel.org \
--cc=linux-arch@vger.kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mips@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=linux-parisc@vger.kernel.org \
--cc=linux-snps-arc@lists.infradead.org \
--cc=linux-um@lists.infradead.org \
--cc=loongarch@lists.linux.dev \
--cc=npiggin@gmail.com \
--cc=peterz@infradead.org \
--cc=qi.zheng@linux.dev \
--cc=will@kernel.org \
--cc=zhengqi.arch@bytedance.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).