linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: Qi Zheng <qi.zheng@linux.dev>
To: "David Hildenbrand (Red Hat)" <david@kernel.org>,
	will@kernel.org, aneesh.kumar@kernel.org, npiggin@gmail.com,
	peterz@infradead.org, dev.jain@arm.com,
	akpm@linux-foundation.org, ioworker0@gmail.com
Cc: linux-arch@vger.kernel.org, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org, linux-alpha@vger.kernel.org,
	linux-snps-arc@lists.infradead.org, loongarch@lists.linux.dev,
	linux-mips@vger.kernel.org, linux-parisc@vger.kernel.org,
	linux-um@lists.infradead.org,
	Qi Zheng <zhengqi.arch@bytedance.com>
Subject: Re: [PATCH 7/7] mm: make PT_RECLAIM depend on MMU_GATHER_RCU_TABLE_FREE && 64BIT
Date: Wed, 19 Nov 2025 20:13:10 +0800	[thread overview]
Message-ID: <479b0409-335f-4450-8eb2-5270a5847f5e@linux.dev> (raw)
In-Reply-To: <6a22ff95-28c1-4c1d-a1a8-6a391bcc8c86@kernel.org>



On 11/19/25 7:35 PM, David Hildenbrand (Red Hat) wrote:
> On 19.11.25 12:02, Qi Zheng wrote:
>> Hi David,
>>
>> On 11/19/25 6:19 PM, David Hildenbrand (Red Hat) wrote:
>>> On 18.11.25 13:02, Qi Zheng wrote:
>>>>
>>>>
>>>> On 11/18/25 12:57 AM, David Hildenbrand (Red Hat) wrote:
>>>>> On 14.11.25 12:11, Qi Zheng wrote:
>>>>>> From: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>
>>>>> Subject: s/&&/&/
>>>>
>>>> will do.
>>>>
>>>>>
>>>>>>
>>>>>> Make PT_RECLAIM depend on MMU_GATHER_RCU_TABLE_FREE so that 
>>>>>> PT_RECLAIM
>>>>>> can
>>>>>> be enabled by default on all architectures that support
>>>>>> MMU_GATHER_RCU_TABLE_FREE.
>>>>>>
>>>>>> Considering that a large number of PTE page table pages (such as
>>>>>> 100GB+)
>>>>>> can only be caused on a 64-bit system, let PT_RECLAIM also depend on
>>>>>> 64BIT.
>>>>>>
>>>>>> Signed-off-by: Qi Zheng <zhengqi.arch@bytedance.com>
>>>>>> ---
>>>>>>     arch/x86/Kconfig | 1 -
>>>>>>     mm/Kconfig       | 6 +-----
>>>>>>     2 files changed, 1 insertion(+), 6 deletions(-)
>>>>>>
>>>>>> diff --git a/arch/x86/Kconfig b/arch/x86/Kconfig
>>>>>> index eac2e86056902..96bff81fd4787 100644
>>>>>> --- a/arch/x86/Kconfig
>>>>>> +++ b/arch/x86/Kconfig
>>>>>> @@ -330,7 +330,6 @@ config X86
>>>>>>         select FUNCTION_ALIGNMENT_4B
>>>>>>         imply IMA_SECURE_AND_OR_TRUSTED_BOOT    if EFI
>>>>>>         select HAVE_DYNAMIC_FTRACE_NO_PATCHABLE
>>>>>> -    select ARCH_SUPPORTS_PT_RECLAIM        if X86_64
>>>>>>         select ARCH_SUPPORTS_SCHED_SMT        if SMP
>>>>>>         select SCHED_SMT            if SMP
>>>>>>         select ARCH_SUPPORTS_SCHED_CLUSTER    if SMP
>>>>>> diff --git a/mm/Kconfig b/mm/Kconfig
>>>>>> index a5a90b169435d..e795fbd69e50c 100644
>>>>>> --- a/mm/Kconfig
>>>>>> +++ b/mm/Kconfig
>>>>>> @@ -1440,14 +1440,10 @@ config ARCH_HAS_USER_SHADOW_STACK
>>>>>>           The architecture has hardware support for userspace shadow
>>>>>> call
>>>>>>               stacks (eg, x86 CET, arm64 GCS or RISC-V Zicfiss).
>>>>>> -config ARCH_SUPPORTS_PT_RECLAIM
>>>>>> -    def_bool n
>>>>>> -
>>>>>>     config PT_RECLAIM
>>>>>>         bool "reclaim empty user page table pages"
>>>>>>         default y
>>>>>> -    depends on ARCH_SUPPORTS_PT_RECLAIM && MMU && SMP
>>>>>> -    select MMU_GATHER_RCU_TABLE_FREE
>>>>>> +    depends on MMU_GATHER_RCU_TABLE_FREE && MMU && SMP && 64BIT
>>>>>
>>>>> Who would we have MMU_GATHER_RCU_TABLE_FREE without MMU? (can we drop
>>>>> the MMU part)
>>>>
>>>> OK.
>>>>
>>>>>
>>>>> Why do we care about SMP in the first place? (can we frop SMP)
>>>>
>>>> OK.
>>>>
>>>>>
>>>>> But I also wonder why we need "MMU_GATHER_RCU_TABLE_FREE && 64BIT":
>>>>>
>>>>> Would it be harmful on 32bit (sure, we might not reclaim as much, but
>>>>> still there is memory to be reclaimed?)?
>>>>
>>>> This is also fine on 32bit, but the benefits are not significant, So I
>>>> chose to enable it only on 64-bit.
>>>
>>> Right. Address space is smaller, but also memory is smaller. Not that I
>>> think we strictly *must* to support 32bit, I merely wonder why we
>>> wouldn't just enable it here.
>>>
>>> OTOH, if there is a good reason we cannot enable it, we can definitely
>>> just keep it 64bit only.
>>
>> The only difficulty is this:
>>
>>>
>>>>
>>>> I actually tried enabling MMU_GATHER_RCU_TABLE_FREE on all
>>>> architectures, and apart from sparc32 being a bit troublesome (because
>>>> it uses mm->page_table_lock for synchronization within
>>>> __pte_free_tlb()), the modifications were relatively simple.
>>
>> in sparc32:
>>
>> void pte_free(struct mm_struct *mm, pgtable_t ptep)
>> {
>>           struct page *page;
>>
>>           page = pfn_to_page(__nocache_pa((unsigned long)ptep) >>
>> PAGE_SHIFT);
>>           spin_lock(&mm->page_table_lock);
>>           if (page_ref_dec_return(page) == 1)
>>                   pagetable_dtor(page_ptdesc(page));
>>           spin_unlock(&mm->page_table_lock);
>>
>>           srmmu_free_nocache(ptep, SRMMU_PTE_TABLE_SIZE);
>> }
>>
>> #define __pte_free_tlb(tlb, pte, addr)  pte_free((tlb)->mm, pte)
>>
>> To enable MMU_GATHER_RCU_TABLE_FREE on sparc32, we need to implement
>> __tlb_remove_table(), and call the pte_free() above in 
>> __tlb_remove_table().
>>
>> However, the __tlb_remove_table() does not have an mm parameter:
>>
>> void __tlb_remove_table(void *_table)
>>
>> so we need to use another lock instead of mm->page_table_lock.
>>
>> I have already sent the v2 [1], and perhaps after that I can enable
>> PT_RECLAIM on all 32-bit architectures as well.
>>
> 
> I guess if we just make it depend on MMU_GATHER_RCU_TABLE_FREE that will 
> be fine.
> 
>> [1].
>> https://lore.kernel.org/all/ 
>> cover.1763537007.git.zhengqi.arch@bytedance.com/
>>
>>>>
>>>>>
>>>>> If all 64BIT support MMU_GATHER_RCU_TABLE_FREE (as you previously
>>>>> state), why can't we only check for 64BIT?
>>>>
>>>> OK, will do.
>>>
>>> This was also more of a question for discussion:
>>>
>>> Would it make sense to have
>>>
>>> config PT_RECLAIM
>>>       def_bool y
>>>       depends on MMU_GATHER_RCU_TABLE_FREE
>>
>> make sense.
>>
>>>
>>> (a) Would we want to make it configurable (why?)
>>
>> No, it was just out of caution before.
>>
>>> (b) Do we really care about SMP (why?)
>>
>> No. Simply because the following situation is impossible to occur:
>>
>> pte_offset_map
>> traversing the PTE page table
>>
>> <preemption or hardirq>
>>
>> call madvise(MADV_DONTNEED)
>>
>> so there's no need to free PTE page via RCU.
>>
>>> (c) Do we want to limit to 64bit (why?)
>>
>> No, just because the profit is greater at 64-BIT.
> 
> I was briefly wondering if on 32bit (but maybe also on 64bit with 
> configurable user page table levels?) we could have the scenario that we 
> only have two page table levels.
> 
> So reclaiming the PMD level (corresponding to the highest level) would 

reclaiming the PMD level? The PT_RECLAIM only reclaim PTE pages, not PMD
pages, am I misunderstanding something?

> be impossible. But for that to happen one would have to discard the 
> whole address range through MADV_DONTNEED (impossible I guess) :)
> 



  reply	other threads:[~2025-11-19 12:13 UTC|newest]

Thread overview: 30+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-14 11:11 [PATCH 0/7] enable PT_RECLAIM on all 64-bit architectures Qi Zheng
2025-11-14 11:11 ` [PATCH 1/7] alpha: mm: enable MMU_GATHER_RCU_TABLE_FREE Qi Zheng
2025-11-14 19:13   ` Magnus Lindholm
2025-11-15  9:06     ` Qi Zheng
2025-11-14 11:11 ` [PATCH 2/7] arc: " Qi Zheng
2025-11-14 11:20   ` Qi Zheng
2025-11-14 23:10     ` Vineet Gupta
2025-11-15  9:08       ` Qi Zheng
2025-11-14 11:11 ` [PATCH 3/7] loongarch: " Qi Zheng
2025-11-14 14:17   ` Huacai Chen
2025-11-14 15:55     ` Qi Zheng
2025-11-17  6:41     ` Qi Zheng
2025-11-17  6:57       ` Huacai Chen
2025-11-14 11:11 ` [PATCH 4/7] mips: " Qi Zheng
2025-11-14 11:11 ` [PATCH 5/7] parisc: " Qi Zheng
2025-11-14 11:11 ` [PATCH 6/7] um: " Qi Zheng
2025-11-14 11:11 ` [PATCH 7/7] mm: make PT_RECLAIM depend on MMU_GATHER_RCU_TABLE_FREE && 64BIT Qi Zheng
2025-11-15  0:51   ` kernel test robot
2025-11-15  1:12   ` kernel test robot
2025-11-17 16:57   ` David Hildenbrand (Red Hat)
2025-11-18 12:02     ` Qi Zheng
2025-11-19 10:19       ` David Hildenbrand (Red Hat)
2025-11-19 11:02         ` Qi Zheng
2025-11-19 11:35           ` David Hildenbrand (Red Hat)
2025-11-19 12:13             ` Qi Zheng [this message]
2025-11-19 12:24               ` David Hildenbrand (Red Hat)
2025-11-17 16:53 ` [PATCH 0/7] enable PT_RECLAIM on all 64-bit architectures David Hildenbrand (Red Hat)
2025-11-18 11:53   ` Qi Zheng
2025-11-19 10:13     ` David Hildenbrand (Red Hat)
2025-11-19 10:37       ` Qi Zheng

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=479b0409-335f-4450-8eb2-5270a5847f5e@linux.dev \
    --to=qi.zheng@linux.dev \
    --cc=akpm@linux-foundation.org \
    --cc=aneesh.kumar@kernel.org \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=ioworker0@gmail.com \
    --cc=linux-alpha@vger.kernel.org \
    --cc=linux-arch@vger.kernel.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mips@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=linux-parisc@vger.kernel.org \
    --cc=linux-snps-arc@lists.infradead.org \
    --cc=linux-um@lists.infradead.org \
    --cc=loongarch@lists.linux.dev \
    --cc=npiggin@gmail.com \
    --cc=peterz@infradead.org \
    --cc=will@kernel.org \
    --cc=zhengqi.arch@bytedance.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).