Re: [PATCH v6 1/5] mm: rmap: support batched checks of the references for large folios

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: Barry Song <21cnbao@gmail.com>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
	akpm@linux-foundation.org, catalin.marinas@arm.com,
	will@kernel.org, lorenzo.stoakes@oracle.com,
	ryan.roberts@arm.com, Liam.Howlett@oracle.com, vbabka@suse.cz,
	rppt@kernel.org, surenb@google.com, mhocko@suse.com,
	riel@surriel.com, harry.yoo@oracle.com, jannh@google.com,
	willy@infradead.org, dev.jain@arm.com, linux-mm@kvack.org,
	linux-arm-kernel@lists.infradead.org,
	linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 1/5] mm: rmap: support batched checks of the references for large folios
Date: Wed, 18 Mar 2026 09:37:12 +0800	[thread overview]
Message-ID: <6bdc4b03-9631-4717-a3fa-2785a7930aba@linux.alibaba.com> (raw)
In-Reply-To: <CAGsJ_4wZq7ZvMzMq513a89=VYs7gGkeVwSnAczxs_yYCHGFDQA@mail.gmail.com>



On 3/17/26 3:30 PM, Barry Song wrote:
> On Mon, Mar 16, 2026 at 2:25 PM Baolin Wang
> <baolin.wang@linux.alibaba.com> wrote:
>>
>>
>>
>> On 3/10/26 4:17 PM, David Hildenbrand (Arm) wrote:
>>> On 3/10/26 02:37, Baolin Wang wrote:
>>>>
>>>>
>>>> On 3/7/26 4:02 PM, Barry Song wrote:
>>>>> On Sat, Mar 7, 2026 at 10:22 AM Baolin Wang
>>>>> <baolin.wang@linux.alibaba.com> wrote:
>>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Thanks.
>>>>>>
>>>>>>
>>>>>> Yes. In addition, this will involve many architectures’ implementations
>>>>>> and their differing TLB flush mechanisms, so it’s difficult to make a
>>>>>> reasonable per-architecture measurement. If any architecture has a more
>>>>>> efficient flush method, I’d prefer to implement an architecture‑specific
>>>>>> clear_flush_young_ptes().
>>>>>
>>>>> Right! Since TLBI is usually quite expensive, I wonder if a generic
>>>>> implementation for architectures lacking clear_flush_young_ptes()
>>>>> might benefit from something like the below (just a very rough idea):
>>>>>
>>>>> int clear_flush_young_ptes(struct vm_area_struct *vma,
>>>>>                    unsigned long addr, pte_t *ptep, unsigned int nr)
>>>>> {
>>>>>            unsigned long curr_addr = addr;
>>>>>            int young = 0;
>>>>>
>>>>>            while (nr--) {
>>>>>                    young |= ptep_test_and_clear_young(vma, curr_addr,
>>>>> ptep);
>>>>>                    ptep++;
>>>>>                    curr_addr += PAGE_SIZE;
>>>>>            }
>>>>>
>>>>>            if (young)
>>>>>                    flush_tlb_range(vma, addr, curr_addr);
>>>>>            return young;
>>>>> }
>>>>
>>>> I understand your point. I’m concerned that I can’t test this patch on
>>>> every architecture to validate the benefits. Anyway, let me try this on
>>>> my X86 machine first.
>>>
>>> In any case, please make that a follow-up patch :)
>>
>> Sure. However, after investigating RISC‑V and x86, I found that
>> ptep_clear_flush_young() does not flush the TLB on these architectures:
>>
>> int ptep_clear_flush_young(struct vm_area_struct *vma,
>>                             unsigned long address, pte_t *ptep)
>> {
>>          /*
>>           * On x86 CPUs, clearing the accessed bit without a TLB flush
>>           * doesn't cause data corruption. [ It could cause incorrect
>>           * page aging and the (mistaken) reclaim of hot pages, but the
>>           * chance of that should be relatively low. ]
>>           *
>>           * So as a performance optimization don't flush the TLB when
>>           * clearing the accessed bit, it will eventually be flushed by
>>           * a context switch or a VM operation anyway. [ In the rare
>>           * event of it not getting flushed for a long time the delay
>>           * shouldn't really matter because there's no real memory
>>           * pressure for swapout to react to. ]
>>           */
>>          return ptep_test_and_clear_young(vma, address, ptep);
>> }
>>
>> I don't have access to other architectures, so I think we can postpone
>> this optimization unless someone is interested in optimizing the TLB flush.
> 
> The comment is interesting. I think it likely applies to most
> architectures, including ARM64. The main reason ARM64 doesn’t use
> this approach is probably that it can issue tlbi_nosync and then
> rely on a final dsb to ensure all invalidations are completed—
> and tlbi_nosync itself is relatively cheap.

Actually, we both tried this a few years ago, but neither succeeded :).

My patch: https://lkml.org/lkml/2023/10/24/533

Your patch: 
https://lore.kernel.org/lkml/20220617070555.344368-1-21cnbao@gmail.com/

Now I’m more inclined toward your approach, to align with MGLRU. It’s 
time to restart the discussion on this patch? :)

next prev parent reply	other threads:[~2026-03-18  1:37 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-02-09 14:07 [PATCH v6 0/5] support batch checking of references and unmapping for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 1/5] mm: rmap: support batched checks of the references " Baolin Wang
2026-02-09 15:25   ` David Hildenbrand (Arm)
2026-03-06 21:07   ` Barry Song
2026-03-07  2:22     ` Baolin Wang
2026-03-07  8:02       ` Barry Song
2026-03-10  1:37         ` Baolin Wang
2026-03-10  8:17           ` David Hildenbrand (Arm)
2026-03-16  6:25             ` Baolin Wang
2026-03-16 14:15               ` David Hildenbrand (Arm)
2026-03-25 14:36                 ` Lorenzo Stoakes (Oracle)
2026-03-25 14:58                   ` David Hildenbrand (Arm)
2026-03-25 15:06                     ` Lorenzo Stoakes (Oracle)
2026-03-25 15:30                       ` Andrew Morton
2026-03-25 15:32                         ` Lorenzo Stoakes (Oracle)
2026-03-25 16:23                           ` Andrew Morton
2026-03-25 16:28                             ` Lorenzo Stoakes (Oracle)
2026-03-25 18:43                               ` Andrew Morton
2026-03-25 18:58                                 ` Lorenzo Stoakes (Oracle)
2026-03-26  1:47                       ` Baolin Wang
2026-03-26  5:31                         ` Barry Song
2026-03-26 11:10                         ` Lorenzo Stoakes (Oracle)
2026-03-26 12:04                           ` Baolin Wang
2026-03-26 12:21                             ` Lorenzo Stoakes (Oracle)
2026-03-27 10:20                               ` Baolin Wang
2026-03-27  9:00                             ` David Hildenbrand (Arm)
2026-03-17  7:30               ` Barry Song
2026-03-18  1:37                 ` Baolin Wang [this message]
2026-02-09 14:07 ` [PATCH v6 2/5] arm64: mm: factor out the address and ptep alignment into a new helper Baolin Wang
2026-02-09 14:07 ` [PATCH v6 3/5] arm64: mm: support batch clearing of the young flag for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes() Baolin Wang
2026-02-09 15:30   ` David Hildenbrand (Arm)
2026-02-10  0:39     ` Baolin Wang
2026-03-06 21:20   ` Barry Song
2026-03-07  2:14     ` Baolin Wang
2026-03-07  7:41       ` Barry Song
2026-02-09 14:07 ` [PATCH v6 5/5] mm: rmap: support batched unmapping for file large folios Baolin Wang
2026-02-09 15:31   ` David Hildenbrand (Arm)
2026-02-10  1:53 ` [PATCH v6 0/5] support batch checking of references and unmapping for " Andrew Morton
2026-02-10  2:01   ` Baolin Wang

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=6bdc4b03-9631-4717-a3fa-2785a7930aba@linux.alibaba.com \
    --to=baolin.wang@linux.alibaba.com \
    --cc=21cnbao@gmail.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=catalin.marinas@arm.com \
    --cc=david@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=harry.yoo@oracle.com \
    --cc=jannh@google.com \
    --cc=linux-arm-kernel@lists.infradead.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=mhocko@suse.com \
    --cc=riel@surriel.com \
    --cc=rppt@kernel.org \
    --cc=ryan.roberts@arm.com \
    --cc=surenb@google.com \
    --cc=vbabka@suse.cz \
    --cc=will@kernel.org \
    --cc=willy@infradead.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.