From: Barry Song <21cnbao@gmail.com>
To: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>,
"David Hildenbrand (Arm)" <david@kernel.org>,
akpm@linux-foundation.org, catalin.marinas@arm.com,
will@kernel.org, lorenzo.stoakes@oracle.com,
ryan.roberts@arm.com, Liam.Howlett@oracle.com, vbabka@suse.cz,
rppt@kernel.org, surenb@google.com, mhocko@suse.com,
riel@surriel.com, harry.yoo@oracle.com, jannh@google.com,
willy@infradead.org, dev.jain@arm.com, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 1/5] mm: rmap: support batched checks of the references for large folios
Date: Thu, 26 Mar 2026 13:31:03 +0800 [thread overview]
Message-ID: <CAGsJ_4xL8Rm929oXG54TRXeL8LoHqMYM76rGUc8eKWwC_iPZKA@mail.gmail.com> (raw)
In-Reply-To: <bfd282b1-523d-4945-97a0-ec30fe1df577@linux.alibaba.com>
On Thu, Mar 26, 2026 at 9:47 AM Baolin Wang
<baolin.wang@linux.alibaba.com> wrote:
>
>
>
> On 3/25/26 11:06 PM, Lorenzo Stoakes (Oracle) wrote:
> > On Wed, Mar 25, 2026 at 03:58:36PM +0100, David Hildenbrand (Arm) wrote:
> >> On 3/25/26 15:36, Lorenzo Stoakes (Oracle) wrote:
> >>> On Mon, Mar 16, 2026 at 03:15:18PM +0100, David Hildenbrand (Arm) wrote:
> >>>> On 3/16/26 07:25, Baolin Wang wrote:
> >>>>>
> >>>>>
> >>>>>
> >>>>> Sure. However, after investigating RISC‑V and x86, I found that
> >>>>> ptep_clear_flush_young() does not flush the TLB on these architectures:
> >>>>>
> >>>>> int ptep_clear_flush_young(struct vm_area_struct *vma,
> >>>>> unsigned long address, pte_t *ptep)
> >>>>> {
> >>>>> /*
> >>>>> * On x86 CPUs, clearing the accessed bit without a TLB flush
> >>>>> * doesn't cause data corruption. [ It could cause incorrect
> >>>>> * page aging and the (mistaken) reclaim of hot pages, but the
> >>>>> * chance of that should be relatively low. ]
> >>>>> *
> >>>>> * So as a performance optimization don't flush the TLB when
> >>>>> * clearing the accessed bit, it will eventually be flushed by
> >>>>> * a context switch or a VM operation anyway. [ In the rare
> >>>>> * event of it not getting flushed for a long time the delay
> >>>>> * shouldn't really matter because there's no real memory
> >>>>> * pressure for swapout to react to. ]
> >>>>> */
> >>>>> return ptep_test_and_clear_young(vma, address, ptep);
> >>>>> }
> >>>>
> >>>> You'd probably want an arch helper then, that tells you whether
> >>>> a flush_tlb_range() after ptep_test_and_clear_young() is required.
> >>>>
> >>>> Or some special flush_tlb_range() helper.
> >>>>
> >>>> I agree that it requires more work.
>
> (Sorry, David. I forgot to reply to your email because I've had a lot to
> sort out recently.)
>
> Rather than adding more arch helpers (we already have plenty for the
> young flag check), I think we should try removing the TLB flush, as I
> mentioned to Barry[1]. MGLRU reclaim already skips the TLB flush, and it
> seems to work fine. What do you think?
>
> Here are our previous attempts to remove the TLB flush:
>
> My patch: https://lkml.org/lkml/2023/10/24/533
> Barry's patch:
> https://lore.kernel.org/lkml/20220617070555.344368-1-21cnbao@gmail.com/
>
> [1]
> https://lore.kernel.org/all/6bdc4b03-9631-4717-a3fa-2785a7930aba@linux.alibaba.com/
x86: ptep_clear_flush_young does not perform any TLB
invalidation. simply, calling ptep_test_and_clear_young()
RISC-V: follows the exact same behavior as x86.
S390:
simply, calling ptep_test_and_clear_young()
powerpc:
simply, calling ptep_test_and_clear_young();
parisc:
set_pte + __flush_cache_page
but ptep_test_and_clear_young() doesn't need __flush_cache_page()
arm64:
ptep_test_and_clear_young() followed by
flush_tlb_page_nosync() can still be expensive,
based on my previous observations.
others:
ptep_test_and_clear_young + flush_tlb_page
revisiting the comment for x86:
/*
* On x86 CPUs, clearing the accessed bit without a TLB flush
* doesn't cause data corruption. [ It could cause incorrect
* page aging and the (mistaken) reclaim of hot pages, but the
* chance of that should be relatively low. ]
*
* So as a performance optimization don't flush the TLB when
* clearing the accessed bit, it will eventually be flushed by
* a context switch or a VM operation anyway. [ In the rare
* event of it not getting flushed for a long time the delay
* shouldn't really matter because there's no real memory
* pressure for swapout to react to. ]
*/
At least I feel this also applies to ARM64?
Maybe Ryan, Will, or Catalin can clarify why ARM64 requires a
nosync TLBI, whereas x86 does not?
Thanks
Barry
next prev parent reply other threads:[~2026-03-26 5:31 UTC|newest]
Thread overview: 38+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-09 14:07 [PATCH v6 0/5] support batch checking of references and unmapping for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 1/5] mm: rmap: support batched checks of the references " Baolin Wang
2026-02-09 15:25 ` David Hildenbrand (Arm)
2026-03-06 21:07 ` Barry Song
2026-03-07 2:22 ` Baolin Wang
2026-03-07 8:02 ` Barry Song
2026-03-10 1:37 ` Baolin Wang
2026-03-10 8:17 ` David Hildenbrand (Arm)
2026-03-16 6:25 ` Baolin Wang
2026-03-16 14:15 ` David Hildenbrand (Arm)
2026-03-25 14:36 ` Lorenzo Stoakes (Oracle)
2026-03-25 14:58 ` David Hildenbrand (Arm)
2026-03-25 15:06 ` Lorenzo Stoakes (Oracle)
2026-03-25 15:30 ` Andrew Morton
2026-03-25 15:32 ` Lorenzo Stoakes (Oracle)
2026-03-25 16:23 ` Andrew Morton
2026-03-25 16:28 ` Lorenzo Stoakes (Oracle)
2026-03-25 18:43 ` Andrew Morton
2026-03-25 18:58 ` Lorenzo Stoakes (Oracle)
2026-03-26 1:47 ` Baolin Wang
2026-03-26 5:31 ` Barry Song [this message]
2026-03-26 11:10 ` Lorenzo Stoakes (Oracle)
2026-03-26 12:04 ` Baolin Wang
2026-03-26 12:21 ` Lorenzo Stoakes (Oracle)
2026-03-17 7:30 ` Barry Song
2026-03-18 1:37 ` Baolin Wang
2026-02-09 14:07 ` [PATCH v6 2/5] arm64: mm: factor out the address and ptep alignment into a new helper Baolin Wang
2026-02-09 14:07 ` [PATCH v6 3/5] arm64: mm: support batch clearing of the young flag for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes() Baolin Wang
2026-02-09 15:30 ` David Hildenbrand (Arm)
2026-02-10 0:39 ` Baolin Wang
2026-03-06 21:20 ` Barry Song
2026-03-07 2:14 ` Baolin Wang
2026-03-07 7:41 ` Barry Song
2026-02-09 14:07 ` [PATCH v6 5/5] mm: rmap: support batched unmapping for file large folios Baolin Wang
2026-02-09 15:31 ` David Hildenbrand (Arm)
2026-02-10 1:53 ` [PATCH v6 0/5] support batch checking of references and unmapping for " Andrew Morton
2026-02-10 2:01 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=CAGsJ_4xL8Rm929oXG54TRXeL8LoHqMYM76rGUc8eKWwC_iPZKA@mail.gmail.com \
--to=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox