From: Baolin Wang <baolin.wang@linux.alibaba.com>
To: "Lorenzo Stoakes (Oracle)" <ljs@kernel.org>
Cc: "David Hildenbrand (Arm)" <david@kernel.org>,
Barry Song <21cnbao@gmail.com>,
akpm@linux-foundation.org, catalin.marinas@arm.com,
will@kernel.org, lorenzo.stoakes@oracle.com,
ryan.roberts@arm.com, Liam.Howlett@oracle.com, vbabka@suse.cz,
rppt@kernel.org, surenb@google.com, mhocko@suse.com,
riel@surriel.com, harry.yoo@oracle.com, jannh@google.com,
willy@infradead.org, dev.jain@arm.com, linux-mm@kvack.org,
linux-arm-kernel@lists.infradead.org,
linux-kernel@vger.kernel.org
Subject: Re: [PATCH v6 1/5] mm: rmap: support batched checks of the references for large folios
Date: Fri, 27 Mar 2026 18:20:21 +0800 [thread overview]
Message-ID: <cec6308f-699c-40ca-a8e7-80247e030fb9@linux.alibaba.com> (raw)
In-Reply-To: <5b1c0687-a4e4-4a95-8e8f-2d2ce171247c@lucifer.local>
On 3/26/26 8:21 PM, Lorenzo Stoakes (Oracle) wrote:
> On Thu, Mar 26, 2026 at 08:04:10PM +0800, Baolin Wang wrote:
>>
>>
>> On 3/26/26 7:10 PM, Lorenzo Stoakes (Oracle) wrote:
>>> On Thu, Mar 26, 2026 at 09:47:51AM +0800, Baolin Wang wrote:
>>>>
>>>>
>>>> On 3/25/26 11:06 PM, Lorenzo Stoakes (Oracle) wrote:
>>>>> On Wed, Mar 25, 2026 at 03:58:36PM +0100, David Hildenbrand (Arm) wrote:
>>>>>> On 3/25/26 15:36, Lorenzo Stoakes (Oracle) wrote:
>>>>>>> On Mon, Mar 16, 2026 at 03:15:18PM +0100, David Hildenbrand (Arm) wrote:
>>>>>>>> On 3/16/26 07:25, Baolin Wang wrote:
>>>>>>>>>
>>>>>>>>>
>>>>>>>>>
>>>>>>>>> Sure. However, after investigating RISC‑V and x86, I found that
>>>>>>>>> ptep_clear_flush_young() does not flush the TLB on these architectures:
>>>>>>>>>
>>>>>>>>> int ptep_clear_flush_young(struct vm_area_struct *vma,
>>>>>>>>> unsigned long address, pte_t *ptep)
>>>>>>>>> {
>>>>>>>>> /*
>>>>>>>>> * On x86 CPUs, clearing the accessed bit without a TLB flush
>>>>>>>>> * doesn't cause data corruption. [ It could cause incorrect
>>>>>>>>> * page aging and the (mistaken) reclaim of hot pages, but the
>>>>>>>>> * chance of that should be relatively low. ]
>>>>>>>>> *
>>>>>>>>> * So as a performance optimization don't flush the TLB when
>>>>>>>>> * clearing the accessed bit, it will eventually be flushed by
>>>>>>>>> * a context switch or a VM operation anyway. [ In the rare
>>>>>>>>> * event of it not getting flushed for a long time the delay
>>>>>>>>> * shouldn't really matter because there's no real memory
>>>>>>>>> * pressure for swapout to react to. ]
>>>>>>>>> */
>>>>>>>>> return ptep_test_and_clear_young(vma, address, ptep);
>>>>>>>>> }
>>>>>>>>
>>>>>>>> You'd probably want an arch helper then, that tells you whether
>>>>>>>> a flush_tlb_range() after ptep_test_and_clear_young() is required.
>>>>>>>>
>>>>>>>> Or some special flush_tlb_range() helper.
>>>>>>>>
>>>>>>>> I agree that it requires more work.
>>>>
>>>> (Sorry, David. I forgot to reply to your email because I've had a lot to
>>>> sort out recently.)
>>>>
>>>> Rather than adding more arch helpers (we already have plenty for the young
>>>> flag check), I think we should try removing the TLB flush, as I mentioned to
>>>> Barry[1]. MGLRU reclaim already skips the TLB flush, and it seems to work
>>>> fine. What do you think?
>>>>
>>>> Here are our previous attempts to remove the TLB flush:
>>>>
>>>> My patch: https://lkml.org/lkml/2023/10/24/533
>>>> Barry's patch:
>>>> https://lore.kernel.org/lkml/20220617070555.344368-1-21cnbao@gmail.com/
>>>>
>>>> [1] https://lore.kernel.org/all/6bdc4b03-9631-4717-a3fa-2785a7930aba@linux.alibaba.com/
>>>>
>>>>>>> Sorry unclear here - does the series need more work or does a follow up patch
>>>>>>> need more work?
>>>>>>
>>>>>> Follow up!
>>>>>
>>>>> Ok good as in mm-stable now. Sadly means I don't get to review it but there we
>>>>> go.
>>>>
>>>> Actually this patchset has already been merged upstream:)
>>
>> Let me try to make things clear.
>>
>>> Err but this revision was sent _during_ the merge window...?
>>>
>>> Was sent on 9th Feb on Monday in merge window week 1, with a functional change
>>> listed:
>>>
>>> - Skip batched unmapping for uffd case, reported by Dev. Thanks.
>>>
>>> And then sent in 2nd batch on 18th Feb (see [0]).
>>>
>>> So we were ok with 1 week of 'testing' (does anybody actually test -next during
>>> the merge window? Was it even sent to -next?) for what appears to be a
>>> functional change?
>>
>> I posted v5 on Dec 26th[0], and it collected quite a few Reviewed-by tags
>> and sat in mm-unstable for testing.
>>
>> Later, Dev reported a uffd-related issue (I hope you recall that
>> discussion). I posted a fix[1] for it on Jan 16th, which Andrew accepted.
>>
>> Since then, the v5 series (plus the fix) continued to be tested in
>> mm-unstable. We kept it there mainly because David mentioned he wanted to
>> review the series, so we were waiting for his time.
>>
>> On Feb 9th, after returning from vacation, David reviewed the series
>> (thanks, David!). I replied to and addressed all his comments, then posted
>> v6 on the same day[2].
>
> OK thanks, I see that now.
>
> I still don't think we should have made any changes _during_ the merge window,
> even if they were simple code quality things.
>
> Changing patches then seems just crazy to me, as even code quality stuff can
> cause unexpected bugs, and now we're having upstream take it.
>
> Also this speaks to -fix patches just being broken in general.
>
> If you'd just respun with the fix as a v6, then we'd know 'v6 sent on 16th Jan
> addressed this' and there'd be no isssue.
>
> Now v5 isn't v5, there's v5 and something-not-v5 and to have a sense of the
> testing you have to go read a bunch of email chains.
>
> It also means change logs are now really inaccurate:
>
> Changes from v5:
> - Collect reviewed tags from Ryan, Harry and David. Thanks.
> - Fix some coding style issues (per David).
> - Skip batched unmapping for uffd case, reported by Dev. Thanks.
>
> And that to me means 'v5 didn't have this, v6 does'.
>
> And it's really hard to track timelines for testing.
Indeed. At least I need make the change history clearer.
>> Additionally, v6 had no functional changes compared to v5 + the fix, and it
>> mainly addressed some coding style issues pointed out by David. I also
>> discussed this with David off-list, and since there were no functional
>> changes, my expectation was that it could still make it into the merge
>> window. That is why v6 was merged.
>
> Yeah, we still shouldn't have taken changes to a series DURING the merge window,
> it's just crazy.
>
>>
>> [0] https://lore.kernel.org/linux-mm/cover.1766631066.git.baolin.wang@linux.alibaba.com/#t
>> [1] https://lore.kernel.org/linux-mm/20260116162652.176054-1-baolin.wang@linux.alibaba.com/
>> [2] https://lore.kernel.org/all/cover.1770645603.git.baolin.wang@linux.alibaba.com/
>>
>>> And there was ongoing feedback on this and the v5 series (at [1])?
>>
>> Regarding the feedback on v5, I believe everything has been addressed.
>>
>>> This doesn't really feel sane?
>>>
>>> And now I'm confused as to whether mm-stable patches can collect tags, since
>>> presumably this was in mm-stable at the point this respin was done?
>>>
>>> Maybe I'm missing something here but this doesn't feel like a sane process?
>>
>> Andrew, David, please correct me if I've missed anything. Also, please let
>> me know if there's anything in the process that needs to be improved.
>> Thanks.
>
> This isn't on you, it's about the process as a whole. We need clear rules about
> when changes will be accepted and when not.
>
> And frankly I think we need to do away with fix patches as a whole based on
> this, or at least anything even vaguely non-trivial or that potentially impacts
> code.
Understood. I'll check first before sending fixes to avoid confusion
next time.
next prev parent reply other threads:[~2026-03-27 10:20 UTC|newest]
Thread overview: 40+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-02-09 14:07 [PATCH v6 0/5] support batch checking of references and unmapping for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 1/5] mm: rmap: support batched checks of the references " Baolin Wang
2026-02-09 15:25 ` David Hildenbrand (Arm)
2026-03-06 21:07 ` Barry Song
2026-03-07 2:22 ` Baolin Wang
2026-03-07 8:02 ` Barry Song
2026-03-10 1:37 ` Baolin Wang
2026-03-10 8:17 ` David Hildenbrand (Arm)
2026-03-16 6:25 ` Baolin Wang
2026-03-16 14:15 ` David Hildenbrand (Arm)
2026-03-25 14:36 ` Lorenzo Stoakes (Oracle)
2026-03-25 14:58 ` David Hildenbrand (Arm)
2026-03-25 15:06 ` Lorenzo Stoakes (Oracle)
2026-03-25 15:30 ` Andrew Morton
2026-03-25 15:32 ` Lorenzo Stoakes (Oracle)
2026-03-25 16:23 ` Andrew Morton
2026-03-25 16:28 ` Lorenzo Stoakes (Oracle)
2026-03-25 18:43 ` Andrew Morton
2026-03-25 18:58 ` Lorenzo Stoakes (Oracle)
2026-03-26 1:47 ` Baolin Wang
2026-03-26 5:31 ` Barry Song
2026-03-26 11:10 ` Lorenzo Stoakes (Oracle)
2026-03-26 12:04 ` Baolin Wang
2026-03-26 12:21 ` Lorenzo Stoakes (Oracle)
2026-03-27 10:20 ` Baolin Wang [this message]
2026-03-27 9:00 ` David Hildenbrand (Arm)
2026-03-17 7:30 ` Barry Song
2026-03-18 1:37 ` Baolin Wang
2026-02-09 14:07 ` [PATCH v6 2/5] arm64: mm: factor out the address and ptep alignment into a new helper Baolin Wang
2026-02-09 14:07 ` [PATCH v6 3/5] arm64: mm: support batch clearing of the young flag for large folios Baolin Wang
2026-02-09 14:07 ` [PATCH v6 4/5] arm64: mm: implement the architecture-specific clear_flush_young_ptes() Baolin Wang
2026-02-09 15:30 ` David Hildenbrand (Arm)
2026-02-10 0:39 ` Baolin Wang
2026-03-06 21:20 ` Barry Song
2026-03-07 2:14 ` Baolin Wang
2026-03-07 7:41 ` Barry Song
2026-02-09 14:07 ` [PATCH v6 5/5] mm: rmap: support batched unmapping for file large folios Baolin Wang
2026-02-09 15:31 ` David Hildenbrand (Arm)
2026-02-10 1:53 ` [PATCH v6 0/5] support batch checking of references and unmapping for " Andrew Morton
2026-02-10 2:01 ` Baolin Wang
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=cec6308f-699c-40ca-a8e7-80247e030fb9@linux.alibaba.com \
--to=baolin.wang@linux.alibaba.com \
--cc=21cnbao@gmail.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=catalin.marinas@arm.com \
--cc=david@kernel.org \
--cc=dev.jain@arm.com \
--cc=harry.yoo@oracle.com \
--cc=jannh@google.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=ljs@kernel.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=mhocko@suse.com \
--cc=riel@surriel.com \
--cc=rppt@kernel.org \
--cc=ryan.roberts@arm.com \
--cc=surenb@google.com \
--cc=vbabka@suse.cz \
--cc=will@kernel.org \
--cc=willy@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox