linux-mm.kvack.org archive mirror
 help / color / mirror / Atom feed
From: David Hildenbrand <david@redhat.com>
To: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: Andrew Morton <akpm@linux-foundation.org>,
	"Liam R . Howlett" <Liam.Howlett@oracle.com>,
	Vlastimil Babka <vbabka@suse.cz>, Jann Horn <jannh@google.com>,
	Pedro Falcato <pfalcato@suse.de>, Barry Song <baohua@kernel.org>,
	Dev Jain <dev.jain@arm.com>,
	linux-mm@kvack.org, linux-kernel@vger.kernel.org,
	Ryan Roberts <ryan.roberts@arm.com>
Subject: Re: [PATCH HOTFIX 6.17] mm/mremap: avoid expensive folio lookup on mremap folio pte batch
Date: Thu, 7 Aug 2025 21:41:24 +0200	[thread overview]
Message-ID: <ee3d7b32-af47-42b7-bd27-0675c065d736@redhat.com> (raw)
In-Reply-To: <3fc75720-8da7-4f6c-bdce-1e1280b8e28f@lucifer.local>

On 07.08.25 21:20, Lorenzo Stoakes wrote:
> +cc Ryan for ContPTE stuff.
> 
> On Thu, Aug 07, 2025 at 09:10:52PM +0200, David Hildenbrand wrote:
>> Acked-by: David Hildenbrand <david@redhat.com>
> 
> Thanks!
> 
>>
>> Wondering whether we could then just use the patch hint instead of going via
>> the folio.
>>
>> IOW,
>>
>> return pte_batch_hint(ptep, pte);
> 
> Wouldn't that break the A/D stuff? Also this doesn't mean that the PTE won't
> have some conflicting flags potentially. The check is empirical:
> 
> static inline unsigned int pte_batch_hint(pte_t *ptep, pte_t pte)
> {
> 	if (!pte_valid_cont(pte))
> 		return 1;
> 
> 	return CONT_PTES - (((unsigned long)ptep >> 3) & (CONT_PTES - 1));
> }
> 
> So it's 'the most number of PTEs that _might_ coalesce'.

No. If the bit is set, all PTEs in the aligned range (e.g., 64 KiB 
block) are coalesced. It's literally the bit telling the hardware that 
it can coalesce in that range because all PTEs are alike.

The function tells you exactly how many PTEs you can batch from the 
given PTEP in that 64 KiB block.

That's also why folio_pte_batch_flags() just jumps over that.

All you have to do is limit it by the maximum number.

So likely you would have to do here

diff --git a/mm/mremap.c b/mm/mremap.c
index 677a4d744df9c..58f9cf52eb6bd 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -174,16 +174,7 @@ static pte_t move_soft_dirty_pte(pte_t pte)
  static int mremap_folio_pte_batch(struct vm_area_struct *vma, unsigned 
long addr,
                 pte_t *ptep, pte_t pte, int max_nr)
  {
-       struct folio *folio;
-
-       if (max_nr == 1)
-               return 1;
-
-       folio = vm_normal_folio(vma, addr, pte);
-       if (!folio || !folio_test_large(folio))
-               return 1;
-
-       return folio_pte_batch(folio, ptep, pte, max_nr);
+       return min_t(unsigned int, max_nr, pte_batch_hint(ptep, pte));
  }

  static int move_ptes(struct pagetable_move_control *pmc,


And make sure that the compiler realizes that max_nr >= 1 and optimized 
away the min_t on !arm64.

> 
> (note that a bit grossly we'll call it _again_ in folio_pte_batch_flags()).
> 
 > I suppose we could not even bother with checking if same folio and 
_just_ check> if PTEs have consecutive PFNs, which is not very likely if 
different folio
> but... could that break something?
> 
> It seems the 'magic' is in set_ptes() on arm64 where it'll know to do the 'right
> thing' for a contPTE batch (I may be missing something - please correct me if so
> Dev/Ryan).
> 
> So actually do we even really care that much about folio?

I don't think so. Not in this case here where we don't use the folio for 
anything else.


-- 
Cheers,

David / dhildenb



  reply	other threads:[~2025-08-07 19:41 UTC|newest]

Thread overview: 26+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-08-07 18:58 [PATCH HOTFIX 6.17] mm/mremap: avoid expensive folio lookup on mremap folio pte batch Lorenzo Stoakes
2025-08-07 19:10 ` David Hildenbrand
2025-08-07 19:20   ` Lorenzo Stoakes
2025-08-07 19:41     ` David Hildenbrand [this message]
2025-08-07 20:11       ` Lorenzo Stoakes
2025-08-07 21:01         ` Lorenzo Stoakes
2025-08-07 19:56     ` Ryan Roberts
2025-08-07 20:58       ` Lorenzo Stoakes
2025-08-08  5:18         ` Dev Jain
2025-08-08  7:19         ` Ryan Roberts
2025-08-08  7:45           ` David Hildenbrand
2025-08-08  7:56             ` Ryan Roberts
2025-08-08  8:44               ` Dev Jain
2025-08-08  9:50                 ` Lorenzo Stoakes
2025-08-08  9:45             ` Lorenzo Stoakes
2025-08-08  9:40           ` Lorenzo Stoakes
2025-08-07 19:14 ` Pedro Falcato
2025-08-07 19:22   ` Lorenzo Stoakes
2025-08-07 19:33     ` David Hildenbrand
2025-08-08  5:19 ` Dev Jain
2025-08-08  9:56 ` Vlastimil Babka
2025-08-11  2:40 ` Barry Song
2025-08-11  4:57   ` Lorenzo Stoakes
2025-08-11  6:52     ` Barry Song
2025-08-11 15:08       ` Lorenzo Stoakes
2025-08-11 15:19         ` David Hildenbrand

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ee3d7b32-af47-42b7-bd27-0675c065d736@redhat.com \
    --to=david@redhat.com \
    --cc=Liam.Howlett@oracle.com \
    --cc=akpm@linux-foundation.org \
    --cc=baohua@kernel.org \
    --cc=dev.jain@arm.com \
    --cc=jannh@google.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=lorenzo.stoakes@oracle.com \
    --cc=pfalcato@suse.de \
    --cc=ryan.roberts@arm.com \
    --cc=vbabka@suse.cz \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).