From: Dev Jain <dev.jain@arm.com>
To: akpm@linux-foundation.org, david@redhat.com
Cc: ziy@nvidia.com, baolin.wang@linux.alibaba.com,
lorenzo.stoakes@oracle.com, Liam.Howlett@oracle.com,
npache@redhat.com, ryan.roberts@arm.com, baohua@kernel.org,
linux-mm@kvack.org, linux-kernel@vger.kernel.org,
Dev Jain <dev.jain@arm.com>
Subject: [PATCH v4 1/3] mm: add get_and_clear_ptes() and clear_ptes()
Date: Thu, 24 Jul 2025 10:52:59 +0530 [thread overview]
Message-ID: <20250724052301.23844-2-dev.jain@arm.com> (raw)
In-Reply-To: <20250724052301.23844-1-dev.jain@arm.com>
From: David Hildenbrand <david@redhat.com>
Let's add variants to be used where "full" does not apply -- which will
be the majority of cases in the future. "full" really only applies if
we are about to tear down a full MM.
Use get_and_clear_ptes() in existing code, clear_ptes() users will
be added next.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
Signed-off-by: Dev Jain <dev.jain@arm.com>
---
arch/arm64/mm/mmu.c | 2 +-
include/linux/pgtable.h | 45 +++++++++++++++++++++++++++++++++++++++++
mm/mremap.c | 2 +-
mm/rmap.c | 2 +-
4 files changed, 48 insertions(+), 3 deletions(-)
diff --git a/arch/arm64/mm/mmu.c b/arch/arm64/mm/mmu.c
index abd9725796e9..20a89ab97dc5 100644
--- a/arch/arm64/mm/mmu.c
+++ b/arch/arm64/mm/mmu.c
@@ -1528,7 +1528,7 @@ early_initcall(prevent_bootmem_remove_init);
pte_t modify_prot_start_ptes(struct vm_area_struct *vma, unsigned long addr,
pte_t *ptep, unsigned int nr)
{
- pte_t pte = get_and_clear_full_ptes(vma->vm_mm, addr, ptep, nr, /* full = */ 0);
+ pte_t pte = get_and_clear_ptes(vma->vm_mm, addr, ptep, nr);
if (alternative_has_cap_unlikely(ARM64_WORKAROUND_2645198)) {
/*
diff --git a/include/linux/pgtable.h b/include/linux/pgtable.h
index e3b99920be05..4c035637eeb7 100644
--- a/include/linux/pgtable.h
+++ b/include/linux/pgtable.h
@@ -736,6 +736,29 @@ static inline pte_t get_and_clear_full_ptes(struct mm_struct *mm,
}
#endif
+/**
+ * get_and_clear_ptes - Clear present PTEs that map consecutive pages of
+ * the same folio, collecting dirty/accessed bits.
+ * @mm: Address space the pages are mapped into.
+ * @addr: Address the first page is mapped at.
+ * @ptep: Page table pointer for the first entry.
+ * @nr: Number of entries to clear.
+ *
+ * Use this instead of get_and_clear_full_ptes() if it is known that we don't
+ * need to clear the full mm, which is mostly the case.
+ *
+ * Note that PTE bits in the PTE range besides the PFN can differ. For example,
+ * some PTEs might be write-protected.
+ *
+ * Context: The caller holds the page table lock. The PTEs map consecutive
+ * pages that belong to the same folio. The PTEs are all in the same PMD.
+ */
+static inline pte_t get_and_clear_ptes(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, unsigned int nr)
+{
+ return get_and_clear_full_ptes(mm, addr, ptep, nr, 0);
+}
+
#ifndef clear_full_ptes
/**
* clear_full_ptes - Clear present PTEs that map consecutive pages of the same
@@ -768,6 +791,28 @@ static inline void clear_full_ptes(struct mm_struct *mm, unsigned long addr,
}
#endif
+/**
+ * clear_ptes - Clear present PTEs that map consecutive pages of the same folio.
+ * @mm: Address space the pages are mapped into.
+ * @addr: Address the first page is mapped at.
+ * @ptep: Page table pointer for the first entry.
+ * @nr: Number of entries to clear.
+ *
+ * Use this instead of clear_full_ptes() if it is known that we don't need to
+ * clear the full mm, which is mostly the case.
+ *
+ * Note that PTE bits in the PTE range besides the PFN can differ. For example,
+ * some PTEs might be write-protected.
+ *
+ * Context: The caller holds the page table lock. The PTEs map consecutive
+ * pages that belong to the same folio. The PTEs are all in the same PMD.
+ */
+static inline void clear_ptes(struct mm_struct *mm, unsigned long addr,
+ pte_t *ptep, unsigned int nr)
+{
+ clear_full_ptes(mm, addr, ptep, nr, 0);
+}
+
/*
* If two threads concurrently fault at the same page, the thread that
* won the race updates the PTE and its local TLB/Cache. The other thread
diff --git a/mm/mremap.c b/mm/mremap.c
index ac39845e9718..677a4d744df9 100644
--- a/mm/mremap.c
+++ b/mm/mremap.c
@@ -280,7 +280,7 @@ static int move_ptes(struct pagetable_move_control *pmc,
old_pte, max_nr_ptes);
force_flush = true;
}
- pte = get_and_clear_full_ptes(mm, old_addr, old_ptep, nr_ptes, 0);
+ pte = get_and_clear_ptes(mm, old_addr, old_ptep, nr_ptes);
pte = move_pte(pte, old_addr, new_addr);
pte = move_soft_dirty_pte(pte);
diff --git a/mm/rmap.c b/mm/rmap.c
index f93ce27132ab..568198e9efc2 100644
--- a/mm/rmap.c
+++ b/mm/rmap.c
@@ -2036,7 +2036,7 @@ static bool try_to_unmap_one(struct folio *folio, struct vm_area_struct *vma,
flush_cache_range(vma, address, end_addr);
/* Nuke the page table entry. */
- pteval = get_and_clear_full_ptes(mm, address, pvmw.pte, nr_pages, 0);
+ pteval = get_and_clear_ptes(mm, address, pvmw.pte, nr_pages);
/*
* We clear the PTE but do not flush so potentially
* a remote CPU could still be writing to the folio.
--
2.30.2
next prev parent reply other threads:[~2025-07-24 5:23 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-07-24 5:22 [PATCH v4 0/3] Optimizations for khugepaged Dev Jain
2025-07-24 5:22 ` Dev Jain [this message]
2025-07-24 9:31 ` [PATCH v4 1/3] mm: add get_and_clear_ptes() and clear_ptes() Barry Song
2025-07-24 17:17 ` Lorenzo Stoakes
2025-07-29 14:30 ` Zi Yan
2025-07-24 5:23 ` [PATCH v4 2/3] khugepaged: Optimize __collapse_huge_page_copy_succeeded() by PTE batching Dev Jain
2025-07-24 17:55 ` Lorenzo Stoakes
2025-07-24 17:57 ` David Hildenbrand
2025-07-24 18:01 ` Lorenzo Stoakes
2025-07-24 18:02 ` Lorenzo Stoakes
2025-07-29 14:38 ` Zi Yan
2025-07-24 5:23 ` [PATCH v4 3/3] khugepaged: Optimize collapse_pte_mapped_thp() " Dev Jain
2025-07-24 18:07 ` Lorenzo Stoakes
2025-07-29 14:41 ` Zi Yan
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250724052301.23844-2-dev.jain@arm.com \
--to=dev.jain@arm.com \
--cc=Liam.Howlett@oracle.com \
--cc=akpm@linux-foundation.org \
--cc=baohua@kernel.org \
--cc=baolin.wang@linux.alibaba.com \
--cc=david@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=lorenzo.stoakes@oracle.com \
--cc=npache@redhat.com \
--cc=ryan.roberts@arm.com \
--cc=ziy@nvidia.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.