linux-cxl.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*()
@ 2025-06-03 21:16 David Hildenbrand
  2025-06-03 21:16 ` [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd() David Hildenbrand
                   ` (3 more replies)
  0 siblings, 4 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-03 21:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, David Hildenbrand, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

Based on Linus' master.

While working on improving vm_normal_page() and friends, I stumbled
over this issues: refcounted "normal" pages must not be marked
using pmd_special() / pud_special().

Fortunately, so far there doesn't seem to be serious damage.

This is only compile-tested so far. Still looking for an easy way to test
PMD/PUD mappings with DAX. Any tests I can easily run?

Cc: Andrew Morton <akpm@linux-foundation.org>
Cc: Alistair Popple <apopple@nvidia.com>
Cc: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
Cc: "Liam R. Howlett" <Liam.Howlett@oracle.com>
Cc: Vlastimil Babka <vbabka@suse.cz>
Cc: Mike Rapoport <rppt@kernel.org>
Cc: Suren Baghdasaryan <surenb@google.com>
Cc: Michal Hocko <mhocko@suse.com>
Cc: Zi Yan <ziy@nvidia.com>
Cc: Baolin Wang <baolin.wang@linux.alibaba.com>
Cc: Nico Pache <npache@redhat.com>
Cc: Ryan Roberts <ryan.roberts@arm.com>
Cc: Dev Jain <dev.jain@arm.com>
Cc: Dan Williams <dan.j.williams@intel.com>

David Hildenbrand (2):
  mm/huge_memory: don't mark refcounted pages special in
    vmf_insert_folio_pmd()
  mm/huge_memory: don't mark refcounted pages special in
    vmf_insert_folio_pud()

 include/linux/mm.h | 15 ++++++++++
 mm/huge_memory.c   | 72 +++++++++++++++++++++++++++++++++++-----------
 2 files changed, 70 insertions(+), 17 deletions(-)


base-commit: a9dfb7db96f7bc1f30feae673aab7fdbfbc94e9c
-- 
2.49.0


^ permalink raw reply	[flat|nested] 14+ messages in thread

* [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-03 21:16 [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
@ 2025-06-03 21:16 ` David Hildenbrand
  2025-06-06  8:20   ` Oscar Salvador
  2025-06-06  8:27   ` Oscar Salvador
  2025-06-03 21:16 ` [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud() David Hildenbrand
                   ` (2 subsequent siblings)
  3 siblings, 2 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-03 21:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, David Hildenbrand, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

Marking PMDs that map a "normal" refcounted folios as special is
against our rules documented for vm_normal_page().

Fortunately, there are not that many pmd_special() check that can be
mislead, and most vm_normal_page_pmd()/vm_normal_folio_pmd() users that
would get this wrong right now are rather harmless: e.g., none so far
bases decisions whether to grab a folio reference on that decision.

Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
implications as it seems.

Getting this right will get more important as we use
folio_normal_page_pmd() in more places.

Fix it by just inlining the relevant code, making the whole
pmd_none() handling cleaner. We can now use folio_mk_pmd().

While at it, make sure that a pmd that is not-none is actually present
before comparing PFNs.

Fixes: 6c88f72691f8 ("mm/huge_memory: add vmf_insert_folio_pmd()")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/huge_memory.c | 39 ++++++++++++++++++++++++++++++++-------
 1 file changed, 32 insertions(+), 7 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d3e66136e41a3..f9e23dfea76f8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1474,9 +1474,10 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
 	struct vm_area_struct *vma = vmf->vma;
 	unsigned long addr = vmf->address & PMD_MASK;
 	struct mm_struct *mm = vma->vm_mm;
+	pmd_t *pmd = vmf->pmd;
 	spinlock_t *ptl;
 	pgtable_t pgtable = NULL;
-	int error;
+	pmd_t entry;
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return VM_FAULT_SIGBUS;
@@ -1490,17 +1491,41 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
 			return VM_FAULT_OOM;
 	}
 
-	ptl = pmd_lock(mm, vmf->pmd);
-	if (pmd_none(*vmf->pmd)) {
+	ptl = pmd_lock(mm, pmd);
+	if (pmd_none(*pmd)) {
 		folio_get(folio);
 		folio_add_file_rmap_pmd(folio, &folio->page, vma);
 		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PMD_NR);
+
+		entry = folio_mk_pmd(folio, vma->vm_page_prot);
+		if (write) {
+			entry = pmd_mkyoung(pmd_mkdirty(entry));
+			entry = maybe_pmd_mkwrite(entry, vma);
+		}
+		set_pmd_at(mm, addr, pmd, entry);
+		update_mmu_cache_pmd(vma, addr, pmd);
+
+		if (pgtable) {
+			pgtable_trans_huge_deposit(mm, pmd, pgtable);
+			mm_inc_nr_ptes(mm);
+			pgtable = NULL;
+		}
+	} else if (pmd_present(*pmd) && write) {
+		/*
+		 * We only allow for upgrading write permissions if the
+		 * same folio is already mapped.
+		 */
+		if (pmd_pfn(*pmd) == folio_pfn(folio)) {
+			entry = pmd_mkyoung(*pmd);
+			entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
+			if (pmdp_set_access_flags(vma, addr, pmd, entry, 1))
+				update_mmu_cache_pmd(vma, addr, pmd);
+		} else {
+			WARN_ON_ONCE(!is_huge_zero_pmd(*pmd));
+		}
 	}
-	error = insert_pfn_pmd(vma, addr, vmf->pmd,
-			pfn_to_pfn_t(folio_pfn(folio)), vma->vm_page_prot,
-			write, pgtable);
 	spin_unlock(ptl);
-	if (error && pgtable)
+	if (pgtable)
 		pte_free(mm, pgtable);
 
 	return VM_FAULT_NOPAGE;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud()
  2025-06-03 21:16 [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
  2025-06-03 21:16 ` [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd() David Hildenbrand
@ 2025-06-03 21:16 ` David Hildenbrand
  2025-06-03 22:02   ` David Hildenbrand
  2025-06-06  8:27   ` Oscar Salvador
  2025-06-03 21:36 ` [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
  2025-06-05 23:47 ` Dan Williams
  3 siblings, 2 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-03 21:16 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, David Hildenbrand, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

Marking PUDs that map a "normal" refcounted folios as special is
against our rules documented for vm_normal_page().

Fortunately, there are not that many pud_special() check that can be
mislead and are right now rather harmless: e.g., none so far
bases decisions whether to grab a folio reference on that decision.

Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
implications as it seems.

Getting this right will get more important as we introduce
folio_normal_page_pud() and start using it in more place where we
currently special-case based on other VMA flags.

Fix it by just inlining the relevant code, making the whole
pud_none() handling cleaner.

Add folio_mk_pud() to mimic what we do with folio_mk_pmd().

While at it, make sure that the pud that is non-none is actually present
before comparing PFNs.

Fixes: dbe54153296d ("mm/huge_memory: add vmf_insert_folio_pud()")
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 15 +++++++++++++++
 mm/huge_memory.c   | 33 +++++++++++++++++++++++----------
 2 files changed, 38 insertions(+), 10 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 0ef2ba0c667af..047c8261d4002 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1816,6 +1816,21 @@ static inline pmd_t folio_mk_pmd(struct folio *folio, pgprot_t pgprot)
 {
 	return pmd_mkhuge(pfn_pmd(folio_pfn(folio), pgprot));
 }
+
+/**
+ * folio_mk_pud - Create a PUD for this folio
+ * @folio: The folio to create a PUD for
+ * @pgprot: The page protection bits to use
+ *
+ * Create a page table entry for the first page of this folio.
+ * This is suitable for passing to set_pud_at().
+ *
+ * Return: A page table entry suitable for mapping this folio.
+ */
+static inline pud_t folio_mk_pud(struct folio *folio, pgprot_t pgprot)
+{
+	return pud_mkhuge(pfn_pud(folio_pfn(folio), pgprot));
+}
 #endif
 #endif /* CONFIG_MMU */
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index f9e23dfea76f8..7b66a23089381 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1629,6 +1629,7 @@ vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio,
 	pud_t *pud = vmf->pud;
 	struct mm_struct *mm = vma->vm_mm;
 	spinlock_t *ptl;
+	pud_t entry;
 
 	if (addr < vma->vm_start || addr >= vma->vm_end)
 		return VM_FAULT_SIGBUS;
@@ -1637,20 +1638,32 @@ vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio,
 		return VM_FAULT_SIGBUS;
 
 	ptl = pud_lock(mm, pud);
-
-	/*
-	 * If there is already an entry present we assume the folio is
-	 * already mapped, hence no need to take another reference. We
-	 * still call insert_pfn_pud() though in case the mapping needs
-	 * upgrading to writeable.
-	 */
-	if (pud_none(*vmf->pud)) {
+	if (pud_none(*pud)) {
 		folio_get(folio);
 		folio_add_file_rmap_pud(folio, &folio->page, vma);
 		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR);
+
+		entry = folio_mk_pud(folio, vma->vm_page_prot);
+		if (write) {
+			entry = pud_mkyoung(pud_mkdirty(entry));
+			entry = maybe_pud_mkwrite(entry, vma);
+		}
+		set_pud_at(mm, addr, pud, entry);
+		update_mmu_cache_pud(vma, addr, pud);
+	} else if (pud_present(*pud) && write) {
+		/*
+		 * We only allow for upgrading write permissions if the
+		 * same folio is already mapped.
+		 */
+		if (pud_pfn(*pud) == folio_pfn(folio)) {
+			entry = pud_mkyoung(*pud);
+			entry = maybe_pud_mkwrite(pud_mkdirty(entry), vma);
+			if (pudp_set_access_flags(vma, addr, pud, entry, 1))
+				update_mmu_cache_pud(vma, addr, pud);
+		} else {
+			WARN_ON_ONCE(1);
+		}
 	}
-	insert_pfn_pud(vma, addr, vmf->pud, pfn_to_pfn_t(folio_pfn(folio)),
-		write);
 	spin_unlock(ptl);
 
 	return VM_FAULT_NOPAGE;
-- 
2.49.0


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*()
  2025-06-03 21:16 [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
  2025-06-03 21:16 ` [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd() David Hildenbrand
  2025-06-03 21:16 ` [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud() David Hildenbrand
@ 2025-06-03 21:36 ` David Hildenbrand
  2025-06-05 23:47 ` Dan Williams
  3 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-03 21:36 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, Andrew Morton, Alistair Popple,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Zi Yan, Baolin Wang, Nico Pache,
	Ryan Roberts, Dev Jain, Dan Williams

On 03.06.25 23:16, David Hildenbrand wrote:
> Based on Linus' master.
> 
> While working on improving vm_normal_page() and friends, I stumbled
> over this issues: refcounted "normal" pages must not be marked
> using pmd_special() / pud_special().
> 
> Fortunately, so far there doesn't seem to be serious damage.
> 
> This is only compile-tested so far. Still looking for an easy way to test
> PMD/PUD mappings with DAX. Any tests I can easily run?

... aaaand I should have waited for the cross compiles.

s390x and loongarch are not happy about pud_mkhuge() and pfn_pud().

Need to fence folio_mk_pud() by CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud()
  2025-06-03 21:16 ` [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud() David Hildenbrand
@ 2025-06-03 22:02   ` David Hildenbrand
  2025-06-06  8:27   ` Oscar Salvador
  1 sibling, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-03 22:02 UTC (permalink / raw)
  To: linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, Andrew Morton, Alistair Popple,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Zi Yan, Baolin Wang, Nico Pache,
	Ryan Roberts, Dev Jain, Dan Williams

On 03.06.25 23:16, David Hildenbrand wrote:
> Marking PUDs that map a "normal" refcounted folios as special is
> against our rules documented for vm_normal_page().
> 
> Fortunately, there are not that many pud_special() check that can be
> mislead and are right now rather harmless: e.g., none so far
> bases decisions whether to grab a folio reference on that decision.
> 
> Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
> implications as it seems.
> 
> Getting this right will get more important as we introduce
> folio_normal_page_pud() and start using it in more place where we
> currently special-case based on other VMA flags.
> 
> Fix it by just inlining the relevant code, making the whole
> pud_none() handling cleaner.
> 
> Add folio_mk_pud() to mimic what we do with folio_mk_pmd().
> 
> While at it, make sure that the pud that is non-none is actually present
> before comparing PFNs.
> 
> Fixes: dbe54153296d ("mm/huge_memory: add vmf_insert_folio_pud()")
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>   include/linux/mm.h | 15 +++++++++++++++
>   mm/huge_memory.c   | 33 +++++++++++++++++++++++----------
>   2 files changed, 38 insertions(+), 10 deletions(-)
> 
> diff --git a/include/linux/mm.h b/include/linux/mm.h
> index 0ef2ba0c667af..047c8261d4002 100644
> --- a/include/linux/mm.h
> +++ b/include/linux/mm.h
> @@ -1816,6 +1816,21 @@ static inline pmd_t folio_mk_pmd(struct folio *folio, pgprot_t pgprot)
>   {
>   	return pmd_mkhuge(pfn_pmd(folio_pfn(folio), pgprot));
>   }
> +
> +/**
> + * folio_mk_pud - Create a PUD for this folio
> + * @folio: The folio to create a PUD for
> + * @pgprot: The page protection bits to use
> + *
> + * Create a page table entry for the first page of this folio.
> + * This is suitable for passing to set_pud_at().
> + *
> + * Return: A page table entry suitable for mapping this folio.
> + */
> +static inline pud_t folio_mk_pud(struct folio *folio, pgprot_t pgprot)
> +{
> +	return pud_mkhuge(pfn_pud(folio_pfn(folio), pgprot));
> +}
>   #endif
>   #endif /* CONFIG_MMU */

The following on top should make cross-compiles happy (git diff output because it's late
here, probably whitespace messed up):


diff --git a/include/linux/mm.h b/include/linux/mm.h
index 047c8261d4002..b7e2abd8ce0df 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -1817,6 +1817,7 @@ static inline pmd_t folio_mk_pmd(struct folio *folio, pgprot_t pgprot)
         return pmd_mkhuge(pfn_pmd(folio_pfn(folio), pgprot));
  }
  
+#ifdef CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD
  /**
   * folio_mk_pud - Create a PUD for this folio
   * @folio: The folio to create a PUD for
@@ -1831,7 +1832,8 @@ static inline pud_t folio_mk_pud(struct folio *folio, pgprot_t pgprot)
  {
         return pud_mkhuge(pfn_pud(folio_pfn(folio), pgprot));
  }
-#endif
+#endif /* CONFIG_HAVE_ARCH_TRANSPARENT_HUGEPAGE_PUD */
+#endif /* CONFIG_TRANSPARENT_HUGEPAGE */
  #endif /* CONFIG_MMU */
  
  static inline bool folio_has_pincount(const struct folio *folio

-- 
Cheers,

David / dhildenb


^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*()
  2025-06-03 21:16 [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
                   ` (2 preceding siblings ...)
  2025-06-03 21:36 ` [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
@ 2025-06-05 23:47 ` Dan Williams
  2025-06-06  7:28   ` David Hildenbrand
  3 siblings, 1 reply; 14+ messages in thread
From: Dan Williams @ 2025-06-05 23:47 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, David Hildenbrand, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

David Hildenbrand wrote:
> Based on Linus' master.
> 
> While working on improving vm_normal_page() and friends, I stumbled
> over this issues: refcounted "normal" pages must not be marked
> using pmd_special() / pud_special().
> 
> Fortunately, so far there doesn't seem to be serious damage.
> 
> This is only compile-tested so far. Still looking for an easy way to test
> PMD/PUD mappings with DAX. Any tests I can easily run?

The way I test this I would not classify as "easy", it is a bit of a pain
to setup, but it is passing here:

[root@host ndctl]# meson test -C build --suite ndctl:dax
ninja: Entering directory `/root/git/ndctl/build'
[168/168] Linking target cxl/cxl
 1/13 ndctl:dax / daxdev-errors.sh          OK              14.30s
 2/13 ndctl:dax / multi-dax.sh              OK               2.89s
 3/13 ndctl:dax / sub-section.sh            OK               8.40s
 4/13 ndctl:dax / dax-dev                   OK               0.06s
 5/13 ndctl:dax / dax-ext4.sh               OK              20.53s
 6/13 ndctl:dax / dax-xfs.sh                OK              20.34s
 7/13 ndctl:dax / device-dax                OK              11.67s
 8/13 ndctl:dax / revoke-devmem             OK               0.25s
 9/13 ndctl:dax / device-dax-fio.sh         OK              34.02s
10/13 ndctl:dax / daxctl-devices.sh         OK               3.44s
11/13 ndctl:dax / daxctl-create.sh          SKIP             0.32s   exit status 77
12/13 ndctl:dax / dm.sh                     OK               1.33s
13/13 ndctl:dax / mmap.sh                   OK              85.12s

...ignore the SKIP, that seems to be caused by an acpi-einj regression.

However, how about not duplicating the internals of insert_pfn_p[mu]d()
with something like the below. Either way you can add:

Tested-by: Dan Williams <dan.j.williams@intel.com>
Reviewed-by: Dan Williams <dan.j.williams@intel.com>

base-commit: a9dfb7db96f7bc1f30feae673aab7fdbfbc94e9c

-- 8< --
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index d3e66136e41a..cce4456aa62b 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -1372,9 +1372,9 @@ vm_fault_t do_huge_pmd_anonymous_page(struct vm_fault *vmf)
 	return __do_huge_pmd_anonymous_page(vmf);
 }
 
-static int insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
-		pmd_t *pmd, pfn_t pfn, pgprot_t prot, bool write,
-		pgtable_t pgtable)
+static int insert_pmd(struct vm_area_struct *vma, unsigned long addr,
+		       pmd_t *pmd, pfn_t pfn, struct folio *folio,
+		       pgprot_t prot, bool write, pgtable_t pgtable)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pmd_t entry;
@@ -1397,9 +1397,7 @@ static int insert_pfn_pmd(struct vm_area_struct *vma, unsigned long addr,
 	}
 
 	entry = pmd_mkhuge(pfn_t_pmd(pfn, prot));
-	if (pfn_t_devmap(pfn))
-		entry = pmd_mkdevmap(entry);
-	else
+	if (!folio)
 		entry = pmd_mkspecial(entry);
 	if (write) {
 		entry = pmd_mkyoung(pmd_mkdirty(entry));
@@ -1458,8 +1456,8 @@ vm_fault_t vmf_insert_pfn_pmd(struct vm_fault *vmf, pfn_t pfn, bool write)
 	pfnmap_setup_cachemode_pfn(pfn_t_to_pfn(pfn), &pgprot);
 
 	ptl = pmd_lock(vma->vm_mm, vmf->pmd);
-	error = insert_pfn_pmd(vma, addr, vmf->pmd, pfn, pgprot, write,
-			pgtable);
+	error = insert_pmd(vma, addr, vmf->pmd, pfn, NULL, pgprot, write,
+			   pgtable);
 	spin_unlock(ptl);
 	if (error && pgtable)
 		pte_free(vma->vm_mm, pgtable);
@@ -1496,9 +1494,8 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
 		folio_add_file_rmap_pmd(folio, &folio->page, vma);
 		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PMD_NR);
 	}
-	error = insert_pfn_pmd(vma, addr, vmf->pmd,
-			pfn_to_pfn_t(folio_pfn(folio)), vma->vm_page_prot,
-			write, pgtable);
+	error = insert_pmd(vma, addr, vmf->pmd, pfn_to_pfn_t(folio_pfn(folio)),
+			   folio, vma->vm_page_prot, write, pgtable);
 	spin_unlock(ptl);
 	if (error && pgtable)
 		pte_free(mm, pgtable);
@@ -1515,8 +1512,8 @@ static pud_t maybe_pud_mkwrite(pud_t pud, struct vm_area_struct *vma)
 	return pud;
 }
 
-static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
-		pud_t *pud, pfn_t pfn, bool write)
+static void insert_pud(struct vm_area_struct *vma, unsigned long addr,
+		       pud_t *pud, pfn_t pfn, struct folio *folio, bool write)
 {
 	struct mm_struct *mm = vma->vm_mm;
 	pgprot_t prot = vma->vm_page_prot;
@@ -1535,9 +1532,7 @@ static void insert_pfn_pud(struct vm_area_struct *vma, unsigned long addr,
 	}
 
 	entry = pud_mkhuge(pfn_t_pud(pfn, prot));
-	if (pfn_t_devmap(pfn))
-		entry = pud_mkdevmap(entry);
-	else
+	if (!folio)
 		entry = pud_mkspecial(entry);
 	if (write) {
 		entry = pud_mkyoung(pud_mkdirty(entry));
@@ -1581,7 +1576,7 @@ vm_fault_t vmf_insert_pfn_pud(struct vm_fault *vmf, pfn_t pfn, bool write)
 	pfnmap_setup_cachemode_pfn(pfn_t_to_pfn(pfn), &pgprot);
 
 	ptl = pud_lock(vma->vm_mm, vmf->pud);
-	insert_pfn_pud(vma, addr, vmf->pud, pfn, write);
+	insert_pud(vma, addr, vmf->pud, pfn, NULL, write);
 	spin_unlock(ptl);
 
 	return VM_FAULT_NOPAGE;
@@ -1616,7 +1611,7 @@ vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio,
 	/*
 	 * If there is already an entry present we assume the folio is
 	 * already mapped, hence no need to take another reference. We
-	 * still call insert_pfn_pud() though in case the mapping needs
+	 * still call insert_pud() though in case the mapping needs
 	 * upgrading to writeable.
 	 */
 	if (pud_none(*vmf->pud)) {
@@ -1624,8 +1619,8 @@ vm_fault_t vmf_insert_folio_pud(struct vm_fault *vmf, struct folio *folio,
 		folio_add_file_rmap_pud(folio, &folio->page, vma);
 		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PUD_NR);
 	}
-	insert_pfn_pud(vma, addr, vmf->pud, pfn_to_pfn_t(folio_pfn(folio)),
-		write);
+	insert_pud(vma, addr, vmf->pud, pfn_to_pfn_t(folio_pfn(folio)), folio,
+		   write);
 	spin_unlock(ptl);
 
 	return VM_FAULT_NOPAGE;

^ permalink raw reply related	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*()
  2025-06-05 23:47 ` Dan Williams
@ 2025-06-06  7:28   ` David Hildenbrand
  0 siblings, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-06  7:28 UTC (permalink / raw)
  To: Dan Williams, linux-kernel
  Cc: linux-mm, nvdimm, linux-cxl, Andrew Morton, Alistair Popple,
	Lorenzo Stoakes, Liam R. Howlett, Vlastimil Babka, Mike Rapoport,
	Suren Baghdasaryan, Michal Hocko, Zi Yan, Baolin Wang, Nico Pache,
	Ryan Roberts, Dev Jain

Hi Dan,

On 06.06.25 01:47, Dan Williams wrote:
> David Hildenbrand wrote:
>> Based on Linus' master.
>>
>> While working on improving vm_normal_page() and friends, I stumbled
>> over this issues: refcounted "normal" pages must not be marked
>> using pmd_special() / pud_special().
>>
>> Fortunately, so far there doesn't seem to be serious damage.
>>
>> This is only compile-tested so far. Still looking for an easy way to test
>> PMD/PUD mappings with DAX. Any tests I can easily run?
> 
> The way I test this I would not classify as "easy", it is a bit of a pain
> to setup, but it is passing here:

I guess most of the instructions are in

https://github.com/pmem/ndctl

?

I would assume that we need to set aside some special dax area using 
early boot params (memmap=).

Might come in handy in the future.

> 
> [root@host ndctl]# meson test -C build --suite ndctl:dax
> ninja: Entering directory `/root/git/ndctl/build'
> [168/168] Linking target cxl/cxl
>   1/13 ndctl:dax / daxdev-errors.sh          OK              14.30s
>   2/13 ndctl:dax / multi-dax.sh              OK               2.89s
>   3/13 ndctl:dax / sub-section.sh            OK               8.40s
>   4/13 ndctl:dax / dax-dev                   OK               0.06s
>   5/13 ndctl:dax / dax-ext4.sh               OK              20.53s
>   6/13 ndctl:dax / dax-xfs.sh                OK              20.34s
>   7/13 ndctl:dax / device-dax                OK              11.67s
>   8/13 ndctl:dax / revoke-devmem             OK               0.25s
>   9/13 ndctl:dax / device-dax-fio.sh         OK              34.02s
> 10/13 ndctl:dax / daxctl-devices.sh         OK               3.44s
> 11/13 ndctl:dax / daxctl-create.sh          SKIP             0.32s   exit status 77
> 12/13 ndctl:dax / dm.sh                     OK               1.33s
> 13/13 ndctl:dax / mmap.sh                   OK              85.12s
> 
> ...ignore the SKIP, that seems to be caused by an acpi-einj regression.

Thanks for running these tests!

> 
> However, how about not duplicating the internals of insert_pfn_p[mu]d()
> with something like the below. Either way you can add:

I considered that, but I prefer the current end result where we cleanup 
the pmd_none() handling and not mess with folios and pfns at the same time.

... just like we do for insert_pfn() vs. insert_page(), I don't think 
these code paths should be unified.

(we should do more sanity checks like validate_page_before_insert() 
etc., but that's something for another day :) )


> 
> Tested-by: Dan Williams <dan.j.williams@intel.com>
> Reviewed-by: Dan Williams <dan.j.williams@intel.com>

Thanks!


Let me resend with the fixup squashed.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-03 21:16 ` [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd() David Hildenbrand
@ 2025-06-06  8:20   ` Oscar Salvador
  2025-06-06  8:23     ` David Hildenbrand
  2025-06-06  8:27   ` Oscar Salvador
  1 sibling, 1 reply; 14+ messages in thread
From: Oscar Salvador @ 2025-06-06  8:20 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On Tue, Jun 03, 2025 at 11:16:33PM +0200, David Hildenbrand wrote:
> Marking PMDs that map a "normal" refcounted folios as special is
> against our rules documented for vm_normal_page().
> 
> Fortunately, there are not that many pmd_special() check that can be
> mislead, and most vm_normal_page_pmd()/vm_normal_folio_pmd() users that
> would get this wrong right now are rather harmless: e.g., none so far
> bases decisions whether to grab a folio reference on that decision.
> 
> Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
> implications as it seems.
> 
> Getting this right will get more important as we use
> folio_normal_page_pmd() in more places.
> 
> Fix it by just inlining the relevant code, making the whole
> pmd_none() handling cleaner. We can now use folio_mk_pmd().
> 
> While at it, make sure that a pmd that is not-none is actually present
> before comparing PFNs.
> 
> Fixes: 6c88f72691f8 ("mm/huge_memory: add vmf_insert_folio_pmd()")
> Signed-off-by: David Hildenbrand <david@redhat.com>

Hi David,

> ---
>  mm/huge_memory.c | 39 ++++++++++++++++++++++++++++++++-------
>  1 file changed, 32 insertions(+), 7 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index d3e66136e41a3..f9e23dfea76f8 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -1474,9 +1474,10 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
>  	struct vm_area_struct *vma = vmf->vma;
>  	unsigned long addr = vmf->address & PMD_MASK;
>  	struct mm_struct *mm = vma->vm_mm;
> +	pmd_t *pmd = vmf->pmd;
>  	spinlock_t *ptl;
>  	pgtable_t pgtable = NULL;
> -	int error;
> +	pmd_t entry;
>  
>  	if (addr < vma->vm_start || addr >= vma->vm_end)
>  		return VM_FAULT_SIGBUS;
> @@ -1490,17 +1491,41 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
>  			return VM_FAULT_OOM;
>  	}
>  
> -	ptl = pmd_lock(mm, vmf->pmd);
> -	if (pmd_none(*vmf->pmd)) {
> +	ptl = pmd_lock(mm, pmd);
> +	if (pmd_none(*pmd)) {
>  		folio_get(folio);
>  		folio_add_file_rmap_pmd(folio, &folio->page, vma);
>  		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PMD_NR);
> +
> +		entry = folio_mk_pmd(folio, vma->vm_page_prot);
> +		if (write) {
> +			entry = pmd_mkyoung(pmd_mkdirty(entry));
> +			entry = maybe_pmd_mkwrite(entry, vma);
> +		}
> +		set_pmd_at(mm, addr, pmd, entry);
> +		update_mmu_cache_pmd(vma, addr, pmd);
> +
> +		if (pgtable) {
> +			pgtable_trans_huge_deposit(mm, pmd, pgtable);
> +			mm_inc_nr_ptes(mm);
> +			pgtable = NULL;
> +		}
> +	} else if (pmd_present(*pmd) && write) {
> +		/*
> +		 * We only allow for upgrading write permissions if the
> +		 * same folio is already mapped.
> +		 */
> +		if (pmd_pfn(*pmd) == folio_pfn(folio)) {
> +			entry = pmd_mkyoung(*pmd);
> +			entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
> +			if (pmdp_set_access_flags(vma, addr, pmd, entry, 1))
> +				update_mmu_cache_pmd(vma, addr, pmd);
> +		} else {
> +			WARN_ON_ONCE(!is_huge_zero_pmd(*pmd));
> +		}

So, this is pretty much insert_pfn_pmd without pmd_mkdevmap/pmd_mkspecial().
I guess vmf_inser_folio_pmd() doesn't have to be concerned with devmaps
either, right?

Looks good to me, just a nit: would it not be better to pass a boolean
to insert_pfn_pmd() that lets it know whether it "can" create a
devmap/special entries?


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-06  8:20   ` Oscar Salvador
@ 2025-06-06  8:23     ` David Hildenbrand
  2025-06-06  8:26       ` Oscar Salvador
  0 siblings, 1 reply; 14+ messages in thread
From: David Hildenbrand @ 2025-06-06  8:23 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On 06.06.25 10:20, Oscar Salvador wrote:
> On Tue, Jun 03, 2025 at 11:16:33PM +0200, David Hildenbrand wrote:
>> Marking PMDs that map a "normal" refcounted folios as special is
>> against our rules documented for vm_normal_page().
>>
>> Fortunately, there are not that many pmd_special() check that can be
>> mislead, and most vm_normal_page_pmd()/vm_normal_folio_pmd() users that
>> would get this wrong right now are rather harmless: e.g., none so far
>> bases decisions whether to grab a folio reference on that decision.
>>
>> Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
>> implications as it seems.
>>
>> Getting this right will get more important as we use
>> folio_normal_page_pmd() in more places.
>>
>> Fix it by just inlining the relevant code, making the whole
>> pmd_none() handling cleaner. We can now use folio_mk_pmd().
>>
>> While at it, make sure that a pmd that is not-none is actually present
>> before comparing PFNs.
>>
>> Fixes: 6c88f72691f8 ("mm/huge_memory: add vmf_insert_folio_pmd()")
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Hi David,
> 
>> ---
>>   mm/huge_memory.c | 39 ++++++++++++++++++++++++++++++++-------
>>   1 file changed, 32 insertions(+), 7 deletions(-)
>>
>> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
>> index d3e66136e41a3..f9e23dfea76f8 100644
>> --- a/mm/huge_memory.c
>> +++ b/mm/huge_memory.c
>> @@ -1474,9 +1474,10 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
>>   	struct vm_area_struct *vma = vmf->vma;
>>   	unsigned long addr = vmf->address & PMD_MASK;
>>   	struct mm_struct *mm = vma->vm_mm;
>> +	pmd_t *pmd = vmf->pmd;
>>   	spinlock_t *ptl;
>>   	pgtable_t pgtable = NULL;
>> -	int error;
>> +	pmd_t entry;
>>   
>>   	if (addr < vma->vm_start || addr >= vma->vm_end)
>>   		return VM_FAULT_SIGBUS;
>> @@ -1490,17 +1491,41 @@ vm_fault_t vmf_insert_folio_pmd(struct vm_fault *vmf, struct folio *folio,
>>   			return VM_FAULT_OOM;
>>   	}
>>   
>> -	ptl = pmd_lock(mm, vmf->pmd);
>> -	if (pmd_none(*vmf->pmd)) {
>> +	ptl = pmd_lock(mm, pmd);
>> +	if (pmd_none(*pmd)) {
>>   		folio_get(folio);
>>   		folio_add_file_rmap_pmd(folio, &folio->page, vma);
>>   		add_mm_counter(mm, mm_counter_file(folio), HPAGE_PMD_NR);
>> +
>> +		entry = folio_mk_pmd(folio, vma->vm_page_prot);
>> +		if (write) {
>> +			entry = pmd_mkyoung(pmd_mkdirty(entry));
>> +			entry = maybe_pmd_mkwrite(entry, vma);
>> +		}
>> +		set_pmd_at(mm, addr, pmd, entry);
>> +		update_mmu_cache_pmd(vma, addr, pmd);
>> +
>> +		if (pgtable) {
>> +			pgtable_trans_huge_deposit(mm, pmd, pgtable);
>> +			mm_inc_nr_ptes(mm);
>> +			pgtable = NULL;
>> +		}
>> +	} else if (pmd_present(*pmd) && write) {
>> +		/*
>> +		 * We only allow for upgrading write permissions if the
>> +		 * same folio is already mapped.
>> +		 */
>> +		if (pmd_pfn(*pmd) == folio_pfn(folio)) {
>> +			entry = pmd_mkyoung(*pmd);
>> +			entry = maybe_pmd_mkwrite(pmd_mkdirty(entry), vma);
>> +			if (pmdp_set_access_flags(vma, addr, pmd, entry, 1))
>> +				update_mmu_cache_pmd(vma, addr, pmd);
>> +		} else {
>> +			WARN_ON_ONCE(!is_huge_zero_pmd(*pmd));
>> +		}
> 
> So, this is pretty much insert_pfn_pmd without pmd_mkdevmap/pmd_mkspecial().
> I guess vmf_inser_folio_pmd() doesn't have to be concerned with devmaps
> either, right?
> 
> Looks good to me, just a nit: would it not be better to pass a boolean
> to insert_pfn_pmd() that lets it know whether it "can" create a
> devmap/special entries?

See my reply to Dan.

Yet another boolean, yuck. Passing the folio and the pfn, yuck.

(I have a strong opinion here ;) )

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-06  8:23     ` David Hildenbrand
@ 2025-06-06  8:26       ` Oscar Salvador
  2025-06-06  8:52         ` David Hildenbrand
  2025-06-06 18:41         ` David Hildenbrand
  0 siblings, 2 replies; 14+ messages in thread
From: Oscar Salvador @ 2025-06-06  8:26 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On Fri, Jun 06, 2025 at 10:23:11AM +0200, David Hildenbrand wrote:
> See my reply to Dan.
> 
> Yet another boolean, yuck. Passing the folio and the pfn, yuck.
> 
> (I have a strong opinion here ;) )

ok, I see it was already considered. No more questions then ;-)


-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-03 21:16 ` [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd() David Hildenbrand
  2025-06-06  8:20   ` Oscar Salvador
@ 2025-06-06  8:27   ` Oscar Salvador
  1 sibling, 0 replies; 14+ messages in thread
From: Oscar Salvador @ 2025-06-06  8:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On Tue, Jun 03, 2025 at 11:16:33PM +0200, David Hildenbrand wrote:
> Marking PMDs that map a "normal" refcounted folios as special is
> against our rules documented for vm_normal_page().
> 
> Fortunately, there are not that many pmd_special() check that can be
> mislead, and most vm_normal_page_pmd()/vm_normal_folio_pmd() users that
> would get this wrong right now are rather harmless: e.g., none so far
> bases decisions whether to grab a folio reference on that decision.
> 
> Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
> implications as it seems.
> 
> Getting this right will get more important as we use
> folio_normal_page_pmd() in more places.
> 
> Fix it by just inlining the relevant code, making the whole
> pmd_none() handling cleaner. We can now use folio_mk_pmd().
> 
> While at it, make sure that a pmd that is not-none is actually present
> before comparing PFNs.
> 
> Fixes: 6c88f72691f8 ("mm/huge_memory: add vmf_insert_folio_pmd()")
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>



-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud()
  2025-06-03 21:16 ` [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud() David Hildenbrand
  2025-06-03 22:02   ` David Hildenbrand
@ 2025-06-06  8:27   ` Oscar Salvador
  1 sibling, 0 replies; 14+ messages in thread
From: Oscar Salvador @ 2025-06-06  8:27 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On Tue, Jun 03, 2025 at 11:16:34PM +0200, David Hildenbrand wrote:
> Marking PUDs that map a "normal" refcounted folios as special is
> against our rules documented for vm_normal_page().
> 
> Fortunately, there are not that many pud_special() check that can be
> mislead and are right now rather harmless: e.g., none so far
> bases decisions whether to grab a folio reference on that decision.
> 
> Well, and GUP-fast will fallback to GUP-slow. All in all, so far no big
> implications as it seems.
> 
> Getting this right will get more important as we introduce
> folio_normal_page_pud() and start using it in more place where we
> currently special-case based on other VMA flags.
> 
> Fix it by just inlining the relevant code, making the whole
> pud_none() handling cleaner.
> 
> Add folio_mk_pud() to mimic what we do with folio_mk_pmd().
> 
> While at it, make sure that the pud that is non-none is actually present
> before comparing PFNs.
> 
> Fixes: dbe54153296d ("mm/huge_memory: add vmf_insert_folio_pud()")
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Oscar Salvador <osalvador@suse.de>

 

-- 
Oscar Salvador
SUSE Labs

^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-06  8:26       ` Oscar Salvador
@ 2025-06-06  8:52         ` David Hildenbrand
  2025-06-06 18:41         ` David Hildenbrand
  1 sibling, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-06  8:52 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On 06.06.25 10:26, Oscar Salvador wrote:
> On Fri, Jun 06, 2025 at 10:23:11AM +0200, David Hildenbrand wrote:
>> See my reply to Dan.
>>
>> Yet another boolean, yuck. Passing the folio and the pfn, yuck.
>>
>> (I have a strong opinion here ;) )
> 
> ok, I see it was already considered. No more questions then ;-)

Yeah, the first version I had did exactly that (second boolean), and I 
concluded that we are simply not "inserting PFNs".

Especially the second patch is much clearer on the pud_none() handling 
(similar to insert_page_into_pte_locked() where we only have exactly one 
such check)

I'll have some follow-up patches planned that will add more pfn-specific 
and folio-specific sanity checks.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

* Re: [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd()
  2025-06-06  8:26       ` Oscar Salvador
  2025-06-06  8:52         ` David Hildenbrand
@ 2025-06-06 18:41         ` David Hildenbrand
  1 sibling, 0 replies; 14+ messages in thread
From: David Hildenbrand @ 2025-06-06 18:41 UTC (permalink / raw)
  To: Oscar Salvador
  Cc: linux-kernel, linux-mm, nvdimm, linux-cxl, Andrew Morton,
	Alistair Popple, Lorenzo Stoakes, Liam R. Howlett,
	Vlastimil Babka, Mike Rapoport, Suren Baghdasaryan, Michal Hocko,
	Zi Yan, Baolin Wang, Nico Pache, Ryan Roberts, Dev Jain,
	Dan Williams

On 06.06.25 10:26, Oscar Salvador wrote:
> On Fri, Jun 06, 2025 at 10:23:11AM +0200, David Hildenbrand wrote:
>> See my reply to Dan.
>>
>> Yet another boolean, yuck. Passing the folio and the pfn, yuck.
>>
>> (I have a strong opinion here ;) )
> 
> ok, I see it was already considered. No more questions then ;-)

Okay, change of plans, let me try to unify both functions in a clean way :)

I'll only have to find a clean way to make the other stuff I have 
planned with these functions work ...

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 14+ messages in thread

end of thread, other threads:[~2025-06-06 18:41 UTC | newest]

Thread overview: 14+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-03 21:16 [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
2025-06-03 21:16 ` [PATCH v1 1/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pmd() David Hildenbrand
2025-06-06  8:20   ` Oscar Salvador
2025-06-06  8:23     ` David Hildenbrand
2025-06-06  8:26       ` Oscar Salvador
2025-06-06  8:52         ` David Hildenbrand
2025-06-06 18:41         ` David Hildenbrand
2025-06-06  8:27   ` Oscar Salvador
2025-06-03 21:16 ` [PATCH v1 2/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_pud() David Hildenbrand
2025-06-03 22:02   ` David Hildenbrand
2025-06-06  8:27   ` Oscar Salvador
2025-06-03 21:36 ` [PATCH v1 0/2] mm/huge_memory: don't mark refcounted pages special in vmf_insert_folio_*() David Hildenbrand
2025-06-05 23:47 ` Dan Williams
2025-06-06  7:28   ` David Hildenbrand

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).