[PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range

wireguard.lists.zx2c4.com archive mirror
 help / color / mirror / Atom feed

* [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof()
       [not found] <20250827220141.262669-1-david@redhat.com>
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <3hpjmfa6p3onfdv4ma4nv2tdggvsyarh7m36aufy6hvwqtp2wd@2odohwxkl3rk>
       [not found]   ` <f195300e-42e2-4eaa-84c8-c37501c3339c@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 07/36] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
                   ` (22 subsequent siblings)
  23 siblings, 2 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Zi Yan, SeongJae Park, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86

Let's reject them early, which in turn makes folio_alloc_gigantic() reject
them properly.

To avoid converting from order to nr_pages, let's just add MAX_FOLIO_ORDER
and calculate MAX_FOLIO_NR_PAGES based on that.

Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: SeongJae Park <sj@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 6 ++++--
 mm/page_alloc.c    | 5 ++++-
 2 files changed, 8 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 00c8a54127d37..77737cbf2216a 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2055,11 +2055,13 @@ static inline long folio_nr_pages(const struct folio *folio)
 
 /* Only hugetlbfs can allocate folios larger than MAX_ORDER */
 #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-#define MAX_FOLIO_NR_PAGES	(1UL << PUD_ORDER)
+#define MAX_FOLIO_ORDER		PUD_ORDER
 #else
-#define MAX_FOLIO_NR_PAGES	MAX_ORDER_NR_PAGES
+#define MAX_FOLIO_ORDER		MAX_PAGE_ORDER
 #endif
 
+#define MAX_FOLIO_NR_PAGES	(1UL << MAX_FOLIO_ORDER)
+
 /*
  * compound_nr() returns the number of pages in this potentially compound
  * page.  compound_nr() can be called on a tail page, and is defined to
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index baead29b3e67b..426bc404b80cc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6833,6 +6833,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
 int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 			      acr_flags_t alloc_flags, gfp_t gfp_mask)
 {
+	const unsigned int order = ilog2(end - start);
 	unsigned long outer_start, outer_end;
 	int ret = 0;
 
@@ -6850,6 +6851,9 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 					    PB_ISOLATE_MODE_CMA_ALLOC :
 					    PB_ISOLATE_MODE_OTHER;
 
+	if (WARN_ON_ONCE((gfp_mask & __GFP_COMP) && order > MAX_FOLIO_ORDER))
+		return -EINVAL;
+
 	gfp_mask = current_gfp_context(gfp_mask);
 	if (__alloc_contig_verify_gfp_mask(gfp_mask, (gfp_t *)&cc.gfp_mask))
 		return -EINVAL;
@@ -6947,7 +6951,6 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 			free_contig_range(end, outer_end - end);
 	} else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
 		struct page *head = pfn_to_page(start);
-		int order = ilog2(end - start);
 
 		check_new_pages(head, order);
 		prep_new_page(head, order, gfp_mask, 0);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 07/36] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages()
       [not found] <20250827220141.262669-1-david@redhat.com>
  2025-08-27 22:01 ` [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 08/36] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
                   ` (21 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, SeongJae Park, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Let's reject unreasonable folio sizes early, where we can still fail.
We'll add sanity checks to prepare_compound_head/prepare_compound_page
next.

Is there a way to configure a system such that unreasonable folio sizes
would be possible? It would already be rather questionable.

If so, we'd probably want to bail out earlier, where we can avoid a
WARN and just report a proper error message that indicates where
something went wrong such that we messed up.

Acked-by: SeongJae Park <sj@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/memremap.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/mm/memremap.c b/mm/memremap.c
index b0ce0d8254bd8..a2d4bb88f64b6 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -275,6 +275,9 @@ void *memremap_pages(struct dev_pagemap *pgmap, int nid)
 
 	if (WARN_ONCE(!nr_range, "nr_range must be specified\n"))
 		return ERR_PTR(-EINVAL);
+	if (WARN_ONCE(pgmap->vmemmap_shift > MAX_FOLIO_ORDER,
+		      "requested folio size unsupported\n"))
+		return ERR_PTR(-EINVAL);
 
 	switch (pgmap->type) {
 	case MEMORY_DEVICE_PRIVATE:
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 08/36] mm/hugetlb: check for unreasonable folio sizes when registering hstate
       [not found] <20250827220141.262669-1-david@redhat.com>
  2025-08-27 22:01 ` [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 07/36] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <fa3425dd-df25-4a0b-a27e-614c81d301c4@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 09/36] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
                   ` (20 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Let's check that no hstate that corresponds to an unreasonable folio size
is registered by an architecture. If we were to succeed registering, we
could later try allocating an unsupported gigantic folio size.

Further, let's add a BUILD_BUG_ON() for checking that HUGETLB_PAGE_ORDER
is sane at build time. As HUGETLB_PAGE_ORDER is dynamic on powerpc, we have
to use a BUILD_BUG_ON_INVALID() to make it compile.

No existing kernel configuration should be able to trigger this check:
either SPARSEMEM without SPARSEMEM_VMEMMAP cannot be configured or
gigantic folios will not exceed a memory section (the case on sparse).

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/hugetlb.c | 2 ++
 1 file changed, 2 insertions(+)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 572b6f7772841..4a97e4f14c0dc 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -4657,6 +4657,7 @@ static int __init hugetlb_init(void)
 
 	BUILD_BUG_ON(sizeof_field(struct page, private) * BITS_PER_BYTE <
 			__NR_HPAGEFLAGS);
+	BUILD_BUG_ON_INVALID(HUGETLB_PAGE_ORDER > MAX_FOLIO_ORDER);
 
 	if (!hugepages_supported()) {
 		if (hugetlb_max_hstate || default_hstate_max_huge_pages)
@@ -4740,6 +4741,7 @@ void __init hugetlb_add_hstate(unsigned int order)
 	}
 	BUG_ON(hugetlb_max_hstate >= HUGE_MAX_HSTATE);
 	BUG_ON(order < order_base_2(__NR_USED_SUBPAGE));
+	WARN_ON(order > MAX_FOLIO_ORDER);
 	h = &hstates[hugetlb_max_hstate++];
 	__mutex_init(&h->resize_lock, "resize mutex", &h->resize_key);
 	h->order = order;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 09/36] mm/mm_init: make memmap_init_compound() look more like prep_compound_page()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (2 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 08/36] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 10/36] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
                   ` (19 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Mike Rapoport (Microsoft), Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

Grepping for "prep_compound_page" leaves on clueless how devdax gets its
compound pages initialized.

Let's add a comment that might help finding this open-coded
prep_compound_page() initialization more easily.

Further, let's be less smart about the ordering of initialization and just
perform the prep_compound_head() call after all tail pages were
initialized: just like prep_compound_page() does.

No need for a comment to describe the initialization order: again,
just like prep_compound_page().

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/mm_init.c | 15 +++++++--------
 1 file changed, 7 insertions(+), 8 deletions(-)

diff --git a/mm/mm_init.c b/mm/mm_init.c
index 5c21b3af216b2..df614556741a4 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -1091,6 +1091,12 @@ static void __ref memmap_init_compound(struct page *head,
 	unsigned long pfn, end_pfn = head_pfn + nr_pages;
 	unsigned int order = pgmap->vmemmap_shift;
 
+	/*
+	 * We have to initialize the pages, including setting up page links.
+	 * prep_compound_page() does not take care of that, so instead we
+	 * open-code prep_compound_page() so we can take care of initializing
+	 * the pages in the same go.
+	 */
 	__SetPageHead(head);
 	for (pfn = head_pfn + 1; pfn < end_pfn; pfn++) {
 		struct page *page = pfn_to_page(pfn);
@@ -1098,15 +1104,8 @@ static void __ref memmap_init_compound(struct page *head,
 		__init_zone_device_page(page, pfn, zone_idx, nid, pgmap);
 		prep_compound_tail(head, pfn - head_pfn);
 		set_page_count(page, 0);
-
-		/*
-		 * The first tail page stores important compound page info.
-		 * Call prep_compound_head() after the first tail page has
-		 * been initialized, to not have the data overwritten.
-		 */
-		if (pfn == head_pfn + 1)
-			prep_compound_head(head, order);
 	}
+	prep_compound_head(head, order);
 }
 
 void __ref memmap_init_zone_device(struct zone *zone,
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 10/36] mm: sanity-check maximum folio size in folio_set_order()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (3 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 09/36] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <f0c6e9f6-df09-4b10-9338-7bfe4aa46601@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 11/36] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
                   ` (18 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Zi Yan, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86

Let's sanity-check in folio_set_order() whether we would be trying to
create a folio with an order that would make it exceed MAX_FOLIO_ORDER.

This will enable the check whenever a folio/compound page is initialized
through prepare_compound_head() / prepare_compound_page().

Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/internal.h | 1 +
 1 file changed, 1 insertion(+)

diff --git a/mm/internal.h b/mm/internal.h
index 45da9ff5694f6..9b0129531d004 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -755,6 +755,7 @@ static inline void folio_set_order(struct folio *folio, unsigned int order)
 {
 	if (WARN_ON_ONCE(!order || !folio_test_large(folio)))
 		return;
+	VM_WARN_ON_ONCE(order > MAX_FOLIO_ORDER);
 
 	folio->_flags_1 = (folio->_flags_1 & ~0xffUL) | order;
 #ifdef NR_PAGES_IN_LARGE_FOLIO
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 11/36] mm: limit folio/compound page sizes in problematic kernel configs
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (4 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 10/36] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <baa1b6cf-2fde-4149-8cdf-4b54e2d7c60d@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 12/36] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
                   ` (17 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Zi Yan, Mike Rapoport (Microsoft),
	Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86

Let's limit the maximum folio size in problematic kernel config where
the memmap is allocated per memory section (SPARSEMEM without
SPARSEMEM_VMEMMAP) to a single memory section.

Currently, only a single architectures supports ARCH_HAS_GIGANTIC_PAGE
but not SPARSEMEM_VMEMMAP: sh.

Fortunately, the biggest hugetlb size sh supports is 64 MiB
(HUGETLB_PAGE_SIZE_64MB) and the section size is at least 64 MiB
(SECTION_SIZE_BITS == 26), so their use case is not degraded.

As folios and memory sections are naturally aligned to their order-2 size
in memory, consequently a single folio can no longer span multiple memory
sections on these problematic kernel configs.

nth_page() is no longer required when operating within a single compound
page / folio.

Reviewed-by: Zi Yan <ziy@nvidia.com>
Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h | 22 ++++++++++++++++++----
 1 file changed, 18 insertions(+), 4 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 77737cbf2216a..2dee79fa2efcf 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -2053,11 +2053,25 @@ static inline long folio_nr_pages(const struct folio *folio)
 	return folio_large_nr_pages(folio);
 }
 
-/* Only hugetlbfs can allocate folios larger than MAX_ORDER */
-#ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
-#define MAX_FOLIO_ORDER		PUD_ORDER
-#else
+#if !defined(CONFIG_ARCH_HAS_GIGANTIC_PAGE)
+/*
+ * We don't expect any folios that exceed buddy sizes (and consequently
+ * memory sections).
+ */
 #define MAX_FOLIO_ORDER		MAX_PAGE_ORDER
+#elif defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
+/*
+ * Only pages within a single memory section are guaranteed to be
+ * contiguous. By limiting folios to a single memory section, all folio
+ * pages are guaranteed to be contiguous.
+ */
+#define MAX_FOLIO_ORDER		PFN_SECTION_SHIFT
+#else
+/*
+ * There is no real limit on the folio size. We limit them to the maximum we
+ * currently expect (e.g., hugetlb, dax).
+ */
+#define MAX_FOLIO_ORDER		PUD_ORDER
 #endif
 
 #define MAX_FOLIO_NR_PAGES	(1UL << MAX_FOLIO_ORDER)
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 12/36] mm: simplify folio_page() and folio_page_idx()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (5 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 11/36] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
                   ` (16 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Zi Yan, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86

Now that a single folio/compound page can no longer span memory sections
in problematic kernel configurations, we can stop using nth_page().

While at it, turn both macros into static inline functions and add
kernel doc for folio_page_idx().

Reviewed-by: Zi Yan <ziy@nvidia.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h         | 16 ++++++++++++++--
 include/linux/page-flags.h |  5 ++++-
 2 files changed, 18 insertions(+), 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2dee79fa2efcf..f6880e3225c5c 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -210,10 +210,8 @@ extern unsigned long sysctl_admin_reserve_kbytes;
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
-#define folio_page_idx(folio, p)	(page_to_pfn(p) - folio_pfn(folio))
 #else
 #define nth_page(page,n) ((page) + (n))
-#define folio_page_idx(folio, p)	((p) - &(folio)->page)
 #endif
 
 /* to align the pointer to the (next) page boundary */
@@ -225,6 +223,20 @@ extern unsigned long sysctl_admin_reserve_kbytes;
 /* test whether an address (unsigned long or pointer) is aligned to PAGE_SIZE */
 #define PAGE_ALIGNED(addr)	IS_ALIGNED((unsigned long)(addr), PAGE_SIZE)
 
+/**
+ * folio_page_idx - Return the number of a page in a folio.
+ * @folio: The folio.
+ * @page: The folio page.
+ *
+ * This function expects that the page is actually part of the folio.
+ * The returned number is relative to the start of the folio.
+ */
+static inline unsigned long folio_page_idx(const struct folio *folio,
+		const struct page *page)
+{
+	return page - &folio->page;
+}
+
 static inline struct folio *lru_to_folio(struct list_head *head)
 {
 	return list_entry((head)->prev, struct folio, lru);
diff --git a/include/linux/page-flags.h b/include/linux/page-flags.h
index 5ee6ffbdbf831..faf17ca211b4f 100644
--- a/include/linux/page-flags.h
+++ b/include/linux/page-flags.h
@@ -316,7 +316,10 @@ static __always_inline unsigned long _compound_head(const struct page *page)
  * check that the page number lies within @folio; the caller is presumed
  * to have a reference to the page.
  */
-#define folio_page(folio, n)	nth_page(&(folio)->page, n)
+static inline struct page *folio_page(struct folio *folio, unsigned long n)
+{
+	return &folio->page + n;
+}
 
 static __always_inline int PageTail(const struct page *page)
 {
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (6 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 12/36] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-28  7:21   ` Mike Rapoport
       [not found]   ` <cebd5356-0fc6-40aa-9bc6-a3a5ffe918f8@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 14/36] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
                   ` (15 subsequent siblings)
  23 siblings, 2 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

We can now safely iterate over all pages in a folio, so no need for the
pfn_to_page().

Also, as we already force the refcount in __init_single_page() to 1,
we can just set the refcount to 0 and avoid page_ref_freeze() +
VM_BUG_ON. Likely, in the future, we would just want to tell
__init_single_page() to which value to initialize the refcount.

Further, adjust the comments to highlight that we are dealing with an
open-coded prep_compound_page() variant, and add another comment explaining
why we really need the __init_single_page() only on the tail pages.

Note that the current code was likely problematic, but we never ran into
it: prep_compound_tail() would have been called with an offset that might
exceed a memory section, and prep_compound_tail() would have simply
added that offset to the page pointer -- which would not have done the
right thing on sparsemem without vmemmap.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/hugetlb.c | 20 ++++++++++++--------
 1 file changed, 12 insertions(+), 8 deletions(-)

diff --git a/mm/hugetlb.c b/mm/hugetlb.c
index 4a97e4f14c0dc..1f42186a85ea4 100644
--- a/mm/hugetlb.c
+++ b/mm/hugetlb.c
@@ -3237,17 +3237,18 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio,
 {
 	enum zone_type zone = zone_idx(folio_zone(folio));
 	int nid = folio_nid(folio);
+	struct page *page = folio_page(folio, start_page_number);
 	unsigned long head_pfn = folio_pfn(folio);
 	unsigned long pfn, end_pfn = head_pfn + end_page_number;
-	int ret;
-
-	for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) {
-		struct page *page = pfn_to_page(pfn);
 
+	/*
+	 * We mark all tail pages with memblock_reserved_mark_noinit(),
+	 * so these pages are completely uninitialized.
+	 */
+	for (pfn = head_pfn + start_page_number; pfn < end_pfn; page++, pfn++) {
 		__init_single_page(page, pfn, zone, nid);
 		prep_compound_tail((struct page *)folio, pfn - head_pfn);
-		ret = page_ref_freeze(page, 1);
-		VM_BUG_ON(!ret);
+		set_page_count(page, 0);
 	}
 }
 
@@ -3257,12 +3258,15 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio,
 {
 	int ret;
 
-	/* Prepare folio head */
+	/*
+	 * This is an open-coded prep_compound_page() whereby we avoid
+	 * walking pages twice by initializing/preparing+freezing them in the
+	 * same go.
+	 */
 	__folio_clear_reserved(folio);
 	__folio_set_head(folio);
 	ret = folio_ref_freeze(folio, 1);
 	VM_BUG_ON(!ret);
-	/* Initialize the necessary tail struct pages */
 	hugetlb_folio_init_tail_vmemmap(folio, 1, nr_pages);
 	prep_compound_head((struct page *)folio, huge_page_order(h));
 }
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 14/36] mm/mm/percpu-km: drop nth_page() usage within single allocation
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (7 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 15/36] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
                   ` (14 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

We're allocating a higher-order page from the buddy. For these pages
(that are guaranteed to not exceed a single memory section) there is no
need to use nth_page().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/percpu-km.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/percpu-km.c b/mm/percpu-km.c
index fe31aa19db81a..4efa74a495cb6 100644
--- a/mm/percpu-km.c
+++ b/mm/percpu-km.c
@@ -69,7 +69,7 @@ static struct pcpu_chunk *pcpu_create_chunk(gfp_t gfp)
 	}
 
 	for (i = 0; i < nr_pages; i++)
-		pcpu_set_page_chunk(nth_page(pages, i), chunk);
+		pcpu_set_page_chunk(pages + i, chunk);
 
 	chunk->data = pages;
 	chunk->base_addr = page_address(pages);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 15/36] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (8 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 14/36] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <1d74a0e2-51ff-462f-8f3c-75639fd21221@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 16/36] fs: hugetlbfs: cleanup " David Hildenbrand
                   ` (13 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

The nth_page() is not really required anymore, so let's remove it.
While at it, cleanup and simplify the code a bit.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 fs/hugetlbfs/inode.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index 34d496a2b7de6..c5a46d10afaa0 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -217,7 +217,7 @@ static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
 			break;
 		offset += n;
 		if (offset == PAGE_SIZE) {
-			page = nth_page(page, 1);
+			page++;
 			offset = 0;
 		}
 	}
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 16/36] fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (9 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 15/36] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <71cf3600-d9cf-4d16-951c-44582b46c0fa@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 17/36] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
                   ` (12 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Let's cleanup and simplify the function a bit.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 fs/hugetlbfs/inode.c | 33 +++++++++++----------------------
 1 file changed, 11 insertions(+), 22 deletions(-)

diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
index c5a46d10afaa0..6ca1f6b45c1e5 100644
--- a/fs/hugetlbfs/inode.c
+++ b/fs/hugetlbfs/inode.c
@@ -198,31 +198,20 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
 static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
 		size_t bytes)
 {
-	struct page *page;
-	size_t n = 0;
-	size_t res = 0;
-
-	/* First page to start the loop. */
-	page = folio_page(folio, offset / PAGE_SIZE);
-	offset %= PAGE_SIZE;
-	while (1) {
-		if (is_raw_hwpoison_page_in_hugepage(page))
-			break;
+	struct page *page = folio_page(folio, offset / PAGE_SIZE);
+	size_t safe_bytes;
+
+	if (is_raw_hwpoison_page_in_hugepage(page))
+		return 0;
+	/* Safe to read the remaining bytes in this page. */
+	safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
+	page++;
 
-		/* Safe to read n bytes without touching HWPOISON subpage. */
-		n = min(bytes, (size_t)PAGE_SIZE - offset);
-		res += n;
-		bytes -= n;
-		if (!bytes || !n)
+	for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
+		if (is_raw_hwpoison_page_in_hugepage(page))
 			break;
-		offset += n;
-		if (offset == PAGE_SIZE) {
-			page++;
-			offset = 0;
-		}
-	}
 
-	return res;
+	return min(safe_bytes, bytes);
 }
 
 /*
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 17/36] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (10 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 16/36] fs: hugetlbfs: cleanup " David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
                   ` (11 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

It's no longer required to use nth_page() within a folio, so let's just
drop the nth_page() in folio_walk_start().

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/pagewalk.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/mm/pagewalk.c b/mm/pagewalk.c
index c6753d370ff4e..9e4225e5fcf5c 100644
--- a/mm/pagewalk.c
+++ b/mm/pagewalk.c
@@ -1004,7 +1004,7 @@ struct folio *folio_walk_start(struct folio_walk *fw,
 found:
 	if (expose_page)
 		/* Note: Offset from the mapped page, not the folio start. */
-		fw->page = nth_page(page, (addr & (entry_size - 1)) >> PAGE_SHIFT);
+		fw->page = page + ((addr & (entry_size - 1)) >> PAGE_SHIFT);
 	else
 		fw->page = NULL;
 	fw->ptl = ptl;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (11 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 17/36] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <c0dadc4f-6415-4818-a319-e3e15ff47a24@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 19/36] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
                   ` (10 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

nth_page() is no longer required when iterating over pages within a
single folio, so let's just drop it when recording subpages.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/gup.c | 7 +++----
 1 file changed, 3 insertions(+), 4 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index b2a78f0291273..89ca0813791ab 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -488,12 +488,11 @@ static int record_subpages(struct page *page, unsigned long sz,
 			   unsigned long addr, unsigned long end,
 			   struct page **pages)
 {
-	struct page *start_page;
 	int nr;
 
-	start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT);
+	page += (addr & (sz - 1)) >> PAGE_SHIFT;
 	for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
-		pages[nr] = nth_page(start_page, nr);
+		pages[nr] = page++;
 
 	return nr;
 }
@@ -1512,7 +1511,7 @@ static long __get_user_pages(struct mm_struct *mm,
 			}
 
 			for (j = 0; j < page_increm; j++) {
-				subpage = nth_page(page, j);
+				subpage = page + j;
 				pages[i + j] = subpage;
 				flush_anon_page(vma, subpage, start + j * PAGE_SIZE);
 				flush_dcache_page(subpage);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 19/36] io_uring/zcrx: remove nth_page() usage within folio
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (12 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 20/36] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
                   ` (9 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Pavel Begunkov, Jens Axboe,
	Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Johannes Weiner,
	John Hubbard, kasan-dev, kvm, Liam R. Howlett, Linus Torvalds,
	linux-arm-kernel, linux-arm-kernel, linux-crypto, linux-ide,
	linux-kselftest, linux-mips, linux-mmc, linux-mm, linux-riscv,
	linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

Within a folio/compound page, nth_page() is no longer required.
Given that we call folio_test_partial_kmap()+kmap_local_page(), the code
would already be problematic if the pages would span multiple folios.

So let's just assume that all src pages belong to a single
folio/compound page and can be iterated ordinarily. The dst page is
currently always a single page, so we're not actually iterating
anything.

Reviewed-by: Pavel Begunkov <asml.silence@gmail.com>
Cc: Jens Axboe <axboe@kernel.dk>
Cc: Pavel Begunkov <asml.silence@gmail.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 io_uring/zcrx.c | 4 ++--
 1 file changed, 2 insertions(+), 2 deletions(-)

diff --git a/io_uring/zcrx.c b/io_uring/zcrx.c
index e5ff49f3425e0..18c12f4b56b6c 100644
--- a/io_uring/zcrx.c
+++ b/io_uring/zcrx.c
@@ -975,9 +975,9 @@ static ssize_t io_copy_page(struct io_copy_cache *cc, struct page *src_page,
 
 		if (folio_test_partial_kmap(page_folio(dst_page)) ||
 		    folio_test_partial_kmap(page_folio(src_page))) {
-			dst_page = nth_page(dst_page, dst_offset / PAGE_SIZE);
+			dst_page += dst_offset / PAGE_SIZE;
 			dst_offset = offset_in_page(dst_offset);
-			src_page = nth_page(src_page, src_offset / PAGE_SIZE);
+			src_page += src_offset / PAGE_SIZE;
 			src_offset = offset_in_page(src_offset);
 			n = min(PAGE_SIZE - src_offset, PAGE_SIZE - dst_offset);
 			n = min(n, len);
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 20/36] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (13 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 19/36] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <ea74f0e3-bacf-449a-b7ad-213c74599df1@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 21/36] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
                   ` (8 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Thomas Bogendoerfer, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Let's make it clearer that we are operating within a single folio by
providing both the folio and the page.

This implies that for flush_dcache_folio() we'll now avoid one more
page->folio lookup, and that we can safely drop the "nth_page" usage.

Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 arch/mips/include/asm/cacheflush.h | 11 +++++++----
 arch/mips/mm/cache.c               |  8 ++++----
 2 files changed, 11 insertions(+), 8 deletions(-)

diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
index 5d283ef89d90d..8d79bfc687d21 100644
--- a/arch/mips/include/asm/cacheflush.h
+++ b/arch/mips/include/asm/cacheflush.h
@@ -50,13 +50,14 @@ extern void (*flush_cache_mm)(struct mm_struct *mm);
 extern void (*flush_cache_range)(struct vm_area_struct *vma,
 	unsigned long start, unsigned long end);
 extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
-extern void __flush_dcache_pages(struct page *page, unsigned int nr);
+extern void __flush_dcache_folio_pages(struct folio *folio, struct page *page, unsigned int nr);
 
 #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
 static inline void flush_dcache_folio(struct folio *folio)
 {
 	if (cpu_has_dc_aliases)
-		__flush_dcache_pages(&folio->page, folio_nr_pages(folio));
+		__flush_dcache_folio_pages(folio, folio_page(folio, 0),
+					   folio_nr_pages(folio));
 	else if (!cpu_has_ic_fills_f_dc)
 		folio_set_dcache_dirty(folio);
 }
@@ -64,10 +65,12 @@ static inline void flush_dcache_folio(struct folio *folio)
 
 static inline void flush_dcache_page(struct page *page)
 {
+	struct folio *folio = page_folio(page);
+
 	if (cpu_has_dc_aliases)
-		__flush_dcache_pages(page, 1);
+		__flush_dcache_folio_pages(folio, page, folio_nr_pages(folio));
 	else if (!cpu_has_ic_fills_f_dc)
-		folio_set_dcache_dirty(page_folio(page));
+		folio_set_dcache_dirty(folio);
 }
 
 #define flush_dcache_mmap_lock(mapping)		do { } while (0)
diff --git a/arch/mips/mm/cache.c b/arch/mips/mm/cache.c
index bf9a37c60e9f0..e3b4224c9a406 100644
--- a/arch/mips/mm/cache.c
+++ b/arch/mips/mm/cache.c
@@ -99,9 +99,9 @@ SYSCALL_DEFINE3(cacheflush, unsigned long, addr, unsigned long, bytes,
 	return 0;
 }
 
-void __flush_dcache_pages(struct page *page, unsigned int nr)
+void __flush_dcache_folio_pages(struct folio *folio, struct page *page,
+		unsigned int nr)
 {
-	struct folio *folio = page_folio(page);
 	struct address_space *mapping = folio_flush_mapping(folio);
 	unsigned long addr;
 	unsigned int i;
@@ -117,12 +117,12 @@ void __flush_dcache_pages(struct page *page, unsigned int nr)
 	 * get faulted into the tlb (and thus flushed) anyways.
 	 */
 	for (i = 0; i < nr; i++) {
-		addr = (unsigned long)kmap_local_page(nth_page(page, i));
+		addr = (unsigned long)kmap_local_page(page + i);
 		flush_data_cache_page(addr);
 		kunmap_local((void *)addr);
 	}
 }
-EXPORT_SYMBOL(__flush_dcache_pages);
+EXPORT_SYMBOL(__flush_dcache_folio_pages);
 
 void __flush_anon_page(struct page *page, unsigned long vmaddr)
 {
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 21/36] mm/cma: refuse handing out non-contiguous page ranges
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (14 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 20/36] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <b772a0c0-6e09-4fa4-a113-fe5adf9c7fe0@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 22/36] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
                   ` (7 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexandru Elisei, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Let's disallow handing out PFN ranges with non-contiguous pages, so we
can remove the nth-page usage in __cma_alloc(), and so any callers don't
have to worry about that either when wanting to blindly iterate pages.

This is really only a problem in configs with SPARSEMEM but without
SPARSEMEM_VMEMMAP, and only when we would cross memory sections in some
cases.

Will this cause harm? Probably not, because it's mostly 32bit that does
not support SPARSEMEM_VMEMMAP. If this ever becomes a problem we could
look into allocating the memmap for the memory sections spanned by a
single CMA region in one go from memblock.

Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h |  6 ++++++
 mm/cma.c           | 39 ++++++++++++++++++++++++---------------
 mm/util.c          | 33 +++++++++++++++++++++++++++++++++
 3 files changed, 63 insertions(+), 15 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index f6880e3225c5c..2ca1eb2db63ec 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -209,9 +209,15 @@ extern unsigned long sysctl_user_reserve_kbytes;
 extern unsigned long sysctl_admin_reserve_kbytes;
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
+bool page_range_contiguous(const struct page *page, unsigned long nr_pages);
 #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 #else
 #define nth_page(page,n) ((page) + (n))
+static inline bool page_range_contiguous(const struct page *page,
+		unsigned long nr_pages)
+{
+	return true;
+}
 #endif
 
 /* to align the pointer to the (next) page boundary */
diff --git a/mm/cma.c b/mm/cma.c
index e56ec64d0567e..813e6dc7b0954 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -780,10 +780,8 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 				unsigned long count, unsigned int align,
 				struct page **pagep, gfp_t gfp)
 {
-	unsigned long mask, offset;
-	unsigned long pfn = -1;
-	unsigned long start = 0;
 	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
+	unsigned long start, pfn, mask, offset;
 	int ret = -EBUSY;
 	struct page *page = NULL;
 
@@ -795,7 +793,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 	if (bitmap_count > bitmap_maxno)
 		goto out;
 
-	for (;;) {
+	for (start = 0; ; start = bitmap_no + mask + 1) {
 		spin_lock_irq(&cma->lock);
 		/*
 		 * If the request is larger than the available number
@@ -812,6 +810,22 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 			spin_unlock_irq(&cma->lock);
 			break;
 		}
+
+		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
+		page = pfn_to_page(pfn);
+
+		/*
+		 * Do not hand out page ranges that are not contiguous, so
+		 * callers can just iterate the pages without having to worry
+		 * about these corner cases.
+		 */
+		if (!page_range_contiguous(page, count)) {
+			spin_unlock_irq(&cma->lock);
+			pr_warn_ratelimited("%s: %s: skipping incompatible area [0x%lx-0x%lx]",
+					    __func__, cma->name, pfn, pfn + count - 1);
+			continue;
+		}
+
 		bitmap_set(cmr->bitmap, bitmap_no, bitmap_count);
 		cma->available_count -= count;
 		/*
@@ -821,29 +835,24 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 		 */
 		spin_unlock_irq(&cma->lock);
 
-		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
 		mutex_lock(&cma->alloc_mutex);
 		ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
 		mutex_unlock(&cma->alloc_mutex);
-		if (ret == 0) {
-			page = pfn_to_page(pfn);
+		if (!ret)
 			break;
-		}
 
 		cma_clear_bitmap(cma, cmr, pfn, count);
 		if (ret != -EBUSY)
 			break;
 
 		pr_debug("%s(): memory range at pfn 0x%lx %p is busy, retrying\n",
-			 __func__, pfn, pfn_to_page(pfn));
+			 __func__, pfn, page);
 
-		trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn),
-					   count, align);
-		/* try again with a bit different memory target */
-		start = bitmap_no + mask + 1;
+		trace_cma_alloc_busy_retry(cma->name, pfn, page, count, align);
 	}
 out:
-	*pagep = page;
+	if (!ret)
+		*pagep = page;
 	return ret;
 }
 
@@ -882,7 +891,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
 	 */
 	if (page) {
 		for (i = 0; i < count; i++)
-			page_kasan_tag_reset(nth_page(page, i));
+			page_kasan_tag_reset(page + i);
 	}
 
 	if (ret && !(gfp & __GFP_NOWARN)) {
diff --git a/mm/util.c b/mm/util.c
index d235b74f7aff7..0bf349b19b652 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1280,4 +1280,37 @@ unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte,
 {
 	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, 0);
 }
+
+#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
+/**
+ * page_range_contiguous - test whether the page range is contiguous
+ * @page: the start of the page range.
+ * @nr_pages: the number of pages in the range.
+ *
+ * Test whether the page range is contiguous, such that they can be iterated
+ * naively, corresponding to iterating a contiguous PFN range.
+ *
+ * This function should primarily only be used for debug checks, or when
+ * working with page ranges that are not naturally contiguous (e.g., pages
+ * within a folio are).
+ *
+ * Returns true if contiguous, otherwise false.
+ */
+bool page_range_contiguous(const struct page *page, unsigned long nr_pages)
+{
+	const unsigned long start_pfn = page_to_pfn(page);
+	const unsigned long end_pfn = start_pfn + nr_pages;
+	unsigned long pfn;
+
+	/*
+	 * The memmap is allocated per memory section. We need to check
+	 * each involved memory section once.
+	 */
+	for (pfn = ALIGN(start_pfn, PAGES_PER_SECTION);
+	     pfn < end_pfn; pfn += PAGES_PER_SECTION)
+		if (unlikely(page + (pfn - start_pfn) != pfn_to_page(pfn)))
+			return false;
+	return true;
+}
+#endif
 #endif /* CONFIG_MMU */
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 22/36] dma-remap: drop nth_page() in dma_common_contiguous_remap()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (15 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 21/36] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 23/36] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
                   ` (6 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Marek Szyprowski, Robin Murphy,
	Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

dma_common_contiguous_remap() is used to remap an "allocated contiguous
region". Within a single allocation, there is no need to use nth_page()
anymore.

Neither the buddy, nor hugetlb, nor CMA will hand out problematic page
ranges.

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Marek Szyprowski <m.szyprowski@samsung.com>
Cc: Robin Murphy <robin.murphy@arm.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 kernel/dma/remap.c | 2 +-
 1 file changed, 1 insertion(+), 1 deletion(-)

diff --git a/kernel/dma/remap.c b/kernel/dma/remap.c
index 9e2afad1c6152..b7c1c0c92d0c8 100644
--- a/kernel/dma/remap.c
+++ b/kernel/dma/remap.c
@@ -49,7 +49,7 @@ void *dma_common_contiguous_remap(struct page *page, size_t size,
 	if (!pages)
 		return NULL;
 	for (i = 0; i < count; i++)
-		pages[i] = nth_page(page, i);
+		pages[i] = page++;
 	vaddr = vmap(pages, count, VM_DMA_COHERENT, prot);
 	kvfree(pages);
 
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 23/36] scatterlist: disallow non-contigous page ranges in a single SG entry
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (16 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 22/36] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 33/36] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
                   ` (5 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Marek Szyprowski, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

The expectation is that there is currently no user that would pass in
non-contigous page ranges: no allocator, not even VMA, will hand these
out.

The only problematic part would be if someone would provide a range
obtained directly from memblock, or manually merge problematic ranges.
If we find such cases, we should fix them to create separate
SG entries.

Let's check in sg_set_page() that this is really the case. No need to
check in sg_set_folio(), as pages in a folio are guaranteed to be
contiguous. As sg_set_page() gets inlined into modules, we have to
export the page_range_contiguous() helper -- use EXPORT_SYMBOL, there is
nothing special about this helper such that we would want to enforce
GPL-only modules.

We can now drop the nth_page() usage in sg_page_iter_page().

Acked-by: Marek Szyprowski <m.szyprowski@samsung.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/scatterlist.h | 3 ++-
 mm/util.c                   | 1 +
 2 files changed, 3 insertions(+), 1 deletion(-)

diff --git a/include/linux/scatterlist.h b/include/linux/scatterlist.h
index 6f8a4965f9b98..29f6ceb98d74b 100644
--- a/include/linux/scatterlist.h
+++ b/include/linux/scatterlist.h
@@ -158,6 +158,7 @@ static inline void sg_assign_page(struct scatterlist *sg, struct page *page)
 static inline void sg_set_page(struct scatterlist *sg, struct page *page,
 			       unsigned int len, unsigned int offset)
 {
+	VM_WARN_ON_ONCE(!page_range_contiguous(page, ALIGN(len + offset, PAGE_SIZE) / PAGE_SIZE));
 	sg_assign_page(sg, page);
 	sg->offset = offset;
 	sg->length = len;
@@ -600,7 +601,7 @@ void __sg_page_iter_start(struct sg_page_iter *piter,
  */
 static inline struct page *sg_page_iter_page(struct sg_page_iter *piter)
 {
-	return nth_page(sg_page(piter->sg), piter->sg_pgoffset);
+	return sg_page(piter->sg) + piter->sg_pgoffset;
 }
 
 /**
diff --git a/mm/util.c b/mm/util.c
index 0bf349b19b652..e8b9da6b13230 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1312,5 +1312,6 @@ bool page_range_contiguous(const struct page *page, unsigned long nr_pages)
 			return false;
 	return true;
 }
+EXPORT_SYMBOL(page_range_contiguous);
 #endif
 #endif /* CONFIG_MMU */
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 33/36] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (17 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 23/36] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <c9527014-9a29-48f4-8ca9-a6226f962c00@lucifer.local>
  2025-08-27 22:01 ` [PATCH v1 34/36] kfence: drop nth_page() usage David Hildenbrand
                   ` (4 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

There is the concern that unpin_user_page_range_dirty_lock() might do
some weird merging of PFN ranges -- either now or in the future -- such
that PFN range is contiguous but the page range might not be.

Let's sanity-check for that and drop the nth_page() usage.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/gup.c | 7 ++++++-
 1 file changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/gup.c b/mm/gup.c
index 89ca0813791ab..c24f6009a7a44 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -237,7 +237,7 @@ void folio_add_pin(struct folio *folio)
 static inline struct folio *gup_folio_range_next(struct page *start,
 		unsigned long npages, unsigned long i, unsigned int *ntails)
 {
-	struct page *next = nth_page(start, i);
+	struct page *next = start + i;
 	struct folio *folio = page_folio(next);
 	unsigned int nr = 1;
 
@@ -342,6 +342,9 @@ EXPORT_SYMBOL(unpin_user_pages_dirty_lock);
  * "gup-pinned page range" refers to a range of pages that has had one of the
  * pin_user_pages() variants called on that page.
  *
+ * The page range must be truly contiguous: the page range corresponds
+ * to a contiguous PFN range and all pages can be iterated naturally.
+ *
  * For the page ranges defined by [page .. page+npages], make that range (or
  * its head pages, if a compound page) dirty, if @make_dirty is true, and if the
  * page range was previously listed as clean.
@@ -359,6 +362,8 @@ void unpin_user_page_range_dirty_lock(struct page *page, unsigned long npages,
 	struct folio *folio;
 	unsigned int nr;
 
+	VM_WARN_ON_ONCE(!page_range_contiguous(page, npages));
+
 	for (i = 0; i < npages; i += nr) {
 		folio = gup_folio_range_next(page, npages, i, &nr);
 		if (make_dirty && !folio_test_dirty(folio)) {
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 34/36] kfence: drop nth_page() usage
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (18 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 33/36] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-28  8:43   ` Marco Elver
  2025-08-27 22:01 ` [PATCH v1 35/36] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
                   ` (3 subsequent siblings)
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Marco Elver,
	Dmitry Vyukov, Andrew Morton, Brendan Jackman, Christoph Lameter,
	Dennis Zhou, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

We want to get rid of nth_page(), and kfence init code is the last user.

Unfortunately, we might actually walk a PFN range where the pages are
not contiguous, because we might be allocating an area from memblock
that could span memory sections in problematic kernel configs (SPARSEMEM
without SPARSEMEM_VMEMMAP).

We could check whether the page range is contiguous
using page_range_contiguous() and failing kfence init, or making kfence
incompatible these problemtic kernel configs.

Let's keep it simple and simply use pfn_to_page() by iterating PFNs.

Cc: Alexander Potapenko <glider@google.com>
Cc: Marco Elver <elver@google.com>
Cc: Dmitry Vyukov <dvyukov@google.com>
Signed-off-by: David Hildenbrand <david@redhat.com>
---
 mm/kfence/core.c | 12 +++++++-----
 1 file changed, 7 insertions(+), 5 deletions(-)

diff --git a/mm/kfence/core.c b/mm/kfence/core.c
index 0ed3be100963a..727c20c94ac59 100644
--- a/mm/kfence/core.c
+++ b/mm/kfence/core.c
@@ -594,15 +594,14 @@ static void rcu_guarded_free(struct rcu_head *h)
  */
 static unsigned long kfence_init_pool(void)
 {
-	unsigned long addr;
-	struct page *pages;
+	unsigned long addr, start_pfn;
 	int i;
 
 	if (!arch_kfence_init_pool())
 		return (unsigned long)__kfence_pool;
 
 	addr = (unsigned long)__kfence_pool;
-	pages = virt_to_page(__kfence_pool);
+	start_pfn = PHYS_PFN(virt_to_phys(__kfence_pool));
 
 	/*
 	 * Set up object pages: they must have PGTY_slab set to avoid freeing
@@ -613,11 +612,12 @@ static unsigned long kfence_init_pool(void)
 	 * enters __slab_free() slow-path.
 	 */
 	for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
-		struct slab *slab = page_slab(nth_page(pages, i));
+		struct slab *slab;
 
 		if (!i || (i % 2))
 			continue;
 
+		slab = page_slab(pfn_to_page(start_pfn + i));
 		__folio_set_slab(slab_folio(slab));
 #ifdef CONFIG_MEMCG
 		slab->obj_exts = (unsigned long)&kfence_metadata_init[i / 2 - 1].obj_exts |
@@ -665,10 +665,12 @@ static unsigned long kfence_init_pool(void)
 
 reset_slab:
 	for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
-		struct slab *slab = page_slab(nth_page(pages, i));
+		struct slab *slab;
 
 		if (!i || (i % 2))
 			continue;
+
+		slab = page_slab(pfn_to_page(start_pfn + i));
 #ifdef CONFIG_MEMCG
 		slab->obj_exts = 0;
 #endif
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 35/36] block: update comment of "struct bio_vec" regarding nth_page()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (19 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 34/36] kfence: drop nth_page() usage David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
  2025-08-27 22:01 ` [PATCH v1 36/36] mm: remove nth_page() David Hildenbrand
                   ` (2 subsequent siblings)
  23 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Ever since commit 858c708d9efb ("block: move the bi_size update out of
__bio_try_merge_page"), page_is_mergeable() no longer exists, and the
logic in bvec_try_merge_page() is now a simple page pointer
comparison.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/bvec.h | 7 ++-----
 1 file changed, 2 insertions(+), 5 deletions(-)

diff --git a/include/linux/bvec.h b/include/linux/bvec.h
index 0a80e1f9aa201..3fc0efa0825b1 100644
--- a/include/linux/bvec.h
+++ b/include/linux/bvec.h
@@ -22,11 +22,8 @@ struct page;
  * @bv_len:    Number of bytes in the address range.
  * @bv_offset: Start of the address range relative to the start of @bv_page.
  *
- * The following holds for a bvec if n * PAGE_SIZE < bv_offset + bv_len:
- *
- *   nth_page(@bv_page, n) == @bv_page + n
- *
- * This holds because page_is_mergeable() checks the above property.
+ * All pages within a bio_vec starting from @bv_page are contiguous and
+ * can simply be iterated (see bvec_advance()).
  */
 struct bio_vec {
 	struct page	*bv_page;
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* [PATCH v1 36/36] mm: remove nth_page()
       [not found] <20250827220141.262669-1-david@redhat.com>
                   ` (20 preceding siblings ...)
  2025-08-27 22:01 ` [PATCH v1 35/36] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
@ 2025-08-27 22:01 ` David Hildenbrand
       [not found]   ` <18c6a175-507f-464c-b776-67d346863ddf@lucifer.local>
       [not found] ` <20250827220141.262669-25-david@redhat.com>
       [not found] ` <20250827220141.262669-33-david@redhat.com>
  23 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-27 22:01 UTC (permalink / raw)
  To: linux-kernel
  Cc: David Hildenbrand, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

Now that all users are gone, let's remove it.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
 include/linux/mm.h                   | 2 --
 tools/testing/scatterlist/linux/mm.h | 1 -
 2 files changed, 3 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 2ca1eb2db63ec..b26ca8b2162d9 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -210,9 +210,7 @@ extern unsigned long sysctl_admin_reserve_kbytes;
 
 #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
 bool page_range_contiguous(const struct page *page, unsigned long nr_pages);
-#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 #else
-#define nth_page(page,n) ((page) + (n))
 static inline bool page_range_contiguous(const struct page *page,
 		unsigned long nr_pages)
 {
diff --git a/tools/testing/scatterlist/linux/mm.h b/tools/testing/scatterlist/linux/mm.h
index 5bd9e6e806254..121ae78d6e885 100644
--- a/tools/testing/scatterlist/linux/mm.h
+++ b/tools/testing/scatterlist/linux/mm.h
@@ -51,7 +51,6 @@ static inline unsigned long page_to_phys(struct page *page)
 
 #define page_to_pfn(page) ((unsigned long)(page) / PAGE_SIZE)
 #define pfn_to_page(pfn) (void *)((pfn) * PAGE_SIZE)
-#define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
 
 #define __min(t1, t2, min1, min2, x, y) ({              \
 	t1 min1 = (x);                                  \
-- 
2.50.1


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 24/36] ata: libata-eh: drop nth_page() usage within SG entry
       [not found] ` <20250827220141.262669-25-david@redhat.com>
@ 2025-08-28  4:24   ` Damien Le Moal
       [not found]   ` <7612fdc2-97ff-4b89-a532-90c5de56acdc@lucifer.local>
  1 sibling, 0 replies; 49+ messages in thread
From: Damien Le Moal @ 2025-08-28  4:24 UTC (permalink / raw)
  To: David Hildenbrand, linux-kernel
  Cc: Niklas Cassel, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

On 8/28/25 7:01 AM, David Hildenbrand wrote:
> It's no longer required to use nth_page() when iterating pages within a
> single SG entry, so let's drop the nth_page() usage.
> 
> Cc: Damien Le Moal <dlemoal@kernel.org>
> Cc: Niklas Cassel <cassel@kernel.org>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Acked-by: Damien Le Moal <dlemoal@kernel.org>

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  2025-08-27 22:01 ` [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
@ 2025-08-28  7:21   ` Mike Rapoport
  2025-08-28  7:44     ` David Hildenbrand
       [not found]   ` <cebd5356-0fc6-40aa-9bc6-a3a5ffe918f8@lucifer.local>
  1 sibling, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2025-08-28  7:21 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86,
	Zi Yan

On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
> We can now safely iterate over all pages in a folio, so no need for the
> pfn_to_page().
> 
> Also, as we already force the refcount in __init_single_page() to 1,
> we can just set the refcount to 0 and avoid page_ref_freeze() +
> VM_BUG_ON. Likely, in the future, we would just want to tell
> __init_single_page() to which value to initialize the refcount.
> 
> Further, adjust the comments to highlight that we are dealing with an
> open-coded prep_compound_page() variant, and add another comment explaining
> why we really need the __init_single_page() only on the tail pages.
> 
> Note that the current code was likely problematic, but we never ran into
> it: prep_compound_tail() would have been called with an offset that might
> exceed a memory section, and prep_compound_tail() would have simply
> added that offset to the page pointer -- which would not have done the
> right thing on sparsemem without vmemmap.
> 
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  mm/hugetlb.c | 20 ++++++++++++--------
>  1 file changed, 12 insertions(+), 8 deletions(-)
> 
> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> index 4a97e4f14c0dc..1f42186a85ea4 100644
> --- a/mm/hugetlb.c
> +++ b/mm/hugetlb.c
> @@ -3237,17 +3237,18 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio,
>  {
>  	enum zone_type zone = zone_idx(folio_zone(folio));
>  	int nid = folio_nid(folio);
> +	struct page *page = folio_page(folio, start_page_number);
>  	unsigned long head_pfn = folio_pfn(folio);
>  	unsigned long pfn, end_pfn = head_pfn + end_page_number;
> -	int ret;
> -
> -	for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) {
> -		struct page *page = pfn_to_page(pfn);
>  
> +	/*
> +	 * We mark all tail pages with memblock_reserved_mark_noinit(),
> +	 * so these pages are completely uninitialized.

                             ^ not? ;-)

> +	 */
> +	for (pfn = head_pfn + start_page_number; pfn < end_pfn; page++, pfn++) {
>  		__init_single_page(page, pfn, zone, nid);
>  		prep_compound_tail((struct page *)folio, pfn - head_pfn);
> -		ret = page_ref_freeze(page, 1);
> -		VM_BUG_ON(!ret);
> +		set_page_count(page, 0);
>  	}
>  }
>  
> @@ -3257,12 +3258,15 @@ static void __init hugetlb_folio_init_vmemmap(struct folio *folio,
>  {
>  	int ret;
>  
> -	/* Prepare folio head */
> +	/*
> +	 * This is an open-coded prep_compound_page() whereby we avoid
> +	 * walking pages twice by initializing/preparing+freezing them in the
> +	 * same go.
> +	 */
>  	__folio_clear_reserved(folio);
>  	__folio_set_head(folio);
>  	ret = folio_ref_freeze(folio, 1);
>  	VM_BUG_ON(!ret);
> -	/* Initialize the necessary tail struct pages */
>  	hugetlb_folio_init_tail_vmemmap(folio, 1, nr_pages);
>  	prep_compound_head((struct page *)folio, huge_page_order(h));
>  }
> -- 
> 2.50.1
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  2025-08-28  7:21   ` Mike Rapoport
@ 2025-08-28  7:44     ` David Hildenbrand
  2025-08-28  8:06       ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-28  7:44 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86,
	Zi Yan

On 28.08.25 09:21, Mike Rapoport wrote:
> On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
>> We can now safely iterate over all pages in a folio, so no need for the
>> pfn_to_page().
>>
>> Also, as we already force the refcount in __init_single_page() to 1,
>> we can just set the refcount to 0 and avoid page_ref_freeze() +
>> VM_BUG_ON. Likely, in the future, we would just want to tell
>> __init_single_page() to which value to initialize the refcount.
>>
>> Further, adjust the comments to highlight that we are dealing with an
>> open-coded prep_compound_page() variant, and add another comment explaining
>> why we really need the __init_single_page() only on the tail pages.
>>
>> Note that the current code was likely problematic, but we never ran into
>> it: prep_compound_tail() would have been called with an offset that might
>> exceed a memory section, and prep_compound_tail() would have simply
>> added that offset to the page pointer -- which would not have done the
>> right thing on sparsemem without vmemmap.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   mm/hugetlb.c | 20 ++++++++++++--------
>>   1 file changed, 12 insertions(+), 8 deletions(-)
>>
>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>> index 4a97e4f14c0dc..1f42186a85ea4 100644
>> --- a/mm/hugetlb.c
>> +++ b/mm/hugetlb.c
>> @@ -3237,17 +3237,18 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio,
>>   {
>>   	enum zone_type zone = zone_idx(folio_zone(folio));
>>   	int nid = folio_nid(folio);
>> +	struct page *page = folio_page(folio, start_page_number);
>>   	unsigned long head_pfn = folio_pfn(folio);
>>   	unsigned long pfn, end_pfn = head_pfn + end_page_number;
>> -	int ret;
>> -
>> -	for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) {
>> -		struct page *page = pfn_to_page(pfn);
>>   
>> +	/*
>> +	 * We mark all tail pages with memblock_reserved_mark_noinit(),
>> +	 * so these pages are completely uninitialized.
> 
>                               ^ not? ;-)

Can you elaborate?

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  2025-08-28  7:44     ` David Hildenbrand
@ 2025-08-28  8:06       ` Mike Rapoport
  2025-08-28  8:18         ` David Hildenbrand
  0 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2025-08-28  8:06 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86,
	Zi Yan

On Thu, Aug 28, 2025 at 09:44:27AM +0200, David Hildenbrand wrote:
> On 28.08.25 09:21, Mike Rapoport wrote:
> > On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
> > > We can now safely iterate over all pages in a folio, so no need for the
> > > pfn_to_page().
> > > 
> > > Also, as we already force the refcount in __init_single_page() to 1,
> > > we can just set the refcount to 0 and avoid page_ref_freeze() +
> > > VM_BUG_ON. Likely, in the future, we would just want to tell
> > > __init_single_page() to which value to initialize the refcount.
> > > 
> > > Further, adjust the comments to highlight that we are dealing with an
> > > open-coded prep_compound_page() variant, and add another comment explaining
> > > why we really need the __init_single_page() only on the tail pages.
> > > 
> > > Note that the current code was likely problematic, but we never ran into
> > > it: prep_compound_tail() would have been called with an offset that might
> > > exceed a memory section, and prep_compound_tail() would have simply
> > > added that offset to the page pointer -- which would not have done the
> > > right thing on sparsemem without vmemmap.
> > > 
> > > Signed-off-by: David Hildenbrand <david@redhat.com>
> > > ---
> > >   mm/hugetlb.c | 20 ++++++++++++--------
> > >   1 file changed, 12 insertions(+), 8 deletions(-)
> > > 
> > > diff --git a/mm/hugetlb.c b/mm/hugetlb.c
> > > index 4a97e4f14c0dc..1f42186a85ea4 100644
> > > --- a/mm/hugetlb.c
> > > +++ b/mm/hugetlb.c
> > > @@ -3237,17 +3237,18 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio,
> > >   {
> > >   	enum zone_type zone = zone_idx(folio_zone(folio));
> > >   	int nid = folio_nid(folio);
> > > +	struct page *page = folio_page(folio, start_page_number);
> > >   	unsigned long head_pfn = folio_pfn(folio);
> > >   	unsigned long pfn, end_pfn = head_pfn + end_page_number;
> > > -	int ret;
> > > -
> > > -	for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) {
> > > -		struct page *page = pfn_to_page(pfn);
> > > +	/*
> > > +	 * We mark all tail pages with memblock_reserved_mark_noinit(),
> > > +	 * so these pages are completely uninitialized.
> > 
> >                               ^ not? ;-)
> 
> Can you elaborate?

Oh, sorry, I misread "uninitialized".
Still, I'd phrase it as 

	/*
	 * We marked all tail pages with memblock_reserved_mark_noinit(),
	 * so we must initialize them here.
	 */

> -- 
> Cheers
> 
> David / dhildenb
> 

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  2025-08-28  8:06       ` Mike Rapoport
@ 2025-08-28  8:18         ` David Hildenbrand
  2025-08-28  8:37           ` Mike Rapoport
  0 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-28  8:18 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86,
	Zi Yan

On 28.08.25 10:06, Mike Rapoport wrote:
> On Thu, Aug 28, 2025 at 09:44:27AM +0200, David Hildenbrand wrote:
>> On 28.08.25 09:21, Mike Rapoport wrote:
>>> On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
>>>> We can now safely iterate over all pages in a folio, so no need for the
>>>> pfn_to_page().
>>>>
>>>> Also, as we already force the refcount in __init_single_page() to 1,
>>>> we can just set the refcount to 0 and avoid page_ref_freeze() +
>>>> VM_BUG_ON. Likely, in the future, we would just want to tell
>>>> __init_single_page() to which value to initialize the refcount.
>>>>
>>>> Further, adjust the comments to highlight that we are dealing with an
>>>> open-coded prep_compound_page() variant, and add another comment explaining
>>>> why we really need the __init_single_page() only on the tail pages.
>>>>
>>>> Note that the current code was likely problematic, but we never ran into
>>>> it: prep_compound_tail() would have been called with an offset that might
>>>> exceed a memory section, and prep_compound_tail() would have simply
>>>> added that offset to the page pointer -- which would not have done the
>>>> right thing on sparsemem without vmemmap.
>>>>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>    mm/hugetlb.c | 20 ++++++++++++--------
>>>>    1 file changed, 12 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/mm/hugetlb.c b/mm/hugetlb.c
>>>> index 4a97e4f14c0dc..1f42186a85ea4 100644
>>>> --- a/mm/hugetlb.c
>>>> +++ b/mm/hugetlb.c
>>>> @@ -3237,17 +3237,18 @@ static void __init hugetlb_folio_init_tail_vmemmap(struct folio *folio,
>>>>    {
>>>>    	enum zone_type zone = zone_idx(folio_zone(folio));
>>>>    	int nid = folio_nid(folio);
>>>> +	struct page *page = folio_page(folio, start_page_number);
>>>>    	unsigned long head_pfn = folio_pfn(folio);
>>>>    	unsigned long pfn, end_pfn = head_pfn + end_page_number;
>>>> -	int ret;
>>>> -
>>>> -	for (pfn = head_pfn + start_page_number; pfn < end_pfn; pfn++) {
>>>> -		struct page *page = pfn_to_page(pfn);
>>>> +	/*
>>>> +	 * We mark all tail pages with memblock_reserved_mark_noinit(),
>>>> +	 * so these pages are completely uninitialized.
>>>
>>>                                ^ not? ;-)
>>
>> Can you elaborate?
> 
> Oh, sorry, I misread "uninitialized".
> Still, I'd phrase it as
> 
> 	/*
> 	 * We marked all tail pages with memblock_reserved_mark_noinit(),
> 	 * so we must initialize them here.
> 	 */

I prefer what I currently have, but thanks for the review.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  2025-08-28  8:18         ` David Hildenbrand
@ 2025-08-28  8:37           ` Mike Rapoport
  2025-08-29 12:00             ` David Hildenbrand
  0 siblings, 1 reply; 49+ messages in thread
From: Mike Rapoport @ 2025-08-28  8:37 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86,
	Zi Yan

On Thu, Aug 28, 2025 at 10:18:23AM +0200, David Hildenbrand wrote:
> On 28.08.25 10:06, Mike Rapoport wrote:
> > On Thu, Aug 28, 2025 at 09:44:27AM +0200, David Hildenbrand wrote:
> > > On 28.08.25 09:21, Mike Rapoport wrote:
> > > > On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
> > > > > +	/*
> > > > > +	 * We mark all tail pages with memblock_reserved_mark_noinit(),
> > > > > +	 * so these pages are completely uninitialized.
> > > > 
> > > >                                ^ not? ;-)
> > > 
> > > Can you elaborate?
> > 
> > Oh, sorry, I misread "uninitialized".
> > Still, I'd phrase it as
> > 
> > 	/*
> > 	 * We marked all tail pages with memblock_reserved_mark_noinit(),
> > 	 * so we must initialize them here.
> > 	 */
> 
> I prefer what I currently have, but thanks for the review.

No strong feelings, feel free to add

Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>

-- 
Sincerely yours,
Mike.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 34/36] kfence: drop nth_page() usage
  2025-08-27 22:01 ` [PATCH v1 34/36] kfence: drop nth_page() usage David Hildenbrand
@ 2025-08-28  8:43   ` Marco Elver
  0 siblings, 0 replies; 49+ messages in thread
From: Marco Elver @ 2025-08-28  8:43 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, Alexander Potapenko, Dmitry Vyukov, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On Thu, 28 Aug 2025 at 00:11, 'David Hildenbrand' via kasan-dev
<kasan-dev@googlegroups.com> wrote:
>
> We want to get rid of nth_page(), and kfence init code is the last user.
>
> Unfortunately, we might actually walk a PFN range where the pages are
> not contiguous, because we might be allocating an area from memblock
> that could span memory sections in problematic kernel configs (SPARSEMEM
> without SPARSEMEM_VMEMMAP).
>
> We could check whether the page range is contiguous
> using page_range_contiguous() and failing kfence init, or making kfence
> incompatible these problemtic kernel configs.
>
> Let's keep it simple and simply use pfn_to_page() by iterating PFNs.
>
> Cc: Alexander Potapenko <glider@google.com>
> Cc: Marco Elver <elver@google.com>
> Cc: Dmitry Vyukov <dvyukov@google.com>
> Signed-off-by: David Hildenbrand <david@redhat.com>

Reviewed-by: Marco Elver <elver@google.com>

Thanks.

> ---
>  mm/kfence/core.c | 12 +++++++-----
>  1 file changed, 7 insertions(+), 5 deletions(-)
>
> diff --git a/mm/kfence/core.c b/mm/kfence/core.c
> index 0ed3be100963a..727c20c94ac59 100644
> --- a/mm/kfence/core.c
> +++ b/mm/kfence/core.c
> @@ -594,15 +594,14 @@ static void rcu_guarded_free(struct rcu_head *h)
>   */
>  static unsigned long kfence_init_pool(void)
>  {
> -       unsigned long addr;
> -       struct page *pages;
> +       unsigned long addr, start_pfn;
>         int i;
>
>         if (!arch_kfence_init_pool())
>                 return (unsigned long)__kfence_pool;
>
>         addr = (unsigned long)__kfence_pool;
> -       pages = virt_to_page(__kfence_pool);
> +       start_pfn = PHYS_PFN(virt_to_phys(__kfence_pool));
>
>         /*
>          * Set up object pages: they must have PGTY_slab set to avoid freeing
> @@ -613,11 +612,12 @@ static unsigned long kfence_init_pool(void)
>          * enters __slab_free() slow-path.
>          */
>         for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
> -               struct slab *slab = page_slab(nth_page(pages, i));
> +               struct slab *slab;
>
>                 if (!i || (i % 2))
>                         continue;
>
> +               slab = page_slab(pfn_to_page(start_pfn + i));
>                 __folio_set_slab(slab_folio(slab));
>  #ifdef CONFIG_MEMCG
>                 slab->obj_exts = (unsigned long)&kfence_metadata_init[i / 2 - 1].obj_exts |
> @@ -665,10 +665,12 @@ static unsigned long kfence_init_pool(void)
>
>  reset_slab:
>         for (i = 0; i < KFENCE_POOL_SIZE / PAGE_SIZE; i++) {
> -               struct slab *slab = page_slab(nth_page(pages, i));
> +               struct slab *slab;
>
>                 if (!i || (i % 2))
>                         continue;
> +
> +               slab = page_slab(pfn_to_page(start_pfn + i));
>  #ifdef CONFIG_MEMCG
>                 slab->obj_exts = 0;
>  #endif
> --
> 2.50.1
>
> --
> You received this message because you are subscribed to the Google Groups "kasan-dev" group.
> To unsubscribe from this group and stop receiving emails from it, send an email to kasan-dev+unsubscribe@googlegroups.com.
> To view this discussion visit https://groups.google.com/d/msgid/kasan-dev/20250827220141.262669-35-david%40redhat.com.

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 20/36] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages()
       [not found]   ` <ea74f0e3-bacf-449a-b7ad-213c74599df1@lucifer.local>
@ 2025-08-28 20:51     ` David Hildenbrand
       [not found]       ` <549a60a6-25e2-48d5-b442-49404a857014@lucifer.local>
  0 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-28 20:51 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Thomas Bogendoerfer, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

On 28.08.25 18:57, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:24AM +0200, David Hildenbrand wrote:
>> Let's make it clearer that we are operating within a single folio by
>> providing both the folio and the page.
>>
>> This implies that for flush_dcache_folio() we'll now avoid one more
>> page->folio lookup, and that we can safely drop the "nth_page" usage.
>>
>> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
>> ---
>>   arch/mips/include/asm/cacheflush.h | 11 +++++++----
>>   arch/mips/mm/cache.c               |  8 ++++----
>>   2 files changed, 11 insertions(+), 8 deletions(-)
>>
>> diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
>> index 5d283ef89d90d..8d79bfc687d21 100644
>> --- a/arch/mips/include/asm/cacheflush.h
>> +++ b/arch/mips/include/asm/cacheflush.h
>> @@ -50,13 +50,14 @@ extern void (*flush_cache_mm)(struct mm_struct *mm);
>>   extern void (*flush_cache_range)(struct vm_area_struct *vma,
>>   	unsigned long start, unsigned long end);
>>   extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
>> -extern void __flush_dcache_pages(struct page *page, unsigned int nr);
>> +extern void __flush_dcache_folio_pages(struct folio *folio, struct page *page, unsigned int nr);
> 
> NIT: Be good to drop the extern.

I think I'll leave the one in, though, someone should clean up all of 
them in one go.

Just imagine how the other functions would think about the new guy 
showing off here. :)

> 
>>
>>   #define ARCH_IMPLEMENTS_FLUSH_DCACHE_PAGE 1
>>   static inline void flush_dcache_folio(struct folio *folio)
>>   {
>>   	if (cpu_has_dc_aliases)
>> -		__flush_dcache_pages(&folio->page, folio_nr_pages(folio));
>> +		__flush_dcache_folio_pages(folio, folio_page(folio, 0),
>> +					   folio_nr_pages(folio));
>>   	else if (!cpu_has_ic_fills_f_dc)
>>   		folio_set_dcache_dirty(folio);
>>   }
>> @@ -64,10 +65,12 @@ static inline void flush_dcache_folio(struct folio *folio)
>>
>>   static inline void flush_dcache_page(struct page *page)
>>   {
>> +	struct folio *folio = page_folio(page);
>> +
>>   	if (cpu_has_dc_aliases)
>> -		__flush_dcache_pages(page, 1);
>> +		__flush_dcache_folio_pages(folio, page, folio_nr_pages(folio));
> 
> Hmmm, shouldn't this be 1 not folio_nr_pages()? Seems that the original
> implementation only flushed a single page even if contained within a larger
> folio?

Yes, reworked it 3 times and messed it up during the last rework. Thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 24/36] ata: libata-eh: drop nth_page() usage within SG entry
       [not found]   ` <7612fdc2-97ff-4b89-a532-90c5de56acdc@lucifer.local>
@ 2025-08-29  0:22     ` Damien Le Moal
  2025-08-29 14:37       ` David Hildenbrand
  0 siblings, 1 reply; 49+ messages in thread
From: Damien Le Moal @ 2025-08-29  0:22 UTC (permalink / raw)
  To: Lorenzo Stoakes, David Hildenbrand
  Cc: linux-kernel, Niklas Cassel, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

On 8/29/25 2:53 AM, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:28AM +0200, David Hildenbrand wrote:
>> It's no longer required to use nth_page() when iterating pages within a
>> single SG entry, so let's drop the nth_page() usage.
>>
>> Cc: Damien Le Moal <dlemoal@kernel.org>
>> Cc: Niklas Cassel <cassel@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> LGTM, so:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>

Just noticed this:

s/libata-eh/libata-sff

in the commit title please.

-- 
Damien Le Moal
Western Digital Research

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof()
       [not found]   ` <3hpjmfa6p3onfdv4ma4nv2tdggvsyarh7m36aufy6hvwqtp2wd@2odohwxkl3rk>
@ 2025-08-29  9:58     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29  9:58 UTC (permalink / raw)
  To: Liam R. Howlett, linux-kernel, Zi Yan, SeongJae Park,
	Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Linus Torvalds,
	linux-arm-kernel, linux-arm-kernel, linux-crypto, linux-ide,
	linux-kselftest, linux-mips, linux-mmc, linux-mm, linux-riscv,
	linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86

On 29.08.25 02:33, Liam R. Howlett wrote:
> * David Hildenbrand <david@redhat.com> [250827 18:04]:
>> Let's reject them early, which in turn makes folio_alloc_gigantic() reject
>> them properly.
>>
>> To avoid converting from order to nr_pages, let's just add MAX_FOLIO_ORDER
>> and calculate MAX_FOLIO_NR_PAGES based on that.
>>
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Acked-by: SeongJae Park <sj@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Nit below, but..
> 
> Reviewed-by: Liam R. Howlett <Liam.Howlett@oracle.com>
> 
>> ---
>>   include/linux/mm.h | 6 ++++--
>>   mm/page_alloc.c    | 5 ++++-
>>   2 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 00c8a54127d37..77737cbf2216a 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2055,11 +2055,13 @@ static inline long folio_nr_pages(const struct folio *folio)
>>   
>>   /* Only hugetlbfs can allocate folios larger than MAX_ORDER */
>>   #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
>> -#define MAX_FOLIO_NR_PAGES	(1UL << PUD_ORDER)
>> +#define MAX_FOLIO_ORDER		PUD_ORDER
>>   #else
>> -#define MAX_FOLIO_NR_PAGES	MAX_ORDER_NR_PAGES
>> +#define MAX_FOLIO_ORDER		MAX_PAGE_ORDER
>>   #endif
>>   
>> +#define MAX_FOLIO_NR_PAGES	(1UL << MAX_FOLIO_ORDER)
>> +
>>   /*
>>    * compound_nr() returns the number of pages in this potentially compound
>>    * page.  compound_nr() can be called on a tail page, and is defined to
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index baead29b3e67b..426bc404b80cc 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6833,6 +6833,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
>>   int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>   			      acr_flags_t alloc_flags, gfp_t gfp_mask)
>>   {
>> +	const unsigned int order = ilog2(end - start);
>>   	unsigned long outer_start, outer_end;
>>   	int ret = 0;
>>   
>> @@ -6850,6 +6851,9 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>   					    PB_ISOLATE_MODE_CMA_ALLOC :
>>   					    PB_ISOLATE_MODE_OTHER;
>>   
>> +	if (WARN_ON_ONCE((gfp_mask & __GFP_COMP) && order > MAX_FOLIO_ORDER))
>> +		return -EINVAL;
>> +
>>   	gfp_mask = current_gfp_context(gfp_mask);
>>   	if (__alloc_contig_verify_gfp_mask(gfp_mask, (gfp_t *)&cc.gfp_mask))
>>   		return -EINVAL;
>> @@ -6947,7 +6951,6 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>   			free_contig_range(end, outer_end - end);
>>   	} else if (start == outer_start && end == outer_end && is_power_of_2(end - start)) {
>>   		struct page *head = pfn_to_page(start);
>> -		int order = ilog2(end - start);
> 
> You have changed this from an int to a const unsigned int, which is
> totally fine but it was left out of the change log.  

Considered to trivial to document, but I can add a sentence about that.

> Probably not really
> worth mentioning but curious why the change to unsigned here?

orders are always unsigned, like folio_order().

Thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof()
       [not found]   ` <f195300e-42e2-4eaa-84c8-c37501c3339c@lucifer.local>
@ 2025-08-29 10:06     ` David Hildenbrand
       [not found]       ` <34edaa0d-0d5f-4041-9a3d-fb5b2dd584e8@lucifer.local>
  0 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 10:06 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Zi Yan, SeongJae Park, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86

On 28.08.25 16:37, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:10AM +0200, David Hildenbrand wrote:
>> Let's reject them early, which in turn makes folio_alloc_gigantic() reject
>> them properly.
>>
>> To avoid converting from order to nr_pages, let's just add MAX_FOLIO_ORDER
>> and calculate MAX_FOLIO_NR_PAGES based on that.
>>
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Acked-by: SeongJae Park <sj@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Some nits, but overall LGTM so:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
>> ---
>>   include/linux/mm.h | 6 ++++--
>>   mm/page_alloc.c    | 5 ++++-
>>   2 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index 00c8a54127d37..77737cbf2216a 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -2055,11 +2055,13 @@ static inline long folio_nr_pages(const struct folio *folio)
>>
>>   /* Only hugetlbfs can allocate folios larger than MAX_ORDER */
>>   #ifdef CONFIG_ARCH_HAS_GIGANTIC_PAGE
>> -#define MAX_FOLIO_NR_PAGES	(1UL << PUD_ORDER)
>> +#define MAX_FOLIO_ORDER		PUD_ORDER
>>   #else
>> -#define MAX_FOLIO_NR_PAGES	MAX_ORDER_NR_PAGES
>> +#define MAX_FOLIO_ORDER		MAX_PAGE_ORDER
>>   #endif
>>
>> +#define MAX_FOLIO_NR_PAGES	(1UL << MAX_FOLIO_ORDER)
> 
> BIT()?

I don't think we want to use BIT whenever we convert from order -> folio 
-- which is why we also don't do that in other code.

BIT() is nice in the context of flags and bitmaps, but not really in the 
context of converting orders to pages.

One could argue that maybe one would want a order_to_pages() helper 
(that could use BIT() internally), but I am certainly not someone that 
would suggest that at this point ...  :)

> 
>> +
>>   /*
>>    * compound_nr() returns the number of pages in this potentially compound
>>    * page.  compound_nr() can be called on a tail page, and is defined to
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index baead29b3e67b..426bc404b80cc 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -6833,6 +6833,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
>>   int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>   			      acr_flags_t alloc_flags, gfp_t gfp_mask)
>>   {
>> +	const unsigned int order = ilog2(end - start);
>>   	unsigned long outer_start, outer_end;
>>   	int ret = 0;
>>
>> @@ -6850,6 +6851,9 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>   					    PB_ISOLATE_MODE_CMA_ALLOC :
>>   					    PB_ISOLATE_MODE_OTHER;
>>
>> +	if (WARN_ON_ONCE((gfp_mask & __GFP_COMP) && order > MAX_FOLIO_ORDER))
>> +		return -EINVAL;
> 
> Possibly not worth it for a one off, but be nice to have this as a helper function, like:
> 
> static bool is_valid_order(gfp_t gfp_mask, unsigned int order)
> {
> 	return !(gfp_mask & __GFP_COMP) || order <= MAX_FOLIO_ORDER;
> }
> 
> Then makes this:
> 
> 	if (WARN_ON_ONCE(!is_valid_order(gfp_mask, order)))
> 		return -EINVAL;
> 
> Kinda self-documenting!

I don't like it -- especially forwarding __GFP_COMP.

is_valid_folio_order() to wrap the order check? Also not sure.

So I'll leave it as is I think.

Thanks for all the review!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 08/36] mm/hugetlb: check for unreasonable folio sizes when registering hstate
       [not found]   ` <fa3425dd-df25-4a0b-a27e-614c81d301c4@lucifer.local>
@ 2025-08-29 10:07     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 10:07 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On 28.08.25 16:45, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:12AM +0200, David Hildenbrand wrote:
>> Let's check that no hstate that corresponds to an unreasonable folio size
>> is registered by an architecture. If we were to succeed registering, we
>> could later try allocating an unsupported gigantic folio size.
>>
>> Further, let's add a BUILD_BUG_ON() for checking that HUGETLB_PAGE_ORDER
>> is sane at build time. As HUGETLB_PAGE_ORDER is dynamic on powerpc, we have
>> to use a BUILD_BUG_ON_INVALID() to make it compile.
>>
>> No existing kernel configuration should be able to trigger this check:
>> either SPARSEMEM without SPARSEMEM_VMEMMAP cannot be configured or
>> gigantic folios will not exceed a memory section (the case on sparse).
> 
> I am guessing it's implicit that MAX_FOLIO_ORDER <= section size?

Yes, we have a build-time bug that somewhere.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 10/36] mm: sanity-check maximum folio size in folio_set_order()
       [not found]   ` <f0c6e9f6-df09-4b10-9338-7bfe4aa46601@lucifer.local>
@ 2025-08-29 10:10     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 10:10 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Zi Yan, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86

On 28.08.25 17:00, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:14AM +0200, David Hildenbrand wrote:
>> Let's sanity-check in folio_set_order() whether we would be trying to
>> create a folio with an order that would make it exceed MAX_FOLIO_ORDER.
>>
>> This will enable the check whenever a folio/compound page is initialized
>> through prepare_compound_head() / prepare_compound_page().
> 
> NIT: with CONFIG_DEBUG_VM set :)

Yes, will add that.

> 
>>
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> LGTM (apart from nit below), so:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
>> ---
>>   mm/internal.h | 1 +
>>   1 file changed, 1 insertion(+)
>>
>> diff --git a/mm/internal.h b/mm/internal.h
>> index 45da9ff5694f6..9b0129531d004 100644
>> --- a/mm/internal.h
>> +++ b/mm/internal.h
>> @@ -755,6 +755,7 @@ static inline void folio_set_order(struct folio *folio, unsigned int order)
>>   {
>>   	if (WARN_ON_ONCE(!order || !folio_test_large(folio)))
>>   		return;
>> +	VM_WARN_ON_ONCE(order > MAX_FOLIO_ORDER);
> 
> Given we have 'full-fat' WARN_ON*()'s above, maybe worth making this one too?

The idea is that if you reach this point here, previous such checks I 
added failed. So this is the safety net, and for that VM_WARN_ON_ONCE() 
is sufficient.

I think we should rather convert the WARN_ON_ONCE to VM_WARN_ON_ONCE() 
at some point, because no sane code should ever trigger that.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 11/36] mm: limit folio/compound page sizes in problematic kernel configs
       [not found]   ` <baa1b6cf-2fde-4149-8cdf-4b54e2d7c60d@lucifer.local>
@ 2025-08-29 11:57     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 11:57 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Zi Yan, Mike Rapoport (Microsoft),
	Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86

On 28.08.25 17:10, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:15AM +0200, David Hildenbrand wrote:
>> Let's limit the maximum folio size in problematic kernel config where
>> the memmap is allocated per memory section (SPARSEMEM without
>> SPARSEMEM_VMEMMAP) to a single memory section.
>>
>> Currently, only a single architectures supports ARCH_HAS_GIGANTIC_PAGE
>> but not SPARSEMEM_VMEMMAP: sh.
>>
>> Fortunately, the biggest hugetlb size sh supports is 64 MiB
>> (HUGETLB_PAGE_SIZE_64MB) and the section size is at least 64 MiB
>> (SECTION_SIZE_BITS == 26), so their use case is not degraded.
>>
>> As folios and memory sections are naturally aligned to their order-2 size
>> in memory, consequently a single folio can no longer span multiple memory
>> sections on these problematic kernel configs.
>>
>> nth_page() is no longer required when operating within a single compound
>> page / folio.
>>
>> Reviewed-by: Zi Yan <ziy@nvidia.com>
>> Acked-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Realy great comments, like this!
> 
> I wonder if we could have this be part of the first patch where you fiddle
> with MAX_FOLIO_ORDER etc. but not a big deal.

I think it belongs into this patch where we actually impose the 
restrictions.

[...]

>> +/*
>> + * Only pages within a single memory section are guaranteed to be
>> + * contiguous. By limiting folios to a single memory section, all folio
>> + * pages are guaranteed to be contiguous.
>> + */
>> +#define MAX_FOLIO_ORDER		PFN_SECTION_SHIFT
> 
> Hmmm, was this implicit before somehow? I mean surely by the fact as you say
> that physical contiguity would not otherwise be guaranteed :))

Well, my patches until this point made sure that any attempt to use a 
larger folio would fail in a way that we could spot now if there is any 
offender.

That is why before this change, nth_page() was required within a folio.

Hope that clarifies it, thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
       [not found]   ` <cebd5356-0fc6-40aa-9bc6-a3a5ffe918f8@lucifer.local>
@ 2025-08-29 11:59     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 11:59 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On 28.08.25 17:37, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
>> We can now safely iterate over all pages in a folio, so no need for the
>> pfn_to_page().
>>
>> Also, as we already force the refcount in __init_single_page() to 1,
> 
> Mega huge nit (ignore if you want), but maybe worth saying 'via
> init_page_count()'.

Will add, thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap()
  2025-08-28  8:37           ` Mike Rapoport
@ 2025-08-29 12:00             ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 12:00 UTC (permalink / raw)
  To: Mike Rapoport
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Lorenzo Stoakes, Marco Elver,
	Marek Szyprowski, Michal Hocko, Muchun Song, netdev,
	Oscar Salvador, Peter Xu, Robin Murphy, Suren Baghdasaryan,
	Tejun Heo, virtualization, Vlastimil Babka, wireguard, x86,
	Zi Yan

On 28.08.25 10:37, Mike Rapoport wrote:
> On Thu, Aug 28, 2025 at 10:18:23AM +0200, David Hildenbrand wrote:
>> On 28.08.25 10:06, Mike Rapoport wrote:
>>> On Thu, Aug 28, 2025 at 09:44:27AM +0200, David Hildenbrand wrote:
>>>> On 28.08.25 09:21, Mike Rapoport wrote:
>>>>> On Thu, Aug 28, 2025 at 12:01:17AM +0200, David Hildenbrand wrote:
>>>>>> +	/*
>>>>>> +	 * We mark all tail pages with memblock_reserved_mark_noinit(),
>>>>>> +	 * so these pages are completely uninitialized.
>>>>>
>>>>>                                 ^ not? ;-)
>>>>
>>>> Can you elaborate?
>>>
>>> Oh, sorry, I misread "uninitialized".
>>> Still, I'd phrase it as
>>>
>>> 	/*
>>> 	 * We marked all tail pages with memblock_reserved_mark_noinit(),
>>> 	 * so we must initialize them here.
>>> 	 */
>>
>> I prefer what I currently have, but thanks for the review.
> 
> No strong feelings, feel free to add
> 
> Reviewed-by: Mike Rapoport (Microsoft) <rppt@kernel.org>
> 

I now have

"As we marked all tail pages with memblock_reserved_mark_noinit(), we 
must initialize them ourselves here."

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 15/36] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison()
       [not found]   ` <1d74a0e2-51ff-462f-8f3c-75639fd21221@lucifer.local>
@ 2025-08-29 12:02     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 12:02 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On 28.08.25 17:45, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:19AM +0200, David Hildenbrand wrote:
>> The nth_page() is not really required anymore, so let's remove it.
>> While at it, cleanup and simplify the code a bit.
> 
> Hm Not sure which bit is the cleanup? Was there meant to be more here or?

Thanks, leftover from the pre-split of this patch!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof()
       [not found]       ` <34edaa0d-0d5f-4041-9a3d-fb5b2dd584e8@lucifer.local>
@ 2025-08-29 13:09         ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 13:09 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Zi Yan, SeongJae Park, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86

> 
> It seems a bit arbitrary, like we open-code this (at risk of making a mistake)
> in some places but not others.

[...]

>>
>> One could argue that maybe one would want a order_to_pages() helper (that
>> could use BIT() internally), but I am certainly not someone that would
>> suggest that at this point ...  :)
> 
> I mean maybe.
> 
> Anyway as I said none of this is massively important, the open-coding here is
> correct, just seems silly.

Maybe we really want a ORDER_PAGES() and PAGES_ORDER().

But I mean, we also have PHYS_PFN() PFN_PHYS() and see how many "<< 
PAGE_SIZE" etc we are using all over the place.

> 
>>
>>>
>>>> +
>>>>    /*
>>>>     * compound_nr() returns the number of pages in this potentially compound
>>>>     * page.  compound_nr() can be called on a tail page, and is defined to
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index baead29b3e67b..426bc404b80cc 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -6833,6 +6833,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
>>>>    int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>>    			      acr_flags_t alloc_flags, gfp_t gfp_mask)
> 
> Funny btw th
> 
>>>>    {
>>>> +	const unsigned int order = ilog2(end - start);
>>>>    	unsigned long outer_start, outer_end;
>>>>    	int ret = 0;
>>>>
>>>> @@ -6850,6 +6851,9 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
>>>>    					    PB_ISOLATE_MODE_CMA_ALLOC :
>>>>    					    PB_ISOLATE_MODE_OTHER;
>>>>
>>>> +	if (WARN_ON_ONCE((gfp_mask & __GFP_COMP) && order > MAX_FOLIO_ORDER))
>>>> +		return -EINVAL;
>>>
>>> Possibly not worth it for a one off, but be nice to have this as a helper function, like:
>>>
>>> static bool is_valid_order(gfp_t gfp_mask, unsigned int order)
>>> {
>>> 	return !(gfp_mask & __GFP_COMP) || order <= MAX_FOLIO_ORDER;
>>> }
>>>
>>> Then makes this:
>>>
>>> 	if (WARN_ON_ONCE(!is_valid_order(gfp_mask, order)))
>>> 		return -EINVAL;
>>>
>>> Kinda self-documenting!
>>
>> I don't like it -- especially forwarding __GFP_COMP.
>>
>> is_valid_folio_order() to wrap the order check? Also not sure.
> 
> OK, it's not a big deal.
> 
> Can we have a comment explaining this though? As people might be confused
> as to why we check this here and not elsewhere.

I can add a comment.

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 16/36] fs: hugetlbfs: cleanup folio in adjust_range_hwpoison()
       [not found]   ` <71cf3600-d9cf-4d16-951c-44582b46c0fa@lucifer.local>
@ 2025-08-29 13:22     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 13:22 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan


> 
> Lord above.
> 
> Also semantics of 'if bytes == 0, then check first page anyway' which you do
> capture.

Yeah, I think bytes == 0 would not make any sense, though. Staring 
briefly at the single caller, that seems to be the case (bytes != 0).

> 
> OK think I have convinced myself this is right, so hopefully no deeply subtle
> off-by-one issues here :P
> 
> Anyway, LGTM, so:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
>> ---
>>   fs/hugetlbfs/inode.c | 33 +++++++++++----------------------
>>   1 file changed, 11 insertions(+), 22 deletions(-)
>>
>> diff --git a/fs/hugetlbfs/inode.c b/fs/hugetlbfs/inode.c
>> index c5a46d10afaa0..6ca1f6b45c1e5 100644
>> --- a/fs/hugetlbfs/inode.c
>> +++ b/fs/hugetlbfs/inode.c
>> @@ -198,31 +198,20 @@ hugetlb_get_unmapped_area(struct file *file, unsigned long addr,
>>   static size_t adjust_range_hwpoison(struct folio *folio, size_t offset,
>>   		size_t bytes)
>>   {
>> -	struct page *page;
>> -	size_t n = 0;
>> -	size_t res = 0;
>> -
>> -	/* First page to start the loop. */
>> -	page = folio_page(folio, offset / PAGE_SIZE);
>> -	offset %= PAGE_SIZE;
>> -	while (1) {
>> -		if (is_raw_hwpoison_page_in_hugepage(page))
>> -			break;
>> +	struct page *page = folio_page(folio, offset / PAGE_SIZE);
>> +	size_t safe_bytes;
>> +
>> +	if (is_raw_hwpoison_page_in_hugepage(page))
>> +		return 0;
>> +	/* Safe to read the remaining bytes in this page. */
>> +	safe_bytes = PAGE_SIZE - (offset % PAGE_SIZE);
>> +	page++;
>>
>> -		/* Safe to read n bytes without touching HWPOISON subpage. */
>> -		n = min(bytes, (size_t)PAGE_SIZE - offset);
>> -		res += n;
>> -		bytes -= n;
>> -		if (!bytes || !n)
>> +	for (; safe_bytes < bytes; safe_bytes += PAGE_SIZE, page++)
> 
> OK this is quite subtle - so if safe_bytes == bytes, this means we've confirmed
> that all requested bytes are safe.
> 
> So offset=0, bytes = 4096 would fail this (as safe_bytes == 4096).
> 
> Maybe worth putting something like:
> 
> 	/*
> 	 * Now we check page-by-page in the folio to see if any bytes we don't
> 	 * yet know to be safe are contained within posioned pages or not.
> 	 */
> 
> Above the loop. Or something like this.

"Check each remaining page as long as we are not done yet."

> 
>> +		if (is_raw_hwpoison_page_in_hugepage(page))
>>   			break;
>> -		offset += n;
>> -		if (offset == PAGE_SIZE) {
>> -			page++;
>> -			offset = 0;
>> -		}
>> -	}
>>
>> -	return res;
>> +	return min(safe_bytes, bytes);
> 
> Yeah given above analysis this seems correct.
> 
> You must have torn your hair out over this :)

I could resist the urge to clean that up, yes.

I'll also drop the "The implementation borrows the iteration logic from 
copy_page_to_iter*." part, because I suspect this comment no longer 
makes sense.

Thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages
       [not found]   ` <c0dadc4f-6415-4818-a319-e3e15ff47a24@lucifer.local>
@ 2025-08-29 13:41     ` David Hildenbrand
       [not found]       ` <8a26ae97-9a78-4db5-be98-9c1f6e4fb403@lucifer.local>
  0 siblings, 1 reply; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 13:41 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On 28.08.25 18:37, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:22AM +0200, David Hildenbrand wrote:
>> nth_page() is no longer required when iterating over pages within a
>> single folio, so let's just drop it when recording subpages.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> This looks correct to me, so notwithtsanding suggestion below, LGTM and:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
>> ---
>>   mm/gup.c | 7 +++----
>>   1 file changed, 3 insertions(+), 4 deletions(-)
>>
>> diff --git a/mm/gup.c b/mm/gup.c
>> index b2a78f0291273..89ca0813791ab 100644
>> --- a/mm/gup.c
>> +++ b/mm/gup.c
>> @@ -488,12 +488,11 @@ static int record_subpages(struct page *page, unsigned long sz,
>>   			   unsigned long addr, unsigned long end,
>>   			   struct page **pages)
>>   {
>> -	struct page *start_page;
>>   	int nr;
>>
>> -	start_page = nth_page(page, (addr & (sz - 1)) >> PAGE_SHIFT);
>> +	page += (addr & (sz - 1)) >> PAGE_SHIFT;
>>   	for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
>> -		pages[nr] = nth_page(start_page, nr);
>> +		pages[nr] = page++;
> 
> 
> This is really nice, but I wonder if (while we're here) we can't be even
> more clear as to what's going on here, e.g.:
> 
> static int record_subpages(struct page *page, unsigned long sz,
> 			   unsigned long addr, unsigned long end,
> 			   struct page **pages)
> {
> 	size_t offset_in_folio = (addr & (sz - 1)) >> PAGE_SHIFT;
> 	struct page *subpage = page + offset_in_folio;
> 
> 	for (; addr != end; addr += PAGE_SIZE)
> 		*pages++ = subpage++;
> 
> 	return nr;
> }
> 
> Or some variant of that with the masking stuff self-documented.

What about the following cleanup on top:


diff --git a/mm/gup.c b/mm/gup.c
index 89ca0813791ab..5a72a135ec70b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm)
  #ifdef CONFIG_MMU
  
  #ifdef CONFIG_HAVE_GUP_FAST
-static int record_subpages(struct page *page, unsigned long sz,
-                          unsigned long addr, unsigned long end,
-                          struct page **pages)
-{
-       int nr;
-
-       page += (addr & (sz - 1)) >> PAGE_SHIFT;
-       for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
-               pages[nr] = page++;
-
-       return nr;
-}
-
  /**
   * try_grab_folio_fast() - Attempt to get or pin a folio in fast path.
   * @page:  pointer to page to be grabbed
@@ -2963,8 +2950,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
         if (pmd_special(orig))
                 return 0;
  
-       page = pmd_page(orig);
-       refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr);
+       refs = (end - addr) >> PAGE_SHIFT;
+       page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
  
         folio = try_grab_folio_fast(page, refs, flags);
         if (!folio)
@@ -2985,6 +2972,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
         }
  
         *nr += refs;
+       for (; refs; refs--)
+               *(pages++) = page++;
         folio_set_referenced(folio);
         return 1;
  }
@@ -3003,8 +2992,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
         if (pud_special(orig))
                 return 0;
  
-       page = pud_page(orig);
-       refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr);
+       refs = (end - addr) >> PAGE_SHIFT;
+       page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
  
         folio = try_grab_folio_fast(page, refs, flags);
         if (!folio)
@@ -3026,6 +3015,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
         }
  
         *nr += refs;
+       for (; refs; refs--)
+               *(pages++) = page++;
         folio_set_referenced(folio);
         return 1;
  }


The nice thing is that we only record pages in the array if they actually passed our tests.


-- 
Cheers

David / dhildenb


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 20/36] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages()
       [not found]       ` <549a60a6-25e2-48d5-b442-49404a857014@lucifer.local>
@ 2025-08-29 13:44         ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 13:44 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Thomas Bogendoerfer, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

On 29.08.25 14:51, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 10:51:46PM +0200, David Hildenbrand wrote:
>> On 28.08.25 18:57, Lorenzo Stoakes wrote:
>>> On Thu, Aug 28, 2025 at 12:01:24AM +0200, David Hildenbrand wrote:
>>>> Let's make it clearer that we are operating within a single folio by
>>>> providing both the folio and the page.
>>>>
>>>> This implies that for flush_dcache_folio() we'll now avoid one more
>>>> page->folio lookup, and that we can safely drop the "nth_page" usage.
>>>>
>>>> Cc: Thomas Bogendoerfer <tsbogend@alpha.franken.de>
>>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>>> ---
>>>>    arch/mips/include/asm/cacheflush.h | 11 +++++++----
>>>>    arch/mips/mm/cache.c               |  8 ++++----
>>>>    2 files changed, 11 insertions(+), 8 deletions(-)
>>>>
>>>> diff --git a/arch/mips/include/asm/cacheflush.h b/arch/mips/include/asm/cacheflush.h
>>>> index 5d283ef89d90d..8d79bfc687d21 100644
>>>> --- a/arch/mips/include/asm/cacheflush.h
>>>> +++ b/arch/mips/include/asm/cacheflush.h
>>>> @@ -50,13 +50,14 @@ extern void (*flush_cache_mm)(struct mm_struct *mm);
>>>>    extern void (*flush_cache_range)(struct vm_area_struct *vma,
>>>>    	unsigned long start, unsigned long end);
>>>>    extern void (*flush_cache_page)(struct vm_area_struct *vma, unsigned long page, unsigned long pfn);
>>>> -extern void __flush_dcache_pages(struct page *page, unsigned int nr);
>>>> +extern void __flush_dcache_folio_pages(struct folio *folio, struct page *page, unsigned int nr);
>>>
>>> NIT: Be good to drop the extern.
>>
>> I think I'll leave the one in, though, someone should clean up all of them
>> in one go.
> 
> This is how we always clean these up though, buuut to be fair that's in mm.
> 

Well, okay, I'll make all the other functions jealous and blame it on 
you! :P

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 21/36] mm/cma: refuse handing out non-contiguous page ranges
       [not found]   ` <b772a0c0-6e09-4fa4-a113-fe5adf9c7fe0@lucifer.local>
@ 2025-08-29 14:34     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 14:34 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexandru Elisei, Alexander Potapenko,
	Andrew Morton, Brendan Jackman, Christoph Lameter, Dennis Zhou,
	Dmitry Vyukov, dri-devel, intel-gfx, iommu, io-uring,
	Jason Gunthorpe, Jens Axboe, Johannes Weiner, John Hubbard,
	kasan-dev, kvm, Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

On 28.08.25 19:28, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:25AM +0200, David Hildenbrand wrote:
>> Let's disallow handing out PFN ranges with non-contiguous pages, so we
>> can remove the nth-page usage in __cma_alloc(), and so any callers don't
>> have to worry about that either when wanting to blindly iterate pages.
>>
>> This is really only a problem in configs with SPARSEMEM but without
>> SPARSEMEM_VMEMMAP, and only when we would cross memory sections in some
>> cases.
> 
> I'm guessing this is something that we don't need to worry about in
> reality?

That my theory yes.

> 
>>
>> Will this cause harm? Probably not, because it's mostly 32bit that does
>> not support SPARSEMEM_VMEMMAP. If this ever becomes a problem we could
>> look into allocating the memmap for the memory sections spanned by a
>> single CMA region in one go from memblock.
>>
>> Reviewed-by: Alexandru Elisei <alexandru.elisei@arm.com>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> LGTM other than refactoring point below.
> 
> CMA stuff looks fine afaict after staring at it for a while, on proviso
> that handing out ranges within the same section is always going to be the
> case.
> 
> Anyway overall,
> 
> LGTM, so:
> 
> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> 
>> ---
>>   include/linux/mm.h |  6 ++++++
>>   mm/cma.c           | 39 ++++++++++++++++++++++++---------------
>>   mm/util.c          | 33 +++++++++++++++++++++++++++++++++
>>   3 files changed, 63 insertions(+), 15 deletions(-)
>>
>> diff --git a/include/linux/mm.h b/include/linux/mm.h
>> index f6880e3225c5c..2ca1eb2db63ec 100644
>> --- a/include/linux/mm.h
>> +++ b/include/linux/mm.h
>> @@ -209,9 +209,15 @@ extern unsigned long sysctl_user_reserve_kbytes;
>>   extern unsigned long sysctl_admin_reserve_kbytes;
>>
>>   #if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
>> +bool page_range_contiguous(const struct page *page, unsigned long nr_pages);
>>   #define nth_page(page,n) pfn_to_page(page_to_pfn((page)) + (n))
>>   #else
>>   #define nth_page(page,n) ((page) + (n))
>> +static inline bool page_range_contiguous(const struct page *page,
>> +		unsigned long nr_pages)
>> +{
>> +	return true;
>> +}
>>   #endif
>>
>>   /* to align the pointer to the (next) page boundary */
>> diff --git a/mm/cma.c b/mm/cma.c
>> index e56ec64d0567e..813e6dc7b0954 100644
>> --- a/mm/cma.c
>> +++ b/mm/cma.c
>> @@ -780,10 +780,8 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>>   				unsigned long count, unsigned int align,
>>   				struct page **pagep, gfp_t gfp)
>>   {
>> -	unsigned long mask, offset;
>> -	unsigned long pfn = -1;
>> -	unsigned long start = 0;
>>   	unsigned long bitmap_maxno, bitmap_no, bitmap_count;
>> +	unsigned long start, pfn, mask, offset;
>>   	int ret = -EBUSY;
>>   	struct page *page = NULL;
>>
>> @@ -795,7 +793,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>>   	if (bitmap_count > bitmap_maxno)
>>   		goto out;
>>
>> -	for (;;) {
>> +	for (start = 0; ; start = bitmap_no + mask + 1) {
>>   		spin_lock_irq(&cma->lock);
>>   		/*
>>   		 * If the request is larger than the available number
>> @@ -812,6 +810,22 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>>   			spin_unlock_irq(&cma->lock);
>>   			break;
>>   		}
>> +
>> +		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
>> +		page = pfn_to_page(pfn);
>> +
>> +		/*
>> +		 * Do not hand out page ranges that are not contiguous, so
>> +		 * callers can just iterate the pages without having to worry
>> +		 * about these corner cases.
>> +		 */
>> +		if (!page_range_contiguous(page, count)) {
>> +			spin_unlock_irq(&cma->lock);
>> +			pr_warn_ratelimited("%s: %s: skipping incompatible area [0x%lx-0x%lx]",
>> +					    __func__, cma->name, pfn, pfn + count - 1);
>> +			continue;
>> +		}
>> +
>>   		bitmap_set(cmr->bitmap, bitmap_no, bitmap_count);
>>   		cma->available_count -= count;
>>   		/*
>> @@ -821,29 +835,24 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
>>   		 */
>>   		spin_unlock_irq(&cma->lock);
>>
>> -		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
>>   		mutex_lock(&cma->alloc_mutex);
>>   		ret = alloc_contig_range(pfn, pfn + count, ACR_FLAGS_CMA, gfp);
>>   		mutex_unlock(&cma->alloc_mutex);
>> -		if (ret == 0) {
>> -			page = pfn_to_page(pfn);
>> +		if (!ret)
>>   			break;
>> -		}
>>
>>   		cma_clear_bitmap(cma, cmr, pfn, count);
>>   		if (ret != -EBUSY)
>>   			break;
>>
>>   		pr_debug("%s(): memory range at pfn 0x%lx %p is busy, retrying\n",
>> -			 __func__, pfn, pfn_to_page(pfn));
>> +			 __func__, pfn, page);
>>
>> -		trace_cma_alloc_busy_retry(cma->name, pfn, pfn_to_page(pfn),
>> -					   count, align);
>> -		/* try again with a bit different memory target */
>> -		start = bitmap_no + mask + 1;
>> +		trace_cma_alloc_busy_retry(cma->name, pfn, page, count, align);
>>   	}
>>   out:
>> -	*pagep = page;
>> +	if (!ret)
>> +		*pagep = page;
>>   	return ret;
>>   }
>>
>> @@ -882,7 +891,7 @@ static struct page *__cma_alloc(struct cma *cma, unsigned long count,
>>   	 */
>>   	if (page) {
>>   		for (i = 0; i < count; i++)
>> -			page_kasan_tag_reset(nth_page(page, i));
>> +			page_kasan_tag_reset(page + i);
>>   	}
>>
>>   	if (ret && !(gfp & __GFP_NOWARN)) {
>> diff --git a/mm/util.c b/mm/util.c
>> index d235b74f7aff7..0bf349b19b652 100644
>> --- a/mm/util.c
>> +++ b/mm/util.c
>> @@ -1280,4 +1280,37 @@ unsigned int folio_pte_batch(struct folio *folio, pte_t *ptep, pte_t pte,
>>   {
>>   	return folio_pte_batch_flags(folio, NULL, ptep, &pte, max_nr, 0);
>>   }
>> +
>> +#if defined(CONFIG_SPARSEMEM) && !defined(CONFIG_SPARSEMEM_VMEMMAP)
>> +/**
>> + * page_range_contiguous - test whether the page range is contiguous
>> + * @page: the start of the page range.
>> + * @nr_pages: the number of pages in the range.
>> + *
>> + * Test whether the page range is contiguous, such that they can be iterated
>> + * naively, corresponding to iterating a contiguous PFN range.
>> + *
>> + * This function should primarily only be used for debug checks, or when
>> + * working with page ranges that are not naturally contiguous (e.g., pages
>> + * within a folio are).
>> + *
>> + * Returns true if contiguous, otherwise false.
>> + */
>> +bool page_range_contiguous(const struct page *page, unsigned long nr_pages)
>> +{
>> +	const unsigned long start_pfn = page_to_pfn(page);
>> +	const unsigned long end_pfn = start_pfn + nr_pages;
>> +	unsigned long pfn;
>> +
>> +	/*
>> +	 * The memmap is allocated per memory section. We need to check
>> +	 * each involved memory section once.
>> +	 */
>> +	for (pfn = ALIGN(start_pfn, PAGES_PER_SECTION);
>> +	     pfn < end_pfn; pfn += PAGES_PER_SECTION)
>> +		if (unlikely(page + (pfn - start_pfn) != pfn_to_page(pfn)))
>> +			return false;
> 
> I find this pretty confusing, my test for this is how many times I have to read
> the code to understand what it's doing :)
> 
> So we have something like:
> 
>    (pfn of page)
>     start_pfn        pfn = align UP
>          |                 |
>          v                 v
>   |         section        |
>          <----------------->
>            pfn - start_pfn
> 
> Then check page + (pfn - start_pfn) == pfn_to_page(pfn)
> 
> And loop such that:
> 
>    (pfn of page)
>     start_pfn                                      pfn
>          |                                          |
>          v                                          v
>   |         section        |         section        |
>          <------------------------------------------>
>                          pfn - start_pfn
> 
> Again check page + (pfn - start_pfn) == pfn_to_page(pfn)
> 
> And so on.
> 
> So the logic looks good, but it's just... that took me a hot second to
> parse :)
> 
> I think a few simple fixups
> 
> bool page_range_contiguous(const struct page *page, unsigned long nr_pages)
> {
> 	const unsigned long start_pfn = page_to_pfn(page);
> 	const unsigned long end_pfn = start_pfn + nr_pages;
> 	/* The PFN of the start of the next section. */
> 	unsigned long pfn = ALIGN(start_pfn, PAGES_PER_SECTION);
> 	/* The page we'd expected to see if the range were contiguous. */
> 	struct page *expected = page + (pfn - start_pfn);
> 
> 	/*
> 	 * The memmap is allocated per memory section. We need to check
> 	 * each involved memory section once.
> 	 */
> 	for (; pfn < end_pfn; pfn += PAGES_PER_SECTION, expected += PAGES_PER_SECTION)
> 		if (unlikely(expected != pfn_to_page(pfn)))
> 			return false;
> 	return true;
> }
> 

Hm, I prefer my variant, especially where the pfn is calculated in the for loop. Likely a
matter of personal taste.

But I can see why skipping the first section might be a surprise when not
having the semantics of ALIGN() in the cache.

So I'll add the following on top:

diff --git a/mm/util.c b/mm/util.c
index 0bf349b19b652..fbdb73aaf35fe 100644
--- a/mm/util.c
+++ b/mm/util.c
@@ -1303,8 +1303,10 @@ bool page_range_contiguous(const struct page *page, unsigned long nr_pages)
         unsigned long pfn;
  
         /*
-        * The memmap is allocated per memory section. We need to check
-        * each involved memory section once.
+        * The memmap is allocated per memory section, so no need to check
+        * within the first section. However, we need to check each other
+        * spanned memory section once, making sure the first page in a
+        * section could similarly be reached by just iterating pages.
          */
         for (pfn = ALIGN(start_pfn, PAGES_PER_SECTION);
              pfn < end_pfn; pfn += PAGES_PER_SECTION)

Thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply related	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 24/36] ata: libata-eh: drop nth_page() usage within SG entry
  2025-08-29  0:22     ` Damien Le Moal
@ 2025-08-29 14:37       ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 14:37 UTC (permalink / raw)
  To: Damien Le Moal, Lorenzo Stoakes
  Cc: linux-kernel, Niklas Cassel, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Marco Elver, Marek Szyprowski, Michal Hocko,
	Mike Rapoport, Muchun Song, netdev, Oscar Salvador, Peter Xu,
	Robin Murphy, Suren Baghdasaryan, Tejun Heo, virtualization,
	Vlastimil Babka, wireguard, x86, Zi Yan

On 29.08.25 02:22, Damien Le Moal wrote:
> On 8/29/25 2:53 AM, Lorenzo Stoakes wrote:
>> On Thu, Aug 28, 2025 at 12:01:28AM +0200, David Hildenbrand wrote:
>>> It's no longer required to use nth_page() when iterating pages within a
>>> single SG entry, so let's drop the nth_page() usage.
>>>
>>> Cc: Damien Le Moal <dlemoal@kernel.org>
>>> Cc: Niklas Cassel <cassel@kernel.org>
>>> Signed-off-by: David Hildenbrand <david@redhat.com>
>>
>> LGTM, so:
>>
>> Reviewed-by: Lorenzo Stoakes <lorenzo.stoakes@oracle.com>
> 
> Just noticed this:
> 
> s/libata-eh/libata-sff
> 
> in the commit title please.
> 

Sure, I think some quick git-log search mislead me.

Thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 33/36] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock()
       [not found]   ` <c9527014-9a29-48f4-8ca9-a6226f962c00@lucifer.local>
@ 2025-08-29 14:41     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 14:41 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On 28.08.25 20:09, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:37AM +0200, David Hildenbrand wrote:
>> There is the concern that unpin_user_page_range_dirty_lock() might do
>> some weird merging of PFN ranges -- either now or in the future -- such
>> that PFN range is contiguous but the page range might not be.
>>
>> Let's sanity-check for that and drop the nth_page() usage.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> Seems one user uses SG and the other is IOMMU and in each instance you'd
> expect physical contiguity (maybe Jason G. or somebody else more familiar
> with these uses can also chime in).

Right, and I added the sanity-check so we can identify and fix any such 
wrong merging of ranges.

Thanks!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 36/36] mm: remove nth_page()
       [not found]   ` <18c6a175-507f-464c-b776-67d346863ddf@lucifer.local>
@ 2025-08-29 14:42     ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-08-29 14:42 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

On 28.08.25 20:25, Lorenzo Stoakes wrote:
> On Thu, Aug 28, 2025 at 12:01:40AM +0200, David Hildenbrand wrote:
>> Now that all users are gone, let's remove it.
>>
>> Signed-off-by: David Hildenbrand <david@redhat.com>
> 
> HAPPY DAYYS!!!!
> 
> Happy to have reached this bit, great work! :)

I was just as happy when I made it to the end of this series :)

Thanks for all the review!!

-- 
Cheers

David / dhildenb


^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 32/36] crypto: remove nth_page() usage within SG entry
       [not found] ` <20250827220141.262669-33-david@redhat.com>
@ 2025-08-30  8:50   ` Herbert Xu
  0 siblings, 0 replies; 49+ messages in thread
From: Herbert Xu @ 2025-08-30  8:50 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: linux-kernel, David S. Miller, Alexander Potapenko, Andrew Morton,
	Brendan Jackman, Christoph Lameter, Dennis Zhou, Dmitry Vyukov,
	dri-devel, intel-gfx, iommu, io-uring, Jason Gunthorpe,
	Jens Axboe, Johannes Weiner, John Hubbard, kasan-dev, kvm,
	Liam R. Howlett, Linus Torvalds, linux-arm-kernel,
	linux-arm-kernel, linux-crypto, linux-ide, linux-kselftest,
	linux-mips, linux-mmc, linux-mm, linux-riscv, linux-s390,
	linux-scsi, Lorenzo Stoakes, Marco Elver, Marek Szyprowski,
	Michal Hocko, Mike Rapoport, Muchun Song, netdev, Oscar Salvador,
	Peter Xu, Robin Murphy, Suren Baghdasaryan, Tejun Heo,
	virtualization, Vlastimil Babka, wireguard, x86, Zi Yan

On Thu, Aug 28, 2025 at 12:01:36AM +0200, David Hildenbrand wrote:
> It's no longer required to use nth_page() when iterating pages within a
> single SG entry, so let's drop the nth_page() usage.
> 
> Cc: Herbert Xu <herbert@gondor.apana.org.au>
> Cc: "David S. Miller" <davem@davemloft.net>
> Signed-off-by: David Hildenbrand <david@redhat.com>
> ---
>  crypto/ahash.c               | 4 ++--
>  crypto/scompress.c           | 8 ++++----
>  include/crypto/scatterwalk.h | 4 ++--
>  3 files changed, 8 insertions(+), 8 deletions(-)

Acked-by: Herbert Xu <herbert@gondor.apana.org.au>

Thanks,
-- 
Email: Herbert Xu <herbert@gondor.apana.org.au>
Home Page: http://gondor.apana.org.au/~herbert/
PGP Key: http://gondor.apana.org.au/~herbert/pubkey.txt

^ permalink raw reply	[flat|nested] 49+ messages in thread

* Re: [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages
       [not found]       ` <8a26ae97-9a78-4db5-be98-9c1f6e4fb403@lucifer.local>
@ 2025-09-01 11:35         ` David Hildenbrand
  0 siblings, 0 replies; 49+ messages in thread
From: David Hildenbrand @ 2025-09-01 11:35 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: linux-kernel, Alexander Potapenko, Andrew Morton, Brendan Jackman,
	Christoph Lameter, Dennis Zhou, Dmitry Vyukov, dri-devel,
	intel-gfx, iommu, io-uring, Jason Gunthorpe, Jens Axboe,
	Johannes Weiner, John Hubbard, kasan-dev, kvm, Liam R. Howlett,
	Linus Torvalds, linux-arm-kernel, linux-arm-kernel, linux-crypto,
	linux-ide, linux-kselftest, linux-mips, linux-mmc, linux-mm,
	linux-riscv, linux-s390, linux-scsi, Marco Elver,
	Marek Szyprowski, Michal Hocko, Mike Rapoport, Muchun Song,
	netdev, Oscar Salvador, Peter Xu, Robin Murphy,
	Suren Baghdasaryan, Tejun Heo, virtualization, Vlastimil Babka,
	wireguard, x86, Zi Yan

>>
>>
>> The nice thing is that we only record pages in the array if they actually passed our tests.
> 
> Yeah that's nice actually.
> 
> This is fine (not the meme :P)

:D

> 
> So yes let's do this!

That leaves us with the following on top of this patch:

 From 4533c6e3590cab0c53e81045624d5949e0ad9015 Mon Sep 17 00:00:00 2001
From: David Hildenbrand <david@redhat.com>
Date: Fri, 29 Aug 2025 15:41:45 +0200
Subject: [PATCH] mm/gup: remove record_subpages()

We can just cleanup the code by calculating the #refs earlier,
so we can just inline what remains of record_subpages().

Calculate the number of references/pages ahead of times, and record them
only once all our tests passed.

Signed-off-by: David Hildenbrand <david@redhat.com>
---
  mm/gup.c | 25 ++++++++-----------------
  1 file changed, 8 insertions(+), 17 deletions(-)

diff --git a/mm/gup.c b/mm/gup.c
index 89ca0813791ab..5a72a135ec70b 100644
--- a/mm/gup.c
+++ b/mm/gup.c
@@ -484,19 +484,6 @@ static inline void mm_set_has_pinned_flag(struct mm_struct *mm)
  #ifdef CONFIG_MMU
  
  #ifdef CONFIG_HAVE_GUP_FAST
-static int record_subpages(struct page *page, unsigned long sz,
-			   unsigned long addr, unsigned long end,
-			   struct page **pages)
-{
-	int nr;
-
-	page += (addr & (sz - 1)) >> PAGE_SHIFT;
-	for (nr = 0; addr != end; nr++, addr += PAGE_SIZE)
-		pages[nr] = page++;
-
-	return nr;
-}
-
  /**
   * try_grab_folio_fast() - Attempt to get or pin a folio in fast path.
   * @page:  pointer to page to be grabbed
@@ -2963,8 +2950,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
  	if (pmd_special(orig))
  		return 0;
  
-	page = pmd_page(orig);
-	refs = record_subpages(page, PMD_SIZE, addr, end, pages + *nr);
+	refs = (end - addr) >> PAGE_SHIFT;
+	page = pmd_page(orig) + ((addr & ~PMD_MASK) >> PAGE_SHIFT);
  
  	folio = try_grab_folio_fast(page, refs, flags);
  	if (!folio)
@@ -2985,6 +2972,8 @@ static int gup_fast_pmd_leaf(pmd_t orig, pmd_t *pmdp, unsigned long addr,
  	}
  
  	*nr += refs;
+	for (; refs; refs--)
+		*(pages++) = page++;
  	folio_set_referenced(folio);
  	return 1;
  }
@@ -3003,8 +2992,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
  	if (pud_special(orig))
  		return 0;
  
-	page = pud_page(orig);
-	refs = record_subpages(page, PUD_SIZE, addr, end, pages + *nr);
+	refs = (end - addr) >> PAGE_SHIFT;
+	page = pud_page(orig) + ((addr & ~PUD_MASK) >> PAGE_SHIFT);
  
  	folio = try_grab_folio_fast(page, refs, flags);
  	if (!folio)
@@ -3026,6 +3015,8 @@ static int gup_fast_pud_leaf(pud_t orig, pud_t *pudp, unsigned long addr,
  	}
  
  	*nr += refs;
+	for (; refs; refs--)
+		*(pages++) = page++;
  	folio_set_referenced(folio);
  	return 1;
  }
-- 
2.50.1


-- 
Cheers

David / dhildenb


^ permalink raw reply related	[flat|nested] 49+ messages in thread

end of thread, other threads:[~2025-09-01 11:35 UTC | newest]

Thread overview: 49+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
     [not found] <20250827220141.262669-1-david@redhat.com>
2025-08-27 22:01 ` [PATCH v1 06/36] mm/page_alloc: reject unreasonable folio/compound page sizes in alloc_contig_range_noprof() David Hildenbrand
     [not found]   ` <3hpjmfa6p3onfdv4ma4nv2tdggvsyarh7m36aufy6hvwqtp2wd@2odohwxkl3rk>
2025-08-29  9:58     ` David Hildenbrand
     [not found]   ` <f195300e-42e2-4eaa-84c8-c37501c3339c@lucifer.local>
2025-08-29 10:06     ` David Hildenbrand
     [not found]       ` <34edaa0d-0d5f-4041-9a3d-fb5b2dd584e8@lucifer.local>
2025-08-29 13:09         ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 07/36] mm/memremap: reject unreasonable folio/compound page sizes in memremap_pages() David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 08/36] mm/hugetlb: check for unreasonable folio sizes when registering hstate David Hildenbrand
     [not found]   ` <fa3425dd-df25-4a0b-a27e-614c81d301c4@lucifer.local>
2025-08-29 10:07     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 09/36] mm/mm_init: make memmap_init_compound() look more like prep_compound_page() David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 10/36] mm: sanity-check maximum folio size in folio_set_order() David Hildenbrand
     [not found]   ` <f0c6e9f6-df09-4b10-9338-7bfe4aa46601@lucifer.local>
2025-08-29 10:10     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 11/36] mm: limit folio/compound page sizes in problematic kernel configs David Hildenbrand
     [not found]   ` <baa1b6cf-2fde-4149-8cdf-4b54e2d7c60d@lucifer.local>
2025-08-29 11:57     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 12/36] mm: simplify folio_page() and folio_page_idx() David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 13/36] mm/hugetlb: cleanup hugetlb_folio_init_tail_vmemmap() David Hildenbrand
2025-08-28  7:21   ` Mike Rapoport
2025-08-28  7:44     ` David Hildenbrand
2025-08-28  8:06       ` Mike Rapoport
2025-08-28  8:18         ` David Hildenbrand
2025-08-28  8:37           ` Mike Rapoport
2025-08-29 12:00             ` David Hildenbrand
     [not found]   ` <cebd5356-0fc6-40aa-9bc6-a3a5ffe918f8@lucifer.local>
2025-08-29 11:59     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 14/36] mm/mm/percpu-km: drop nth_page() usage within single allocation David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 15/36] fs: hugetlbfs: remove nth_page() usage within folio in adjust_range_hwpoison() David Hildenbrand
     [not found]   ` <1d74a0e2-51ff-462f-8f3c-75639fd21221@lucifer.local>
2025-08-29 12:02     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 16/36] fs: hugetlbfs: cleanup " David Hildenbrand
     [not found]   ` <71cf3600-d9cf-4d16-951c-44582b46c0fa@lucifer.local>
2025-08-29 13:22     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 17/36] mm/pagewalk: drop nth_page() usage within folio in folio_walk_start() David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 18/36] mm/gup: drop nth_page() usage within folio when recording subpages David Hildenbrand
     [not found]   ` <c0dadc4f-6415-4818-a319-e3e15ff47a24@lucifer.local>
2025-08-29 13:41     ` David Hildenbrand
     [not found]       ` <8a26ae97-9a78-4db5-be98-9c1f6e4fb403@lucifer.local>
2025-09-01 11:35         ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 19/36] io_uring/zcrx: remove nth_page() usage within folio David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 20/36] mips: mm: convert __flush_dcache_pages() to __flush_dcache_folio_pages() David Hildenbrand
     [not found]   ` <ea74f0e3-bacf-449a-b7ad-213c74599df1@lucifer.local>
2025-08-28 20:51     ` David Hildenbrand
     [not found]       ` <549a60a6-25e2-48d5-b442-49404a857014@lucifer.local>
2025-08-29 13:44         ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 21/36] mm/cma: refuse handing out non-contiguous page ranges David Hildenbrand
     [not found]   ` <b772a0c0-6e09-4fa4-a113-fe5adf9c7fe0@lucifer.local>
2025-08-29 14:34     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 22/36] dma-remap: drop nth_page() in dma_common_contiguous_remap() David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 23/36] scatterlist: disallow non-contigous page ranges in a single SG entry David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 33/36] mm/gup: drop nth_page() usage in unpin_user_page_range_dirty_lock() David Hildenbrand
     [not found]   ` <c9527014-9a29-48f4-8ca9-a6226f962c00@lucifer.local>
2025-08-29 14:41     ` David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 34/36] kfence: drop nth_page() usage David Hildenbrand
2025-08-28  8:43   ` Marco Elver
2025-08-27 22:01 ` [PATCH v1 35/36] block: update comment of "struct bio_vec" regarding nth_page() David Hildenbrand
2025-08-27 22:01 ` [PATCH v1 36/36] mm: remove nth_page() David Hildenbrand
     [not found]   ` <18c6a175-507f-464c-b776-67d346863ddf@lucifer.local>
2025-08-29 14:42     ` David Hildenbrand
     [not found] ` <20250827220141.262669-25-david@redhat.com>
2025-08-28  4:24   ` [PATCH v1 24/36] ata: libata-eh: drop nth_page() usage within SG entry Damien Le Moal
     [not found]   ` <7612fdc2-97ff-4b89-a532-90c5de56acdc@lucifer.local>
2025-08-29  0:22     ` Damien Le Moal
2025-08-29 14:37       ` David Hildenbrand
     [not found] ` <20250827220141.262669-33-david@redhat.com>
2025-08-30  8:50   ` [PATCH v1 32/36] crypto: remove " Herbert Xu

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).