linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
@ 2025-05-09 20:01 Zi Yan
  2025-05-09 20:01 ` [PATCH v4 1/4] mm/page_isolation: make page isolation " Zi Yan
                   ` (5 more replies)
  0 siblings, 6 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-09 20:01 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Zi Yan

Hi David and Oscar,

Can you take a look at Patch 2, which changes how online_pages() set
online pageblock migratetypes? It used to first set all pageblocks to
MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
Let me know if there is any issue with my changes.

Hi Johannes,

Patch 2 now have set_pageblock_migratetype() not accepting
MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
feedback.

Hi all,

This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
being overwritten during pageblock isolation process. Currently,
MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
migratetype. This causes pageblock migratetype loss during
alloc_contig_range() and memory offline, especially when the process
fails due to a failed pageblock isolation and the code tries to undo the
finished pageblock isolations.

It is on top of mm-everything-2025-05-09-01-57 with v3 reverted.

In terms of performance for changing pageblock types, no performance
change is observed:

1. I used perf to collect stats of offlining and onlining all memory of a
40GB VM 10 times and see that get_pfnblock_flags_mask() and
set_pfnblock_flags_mask() take about 0.12% and 0.02% of the whole process
respectively with and without this patchset across 3 runs.

2. I used perf to collect stats of dd from /dev/random to a 40GB tmpfs file
and find get_pfnblock_flags_mask() takes about 0.05% of the process with and
without this patchset across 3 runs.


Changelog
===
From v3[2]:
1. kept the original is_migrate_isolate_page()
2. moved {get,set,clear}_pageblock_isolate() to mm/page_isolation.c
3. used a single version for get_pageblock_migratetype() and
   get_pfnblock_migratetype().
4. replace get_pageblock_isolate() with
   get_pageblock_migratetype() == MIGRATE_ISOLATE, a
   get_pageblock_isolate() becomes private in mm/page_isolation.c
5. made set_pageblock_migratetype() not accept MIGRATE_ISOLATE, so that
   people need to use the dedicate {get,set,clear}_pageblock_isolate() APIs.
6. changed online_page() from mm/memory_hotplug.c to first set pageblock
   migratetype to MIGRATE_MOVABLE, then isolate pageblocks.
7. added __maybe_unused to get_pageblock_isolate(), since it is only
   used in VM_BUG_ON(), which could be not present when MM debug is off.
   It is reported by kernel test robot.
7. fixed test_pages_isolated() type issues reported by kernel test
   robot.

From v2[1]:
1. Moved MIGRATETYPE_NO_ISO_MASK to Patch 2, where it is used.
2. Removed spurious changes in Patch 1.
3. Refactored code so that migratetype mask is passed properly for all
callers to {get,set}_pfnblock_flags_mask().
4. Added toggle_pageblock_isolate() for setting and clearing
MIGRATE_ISOLATE.
5. Changed get_pageblock_migratetype() when CONFIG_MEMORY_ISOLATION to
handle MIGRATE_ISOLATE case. It acts like a parsing layer for
get_pfnblock_flags_mask().


Design
===

Pageblock flags are read in words to achieve good performance and existing
pageblock flags take 4 bits per pageblock. To avoid a substantial change
to the pageblock flag code, pageblock flag bits are expanded to use 8
and MIGRATE_ISOLATE is moved to use the last bit (bit 7).

It might look like the pageblock flags have doubled the overhead, but in
reality, the overhead is only 1 byte per 2MB/4MB (based on pageblock config),
or 0.0000476 %.

Any comment and/or suggestion is welcome. Thanks.

[1] https://lore.kernel.org/linux-mm/20250214154215.717537-1-ziy@nvidia.com/
[2] https://lore.kernel.org/linux-mm/20250507211059.2211628-2-ziy@nvidia.com/


Zi Yan (4):
  mm/page_isolation: make page isolation a standalone bit.
  mm/page_isolation: remove migratetype from
    move_freepages_block_isolate()
  mm/page_isolation: remove migratetype from undo_isolate_page_range()
  mm/page_isolation: remove migratetype parameter from more functions.

 drivers/virtio/virtio_mem.c     |   3 +-
 include/linux/gfp.h             |   6 +-
 include/linux/mmzone.h          |  17 +++--
 include/linux/page-isolation.h  |  33 +++++++--
 include/linux/pageblock-flags.h |   9 ++-
 include/trace/events/kmem.h     |  14 ++--
 mm/cma.c                        |   2 +-
 mm/memory_hotplug.c             |  14 ++--
 mm/page_alloc.c                 | 126 ++++++++++++++++++++++++--------
 mm/page_isolation.c             | 114 +++++++++++++++--------------
 10 files changed, 226 insertions(+), 112 deletions(-)

-- 
2.47.2


^ permalink raw reply	[flat|nested] 42+ messages in thread

* [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
@ 2025-05-09 20:01 ` Zi Yan
  2025-05-13 11:32   ` Brendan Jackman
  2025-05-19  8:08   ` David Hildenbrand
  2025-05-09 20:01 ` [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate() Zi Yan
                   ` (4 subsequent siblings)
  5 siblings, 2 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-09 20:01 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Zi Yan

During page isolation, the original migratetype is overwritten, since
MIGRATE_* are enums and stored in pageblock bitmaps. Change
MIGRATE_ISOLATE to be stored a standalone bit, PB_migrate_isolate, like
PB_migrate_skip, so that migratetype is not lost during pageblock
isolation. pageblock bits needs to be word aligned, so expand
the number of pageblock bits from 4 to 8 and make PB_migrate_isolate bit 7.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/mmzone.h          | 15 ++++++++------
 include/linux/pageblock-flags.h |  9 ++++++++-
 mm/page_alloc.c                 | 36 ++++++++++++++++++++++++++++++++-
 mm/page_isolation.c             | 11 ++++++++++
 4 files changed, 63 insertions(+), 8 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index b19a98c20de8..7ef01fe148ce 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -106,14 +106,17 @@ static inline bool migratetype_is_mergeable(int mt)
 
 extern int page_group_by_mobility_disabled;
 
-#define MIGRATETYPE_MASK ((1UL << PB_migratetype_bits) - 1)
+#ifdef CONFIG_MEMORY_ISOLATION
+#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
+#else
+#define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
+#endif
+
+unsigned long get_pageblock_migratetype(const struct page *page);
 
-#define get_pageblock_migratetype(page)					\
-	get_pfnblock_flags_mask(page, page_to_pfn(page), MIGRATETYPE_MASK)
+#define folio_migratetype(folio)					\
+	get_pageblock_migratetype(&folio->page)
 
-#define folio_migratetype(folio)				\
-	get_pfnblock_flags_mask(&folio->page, folio_pfn(folio),		\
-			MIGRATETYPE_MASK)
 struct free_area {
 	struct list_head	free_list[MIGRATE_TYPES];
 	unsigned long		nr_free;
diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h
index 0c4963339f0b..00040e7df8c8 100644
--- a/include/linux/pageblock-flags.h
+++ b/include/linux/pageblock-flags.h
@@ -20,7 +20,10 @@ enum pageblock_bits {
 	PB_migrate_end = PB_migrate + PB_migratetype_bits - 1,
 			/* 3 bits required for migrate types */
 	PB_migrate_skip,/* If set the block is skipped by compaction */
-
+#ifdef CONFIG_MEMORY_ISOLATION
+	PB_migrate_isolate = 7, /* If set the block is isolated */
+			/* set it to 7 to make pageblock bit word aligned */
+#endif
 	/*
 	 * Assume the bits will always align on a word. If this assumption
 	 * changes then get/set pageblock needs updating.
@@ -28,6 +31,10 @@ enum pageblock_bits {
 	NR_PAGEBLOCK_BITS
 };
 
+#ifdef CONFIG_MEMORY_ISOLATION
+#define PB_migrate_isolate_bit BIT(PB_migrate_isolate)
+#endif
+
 #if defined(CONFIG_PAGE_BLOCK_ORDER)
 #define PAGE_BLOCK_ORDER CONFIG_PAGE_BLOCK_ORDER
 #else
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index c77592b22256..04e301fb4879 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -381,10 +381,31 @@ unsigned long get_pfnblock_flags_mask(const struct page *page,
 	return (word >> bitidx) & mask;
 }
 
+unsigned long get_pageblock_migratetype(const struct page *page)
+{
+	unsigned long flags;
+
+	flags = get_pfnblock_flags_mask(page, page_to_pfn(page),
+			MIGRATETYPE_MASK);
+#ifdef CONFIG_MEMORY_ISOLATION
+	if (flags & PB_migrate_isolate_bit)
+		return MIGRATE_ISOLATE;
+#endif
+	return flags;
+}
+
 static __always_inline int get_pfnblock_migratetype(const struct page *page,
 					unsigned long pfn)
 {
-	return get_pfnblock_flags_mask(page, pfn, MIGRATETYPE_MASK);
+	unsigned long flags;
+
+	flags = get_pfnblock_flags_mask(page, pfn,
+			MIGRATETYPE_MASK);
+#ifdef CONFIG_MEMORY_ISOLATION
+	if (flags & PB_migrate_isolate_bit)
+		return MIGRATE_ISOLATE;
+#endif
+	return flags;
 }
 
 /**
@@ -402,8 +423,14 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
 	unsigned long bitidx, word_bitidx;
 	unsigned long word;
 
+#ifdef CONFIG_MEMORY_ISOLATION
+	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 8);
+	/* extra one for MIGRATE_ISOLATE */
+	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits) + 1);
+#else
 	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
 	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits));
+#endif
 
 	bitmap = get_pageblock_bitmap(page, pfn);
 	bitidx = pfn_to_bitidx(page, pfn);
@@ -426,6 +453,13 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
 		     migratetype < MIGRATE_PCPTYPES))
 		migratetype = MIGRATE_UNMOVABLE;
 
+#ifdef CONFIG_MEMORY_ISOLATION
+	if (migratetype == MIGRATE_ISOLATE) {
+		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
+				page_to_pfn(page), PB_migrate_isolate_bit);
+		return;
+	}
+#endif
 	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
 				page_to_pfn(page), MIGRATETYPE_MASK);
 }
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index b2fc5266e3d2..751e21f6d85e 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -15,6 +15,17 @@
 #define CREATE_TRACE_POINTS
 #include <trace/events/page_isolation.h>
 
+static inline bool __maybe_unused get_pageblock_isolate(struct page *page)
+{
+	return get_pfnblock_flags_mask(page, page_to_pfn(page),
+			PB_migrate_isolate_bit);
+}
+static inline void clear_pageblock_isolate(struct page *page)
+{
+	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
+			PB_migrate_isolate_bit);
+}
+
 /*
  * This function checks whether the range [start_pfn, end_pfn) includes
  * unmovable pages or not. The range must fall into a single pageblock and
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
  2025-05-09 20:01 ` [PATCH v4 1/4] mm/page_isolation: make page isolation " Zi Yan
@ 2025-05-09 20:01 ` Zi Yan
  2025-05-12  6:25   ` kernel test robot
                     ` (2 more replies)
  2025-05-09 20:01 ` [PATCH v4 3/4] mm/page_isolation: remove migratetype from undo_isolate_page_range() Zi Yan
                   ` (3 subsequent siblings)
  5 siblings, 3 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-09 20:01 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Zi Yan

Since migratetype is no longer overwritten during pageblock isolation,
moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.

Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
when a pageblock is isolated. It is used by move_freepages_block_isolate().

Add pageblock_isolate_and_move_free_pages() and
pageblock_unisolate_and_move_free_pages() to be explicit about the page
isolation operations. Both share the common code in
__move_freepages_block_isolate(), which is renamed from
move_freepages_block_isolate().

Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
so that one should use set_pageblock_isolate() to isolate pageblocks.

Two consequential changes:
1. move pageblock migratetype code out of __move_freepages_block().
2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
   called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
   pageblocks are isolated afterwards. Otherwise, all online pageblocks
   will have non-determined migratetype.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 include/linux/mmzone.h         |  4 +-
 include/linux/page-isolation.h |  5 ++-
 mm/memory_hotplug.c            |  7 +++-
 mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
 mm/page_isolation.c            | 27 ++++++++-----
 5 files changed, 82 insertions(+), 34 deletions(-)

diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
index 7ef01fe148ce..f66895456974 100644
--- a/include/linux/mmzone.h
+++ b/include/linux/mmzone.h
@@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
 extern int page_group_by_mobility_disabled;
 
 #ifdef CONFIG_MEMORY_ISOLATION
-#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
+#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
+#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
 #else
+#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
 #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
 #endif
 
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 898bb788243b..b0a2af0a5357 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
 #define REPORT_FAILURE	0x2
 
 void set_pageblock_migratetype(struct page *page, int migratetype);
+void set_pageblock_isolate(struct page *page);
 
-bool move_freepages_block_isolate(struct zone *zone, struct page *page,
-				  int migratetype);
+bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
+bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
 
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 			     int migratetype, int flags);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b1caedbade5b..c86c47bba019 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 	const int nid = zone_to_nid(zone);
 	int ret;
 	struct memory_notify arg;
+	unsigned long isol_pfn;
 
 	/*
 	 * {on,off}lining is constrained to full memory sections (or more
@@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 
 
 	/* associate pfn range with the zone */
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
+	for (isol_pfn = pfn;
+	     isol_pfn < pfn + nr_pages;
+	     isol_pfn += pageblock_nr_pages)
+		set_pageblock_isolate(pfn_to_page(isol_pfn));
 
 	arg.start_pfn = pfn;
 	arg.nr_pages = nr_pages;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 04e301fb4879..cfd37b2d992e 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -454,11 +454,9 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
 		migratetype = MIGRATE_UNMOVABLE;
 
 #ifdef CONFIG_MEMORY_ISOLATION
-	if (migratetype == MIGRATE_ISOLATE) {
-		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
-				page_to_pfn(page), PB_migrate_isolate_bit);
-		return;
-	}
+	VM_WARN(migratetype == MIGRATE_ISOLATE,
+			"Use set_pageblock_isolate() for pageblock isolation");
+	return;
 #endif
 	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
 				page_to_pfn(page), MIGRATETYPE_MASK);
@@ -1819,8 +1817,8 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
 #endif
 
 /*
- * Change the type of a block and move all its free pages to that
- * type's freelist.
+ * Move all free pages of a block to new type's freelist. Caller needs to
+ * change the block type.
  */
 static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
 				  int old_mt, int new_mt)
@@ -1852,8 +1850,6 @@ static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
 		pages_moved += 1 << order;
 	}
 
-	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
-
 	return pages_moved;
 }
 
@@ -1911,11 +1907,16 @@ static int move_freepages_block(struct zone *zone, struct page *page,
 				int old_mt, int new_mt)
 {
 	unsigned long start_pfn;
+	int res;
 
 	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
 		return -1;
 
-	return __move_freepages_block(zone, start_pfn, old_mt, new_mt);
+	res = __move_freepages_block(zone, start_pfn, old_mt, new_mt);
+	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
+
+	return res;
+
 }
 
 #ifdef CONFIG_MEMORY_ISOLATION
@@ -1943,11 +1944,17 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
 	return start_pfn;
 }
 
+static inline void toggle_pageblock_isolate(struct page *page, bool isolate)
+{
+	set_pfnblock_flags_mask(page, (isolate << PB_migrate_isolate),
+			page_to_pfn(page), PB_migrate_isolate_bit);
+}
+
 /**
- * move_freepages_block_isolate - move free pages in block for page isolation
+ * __move_freepages_block_isolate - move free pages in block for page isolation
  * @zone: the zone
  * @page: the pageblock page
- * @migratetype: migratetype to set on the pageblock
+ * @isolate: to isolate the given pageblock or unisolate it
  *
  * This is similar to move_freepages_block(), but handles the special
  * case encountered in page isolation, where the block of interest
@@ -1962,10 +1969,15 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
  *
  * Returns %true if pages could be moved, %false otherwise.
  */
-bool move_freepages_block_isolate(struct zone *zone, struct page *page,
-				  int migratetype)
+static bool __move_freepages_block_isolate(struct zone *zone,
+		struct page *page, bool isolate)
 {
 	unsigned long start_pfn, pfn;
+	int from_mt;
+	int to_mt;
+
+	if (isolate == (get_pageblock_migratetype(page) == MIGRATE_ISOLATE))
+		return false;
 
 	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
 		return false;
@@ -1982,7 +1994,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
 
 		del_page_from_free_list(buddy, zone, order,
 					get_pfnblock_migratetype(buddy, pfn));
-		set_pageblock_migratetype(page, migratetype);
+		toggle_pageblock_isolate(page, isolate);
 		split_large_buddy(zone, buddy, pfn, order, FPI_NONE);
 		return true;
 	}
@@ -1993,16 +2005,38 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
 
 		del_page_from_free_list(page, zone, order,
 					get_pfnblock_migratetype(page, pfn));
-		set_pageblock_migratetype(page, migratetype);
+		toggle_pageblock_isolate(page, isolate);
 		split_large_buddy(zone, page, pfn, order, FPI_NONE);
 		return true;
 	}
 move:
-	__move_freepages_block(zone, start_pfn,
-			       get_pfnblock_migratetype(page, start_pfn),
-			       migratetype);
+	/* use MIGRATETYPE_NO_ISO_MASK to get the non-isolate migratetype */
+	if (isolate) {
+		from_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
+				MIGRATETYPE_NO_ISO_MASK);
+		to_mt = MIGRATE_ISOLATE;
+	} else {
+		from_mt = MIGRATE_ISOLATE;
+		to_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
+				MIGRATETYPE_NO_ISO_MASK);
+	}
+
+	__move_freepages_block(zone, start_pfn, from_mt, to_mt);
+	toggle_pageblock_isolate(pfn_to_page(start_pfn), isolate);
+
 	return true;
 }
+
+bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
+{
+	return __move_freepages_block_isolate(zone, page, true);
+}
+
+bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page)
+{
+	return __move_freepages_block_isolate(zone, page, false);
+}
+
 #endif /* CONFIG_MEMORY_ISOLATION */
 
 static void change_pageblock_range(struct page *pageblock_page,
@@ -2194,6 +2228,7 @@ try_to_claim_block(struct zone *zone, struct page *page,
 	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
 			page_group_by_mobility_disabled) {
 		__move_freepages_block(zone, start_pfn, block_type, start_type);
+		set_pageblock_migratetype(pfn_to_page(start_pfn), start_type);
 		return __rmqueue_smallest(zone, order, start_type);
 	}
 
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 751e21f6d85e..4571940f14db 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
 	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
 			PB_migrate_isolate_bit);
 }
+void set_pageblock_isolate(struct page *page)
+{
+	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
+			page_to_pfn(page),
+			PB_migrate_isolate_bit);
+}
 
 /*
  * This function checks whether the range [start_pfn, end_pfn) includes
@@ -199,7 +205,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
 			migratetype, isol_flags);
 	if (!unmovable) {
-		if (!move_freepages_block_isolate(zone, page, MIGRATE_ISOLATE)) {
+		if (!pageblock_isolate_and_move_free_pages(zone, page)) {
 			spin_unlock_irqrestore(&zone->lock, flags);
 			return -EBUSY;
 		}
@@ -220,7 +226,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 	return -EBUSY;
 }
 
-static void unset_migratetype_isolate(struct page *page, int migratetype)
+static void unset_migratetype_isolate(struct page *page)
 {
 	struct zone *zone;
 	unsigned long flags;
@@ -273,10 +279,10 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
 		 * Isolating this block already succeeded, so this
 		 * should not fail on zone boundaries.
 		 */
-		WARN_ON_ONCE(!move_freepages_block_isolate(zone, page, migratetype));
+		WARN_ON_ONCE(!pageblock_unisolate_and_move_free_pages(zone, page));
 	} else {
-		set_pageblock_migratetype(page, migratetype);
-		__putback_isolated_page(page, order, migratetype);
+		clear_pageblock_isolate(page);
+		__putback_isolated_page(page, order, get_pageblock_migratetype(page));
 	}
 	zone->nr_isolate_pageblock--;
 out:
@@ -394,7 +400,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
 		if (PageBuddy(page)) {
 			int order = buddy_order(page);
 
-			/* move_freepages_block_isolate() handled this */
+			/* pageblock_isolate_and_move_free_pages() handled this */
 			VM_WARN_ON_ONCE(pfn + (1 << order) > boundary_pfn);
 
 			pfn += 1UL << order;
@@ -444,7 +450,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
 failed:
 	/* restore the original migratetype */
 	if (!skip_isolation)
-		unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype);
+		unset_migratetype_isolate(pfn_to_page(isolate_pageblock));
 	return -EBUSY;
 }
 
@@ -515,7 +521,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	ret = isolate_single_pageblock(isolate_end, flags, true,
 			skip_isolation, migratetype);
 	if (ret) {
-		unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype);
+		unset_migratetype_isolate(pfn_to_page(isolate_start));
 		return ret;
 	}
 
@@ -528,8 +534,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 					start_pfn, end_pfn)) {
 			undo_isolate_page_range(isolate_start, pfn, migratetype);
 			unset_migratetype_isolate(
-				pfn_to_page(isolate_end - pageblock_nr_pages),
-				migratetype);
+				pfn_to_page(isolate_end - pageblock_nr_pages));
 			return -EBUSY;
 		}
 	}
@@ -559,7 +564,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 		page = __first_valid_page(pfn, pageblock_nr_pages);
 		if (!page || !is_migrate_isolate_page(page))
 			continue;
-		unset_migratetype_isolate(page, migratetype);
+		unset_migratetype_isolate(page);
 	}
 }
 /*
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 3/4] mm/page_isolation: remove migratetype from undo_isolate_page_range()
  2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
  2025-05-09 20:01 ` [PATCH v4 1/4] mm/page_isolation: make page isolation " Zi Yan
  2025-05-09 20:01 ` [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate() Zi Yan
@ 2025-05-09 20:01 ` Zi Yan
  2025-05-09 20:01 ` [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions Zi Yan
                   ` (2 subsequent siblings)
  5 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-09 20:01 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Zi Yan

Since migratetype is no longer overwritten during pageblock isolation,
undoing pageblock isolation no longer needs which migratetype to restore.

Signed-off-by: Zi Yan <ziy@nvidia.com>
Acked-by: David Hildenbrand <david@redhat.com>
---
 include/linux/page-isolation.h | 3 +--
 mm/memory_hotplug.c            | 4 ++--
 mm/page_alloc.c                | 2 +-
 mm/page_isolation.c            | 9 +++------
 4 files changed, 7 insertions(+), 11 deletions(-)

diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index b0a2af0a5357..5e6538ab74f0 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -34,8 +34,7 @@ bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *pag
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 			     int migratetype, int flags);
 
-void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-			     int migratetype);
+void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn);
 
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
 			int isol_flags);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index c86c47bba019..e63e115e2c08 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -1234,7 +1234,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
 		build_all_zonelists(NULL);
 
 	/* Basic onlining is complete, allow allocation of onlined pages. */
-	undo_isolate_page_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE);
+	undo_isolate_page_range(pfn, pfn + nr_pages);
 
 	/*
 	 * Freshly onlined pages aren't shuffled (e.g., all pages are placed to
@@ -2120,7 +2120,7 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 
 failed_removal_isolated:
 	/* pushback to free area */
-	undo_isolate_page_range(start_pfn, end_pfn, MIGRATE_MOVABLE);
+	undo_isolate_page_range(start_pfn, end_pfn);
 	memory_notify(MEM_CANCEL_OFFLINE, &arg);
 failed_removal_pcplists_disabled:
 	lru_cache_enable();
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index cfd37b2d992e..8ff7937e932a 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6887,7 +6887,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 		     start, end, outer_start, outer_end);
 	}
 done:
-	undo_isolate_page_range(start, end, migratetype);
+	undo_isolate_page_range(start, end);
 	return ret;
 }
 EXPORT_SYMBOL(alloc_contig_range_noprof);
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 4571940f14db..7b9b76620a96 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -532,7 +532,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 		page = __first_valid_page(pfn, pageblock_nr_pages);
 		if (page && set_migratetype_isolate(page, migratetype, flags,
 					start_pfn, end_pfn)) {
-			undo_isolate_page_range(isolate_start, pfn, migratetype);
+			undo_isolate_page_range(isolate_start, pfn);
 			unset_migratetype_isolate(
 				pfn_to_page(isolate_end - pageblock_nr_pages));
 			return -EBUSY;
@@ -545,13 +545,10 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
  * undo_isolate_page_range - undo effects of start_isolate_page_range()
  * @start_pfn:		The first PFN of the isolated range
  * @end_pfn:		The last PFN of the isolated range
- * @migratetype:	New migrate type to set on the range
  *
- * This finds every MIGRATE_ISOLATE page block in the given range
- * and switches it to @migratetype.
+ * This finds and unsets every MIGRATE_ISOLATE page block in the given range
  */
-void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-			    int migratetype)
+void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn)
 {
 	unsigned long pfn;
 	struct page *page;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions.
  2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
                   ` (2 preceding siblings ...)
  2025-05-09 20:01 ` [PATCH v4 3/4] mm/page_isolation: remove migratetype from undo_isolate_page_range() Zi Yan
@ 2025-05-09 20:01 ` Zi Yan
  2025-05-17 20:21   ` Vlastimil Babka
  2025-05-18 16:32   ` Johannes Weiner
  2025-05-17 20:26 ` [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Vlastimil Babka
  2025-05-19  7:44 ` David Hildenbrand
  5 siblings, 2 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-09 20:01 UTC (permalink / raw)
  To: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Zi Yan

migratetype is no longer overwritten during pageblock isolation,
start_isolate_page_range(), has_unmovable_pages(), and
set_migratetype_isolate() no longer need which migratetype to restore
during isolation failure.

For has_unmoable_pages(), it needs to know if the isolation is for CMA
allocation, so adding CMA_ALLOCATION to isolation flags to provide the
information.

alloc_contig_range() no longer needs migratetype. Replace it with
a newly defined acr_flags_t to tell if an allocation is for CMA. So does
__alloc_contig_migrate_range().

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 drivers/virtio/virtio_mem.c    |  3 +-
 include/linux/gfp.h            |  6 ++-
 include/linux/page-isolation.h | 25 +++++++++++--
 include/trace/events/kmem.h    | 14 ++++---
 mm/cma.c                       |  2 +-
 mm/memory_hotplug.c            |  3 +-
 mm/page_alloc.c                | 25 ++++++-------
 mm/page_isolation.c            | 67 ++++++++++++++++------------------
 8 files changed, 80 insertions(+), 65 deletions(-)

diff --git a/drivers/virtio/virtio_mem.c b/drivers/virtio/virtio_mem.c
index 56d0dbe62163..8accc0f255a8 100644
--- a/drivers/virtio/virtio_mem.c
+++ b/drivers/virtio/virtio_mem.c
@@ -1243,8 +1243,7 @@ static int virtio_mem_fake_offline(struct virtio_mem *vm, unsigned long pfn,
 		if (atomic_read(&vm->config_changed))
 			return -EAGAIN;
 
-		rc = alloc_contig_range(pfn, pfn + nr_pages, MIGRATE_MOVABLE,
-					GFP_KERNEL);
+		rc = alloc_contig_range(pfn, pfn + nr_pages, 0, GFP_KERNEL);
 		if (rc == -ENOMEM)
 			/* whoops, out of memory */
 			return rc;
diff --git a/include/linux/gfp.h b/include/linux/gfp.h
index c9fa6309c903..db4be1861736 100644
--- a/include/linux/gfp.h
+++ b/include/linux/gfp.h
@@ -423,9 +423,13 @@ static inline bool gfp_compaction_allowed(gfp_t gfp_mask)
 extern gfp_t vma_thp_gfp_mask(struct vm_area_struct *vma);
 
 #ifdef CONFIG_CONTIG_ALLOC
+
+typedef unsigned int __bitwise acr_flags_t;
+#define ACR_CMA		((__force acr_flags_t)BIT(0))	// allocate for CMA
+
 /* The below functions must be run on a range from a single zone. */
 extern int alloc_contig_range_noprof(unsigned long start, unsigned long end,
-			      unsigned migratetype, gfp_t gfp_mask);
+			      acr_flags_t alloc_flags, gfp_t gfp_mask);
 #define alloc_contig_range(...)			alloc_hooks(alloc_contig_range_noprof(__VA_ARGS__))
 
 extern struct page *alloc_contig_pages_noprof(unsigned long nr_pages, gfp_t gfp_mask,
diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
index 5e6538ab74f0..7aed5bf18cc6 100644
--- a/include/linux/page-isolation.h
+++ b/include/linux/page-isolation.h
@@ -22,8 +22,25 @@ static inline bool is_migrate_isolate(int migratetype)
 }
 #endif
 
-#define MEMORY_OFFLINE	0x1
-#define REPORT_FAILURE	0x2
+/*
+ * Isolation modes:
+ * ISOLATE_MODE_NONE - isolate for other purposes than those below
+ * MEMORY_OFFLINE    - isolate to offline (!allocate) memory e.g., skip over
+ *		       PageHWPoison() pages and PageOffline() pages.
+ * CMA_ALLOCATION    - isolate for CMA allocations
+ */
+enum isolate_mode_t {
+	ISOLATE_MODE_NONE,
+	MEMORY_OFFLINE,
+	CMA_ALLOCATION,
+};
+
+/*
+ * Isolation flags:
+ * REPORT_FAILURE - report details about the failure to isolate the range
+ */
+typedef unsigned int __bitwise isolate_flags_t;
+#define REPORT_FAILURE		((__force isolate_flags_t)BIT(0))
 
 void set_pageblock_migratetype(struct page *page, int migratetype);
 void set_pageblock_isolate(struct page *page);
@@ -32,10 +49,10 @@ bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
 bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
 
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-			     int migratetype, int flags);
+			     isolate_mode_t mode, isolate_flags_t flags);
 
 void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn);
 
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
-			int isol_flags);
+			isolate_mode_t mode);
 #endif
diff --git a/include/trace/events/kmem.h b/include/trace/events/kmem.h
index f74925a6cf69..efffcf578217 100644
--- a/include/trace/events/kmem.h
+++ b/include/trace/events/kmem.h
@@ -304,6 +304,7 @@ TRACE_EVENT(mm_page_alloc_extfrag,
 		__entry->change_ownership)
 );
 
+#ifdef CONFIG_CONTIG_ALLOC
 TRACE_EVENT(mm_alloc_contig_migrate_range_info,
 
 	TP_PROTO(unsigned long start,
@@ -311,9 +312,9 @@ TRACE_EVENT(mm_alloc_contig_migrate_range_info,
 		 unsigned long nr_migrated,
 		 unsigned long nr_reclaimed,
 		 unsigned long nr_mapped,
-		 int migratetype),
+		 acr_flags_t alloc_flags),
 
-	TP_ARGS(start, end, nr_migrated, nr_reclaimed, nr_mapped, migratetype),
+	TP_ARGS(start, end, nr_migrated, nr_reclaimed, nr_mapped, alloc_flags),
 
 	TP_STRUCT__entry(
 		__field(unsigned long, start)
@@ -321,7 +322,7 @@ TRACE_EVENT(mm_alloc_contig_migrate_range_info,
 		__field(unsigned long, nr_migrated)
 		__field(unsigned long, nr_reclaimed)
 		__field(unsigned long, nr_mapped)
-		__field(int, migratetype)
+		__field(acr_flags_t, alloc_flags)
 	),
 
 	TP_fast_assign(
@@ -330,17 +331,18 @@ TRACE_EVENT(mm_alloc_contig_migrate_range_info,
 		__entry->nr_migrated = nr_migrated;
 		__entry->nr_reclaimed = nr_reclaimed;
 		__entry->nr_mapped = nr_mapped;
-		__entry->migratetype = migratetype;
+		__entry->alloc_flags = alloc_flags;
 	),
 
-	TP_printk("start=0x%lx end=0x%lx migratetype=%d nr_migrated=%lu nr_reclaimed=%lu nr_mapped=%lu",
+	TP_printk("start=0x%lx end=0x%lx alloc_flags=%d nr_migrated=%lu nr_reclaimed=%lu nr_mapped=%lu",
 		  __entry->start,
 		  __entry->end,
-		  __entry->migratetype,
+		  __entry->alloc_flags,
 		  __entry->nr_migrated,
 		  __entry->nr_reclaimed,
 		  __entry->nr_mapped)
 );
+#endif
 
 TRACE_EVENT(mm_setup_per_zone_wmarks,
 
diff --git a/mm/cma.c b/mm/cma.c
index 15632939f20a..8606bfe19e5d 100644
--- a/mm/cma.c
+++ b/mm/cma.c
@@ -818,7 +818,7 @@ static int cma_range_alloc(struct cma *cma, struct cma_memrange *cmr,
 
 		pfn = cmr->base_pfn + (bitmap_no << cma->order_per_bit);
 		mutex_lock(&cma->alloc_mutex);
-		ret = alloc_contig_range(pfn, pfn + count, MIGRATE_CMA, gfp);
+		ret = alloc_contig_range(pfn, pfn + count, ACR_CMA, gfp);
 		mutex_unlock(&cma->alloc_mutex);
 		if (ret == 0) {
 			page = pfn_to_page(pfn);
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index e63e115e2c08..029a65b6ad27 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -2010,8 +2010,7 @@ int offline_pages(unsigned long start_pfn, unsigned long nr_pages,
 
 	/* set above range as isolated */
 	ret = start_isolate_page_range(start_pfn, end_pfn,
-				       MIGRATE_MOVABLE,
-				       MEMORY_OFFLINE | REPORT_FAILURE);
+				       MEMORY_OFFLINE, REPORT_FAILURE);
 	if (ret) {
 		reason = "failure to isolate range";
 		goto failed_removal_pcplists_disabled;
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 8ff7937e932a..b3476e0f59ad 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6580,11 +6580,12 @@ static void alloc_contig_dump_pages(struct list_head *page_list)
 
 /*
  * [start, end) must belong to a single zone.
- * @migratetype: using migratetype to filter the type of migration in
+ * @alloc_flags: using acr_flags_t to filter the type of migration in
  *		trace_mm_alloc_contig_migrate_range_info.
  */
 static int __alloc_contig_migrate_range(struct compact_control *cc,
-		unsigned long start, unsigned long end, int migratetype)
+					unsigned long start, unsigned long end,
+					acr_flags_t alloc_flags)
 {
 	/* This function is based on compact_zone() from compaction.c. */
 	unsigned int nr_reclaimed;
@@ -6656,7 +6657,7 @@ static int __alloc_contig_migrate_range(struct compact_control *cc,
 		putback_movable_pages(&cc->migratepages);
 	}
 
-	trace_mm_alloc_contig_migrate_range_info(start, end, migratetype,
+	trace_mm_alloc_contig_migrate_range_info(start, end, alloc_flags,
 						 total_migrated,
 						 total_reclaimed,
 						 total_mapped);
@@ -6727,10 +6728,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
  * alloc_contig_range() -- tries to allocate given range of pages
  * @start:	start PFN to allocate
  * @end:	one-past-the-last PFN to allocate
- * @migratetype:	migratetype of the underlying pageblocks (either
- *			#MIGRATE_MOVABLE or #MIGRATE_CMA).  All pageblocks
- *			in range must have the same migratetype and it must
- *			be either of the two.
+ * @alloc_flags:	allocation information
  * @gfp_mask:	GFP mask. Node/zone/placement hints are ignored; only some
  *		action and reclaim modifiers are supported. Reclaim modifiers
  *		control allocation behavior during compaction/migration/reclaim.
@@ -6747,7 +6745,7 @@ static int __alloc_contig_verify_gfp_mask(gfp_t gfp_mask, gfp_t *gfp_cc_mask)
  * need to be freed with free_contig_range().
  */
 int alloc_contig_range_noprof(unsigned long start, unsigned long end,
-		       unsigned migratetype, gfp_t gfp_mask)
+			acr_flags_t alloc_flags, gfp_t gfp_mask)
 {
 	unsigned long outer_start, outer_end;
 	int ret = 0;
@@ -6763,6 +6761,8 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 	};
 	INIT_LIST_HEAD(&cc.migratepages);
 	bool is_range_aligned;
+	isolate_mode_t mode = (alloc_flags & ACR_CMA) ? CMA_ALLOCATION :
+							ISOLATE_MODE_NONE;
 
 	gfp_mask = current_gfp_context(gfp_mask);
 	if (__alloc_contig_verify_gfp_mask(gfp_mask, (gfp_t *)&cc.gfp_mask))
@@ -6789,7 +6789,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 	 * put back to page allocator so that buddy can use them.
 	 */
 
-	ret = start_isolate_page_range(start, end, migratetype, 0);
+	ret = start_isolate_page_range(start, end, mode, 0);
 	if (ret)
 		goto done;
 
@@ -6805,7 +6805,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 	 * allocated.  So, if we fall through be sure to clear ret so that
 	 * -EBUSY is not accidentally used or returned to caller.
 	 */
-	ret = __alloc_contig_migrate_range(&cc, start, end, migratetype);
+	ret = __alloc_contig_migrate_range(&cc, start, end, alloc_flags);
 	if (ret && ret != -EBUSY)
 		goto done;
 
@@ -6839,7 +6839,7 @@ int alloc_contig_range_noprof(unsigned long start, unsigned long end,
 	outer_start = find_large_buddy(start);
 
 	/* Make sure the range is really isolated. */
-	if (test_pages_isolated(outer_start, end, 0)) {
+	if (test_pages_isolated(outer_start, end, mode)) {
 		ret = -EBUSY;
 		goto done;
 	}
@@ -6897,8 +6897,7 @@ static int __alloc_contig_pages(unsigned long start_pfn,
 {
 	unsigned long end_pfn = start_pfn + nr_pages;
 
-	return alloc_contig_range_noprof(start_pfn, end_pfn, MIGRATE_MOVABLE,
-				   gfp_mask);
+	return alloc_contig_range_noprof(start_pfn, end_pfn, 0, gfp_mask);
 }
 
 static bool pfn_range_valid_contig(struct zone *z, unsigned long start_pfn,
diff --git a/mm/page_isolation.c b/mm/page_isolation.c
index 7b9b76620a96..71ed918deeb3 100644
--- a/mm/page_isolation.c
+++ b/mm/page_isolation.c
@@ -48,7 +48,7 @@ void set_pageblock_isolate(struct page *page)
  *
  */
 static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long end_pfn,
-				int migratetype, int flags)
+				isolate_mode_t mode, isolate_flags_t flags)
 {
 	struct page *page = pfn_to_page(start_pfn);
 	struct zone *zone = page_zone(page);
@@ -63,7 +63,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
 		 * isolate CMA pageblocks even when they are not movable in fact
 		 * so consider them movable here.
 		 */
-		if (is_migrate_cma(migratetype))
+		if (mode == CMA_ALLOCATION)
 			return NULL;
 
 		return page;
@@ -134,7 +134,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
 		 * The HWPoisoned page may be not in buddy system, and
 		 * page_count() is not 0.
 		 */
-		if ((flags & MEMORY_OFFLINE) && PageHWPoison(page))
+		if ((mode == MEMORY_OFFLINE) && PageHWPoison(page))
 			continue;
 
 		/*
@@ -147,7 +147,7 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
 		 * move these pages that still have a reference count > 0.
 		 * (false negatives in this function only)
 		 */
-		if ((flags & MEMORY_OFFLINE) && PageOffline(page))
+		if ((mode == MEMORY_OFFLINE) && PageOffline(page))
 			continue;
 
 		if (__PageMovable(page) || PageLRU(page))
@@ -168,8 +168,9 @@ static struct page *has_unmovable_pages(unsigned long start_pfn, unsigned long e
  * present in [start_pfn, end_pfn). The pageblock must intersect with
  * [start_pfn, end_pfn).
  */
-static int set_migratetype_isolate(struct page *page, int migratetype, int isol_flags,
-			unsigned long start_pfn, unsigned long end_pfn)
+static int set_migratetype_isolate(struct page *page, isolate_mode_t mode,
+			isolate_flags_t isol_flags, unsigned long start_pfn,
+			unsigned long end_pfn)
 {
 	struct zone *zone = page_zone(page);
 	struct page *unmovable;
@@ -203,7 +204,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
 				  end_pfn);
 
 	unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
-			migratetype, isol_flags);
+			mode, isol_flags);
 	if (!unmovable) {
 		if (!pageblock_isolate_and_move_free_pages(zone, page)) {
 			spin_unlock_irqrestore(&zone->lock, flags);
@@ -309,11 +310,11 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * isolate_single_pageblock() -- tries to isolate a pageblock that might be
  * within a free or in-use page.
  * @boundary_pfn:		pageblock-aligned pfn that a page might cross
+ * @mode:			isolation mode
  * @flags:			isolation flags
  * @isolate_before:	isolate the pageblock before the boundary_pfn
  * @skip_isolation:	the flag to skip the pageblock isolation in second
  *			isolate_single_pageblock()
- * @migratetype:	migrate type to set in error recovery.
  *
  * Free and in-use pages can be as big as MAX_PAGE_ORDER and contain more than one
  * pageblock. When not all pageblocks within a page are isolated at the same
@@ -328,8 +329,9 @@ __first_valid_page(unsigned long pfn, unsigned long nr_pages)
  * either. The function handles this by splitting the free page or migrating
  * the in-use page then splitting the free page.
  */
-static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
-		bool isolate_before, bool skip_isolation, int migratetype)
+static int isolate_single_pageblock(unsigned long boundary_pfn,
+			isolate_mode_t mode, isolate_flags_t flags,
+			bool isolate_before, bool skip_isolation)
 {
 	unsigned long start_pfn;
 	unsigned long isolate_pageblock;
@@ -355,12 +357,11 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
 				      zone->zone_start_pfn);
 
 	if (skip_isolation) {
-		int mt __maybe_unused = get_pageblock_migratetype(pfn_to_page(isolate_pageblock));
-
-		VM_BUG_ON(!is_migrate_isolate(mt));
+		VM_BUG_ON(!get_pageblock_isolate(pfn_to_page(isolate_pageblock)));
 	} else {
-		ret = set_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype,
-				flags, isolate_pageblock, isolate_pageblock + pageblock_nr_pages);
+		ret = set_migratetype_isolate(pfn_to_page(isolate_pageblock),
+				mode, flags, isolate_pageblock,
+				isolate_pageblock + pageblock_nr_pages);
 
 		if (ret)
 			return ret;
@@ -458,14 +459,8 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
  * start_isolate_page_range() - mark page range MIGRATE_ISOLATE
  * @start_pfn:		The first PFN of the range to be isolated.
  * @end_pfn:		The last PFN of the range to be isolated.
- * @migratetype:	Migrate type to set in error recovery.
- * @flags:		The following flags are allowed (they can be combined in
- *			a bit mask)
- *			MEMORY_OFFLINE - isolate to offline (!allocate) memory
- *					 e.g., skip over PageHWPoison() pages
- *					 and PageOffline() pages.
- *			REPORT_FAILURE - report details about the failure to
- *			isolate the range
+ * @mode:		isolation mode
+ * @flags:		isolation flags
  *
  * Making page-allocation-type to be MIGRATE_ISOLATE means free pages in
  * the range will never be allocated. Any free pages and pages freed in the
@@ -498,7 +493,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
  * Return: 0 on success and -EBUSY if any part of range cannot be isolated.
  */
 int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
-			     int migratetype, int flags)
+			     isolate_mode_t mode, isolate_flags_t flags)
 {
 	unsigned long pfn;
 	struct page *page;
@@ -509,8 +504,8 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	bool skip_isolation = false;
 
 	/* isolate [isolate_start, isolate_start + pageblock_nr_pages) pageblock */
-	ret = isolate_single_pageblock(isolate_start, flags, false,
-			skip_isolation, migratetype);
+	ret = isolate_single_pageblock(isolate_start, mode, flags, false,
+			skip_isolation);
 	if (ret)
 		return ret;
 
@@ -518,8 +513,8 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 		skip_isolation = true;
 
 	/* isolate [isolate_end - pageblock_nr_pages, isolate_end) pageblock */
-	ret = isolate_single_pageblock(isolate_end, flags, true,
-			skip_isolation, migratetype);
+	ret = isolate_single_pageblock(isolate_end, mode, flags, true,
+			skip_isolation);
 	if (ret) {
 		unset_migratetype_isolate(pfn_to_page(isolate_start));
 		return ret;
@@ -530,7 +525,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
 	     pfn < isolate_end - pageblock_nr_pages;
 	     pfn += pageblock_nr_pages) {
 		page = __first_valid_page(pfn, pageblock_nr_pages);
-		if (page && set_migratetype_isolate(page, migratetype, flags,
+		if (page && set_migratetype_isolate(page, mode, flags,
 					start_pfn, end_pfn)) {
 			undo_isolate_page_range(isolate_start, pfn);
 			unset_migratetype_isolate(
@@ -573,7 +568,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn)
  */
 static unsigned long
 __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
-				  int flags)
+				  isolate_mode_t mode)
 {
 	struct page *page;
 
@@ -586,10 +581,10 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
 			 * simple way to verify that as VM_BUG_ON(), though.
 			 */
 			pfn += 1 << buddy_order(page);
-		else if ((flags & MEMORY_OFFLINE) && PageHWPoison(page))
+		else if ((mode == MEMORY_OFFLINE) && PageHWPoison(page))
 			/* A HWPoisoned page cannot be also PageBuddy */
 			pfn++;
-		else if ((flags & MEMORY_OFFLINE) && PageOffline(page) &&
+		else if ((mode == MEMORY_OFFLINE) && PageOffline(page) &&
 			 !page_count(page))
 			/*
 			 * The responsible driver agreed to skip PageOffline()
@@ -608,11 +603,11 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
  * test_pages_isolated - check if pageblocks in range are isolated
  * @start_pfn:		The first PFN of the isolated range
  * @end_pfn:		The first PFN *after* the isolated range
- * @isol_flags:		Testing mode flags
+ * @mode:		Testing mode
  *
  * This tests if all in the specified range are free.
  *
- * If %MEMORY_OFFLINE is specified in @flags, it will consider
+ * If %MEMORY_OFFLINE is specified in @mode, it will consider
  * poisoned and offlined pages free as well.
  *
  * Caller must ensure the requested range doesn't span zones.
@@ -620,7 +615,7 @@ __test_page_isolated_in_pageblock(unsigned long pfn, unsigned long end_pfn,
  * Returns 0 if true, -EBUSY if one or more pages are in use.
  */
 int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
-			int isol_flags)
+			isolate_mode_t mode)
 {
 	unsigned long pfn, flags;
 	struct page *page;
@@ -656,7 +651,7 @@ int test_pages_isolated(unsigned long start_pfn, unsigned long end_pfn,
 	/* Check all pages are free or marked as ISOLATED */
 	zone = page_zone(page);
 	spin_lock_irqsave(&zone->lock, flags);
-	pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn, isol_flags);
+	pfn = __test_page_isolated_in_pageblock(start_pfn, end_pfn, mode);
 	spin_unlock_irqrestore(&zone->lock, flags);
 
 	ret = pfn < end_pfn ? -EBUSY : 0;
-- 
2.47.2


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-09 20:01 ` [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate() Zi Yan
@ 2025-05-12  6:25   ` kernel test robot
  2025-05-12 16:10   ` Lorenzo Stoakes
  2025-05-19  8:21   ` David Hildenbrand
  2 siblings, 0 replies; 42+ messages in thread
From: kernel test robot @ 2025-05-12  6:25 UTC (permalink / raw)
  To: Zi Yan
  Cc: oe-lkp, lkp, linux-mm, David Hildenbrand, Oscar Salvador,
	Johannes Weiner, Andrew Morton, Vlastimil Babka, Baolin Wang,
	Kirill A . Shutemov, Mel Gorman, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Richard Chang, linux-kernel, Zi Yan, oliver.sang



Hello,

kernel test robot noticed "WARNING:at_mm/page_alloc.c:#__add_to_free_list" on:

commit: c095828c286182c38cfc8837ca4fec8bc4bdb81d ("[PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()")
url: https://github.com/intel-lab-lkp/linux/commits/Zi-Yan/mm-page_isolation-make-page-isolation-a-standalone-bit/20250510-040418
patch link: https://lore.kernel.org/all/20250509200111.3372279-3-ziy@nvidia.com/
patch subject: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()

in testcase: boot

config: x86_64-kexec
compiler: clang-20
test machine: qemu-system-x86_64 -enable-kvm -cpu SandyBridge -smp 2 -m 16G

(please refer to attached dmesg/kmsg for entire log/backtrace)


+-------------------------------------------------------+------------+------------+
|                                                       | 8e72f4e133 | c095828c28 |
+-------------------------------------------------------+------------+------------+
| WARNING:at_mm/page_alloc.c:#__add_to_free_list        | 0          | 24         |
| RIP:__add_to_free_list                                | 0          | 24         |
| WARNING:at_mm/page_alloc.c:#__del_page_from_free_list | 0          | 24         |
| RIP:__del_page_from_free_list                         | 0          | 24         |
+-------------------------------------------------------+------------+------------+


If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <oliver.sang@intel.com>
| Closes: https://lore.kernel.org/oe-lkp/202505121332.74fc97c-lkp@intel.com


[    0.337813][    T0] ------------[ cut here ]------------
[    0.338214][    T0] page type is 0, passed migratetype is 2 (nr=512)
[ 0.338706][ T0] WARNING: CPU: 0 PID: 0 at mm/page_alloc.c:703 __add_to_free_list (kbuild/obj/consumer/x86_64-kexec/mm/page_alloc.c:701) 
[    0.339381][    T0] Modules linked in:
[    0.339666][    T0] CPU: 0 UID: 0 PID: 0 Comm: swapper/0 Not tainted 6.15.0-rc5-next-20250509-00002-gc095828c2861 #1 PREEMPT(voluntary)
[    0.340589][    T0] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS 1.16.2-debian-1.16.2-1 04/01/2014
[ 0.341361][ T0] RIP: 0010:__add_to_free_list (kbuild/obj/consumer/x86_64-kexec/mm/page_alloc.c:701) 
[ 0.341789][ T0] Code: 48 c1 fe 06 ba 87 00 00 00 e8 53 5f ff ff 84 c0 be 05 00 00 00 48 0f 49 f0 48 c7 c7 b3 9b 7b 82 89 ea 44 89 f1 e8 b7 71 cd ff <0f> 0b e9 35 ff ff ff 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90
All code
========
   0:	48 c1 fe 06          	sar    $0x6,%rsi
   4:	ba 87 00 00 00       	mov    $0x87,%edx
   9:	e8 53 5f ff ff       	call   0xffffffffffff5f61
   e:	84 c0                	test   %al,%al
  10:	be 05 00 00 00       	mov    $0x5,%esi
  15:	48 0f 49 f0          	cmovns %rax,%rsi
  19:	48 c7 c7 b3 9b 7b 82 	mov    $0xffffffff827b9bb3,%rdi
  20:	89 ea                	mov    %ebp,%edx
  22:	44 89 f1             	mov    %r14d,%ecx
  25:	e8 b7 71 cd ff       	call   0xffffffffffcd71e1
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	e9 35 ff ff ff       	jmp    0xffffffffffffff66
  31:	90                   	nop
  32:	90                   	nop
  33:	90                   	nop
  34:	90                   	nop
  35:	90                   	nop
  36:	90                   	nop
  37:	90                   	nop
  38:	90                   	nop
  39:	90                   	nop
  3a:	90                   	nop
  3b:	90                   	nop
  3c:	90                   	nop
  3d:	90                   	nop
  3e:	90                   	nop
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	e9 35 ff ff ff       	jmp    0xffffffffffffff3c
   7:	90                   	nop
   8:	90                   	nop
   9:	90                   	nop
   a:	90                   	nop
   b:	90                   	nop
   c:	90                   	nop
   d:	90                   	nop
   e:	90                   	nop
   f:	90                   	nop
  10:	90                   	nop
  11:	90                   	nop
  12:	90                   	nop
  13:	90                   	nop
  14:	90                   	nop
  15:	90                   	nop
[    0.343261][    T0] RSP: 0000:ffffffff82a038c8 EFLAGS: 00010046
[    0.343707][    T0] RAX: 0000000000000000 RBX: ffff88843fff1528 RCX: 0000000000000001
[    0.344296][    T0] RDX: ffffffff82a036b8 RSI: 00000000ffff7fff RDI: 0000000000000001
[    0.344885][    T0] RBP: 0000000000000002 R08: 0000000000000000 R09: ffffffff82a036b0
[    0.345473][    T0] R10: 00000000ffff7fff R11: 0000000000000001 R12: ffffea0004328000
[    0.346062][    T0] R13: 0000000000000002 R14: 0000000000000200 R15: 0000000000000009
[    0.346650][    T0] FS:  0000000000000000(0000) GS:ffff8884ac41b000(0000) knlGS:0000000000000000
[    0.347330][    T0] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[    0.347813][    T0] CR2: ffff88843ffff000 CR3: 0000000002a30000 CR4: 00000000000000b0
[    0.348625][    T0] DR0: 0000000000000000 DR1: 0000000000000000 DR2: 0000000000000000
[    0.349523][    T0] DR3: 0000000000000000 DR6: 00000000fffe0ff0 DR7: 0000000000000400
[    0.350424][    T0] Call Trace:
[    0.350791][    T0]  <TASK>
[ 0.351122][ T0] try_to_claim_block (kbuild/obj/consumer/x86_64-kexec/mm/page_alloc.c:616) 
[ 0.351662][ T0] __rmqueue_pcplist (kbuild/obj/consumer/x86_64-kexec/mm/page_alloc.c:2280) 
[ 0.352211][ T0] ? string (kbuild/obj/consumer/x86_64-kexec/lib/vsprintf.c:718) 
[ 0.352641][ T0] ? string (kbuild/obj/consumer/x86_64-kexec/lib/vsprintf.c:718) 
[ 0.353067][ T0] ? move_right (kbuild/obj/consumer/x86_64-kexec/lib/vsprintf.c:?) 
[ 0.353553][ T0] get_page_from_freelist (kbuild/obj/consumer/x86_64-kexec/mm/page_alloc.c:3178) 
[ 0.354182][ T0] ? sprintf (kbuild/obj/consumer/x86_64-kexec/lib/vsprintf.c:3039) 
[ 0.354632][ T0] ? prb_first_seq (kbuild/obj/consumer/x86_64-kexec/kernel/printk/printk_ringbuffer.c:1963) 
[ 0.355163][ T0] __alloc_frozen_pages_noprof (kbuild/obj/consumer/x86_64-kexec/mm/page_alloc.c:5028) 
[ 0.355786][ T0] alloc_pages_mpol (kbuild/obj/consumer/x86_64-kexec/mm/mempolicy.c:2411) 
[ 0.356329][ T0] new_slab (kbuild/obj/consumer/x86_64-kexec/mm/slub.c:2454) 
[ 0.356780][ T0] ___slab_alloc (kbuild/obj/consumer/x86_64-kexec/arch/x86/include/asm/preempt.h:80 kbuild/obj/consumer/x86_64-kexec/mm/slub.c:3859) 
[ 0.357310][ T0] ? pcpu_block_refresh_hint (kbuild/obj/consumer/x86_64-kexec/include/linux/find.h:69) 
[ 0.357916][ T0] ? radix_tree_node_alloc (kbuild/obj/consumer/x86_64-kexec/include/linux/radix-tree.h:57 kbuild/obj/consumer/x86_64-kexec/lib/radix-tree.c:278) 
[ 0.358495][ T0] __slab_alloc (kbuild/obj/consumer/x86_64-kexec/arch/x86/include/asm/preempt.h:95 kbuild/obj/consumer/x86_64-kexec/mm/slub.c:3950) 
[ 0.358983][ T0] kmem_cache_alloc_noprof (kbuild/obj/consumer/x86_64-kexec/mm/slub.c:4023) 
[ 0.359609][ T0] ? radix_tree_node_alloc (kbuild/obj/consumer/x86_64-kexec/include/linux/radix-tree.h:57 kbuild/obj/consumer/x86_64-kexec/lib/radix-tree.c:278) 
[ 0.360207][ T0] radix_tree_node_alloc (kbuild/obj/consumer/x86_64-kexec/include/linux/radix-tree.h:57 kbuild/obj/consumer/x86_64-kexec/lib/radix-tree.c:278) 
[ 0.360783][ T0] idr_get_free (kbuild/obj/consumer/x86_64-kexec/lib/radix-tree.c:1508) 
[ 0.361266][ T0] idr_alloc_u32 (kbuild/obj/consumer/x86_64-kexec/include/linux/err.h:70 kbuild/obj/consumer/x86_64-kexec/lib/idr.c:47) 
[ 0.361765][ T0] idr_alloc (kbuild/obj/consumer/x86_64-kexec/lib/idr.c:88) 
[ 0.362220][ T0] init_cpu_worker_pool (kbuild/obj/consumer/x86_64-kexec/kernel/workqueue.c:714 kbuild/obj/consumer/x86_64-kexec/kernel/workqueue.c:7733) 
[ 0.362798][ T0] workqueue_init_early (kbuild/obj/consumer/x86_64-kexec/kernel/workqueue.c:7803) 
[ 0.363372][ T0] start_kernel (kbuild/obj/consumer/x86_64-kexec/init/main.c:991) 
[ 0.363882][ T0] x86_64_start_reservations (??:?) 
[ 0.364494][ T0] x86_64_start_kernel (kbuild/obj/consumer/x86_64-kexec/arch/x86/kernel/head64.c:238) 
[ 0.365039][ T0] common_startup_64 (kbuild/obj/consumer/x86_64-kexec/arch/x86/kernel/head_64.S:419) 
[    0.365589][    T0]  </TASK>
[    0.365919][    T0] ---[ end trace 0000000000000000 ]---


The kernel config and materials to reproduce are available at:
https://download.01.org/0day-ci/archive/20250512/202505121332.74fc97c-lkp@intel.com



-- 
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-09 20:01 ` [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate() Zi Yan
  2025-05-12  6:25   ` kernel test robot
@ 2025-05-12 16:10   ` Lorenzo Stoakes
  2025-05-12 16:13     ` Zi Yan
                       ` (2 more replies)
  2025-05-19  8:21   ` David Hildenbrand
  2 siblings, 3 replies; 42+ messages in thread
From: Lorenzo Stoakes @ 2025-05-12 16:10 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Harry Yoo

Andrew - please drop this series, it's broken in mm-new.

Zi - (as kernel bot reports actually!) I bisected a kernel splat to this
commit, triggerred by the mm/transhuge-stress test (please make sure to run
mm self tests before submitting series :)

You can trigger it manually with:

./transhuge-stress -d 20

(The same invocation run_vmtest.sh uses).

Note that this was reported in [0] (thanks to Harry Yoo for pointing this
out to me off-list! :)

[0]: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/T/#u

The decoded splat (at this commit in mm-new):

[   55.835700] ------------[ cut here ]------------
[   55.835705] page type is 0, passed migratetype is 2 (nr=32)
[   55.835720] WARNING: CPU: 2 PID: 288 at mm/page_alloc.c:727 move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
[   55.835734] Modules linked in:
[   55.835739] Tainted: [W]=WARN
[   55.835740] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
[   55.835741] RIP: 0010:move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
[ 55.835742] Code: e9 fe ff ff c6 05 f1 9b 7b 01 01 90 48 89 ef e8 11 d7 ff ff 44 89 e1 44 89 ea 48 c7 c7 58 dc 70 82 48 89 c6 e8 1c e3 e0 ff 90 <0f> 0b 90 90 e9 ba fe ff ff 66 90 90 90 90 90 90 90 90 90 90 90 90
All code
========
   0:	e9 fe ff ff c6       	jmp    0xffffffffc7000003
   5:	05 f1 9b 7b 01       	add    $0x17b9bf1,%eax
   a:	01 90 48 89 ef e8    	add    %edx,-0x171076b8(%rax)
  10:	11 d7                	adc    %edx,%edi
  12:	ff                   	(bad)
  13:	ff 44 89 e1          	incl   -0x1f(%rcx,%rcx,4)
  17:	44 89 ea             	mov    %r13d,%edx
  1a:	48 c7 c7 58 dc 70 82 	mov    $0xffffffff8270dc58,%rdi
  21:	48 89 c6             	mov    %rax,%rsi
  24:	e8 1c e3 e0 ff       	call   0xffffffffffe0e345
  29:	90                   	nop
  2a:*	0f 0b                	ud2		<-- trapping instruction
  2c:	90                   	nop
  2d:	90                   	nop
  2e:	e9 ba fe ff ff       	jmp    0xfffffffffffffeed
  33:	66 90                	xchg   %ax,%ax
  35:	90                   	nop
  36:	90                   	nop
  37:	90                   	nop
  38:	90                   	nop
  39:	90                   	nop
  3a:	90                   	nop
  3b:	90                   	nop
  3c:	90                   	nop
  3d:	90                   	nop
  3e:	90                   	nop
  3f:	90                   	nop

Code starting with the faulting instruction
===========================================
   0:	0f 0b                	ud2
   2:	90                   	nop
   3:	90                   	nop
   4:	e9 ba fe ff ff       	jmp    0xfffffffffffffec3
   9:	66 90                	xchg   %ax,%ax
   b:	90                   	nop
   c:	90                   	nop
   d:	90                   	nop
   e:	90                   	nop
   f:	90                   	nop
  10:	90                   	nop
  11:	90                   	nop
  12:	90                   	nop
  13:	90                   	nop
  14:	90                   	nop
  15:	90                   	nop
[   55.835743] RSP: 0018:ffffc900004eba20 EFLAGS: 00010086
[   55.835744] RAX: 000000000000002f RBX: ffff88826cccb080 RCX: 0000000000000027
[   55.835745] RDX: ffff888263d17b08 RSI: 0000000000000001 RDI: ffff888263d17b00
[   55.835746] RBP: ffffea0005fe0000 R08: 00000000ffffdfff R09: ffffffff82b16528
[   55.835746] R10: 80000000ffffe000 R11: 00000000ffffe000 R12: 0000000000000020
[   55.835746] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000005
[   55.835750] FS:  00007fef6a06a740(0000) GS:ffff8882e08a0000(0000) knlGS:0000000000000000
[   55.835751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[   55.835751] CR2: 00007fee20c00000 CR3: 0000000179321000 CR4: 0000000000750ef0
[   55.835751] PKRU: 55555554
[   55.835752] Call Trace:
[   55.835755]  <TASK>
[   55.835756] __move_freepages_block (mm/page_alloc.c:1849)
[   55.835758] try_to_claim_block (mm/page_alloc.c:452 (discriminator 3) mm/page_alloc.c:2231 (discriminator 3))
[   55.835759] __rmqueue_pcplist (mm/page_alloc.c:2287 mm/page_alloc.c:2383 mm/page_alloc.c:2422 mm/page_alloc.c:3140)
[   55.835760] get_page_from_freelist (./include/linux/spinlock.h:391 mm/page_alloc.c:3183 mm/page_alloc.c:3213 mm/page_alloc.c:3739)
[   55.835761] __alloc_frozen_pages_noprof (mm/page_alloc.c:5032)
[   55.835763] ? __blk_flush_plug (block/blk-core.c:1227 (discriminator 2))
[   55.835766] alloc_pages_mpol (mm/mempolicy.c:2413)
[   55.835768] vma_alloc_folio_noprof (mm/mempolicy.c:2432 mm/mempolicy.c:2465)
[   55.835769] ? __pte_alloc (mm/memory.c:444)
[   55.835771] do_anonymous_page (mm/memory.c:1064 (discriminator 4) mm/memory.c:4982 (discriminator 4) mm/memory.c:5039 (discriminator 4))
[   55.835772] ? do_huge_pmd_anonymous_page (mm/huge_memory.c:1226 mm/huge_memory.c:1372)
[   55.835774] __handle_mm_fault (mm/memory.c:4197 mm/memory.c:6038 mm/memory.c:6181)
[   55.835776] handle_mm_fault (mm/memory.c:6350)
[   55.835777] do_user_addr_fault (arch/x86/mm/fault.c:1338)
[   55.835779] exc_page_fault (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:114 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538)
[   55.835783] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
[   55.835785] RIP: 0033:0x403824
[ 55.835786] Code: e0 0f 85 7c 01 00 00 ba 0e 00 00 00 be 00 00 20 00 48 89 c7 48 89 c3 e8 4a ea ff ff 85 c0 0f 85 51 01 00 00 8b 0d b4 49 00 00 <48> 89 1b 85 c9 0f 84 b1 00 00 00 83 e9 03 48 89 e6 ba 10 00 00 00
All code
========
   0:	e0 0f                	loopne 0x11
   2:	85 7c 01 00          	test   %edi,0x0(%rcx,%rax,1)
   6:	00 ba 0e 00 00 00    	add    %bh,0xe(%rdx)
   c:	be 00 00 20 00       	mov    $0x200000,%esi
  11:	48 89 c7             	mov    %rax,%rdi
  14:	48 89 c3             	mov    %rax,%rbx
  17:	e8 4a ea ff ff       	call   0xffffffffffffea66
  1c:	85 c0                	test   %eax,%eax
  1e:	0f 85 51 01 00 00    	jne    0x175
  24:	8b 0d b4 49 00 00    	mov    0x49b4(%rip),%ecx        # 0x49de
  2a:*	48 89 1b             	mov    %rbx,(%rbx)		<-- trapping instruction
  2d:	85 c9                	test   %ecx,%ecx
  2f:	0f 84 b1 00 00 00    	je     0xe6
  35:	83 e9 03             	sub    $0x3,%ecx
  38:	48 89 e6             	mov    %rsp,%rsi
  3b:	ba 10 00 00 00       	mov    $0x10,%edx

Code starting with the faulting instruction
===========================================
   0:	48 89 1b             	mov    %rbx,(%rbx)
   3:	85 c9                	test   %ecx,%ecx
   5:	0f 84 b1 00 00 00    	je     0xbc
   b:	83 e9 03             	sub    $0x3,%ecx
   e:	48 89 e6             	mov    %rsp,%rsi
  11:	ba 10 00 00 00       	mov    $0x10,%edx
[   55.835786] RSP: 002b:00007ffd50b1e550 EFLAGS: 00010246
[   55.835787] RAX: 0000000000000000 RBX: 00007fee20c00000 RCX: 000000000000000c
[   55.835787] RDX: 000000000000000e RSI: 0000000000200000 RDI: 00007fee20c00000
[   55.835788] RBP: 0000000000000003 R08: 00000000ffffffff R09: 0000000000000000
[   55.835788] R10: 0000000000004032 R11: 0000000000000246 R12: 00007fee20c00000
[   55.835788] R13: 00007fef6a000000 R14: 00000000323ca6b0 R15: 0000000000000fd2
[   55.835789]  </TASK>
[   55.835789] ---[ end trace 0000000000000000 ]---


On Fri, May 09, 2025 at 04:01:09PM -0400, Zi Yan wrote:
> Since migratetype is no longer overwritten during pageblock isolation,
> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
>
> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
> when a pageblock is isolated. It is used by move_freepages_block_isolate().
>
> Add pageblock_isolate_and_move_free_pages() and
> pageblock_unisolate_and_move_free_pages() to be explicit about the page
> isolation operations. Both share the common code in
> __move_freepages_block_isolate(), which is renamed from
> move_freepages_block_isolate().
>
> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
> so that one should use set_pageblock_isolate() to isolate pageblocks.
>
> Two consequential changes:
> 1. move pageblock migratetype code out of __move_freepages_block().
> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
>    called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
>    pageblocks are isolated afterwards. Otherwise, all online pageblocks
>    will have non-determined migratetype.
>
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>  include/linux/mmzone.h         |  4 +-
>  include/linux/page-isolation.h |  5 ++-
>  mm/memory_hotplug.c            |  7 +++-
>  mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
>  mm/page_isolation.c            | 27 ++++++++-----
>  5 files changed, 82 insertions(+), 34 deletions(-)
>
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 7ef01fe148ce..f66895456974 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
>  extern int page_group_by_mobility_disabled;
>
>  #ifdef CONFIG_MEMORY_ISOLATION
> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
>  #else
> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
>  #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>  #endif
>
> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 898bb788243b..b0a2af0a5357 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
>  #define REPORT_FAILURE	0x2
>
>  void set_pageblock_migratetype(struct page *page, int migratetype);
> +void set_pageblock_isolate(struct page *page);
>
> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> -				  int migratetype);
> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>
>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  			     int migratetype, int flags);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b1caedbade5b..c86c47bba019 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>  	const int nid = zone_to_nid(zone);
>  	int ret;
>  	struct memory_notify arg;
> +	unsigned long isol_pfn;
>
>  	/*
>  	 * {on,off}lining is constrained to full memory sections (or more
> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>
>
>  	/* associate pfn range with the zone */
> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
> +	for (isol_pfn = pfn;
> +	     isol_pfn < pfn + nr_pages;
> +	     isol_pfn += pageblock_nr_pages)
> +		set_pageblock_isolate(pfn_to_page(isol_pfn));
>
>  	arg.start_pfn = pfn;
>  	arg.nr_pages = nr_pages;
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 04e301fb4879..cfd37b2d992e 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -454,11 +454,9 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>  		migratetype = MIGRATE_UNMOVABLE;
>
>  #ifdef CONFIG_MEMORY_ISOLATION
> -	if (migratetype == MIGRATE_ISOLATE) {
> -		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
> -				page_to_pfn(page), PB_migrate_isolate_bit);
> -		return;
> -	}
> +	VM_WARN(migratetype == MIGRATE_ISOLATE,
> +			"Use set_pageblock_isolate() for pageblock isolation");
> +	return;
>  #endif
>  	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>  				page_to_pfn(page), MIGRATETYPE_MASK);
> @@ -1819,8 +1817,8 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
>  #endif
>
>  /*
> - * Change the type of a block and move all its free pages to that
> - * type's freelist.
> + * Move all free pages of a block to new type's freelist. Caller needs to
> + * change the block type.
>   */
>  static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
>  				  int old_mt, int new_mt)
> @@ -1852,8 +1850,6 @@ static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
>  		pages_moved += 1 << order;
>  	}
>
> -	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
> -
>  	return pages_moved;
>  }
>
> @@ -1911,11 +1907,16 @@ static int move_freepages_block(struct zone *zone, struct page *page,
>  				int old_mt, int new_mt)
>  {
>  	unsigned long start_pfn;
> +	int res;
>
>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
>  		return -1;
>
> -	return __move_freepages_block(zone, start_pfn, old_mt, new_mt);
> +	res = __move_freepages_block(zone, start_pfn, old_mt, new_mt);
> +	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
> +
> +	return res;
> +
>  }
>
>  #ifdef CONFIG_MEMORY_ISOLATION
> @@ -1943,11 +1944,17 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
>  	return start_pfn;
>  }
>
> +static inline void toggle_pageblock_isolate(struct page *page, bool isolate)
> +{
> +	set_pfnblock_flags_mask(page, (isolate << PB_migrate_isolate),
> +			page_to_pfn(page), PB_migrate_isolate_bit);
> +}
> +
>  /**
> - * move_freepages_block_isolate - move free pages in block for page isolation
> + * __move_freepages_block_isolate - move free pages in block for page isolation
>   * @zone: the zone
>   * @page: the pageblock page
> - * @migratetype: migratetype to set on the pageblock
> + * @isolate: to isolate the given pageblock or unisolate it
>   *
>   * This is similar to move_freepages_block(), but handles the special
>   * case encountered in page isolation, where the block of interest
> @@ -1962,10 +1969,15 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
>   *
>   * Returns %true if pages could be moved, %false otherwise.
>   */
> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> -				  int migratetype)
> +static bool __move_freepages_block_isolate(struct zone *zone,
> +		struct page *page, bool isolate)
>  {
>  	unsigned long start_pfn, pfn;
> +	int from_mt;
> +	int to_mt;
> +
> +	if (isolate == (get_pageblock_migratetype(page) == MIGRATE_ISOLATE))
> +		return false;
>
>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
>  		return false;
> @@ -1982,7 +1994,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>
>  		del_page_from_free_list(buddy, zone, order,
>  					get_pfnblock_migratetype(buddy, pfn));
> -		set_pageblock_migratetype(page, migratetype);
> +		toggle_pageblock_isolate(page, isolate);
>  		split_large_buddy(zone, buddy, pfn, order, FPI_NONE);
>  		return true;
>  	}
> @@ -1993,16 +2005,38 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>
>  		del_page_from_free_list(page, zone, order,
>  					get_pfnblock_migratetype(page, pfn));
> -		set_pageblock_migratetype(page, migratetype);
> +		toggle_pageblock_isolate(page, isolate);
>  		split_large_buddy(zone, page, pfn, order, FPI_NONE);
>  		return true;
>  	}
>  move:
> -	__move_freepages_block(zone, start_pfn,
> -			       get_pfnblock_migratetype(page, start_pfn),
> -			       migratetype);
> +	/* use MIGRATETYPE_NO_ISO_MASK to get the non-isolate migratetype */
> +	if (isolate) {
> +		from_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
> +				MIGRATETYPE_NO_ISO_MASK);
> +		to_mt = MIGRATE_ISOLATE;
> +	} else {
> +		from_mt = MIGRATE_ISOLATE;
> +		to_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
> +				MIGRATETYPE_NO_ISO_MASK);
> +	}
> +
> +	__move_freepages_block(zone, start_pfn, from_mt, to_mt);
> +	toggle_pageblock_isolate(pfn_to_page(start_pfn), isolate);
> +
>  	return true;
>  }
> +
> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
> +{
> +	return __move_freepages_block_isolate(zone, page, true);
> +}
> +
> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page)
> +{
> +	return __move_freepages_block_isolate(zone, page, false);
> +}
> +
>  #endif /* CONFIG_MEMORY_ISOLATION */
>
>  static void change_pageblock_range(struct page *pageblock_page,
> @@ -2194,6 +2228,7 @@ try_to_claim_block(struct zone *zone, struct page *page,
>  	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
>  			page_group_by_mobility_disabled) {
>  		__move_freepages_block(zone, start_pfn, block_type, start_type);
> +		set_pageblock_migratetype(pfn_to_page(start_pfn), start_type);
>  		return __rmqueue_smallest(zone, order, start_type);
>  	}
>
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index 751e21f6d85e..4571940f14db 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
>  	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>  			PB_migrate_isolate_bit);
>  }
> +void set_pageblock_isolate(struct page *page)
> +{
> +	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
> +			page_to_pfn(page),
> +			PB_migrate_isolate_bit);
> +}
>
>  /*
>   * This function checks whether the range [start_pfn, end_pfn) includes
> @@ -199,7 +205,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>  	unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
>  			migratetype, isol_flags);
>  	if (!unmovable) {
> -		if (!move_freepages_block_isolate(zone, page, MIGRATE_ISOLATE)) {
> +		if (!pageblock_isolate_and_move_free_pages(zone, page)) {
>  			spin_unlock_irqrestore(&zone->lock, flags);
>  			return -EBUSY;
>  		}
> @@ -220,7 +226,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>  	return -EBUSY;
>  }
>
> -static void unset_migratetype_isolate(struct page *page, int migratetype)
> +static void unset_migratetype_isolate(struct page *page)
>  {
>  	struct zone *zone;
>  	unsigned long flags;
> @@ -273,10 +279,10 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
>  		 * Isolating this block already succeeded, so this
>  		 * should not fail on zone boundaries.
>  		 */
> -		WARN_ON_ONCE(!move_freepages_block_isolate(zone, page, migratetype));
> +		WARN_ON_ONCE(!pageblock_unisolate_and_move_free_pages(zone, page));
>  	} else {
> -		set_pageblock_migratetype(page, migratetype);
> -		__putback_isolated_page(page, order, migratetype);
> +		clear_pageblock_isolate(page);
> +		__putback_isolated_page(page, order, get_pageblock_migratetype(page));
>  	}
>  	zone->nr_isolate_pageblock--;
>  out:
> @@ -394,7 +400,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>  		if (PageBuddy(page)) {
>  			int order = buddy_order(page);
>
> -			/* move_freepages_block_isolate() handled this */
> +			/* pageblock_isolate_and_move_free_pages() handled this */
>  			VM_WARN_ON_ONCE(pfn + (1 << order) > boundary_pfn);
>
>  			pfn += 1UL << order;
> @@ -444,7 +450,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>  failed:
>  	/* restore the original migratetype */
>  	if (!skip_isolation)
> -		unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype);
> +		unset_migratetype_isolate(pfn_to_page(isolate_pageblock));
>  	return -EBUSY;
>  }
>
> @@ -515,7 +521,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  	ret = isolate_single_pageblock(isolate_end, flags, true,
>  			skip_isolation, migratetype);
>  	if (ret) {
> -		unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype);
> +		unset_migratetype_isolate(pfn_to_page(isolate_start));
>  		return ret;
>  	}
>
> @@ -528,8 +534,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  					start_pfn, end_pfn)) {
>  			undo_isolate_page_range(isolate_start, pfn, migratetype);
>  			unset_migratetype_isolate(
> -				pfn_to_page(isolate_end - pageblock_nr_pages),
> -				migratetype);
> +				pfn_to_page(isolate_end - pageblock_nr_pages));
>  			return -EBUSY;
>  		}
>  	}
> @@ -559,7 +564,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>  		page = __first_valid_page(pfn, pageblock_nr_pages);
>  		if (!page || !is_migrate_isolate_page(page))
>  			continue;
> -		unset_migratetype_isolate(page, migratetype);
> +		unset_migratetype_isolate(page);
>  	}
>  }
>  /*
> --
> 2.47.2
>
>
>

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-12 16:10   ` Lorenzo Stoakes
@ 2025-05-12 16:13     ` Zi Yan
  2025-05-12 16:19       ` Lorenzo Stoakes
  2025-05-12 22:00     ` Andrew Morton
  2025-05-12 23:20     ` Zi Yan
  2 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-12 16:13 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Harry Yoo

On 12 May 2025, at 12:10, Lorenzo Stoakes wrote:

> Andrew - please drop this series, it's broken in mm-new.
>
> Zi - (as kernel bot reports actually!) I bisected a kernel splat to this
> commit, triggerred by the mm/transhuge-stress test (please make sure to run
> mm self tests before submitting series :)
>
> You can trigger it manually with:
>
> ./transhuge-stress -d 20

Thanks. I will fix the issue and resend.

>
> (The same invocation run_vmtest.sh uses).
>
> Note that this was reported in [0] (thanks to Harry Yoo for pointing this
> out to me off-list! :)
>
> [0]: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/T/#u
>
> The decoded splat (at this commit in mm-new):
>
> [   55.835700] ------------[ cut here ]------------
> [   55.835705] page type is 0, passed migratetype is 2 (nr=32)
> [   55.835720] WARNING: CPU: 2 PID: 288 at mm/page_alloc.c:727 move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
> [   55.835734] Modules linked in:
> [   55.835739] Tainted: [W]=WARN
> [   55.835740] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> [   55.835741] RIP: 0010:move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
> [ 55.835742] Code: e9 fe ff ff c6 05 f1 9b 7b 01 01 90 48 89 ef e8 11 d7 ff ff 44 89 e1 44 89 ea 48 c7 c7 58 dc 70 82 48 89 c6 e8 1c e3 e0 ff 90 <0f> 0b 90 90 e9 ba fe ff ff 66 90 90 90 90 90 90 90 90 90 90 90 90
> All code
> ========
>    0:	e9 fe ff ff c6       	jmp    0xffffffffc7000003
>    5:	05 f1 9b 7b 01       	add    $0x17b9bf1,%eax
>    a:	01 90 48 89 ef e8    	add    %edx,-0x171076b8(%rax)
>   10:	11 d7                	adc    %edx,%edi
>   12:	ff                   	(bad)
>   13:	ff 44 89 e1          	incl   -0x1f(%rcx,%rcx,4)
>   17:	44 89 ea             	mov    %r13d,%edx
>   1a:	48 c7 c7 58 dc 70 82 	mov    $0xffffffff8270dc58,%rdi
>   21:	48 89 c6             	mov    %rax,%rsi
>   24:	e8 1c e3 e0 ff       	call   0xffffffffffe0e345
>   29:	90                   	nop
>   2a:*	0f 0b                	ud2		<-- trapping instruction
>   2c:	90                   	nop
>   2d:	90                   	nop
>   2e:	e9 ba fe ff ff       	jmp    0xfffffffffffffeed
>   33:	66 90                	xchg   %ax,%ax
>   35:	90                   	nop
>   36:	90                   	nop
>   37:	90                   	nop
>   38:	90                   	nop
>   39:	90                   	nop
>   3a:	90                   	nop
>   3b:	90                   	nop
>   3c:	90                   	nop
>   3d:	90                   	nop
>   3e:	90                   	nop
>   3f:	90                   	nop
>
> Code starting with the faulting instruction
> ===========================================
>    0:	0f 0b                	ud2
>    2:	90                   	nop
>    3:	90                   	nop
>    4:	e9 ba fe ff ff       	jmp    0xfffffffffffffec3
>    9:	66 90                	xchg   %ax,%ax
>    b:	90                   	nop
>    c:	90                   	nop
>    d:	90                   	nop
>    e:	90                   	nop
>    f:	90                   	nop
>   10:	90                   	nop
>   11:	90                   	nop
>   12:	90                   	nop
>   13:	90                   	nop
>   14:	90                   	nop
>   15:	90                   	nop
> [   55.835743] RSP: 0018:ffffc900004eba20 EFLAGS: 00010086
> [   55.835744] RAX: 000000000000002f RBX: ffff88826cccb080 RCX: 0000000000000027
> [   55.835745] RDX: ffff888263d17b08 RSI: 0000000000000001 RDI: ffff888263d17b00
> [   55.835746] RBP: ffffea0005fe0000 R08: 00000000ffffdfff R09: ffffffff82b16528
> [   55.835746] R10: 80000000ffffe000 R11: 00000000ffffe000 R12: 0000000000000020
> [   55.835746] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000005
> [   55.835750] FS:  00007fef6a06a740(0000) GS:ffff8882e08a0000(0000) knlGS:0000000000000000
> [   55.835751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> [   55.835751] CR2: 00007fee20c00000 CR3: 0000000179321000 CR4: 0000000000750ef0
> [   55.835751] PKRU: 55555554
> [   55.835752] Call Trace:
> [   55.835755]  <TASK>
> [   55.835756] __move_freepages_block (mm/page_alloc.c:1849)
> [   55.835758] try_to_claim_block (mm/page_alloc.c:452 (discriminator 3) mm/page_alloc.c:2231 (discriminator 3))
> [   55.835759] __rmqueue_pcplist (mm/page_alloc.c:2287 mm/page_alloc.c:2383 mm/page_alloc.c:2422 mm/page_alloc.c:3140)
> [   55.835760] get_page_from_freelist (./include/linux/spinlock.h:391 mm/page_alloc.c:3183 mm/page_alloc.c:3213 mm/page_alloc.c:3739)
> [   55.835761] __alloc_frozen_pages_noprof (mm/page_alloc.c:5032)
> [   55.835763] ? __blk_flush_plug (block/blk-core.c:1227 (discriminator 2))
> [   55.835766] alloc_pages_mpol (mm/mempolicy.c:2413)
> [   55.835768] vma_alloc_folio_noprof (mm/mempolicy.c:2432 mm/mempolicy.c:2465)
> [   55.835769] ? __pte_alloc (mm/memory.c:444)
> [   55.835771] do_anonymous_page (mm/memory.c:1064 (discriminator 4) mm/memory.c:4982 (discriminator 4) mm/memory.c:5039 (discriminator 4))
> [   55.835772] ? do_huge_pmd_anonymous_page (mm/huge_memory.c:1226 mm/huge_memory.c:1372)
> [   55.835774] __handle_mm_fault (mm/memory.c:4197 mm/memory.c:6038 mm/memory.c:6181)
> [   55.835776] handle_mm_fault (mm/memory.c:6350)
> [   55.835777] do_user_addr_fault (arch/x86/mm/fault.c:1338)
> [   55.835779] exc_page_fault (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:114 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538)
> [   55.835783] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
> [   55.835785] RIP: 0033:0x403824
> [ 55.835786] Code: e0 0f 85 7c 01 00 00 ba 0e 00 00 00 be 00 00 20 00 48 89 c7 48 89 c3 e8 4a ea ff ff 85 c0 0f 85 51 01 00 00 8b 0d b4 49 00 00 <48> 89 1b 85 c9 0f 84 b1 00 00 00 83 e9 03 48 89 e6 ba 10 00 00 00
> All code
> ========
>    0:	e0 0f                	loopne 0x11
>    2:	85 7c 01 00          	test   %edi,0x0(%rcx,%rax,1)
>    6:	00 ba 0e 00 00 00    	add    %bh,0xe(%rdx)
>    c:	be 00 00 20 00       	mov    $0x200000,%esi
>   11:	48 89 c7             	mov    %rax,%rdi
>   14:	48 89 c3             	mov    %rax,%rbx
>   17:	e8 4a ea ff ff       	call   0xffffffffffffea66
>   1c:	85 c0                	test   %eax,%eax
>   1e:	0f 85 51 01 00 00    	jne    0x175
>   24:	8b 0d b4 49 00 00    	mov    0x49b4(%rip),%ecx        # 0x49de
>   2a:*	48 89 1b             	mov    %rbx,(%rbx)		<-- trapping instruction
>   2d:	85 c9                	test   %ecx,%ecx
>   2f:	0f 84 b1 00 00 00    	je     0xe6
>   35:	83 e9 03             	sub    $0x3,%ecx
>   38:	48 89 e6             	mov    %rsp,%rsi
>   3b:	ba 10 00 00 00       	mov    $0x10,%edx
>
> Code starting with the faulting instruction
> ===========================================
>    0:	48 89 1b             	mov    %rbx,(%rbx)
>    3:	85 c9                	test   %ecx,%ecx
>    5:	0f 84 b1 00 00 00    	je     0xbc
>    b:	83 e9 03             	sub    $0x3,%ecx
>    e:	48 89 e6             	mov    %rsp,%rsi
>   11:	ba 10 00 00 00       	mov    $0x10,%edx
> [   55.835786] RSP: 002b:00007ffd50b1e550 EFLAGS: 00010246
> [   55.835787] RAX: 0000000000000000 RBX: 00007fee20c00000 RCX: 000000000000000c
> [   55.835787] RDX: 000000000000000e RSI: 0000000000200000 RDI: 00007fee20c00000
> [   55.835788] RBP: 0000000000000003 R08: 00000000ffffffff R09: 0000000000000000
> [   55.835788] R10: 0000000000004032 R11: 0000000000000246 R12: 00007fee20c00000
> [   55.835788] R13: 00007fef6a000000 R14: 00000000323ca6b0 R15: 0000000000000fd2
> [   55.835789]  </TASK>
> [   55.835789] ---[ end trace 0000000000000000 ]---
>
>
> On Fri, May 09, 2025 at 04:01:09PM -0400, Zi Yan wrote:
>> Since migratetype is no longer overwritten during pageblock isolation,
>> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
>>
>> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
>> when a pageblock is isolated. It is used by move_freepages_block_isolate().
>>
>> Add pageblock_isolate_and_move_free_pages() and
>> pageblock_unisolate_and_move_free_pages() to be explicit about the page
>> isolation operations. Both share the common code in
>> __move_freepages_block_isolate(), which is renamed from
>> move_freepages_block_isolate().
>>
>> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
>> so that one should use set_pageblock_isolate() to isolate pageblocks.
>>
>> Two consequential changes:
>> 1. move pageblock migratetype code out of __move_freepages_block().
>> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
>>    called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
>>    pageblocks are isolated afterwards. Otherwise, all online pageblocks
>>    will have non-determined migratetype.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>  include/linux/mmzone.h         |  4 +-
>>  include/linux/page-isolation.h |  5 ++-
>>  mm/memory_hotplug.c            |  7 +++-
>>  mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
>>  mm/page_isolation.c            | 27 ++++++++-----
>>  5 files changed, 82 insertions(+), 34 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 7ef01fe148ce..f66895456974 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
>>  extern int page_group_by_mobility_disabled;
>>
>>  #ifdef CONFIG_MEMORY_ISOLATION
>> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
>> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
>> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
>>  #else
>> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
>>  #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>>  #endif
>>
>> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
>> index 898bb788243b..b0a2af0a5357 100644
>> --- a/include/linux/page-isolation.h
>> +++ b/include/linux/page-isolation.h
>> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
>>  #define REPORT_FAILURE	0x2
>>
>>  void set_pageblock_migratetype(struct page *page, int migratetype);
>> +void set_pageblock_isolate(struct page *page);
>>
>> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>> -				  int migratetype);
>> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
>> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>>
>>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>  			     int migratetype, int flags);
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index b1caedbade5b..c86c47bba019 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>  	const int nid = zone_to_nid(zone);
>>  	int ret;
>>  	struct memory_notify arg;
>> +	unsigned long isol_pfn;
>>
>>  	/*
>>  	 * {on,off}lining is constrained to full memory sections (or more
>> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>
>>
>>  	/* associate pfn range with the zone */
>> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
>> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
>> +	for (isol_pfn = pfn;
>> +	     isol_pfn < pfn + nr_pages;
>> +	     isol_pfn += pageblock_nr_pages)
>> +		set_pageblock_isolate(pfn_to_page(isol_pfn));
>>
>>  	arg.start_pfn = pfn;
>>  	arg.nr_pages = nr_pages;
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index 04e301fb4879..cfd37b2d992e 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -454,11 +454,9 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>>  		migratetype = MIGRATE_UNMOVABLE;
>>
>>  #ifdef CONFIG_MEMORY_ISOLATION
>> -	if (migratetype == MIGRATE_ISOLATE) {
>> -		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>> -				page_to_pfn(page), PB_migrate_isolate_bit);
>> -		return;
>> -	}
>> +	VM_WARN(migratetype == MIGRATE_ISOLATE,
>> +			"Use set_pageblock_isolate() for pageblock isolation");
>> +	return;
>>  #endif
>>  	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>>  				page_to_pfn(page), MIGRATETYPE_MASK);
>> @@ -1819,8 +1817,8 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
>>  #endif
>>
>>  /*
>> - * Change the type of a block and move all its free pages to that
>> - * type's freelist.
>> + * Move all free pages of a block to new type's freelist. Caller needs to
>> + * change the block type.
>>   */
>>  static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
>>  				  int old_mt, int new_mt)
>> @@ -1852,8 +1850,6 @@ static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
>>  		pages_moved += 1 << order;
>>  	}
>>
>> -	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
>> -
>>  	return pages_moved;
>>  }
>>
>> @@ -1911,11 +1907,16 @@ static int move_freepages_block(struct zone *zone, struct page *page,
>>  				int old_mt, int new_mt)
>>  {
>>  	unsigned long start_pfn;
>> +	int res;
>>
>>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
>>  		return -1;
>>
>> -	return __move_freepages_block(zone, start_pfn, old_mt, new_mt);
>> +	res = __move_freepages_block(zone, start_pfn, old_mt, new_mt);
>> +	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
>> +
>> +	return res;
>> +
>>  }
>>
>>  #ifdef CONFIG_MEMORY_ISOLATION
>> @@ -1943,11 +1944,17 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
>>  	return start_pfn;
>>  }
>>
>> +static inline void toggle_pageblock_isolate(struct page *page, bool isolate)
>> +{
>> +	set_pfnblock_flags_mask(page, (isolate << PB_migrate_isolate),
>> +			page_to_pfn(page), PB_migrate_isolate_bit);
>> +}
>> +
>>  /**
>> - * move_freepages_block_isolate - move free pages in block for page isolation
>> + * __move_freepages_block_isolate - move free pages in block for page isolation
>>   * @zone: the zone
>>   * @page: the pageblock page
>> - * @migratetype: migratetype to set on the pageblock
>> + * @isolate: to isolate the given pageblock or unisolate it
>>   *
>>   * This is similar to move_freepages_block(), but handles the special
>>   * case encountered in page isolation, where the block of interest
>> @@ -1962,10 +1969,15 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
>>   *
>>   * Returns %true if pages could be moved, %false otherwise.
>>   */
>> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>> -				  int migratetype)
>> +static bool __move_freepages_block_isolate(struct zone *zone,
>> +		struct page *page, bool isolate)
>>  {
>>  	unsigned long start_pfn, pfn;
>> +	int from_mt;
>> +	int to_mt;
>> +
>> +	if (isolate == (get_pageblock_migratetype(page) == MIGRATE_ISOLATE))
>> +		return false;
>>
>>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
>>  		return false;
>> @@ -1982,7 +1994,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>
>>  		del_page_from_free_list(buddy, zone, order,
>>  					get_pfnblock_migratetype(buddy, pfn));
>> -		set_pageblock_migratetype(page, migratetype);
>> +		toggle_pageblock_isolate(page, isolate);
>>  		split_large_buddy(zone, buddy, pfn, order, FPI_NONE);
>>  		return true;
>>  	}
>> @@ -1993,16 +2005,38 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>
>>  		del_page_from_free_list(page, zone, order,
>>  					get_pfnblock_migratetype(page, pfn));
>> -		set_pageblock_migratetype(page, migratetype);
>> +		toggle_pageblock_isolate(page, isolate);
>>  		split_large_buddy(zone, page, pfn, order, FPI_NONE);
>>  		return true;
>>  	}
>>  move:
>> -	__move_freepages_block(zone, start_pfn,
>> -			       get_pfnblock_migratetype(page, start_pfn),
>> -			       migratetype);
>> +	/* use MIGRATETYPE_NO_ISO_MASK to get the non-isolate migratetype */
>> +	if (isolate) {
>> +		from_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
>> +				MIGRATETYPE_NO_ISO_MASK);
>> +		to_mt = MIGRATE_ISOLATE;
>> +	} else {
>> +		from_mt = MIGRATE_ISOLATE;
>> +		to_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
>> +				MIGRATETYPE_NO_ISO_MASK);
>> +	}
>> +
>> +	__move_freepages_block(zone, start_pfn, from_mt, to_mt);
>> +	toggle_pageblock_isolate(pfn_to_page(start_pfn), isolate);
>> +
>>  	return true;
>>  }
>> +
>> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
>> +{
>> +	return __move_freepages_block_isolate(zone, page, true);
>> +}
>> +
>> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page)
>> +{
>> +	return __move_freepages_block_isolate(zone, page, false);
>> +}
>> +
>>  #endif /* CONFIG_MEMORY_ISOLATION */
>>
>>  static void change_pageblock_range(struct page *pageblock_page,
>> @@ -2194,6 +2228,7 @@ try_to_claim_block(struct zone *zone, struct page *page,
>>  	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
>>  			page_group_by_mobility_disabled) {
>>  		__move_freepages_block(zone, start_pfn, block_type, start_type);
>> +		set_pageblock_migratetype(pfn_to_page(start_pfn), start_type);
>>  		return __rmqueue_smallest(zone, order, start_type);
>>  	}
>>
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index 751e21f6d85e..4571940f14db 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
>>  	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>>  			PB_migrate_isolate_bit);
>>  }
>> +void set_pageblock_isolate(struct page *page)
>> +{
>> +	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>> +			page_to_pfn(page),
>> +			PB_migrate_isolate_bit);
>> +}
>>
>>  /*
>>   * This function checks whether the range [start_pfn, end_pfn) includes
>> @@ -199,7 +205,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>>  	unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
>>  			migratetype, isol_flags);
>>  	if (!unmovable) {
>> -		if (!move_freepages_block_isolate(zone, page, MIGRATE_ISOLATE)) {
>> +		if (!pageblock_isolate_and_move_free_pages(zone, page)) {
>>  			spin_unlock_irqrestore(&zone->lock, flags);
>>  			return -EBUSY;
>>  		}
>> @@ -220,7 +226,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>>  	return -EBUSY;
>>  }
>>
>> -static void unset_migratetype_isolate(struct page *page, int migratetype)
>> +static void unset_migratetype_isolate(struct page *page)
>>  {
>>  	struct zone *zone;
>>  	unsigned long flags;
>> @@ -273,10 +279,10 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
>>  		 * Isolating this block already succeeded, so this
>>  		 * should not fail on zone boundaries.
>>  		 */
>> -		WARN_ON_ONCE(!move_freepages_block_isolate(zone, page, migratetype));
>> +		WARN_ON_ONCE(!pageblock_unisolate_and_move_free_pages(zone, page));
>>  	} else {
>> -		set_pageblock_migratetype(page, migratetype);
>> -		__putback_isolated_page(page, order, migratetype);
>> +		clear_pageblock_isolate(page);
>> +		__putback_isolated_page(page, order, get_pageblock_migratetype(page));
>>  	}
>>  	zone->nr_isolate_pageblock--;
>>  out:
>> @@ -394,7 +400,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>  		if (PageBuddy(page)) {
>>  			int order = buddy_order(page);
>>
>> -			/* move_freepages_block_isolate() handled this */
>> +			/* pageblock_isolate_and_move_free_pages() handled this */
>>  			VM_WARN_ON_ONCE(pfn + (1 << order) > boundary_pfn);
>>
>>  			pfn += 1UL << order;
>> @@ -444,7 +450,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>  failed:
>>  	/* restore the original migratetype */
>>  	if (!skip_isolation)
>> -		unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype);
>> +		unset_migratetype_isolate(pfn_to_page(isolate_pageblock));
>>  	return -EBUSY;
>>  }
>>
>> @@ -515,7 +521,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>  	ret = isolate_single_pageblock(isolate_end, flags, true,
>>  			skip_isolation, migratetype);
>>  	if (ret) {
>> -		unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype);
>> +		unset_migratetype_isolate(pfn_to_page(isolate_start));
>>  		return ret;
>>  	}
>>
>> @@ -528,8 +534,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>  					start_pfn, end_pfn)) {
>>  			undo_isolate_page_range(isolate_start, pfn, migratetype);
>>  			unset_migratetype_isolate(
>> -				pfn_to_page(isolate_end - pageblock_nr_pages),
>> -				migratetype);
>> +				pfn_to_page(isolate_end - pageblock_nr_pages));
>>  			return -EBUSY;
>>  		}
>>  	}
>> @@ -559,7 +564,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>  		page = __first_valid_page(pfn, pageblock_nr_pages);
>>  		if (!page || !is_migrate_isolate_page(page))
>>  			continue;
>> -		unset_migratetype_isolate(page, migratetype);
>> +		unset_migratetype_isolate(page);
>>  	}
>>  }
>>  /*
>> --
>> 2.47.2
>>
>>
>>


--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-12 16:13     ` Zi Yan
@ 2025-05-12 16:19       ` Lorenzo Stoakes
  2025-05-12 16:28         ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: Lorenzo Stoakes @ 2025-05-12 16:19 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Harry Yoo

On Mon, May 12, 2025 at 12:13:35PM -0400, Zi Yan wrote:
> On 12 May 2025, at 12:10, Lorenzo Stoakes wrote:
>
> > Andrew - please drop this series, it's broken in mm-new.
> >
> > Zi - (as kernel bot reports actually!) I bisected a kernel splat to this
> > commit, triggerred by the mm/transhuge-stress test (please make sure to run
> > mm self tests before submitting series :)
> >
> > You can trigger it manually with:
> >
> > ./transhuge-stress -d 20
>
> Thanks. I will fix the issue and resend.

Thanks :)

Sorry re-reading the 'please make sure to run mm self tests' comment sounds
more snarky thank I intended, and I've definitely forgotten to do it
sometimes myself, but obviously a useful thing to do :P

I wonder if the issue I mention below is related, actually, unless they're
running your series on top of v6.15-rc5...

I pinged there anyway just in case.

Cheers, Lorenzo

>
> >
> > (The same invocation run_vmtest.sh uses).
> >
> > Note that this was reported in [0] (thanks to Harry Yoo for pointing this
> > out to me off-list! :)
> >
> > [0]: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/T/#u
> >
> > The decoded splat (at this commit in mm-new):
> >
> > [   55.835700] ------------[ cut here ]------------
> > [   55.835705] page type is 0, passed migratetype is 2 (nr=32)
> > [   55.835720] WARNING: CPU: 2 PID: 288 at mm/page_alloc.c:727 move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
> > [   55.835734] Modules linked in:
> > [   55.835739] Tainted: [W]=WARN
> > [   55.835740] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
> > [   55.835741] RIP: 0010:move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
> > [ 55.835742] Code: e9 fe ff ff c6 05 f1 9b 7b 01 01 90 48 89 ef e8 11 d7 ff ff 44 89 e1 44 89 ea 48 c7 c7 58 dc 70 82 48 89 c6 e8 1c e3 e0 ff 90 <0f> 0b 90 90 e9 ba fe ff ff 66 90 90 90 90 90 90 90 90 90 90 90 90
> > All code
> > ========
> >    0:	e9 fe ff ff c6       	jmp    0xffffffffc7000003
> >    5:	05 f1 9b 7b 01       	add    $0x17b9bf1,%eax
> >    a:	01 90 48 89 ef e8    	add    %edx,-0x171076b8(%rax)
> >   10:	11 d7                	adc    %edx,%edi
> >   12:	ff                   	(bad)
> >   13:	ff 44 89 e1          	incl   -0x1f(%rcx,%rcx,4)
> >   17:	44 89 ea             	mov    %r13d,%edx
> >   1a:	48 c7 c7 58 dc 70 82 	mov    $0xffffffff8270dc58,%rdi
> >   21:	48 89 c6             	mov    %rax,%rsi
> >   24:	e8 1c e3 e0 ff       	call   0xffffffffffe0e345
> >   29:	90                   	nop
> >   2a:*	0f 0b                	ud2		<-- trapping instruction
> >   2c:	90                   	nop
> >   2d:	90                   	nop
> >   2e:	e9 ba fe ff ff       	jmp    0xfffffffffffffeed
> >   33:	66 90                	xchg   %ax,%ax
> >   35:	90                   	nop
> >   36:	90                   	nop
> >   37:	90                   	nop
> >   38:	90                   	nop
> >   39:	90                   	nop
> >   3a:	90                   	nop
> >   3b:	90                   	nop
> >   3c:	90                   	nop
> >   3d:	90                   	nop
> >   3e:	90                   	nop
> >   3f:	90                   	nop
> >
> > Code starting with the faulting instruction
> > ===========================================
> >    0:	0f 0b                	ud2
> >    2:	90                   	nop
> >    3:	90                   	nop
> >    4:	e9 ba fe ff ff       	jmp    0xfffffffffffffec3
> >    9:	66 90                	xchg   %ax,%ax
> >    b:	90                   	nop
> >    c:	90                   	nop
> >    d:	90                   	nop
> >    e:	90                   	nop
> >    f:	90                   	nop
> >   10:	90                   	nop
> >   11:	90                   	nop
> >   12:	90                   	nop
> >   13:	90                   	nop
> >   14:	90                   	nop
> >   15:	90                   	nop
> > [   55.835743] RSP: 0018:ffffc900004eba20 EFLAGS: 00010086
> > [   55.835744] RAX: 000000000000002f RBX: ffff88826cccb080 RCX: 0000000000000027
> > [   55.835745] RDX: ffff888263d17b08 RSI: 0000000000000001 RDI: ffff888263d17b00
> > [   55.835746] RBP: ffffea0005fe0000 R08: 00000000ffffdfff R09: ffffffff82b16528
> > [   55.835746] R10: 80000000ffffe000 R11: 00000000ffffe000 R12: 0000000000000020
> > [   55.835746] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000005
> > [   55.835750] FS:  00007fef6a06a740(0000) GS:ffff8882e08a0000(0000) knlGS:0000000000000000
> > [   55.835751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
> > [   55.835751] CR2: 00007fee20c00000 CR3: 0000000179321000 CR4: 0000000000750ef0
> > [   55.835751] PKRU: 55555554
> > [   55.835752] Call Trace:
> > [   55.835755]  <TASK>
> > [   55.835756] __move_freepages_block (mm/page_alloc.c:1849)
> > [   55.835758] try_to_claim_block (mm/page_alloc.c:452 (discriminator 3) mm/page_alloc.c:2231 (discriminator 3))
> > [   55.835759] __rmqueue_pcplist (mm/page_alloc.c:2287 mm/page_alloc.c:2383 mm/page_alloc.c:2422 mm/page_alloc.c:3140)
> > [   55.835760] get_page_from_freelist (./include/linux/spinlock.h:391 mm/page_alloc.c:3183 mm/page_alloc.c:3213 mm/page_alloc.c:3739)
> > [   55.835761] __alloc_frozen_pages_noprof (mm/page_alloc.c:5032)
> > [   55.835763] ? __blk_flush_plug (block/blk-core.c:1227 (discriminator 2))
> > [   55.835766] alloc_pages_mpol (mm/mempolicy.c:2413)
> > [   55.835768] vma_alloc_folio_noprof (mm/mempolicy.c:2432 mm/mempolicy.c:2465)
> > [   55.835769] ? __pte_alloc (mm/memory.c:444)
> > [   55.835771] do_anonymous_page (mm/memory.c:1064 (discriminator 4) mm/memory.c:4982 (discriminator 4) mm/memory.c:5039 (discriminator 4))
> > [   55.835772] ? do_huge_pmd_anonymous_page (mm/huge_memory.c:1226 mm/huge_memory.c:1372)
> > [   55.835774] __handle_mm_fault (mm/memory.c:4197 mm/memory.c:6038 mm/memory.c:6181)
> > [   55.835776] handle_mm_fault (mm/memory.c:6350)
> > [   55.835777] do_user_addr_fault (arch/x86/mm/fault.c:1338)
> > [   55.835779] exc_page_fault (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:114 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538)
> > [   55.835783] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
> > [   55.835785] RIP: 0033:0x403824
> > [ 55.835786] Code: e0 0f 85 7c 01 00 00 ba 0e 00 00 00 be 00 00 20 00 48 89 c7 48 89 c3 e8 4a ea ff ff 85 c0 0f 85 51 01 00 00 8b 0d b4 49 00 00 <48> 89 1b 85 c9 0f 84 b1 00 00 00 83 e9 03 48 89 e6 ba 10 00 00 00
> > All code
> > ========
> >    0:	e0 0f                	loopne 0x11
> >    2:	85 7c 01 00          	test   %edi,0x0(%rcx,%rax,1)
> >    6:	00 ba 0e 00 00 00    	add    %bh,0xe(%rdx)
> >    c:	be 00 00 20 00       	mov    $0x200000,%esi
> >   11:	48 89 c7             	mov    %rax,%rdi
> >   14:	48 89 c3             	mov    %rax,%rbx
> >   17:	e8 4a ea ff ff       	call   0xffffffffffffea66
> >   1c:	85 c0                	test   %eax,%eax
> >   1e:	0f 85 51 01 00 00    	jne    0x175
> >   24:	8b 0d b4 49 00 00    	mov    0x49b4(%rip),%ecx        # 0x49de
> >   2a:*	48 89 1b             	mov    %rbx,(%rbx)		<-- trapping instruction
> >   2d:	85 c9                	test   %ecx,%ecx
> >   2f:	0f 84 b1 00 00 00    	je     0xe6
> >   35:	83 e9 03             	sub    $0x3,%ecx
> >   38:	48 89 e6             	mov    %rsp,%rsi
> >   3b:	ba 10 00 00 00       	mov    $0x10,%edx
> >
> > Code starting with the faulting instruction
> > ===========================================
> >    0:	48 89 1b             	mov    %rbx,(%rbx)
> >    3:	85 c9                	test   %ecx,%ecx
> >    5:	0f 84 b1 00 00 00    	je     0xbc
> >    b:	83 e9 03             	sub    $0x3,%ecx
> >    e:	48 89 e6             	mov    %rsp,%rsi
> >   11:	ba 10 00 00 00       	mov    $0x10,%edx
> > [   55.835786] RSP: 002b:00007ffd50b1e550 EFLAGS: 00010246
> > [   55.835787] RAX: 0000000000000000 RBX: 00007fee20c00000 RCX: 000000000000000c
> > [   55.835787] RDX: 000000000000000e RSI: 0000000000200000 RDI: 00007fee20c00000
> > [   55.835788] RBP: 0000000000000003 R08: 00000000ffffffff R09: 0000000000000000
> > [   55.835788] R10: 0000000000004032 R11: 0000000000000246 R12: 00007fee20c00000
> > [   55.835788] R13: 00007fef6a000000 R14: 00000000323ca6b0 R15: 0000000000000fd2
> > [   55.835789]  </TASK>
> > [   55.835789] ---[ end trace 0000000000000000 ]---
> >
> >
> > On Fri, May 09, 2025 at 04:01:09PM -0400, Zi Yan wrote:
> >> Since migratetype is no longer overwritten during pageblock isolation,
> >> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
> >>
> >> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
> >> when a pageblock is isolated. It is used by move_freepages_block_isolate().
> >>
> >> Add pageblock_isolate_and_move_free_pages() and
> >> pageblock_unisolate_and_move_free_pages() to be explicit about the page
> >> isolation operations. Both share the common code in
> >> __move_freepages_block_isolate(), which is renamed from
> >> move_freepages_block_isolate().
> >>
> >> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
> >> so that one should use set_pageblock_isolate() to isolate pageblocks.
> >>
> >> Two consequential changes:
> >> 1. move pageblock migratetype code out of __move_freepages_block().
> >> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
> >>    called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
> >>    pageblocks are isolated afterwards. Otherwise, all online pageblocks
> >>    will have non-determined migratetype.
> >>
> >> Signed-off-by: Zi Yan <ziy@nvidia.com>
> >> ---
> >>  include/linux/mmzone.h         |  4 +-
> >>  include/linux/page-isolation.h |  5 ++-
> >>  mm/memory_hotplug.c            |  7 +++-
> >>  mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
> >>  mm/page_isolation.c            | 27 ++++++++-----
> >>  5 files changed, 82 insertions(+), 34 deletions(-)
> >>
> >> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> >> index 7ef01fe148ce..f66895456974 100644
> >> --- a/include/linux/mmzone.h
> >> +++ b/include/linux/mmzone.h
> >> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
> >>  extern int page_group_by_mobility_disabled;
> >>
> >>  #ifdef CONFIG_MEMORY_ISOLATION
> >> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
> >> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
> >> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
> >>  #else
> >> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
> >>  #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
> >>  #endif
> >>
> >> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> >> index 898bb788243b..b0a2af0a5357 100644
> >> --- a/include/linux/page-isolation.h
> >> +++ b/include/linux/page-isolation.h
> >> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
> >>  #define REPORT_FAILURE	0x2
> >>
> >>  void set_pageblock_migratetype(struct page *page, int migratetype);
> >> +void set_pageblock_isolate(struct page *page);
> >>
> >> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> >> -				  int migratetype);
> >> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
> >> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
> >>
> >>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> >>  			     int migratetype, int flags);
> >> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> >> index b1caedbade5b..c86c47bba019 100644
> >> --- a/mm/memory_hotplug.c
> >> +++ b/mm/memory_hotplug.c
> >> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
> >>  	const int nid = zone_to_nid(zone);
> >>  	int ret;
> >>  	struct memory_notify arg;
> >> +	unsigned long isol_pfn;
> >>
> >>  	/*
> >>  	 * {on,off}lining is constrained to full memory sections (or more
> >> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
> >>
> >>
> >>  	/* associate pfn range with the zone */
> >> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
> >> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
> >> +	for (isol_pfn = pfn;
> >> +	     isol_pfn < pfn + nr_pages;
> >> +	     isol_pfn += pageblock_nr_pages)
> >> +		set_pageblock_isolate(pfn_to_page(isol_pfn));
> >>
> >>  	arg.start_pfn = pfn;
> >>  	arg.nr_pages = nr_pages;
> >> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> >> index 04e301fb4879..cfd37b2d992e 100644
> >> --- a/mm/page_alloc.c
> >> +++ b/mm/page_alloc.c
> >> @@ -454,11 +454,9 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
> >>  		migratetype = MIGRATE_UNMOVABLE;
> >>
> >>  #ifdef CONFIG_MEMORY_ISOLATION
> >> -	if (migratetype == MIGRATE_ISOLATE) {
> >> -		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
> >> -				page_to_pfn(page), PB_migrate_isolate_bit);
> >> -		return;
> >> -	}
> >> +	VM_WARN(migratetype == MIGRATE_ISOLATE,
> >> +			"Use set_pageblock_isolate() for pageblock isolation");
> >> +	return;
> >>  #endif
> >>  	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
> >>  				page_to_pfn(page), MIGRATETYPE_MASK);
> >> @@ -1819,8 +1817,8 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
> >>  #endif
> >>
> >>  /*
> >> - * Change the type of a block and move all its free pages to that
> >> - * type's freelist.
> >> + * Move all free pages of a block to new type's freelist. Caller needs to
> >> + * change the block type.
> >>   */
> >>  static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
> >>  				  int old_mt, int new_mt)
> >> @@ -1852,8 +1850,6 @@ static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
> >>  		pages_moved += 1 << order;
> >>  	}
> >>
> >> -	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
> >> -
> >>  	return pages_moved;
> >>  }
> >>
> >> @@ -1911,11 +1907,16 @@ static int move_freepages_block(struct zone *zone, struct page *page,
> >>  				int old_mt, int new_mt)
> >>  {
> >>  	unsigned long start_pfn;
> >> +	int res;
> >>
> >>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
> >>  		return -1;
> >>
> >> -	return __move_freepages_block(zone, start_pfn, old_mt, new_mt);
> >> +	res = __move_freepages_block(zone, start_pfn, old_mt, new_mt);
> >> +	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
> >> +
> >> +	return res;
> >> +
> >>  }
> >>
> >>  #ifdef CONFIG_MEMORY_ISOLATION
> >> @@ -1943,11 +1944,17 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
> >>  	return start_pfn;
> >>  }
> >>
> >> +static inline void toggle_pageblock_isolate(struct page *page, bool isolate)
> >> +{
> >> +	set_pfnblock_flags_mask(page, (isolate << PB_migrate_isolate),
> >> +			page_to_pfn(page), PB_migrate_isolate_bit);
> >> +}
> >> +
> >>  /**
> >> - * move_freepages_block_isolate - move free pages in block for page isolation
> >> + * __move_freepages_block_isolate - move free pages in block for page isolation
> >>   * @zone: the zone
> >>   * @page: the pageblock page
> >> - * @migratetype: migratetype to set on the pageblock
> >> + * @isolate: to isolate the given pageblock or unisolate it
> >>   *
> >>   * This is similar to move_freepages_block(), but handles the special
> >>   * case encountered in page isolation, where the block of interest
> >> @@ -1962,10 +1969,15 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
> >>   *
> >>   * Returns %true if pages could be moved, %false otherwise.
> >>   */
> >> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> >> -				  int migratetype)
> >> +static bool __move_freepages_block_isolate(struct zone *zone,
> >> +		struct page *page, bool isolate)
> >>  {
> >>  	unsigned long start_pfn, pfn;
> >> +	int from_mt;
> >> +	int to_mt;
> >> +
> >> +	if (isolate == (get_pageblock_migratetype(page) == MIGRATE_ISOLATE))
> >> +		return false;
> >>
> >>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
> >>  		return false;
> >> @@ -1982,7 +1994,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> >>
> >>  		del_page_from_free_list(buddy, zone, order,
> >>  					get_pfnblock_migratetype(buddy, pfn));
> >> -		set_pageblock_migratetype(page, migratetype);
> >> +		toggle_pageblock_isolate(page, isolate);
> >>  		split_large_buddy(zone, buddy, pfn, order, FPI_NONE);
> >>  		return true;
> >>  	}
> >> @@ -1993,16 +2005,38 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> >>
> >>  		del_page_from_free_list(page, zone, order,
> >>  					get_pfnblock_migratetype(page, pfn));
> >> -		set_pageblock_migratetype(page, migratetype);
> >> +		toggle_pageblock_isolate(page, isolate);
> >>  		split_large_buddy(zone, page, pfn, order, FPI_NONE);
> >>  		return true;
> >>  	}
> >>  move:
> >> -	__move_freepages_block(zone, start_pfn,
> >> -			       get_pfnblock_migratetype(page, start_pfn),
> >> -			       migratetype);
> >> +	/* use MIGRATETYPE_NO_ISO_MASK to get the non-isolate migratetype */
> >> +	if (isolate) {
> >> +		from_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
> >> +				MIGRATETYPE_NO_ISO_MASK);
> >> +		to_mt = MIGRATE_ISOLATE;
> >> +	} else {
> >> +		from_mt = MIGRATE_ISOLATE;
> >> +		to_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
> >> +				MIGRATETYPE_NO_ISO_MASK);
> >> +	}
> >> +
> >> +	__move_freepages_block(zone, start_pfn, from_mt, to_mt);
> >> +	toggle_pageblock_isolate(pfn_to_page(start_pfn), isolate);
> >> +
> >>  	return true;
> >>  }
> >> +
> >> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
> >> +{
> >> +	return __move_freepages_block_isolate(zone, page, true);
> >> +}
> >> +
> >> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page)
> >> +{
> >> +	return __move_freepages_block_isolate(zone, page, false);
> >> +}
> >> +
> >>  #endif /* CONFIG_MEMORY_ISOLATION */
> >>
> >>  static void change_pageblock_range(struct page *pageblock_page,
> >> @@ -2194,6 +2228,7 @@ try_to_claim_block(struct zone *zone, struct page *page,
> >>  	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
> >>  			page_group_by_mobility_disabled) {
> >>  		__move_freepages_block(zone, start_pfn, block_type, start_type);
> >> +		set_pageblock_migratetype(pfn_to_page(start_pfn), start_type);
> >>  		return __rmqueue_smallest(zone, order, start_type);
> >>  	}
> >>
> >> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> >> index 751e21f6d85e..4571940f14db 100644
> >> --- a/mm/page_isolation.c
> >> +++ b/mm/page_isolation.c
> >> @@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
> >>  	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
> >>  			PB_migrate_isolate_bit);
> >>  }
> >> +void set_pageblock_isolate(struct page *page)
> >> +{
> >> +	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
> >> +			page_to_pfn(page),
> >> +			PB_migrate_isolate_bit);
> >> +}
> >>
> >>  /*
> >>   * This function checks whether the range [start_pfn, end_pfn) includes
> >> @@ -199,7 +205,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
> >>  	unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
> >>  			migratetype, isol_flags);
> >>  	if (!unmovable) {
> >> -		if (!move_freepages_block_isolate(zone, page, MIGRATE_ISOLATE)) {
> >> +		if (!pageblock_isolate_and_move_free_pages(zone, page)) {
> >>  			spin_unlock_irqrestore(&zone->lock, flags);
> >>  			return -EBUSY;
> >>  		}
> >> @@ -220,7 +226,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
> >>  	return -EBUSY;
> >>  }
> >>
> >> -static void unset_migratetype_isolate(struct page *page, int migratetype)
> >> +static void unset_migratetype_isolate(struct page *page)
> >>  {
> >>  	struct zone *zone;
> >>  	unsigned long flags;
> >> @@ -273,10 +279,10 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
> >>  		 * Isolating this block already succeeded, so this
> >>  		 * should not fail on zone boundaries.
> >>  		 */
> >> -		WARN_ON_ONCE(!move_freepages_block_isolate(zone, page, migratetype));
> >> +		WARN_ON_ONCE(!pageblock_unisolate_and_move_free_pages(zone, page));
> >>  	} else {
> >> -		set_pageblock_migratetype(page, migratetype);
> >> -		__putback_isolated_page(page, order, migratetype);
> >> +		clear_pageblock_isolate(page);
> >> +		__putback_isolated_page(page, order, get_pageblock_migratetype(page));
> >>  	}
> >>  	zone->nr_isolate_pageblock--;
> >>  out:
> >> @@ -394,7 +400,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
> >>  		if (PageBuddy(page)) {
> >>  			int order = buddy_order(page);
> >>
> >> -			/* move_freepages_block_isolate() handled this */
> >> +			/* pageblock_isolate_and_move_free_pages() handled this */
> >>  			VM_WARN_ON_ONCE(pfn + (1 << order) > boundary_pfn);
> >>
> >>  			pfn += 1UL << order;
> >> @@ -444,7 +450,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
> >>  failed:
> >>  	/* restore the original migratetype */
> >>  	if (!skip_isolation)
> >> -		unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype);
> >> +		unset_migratetype_isolate(pfn_to_page(isolate_pageblock));
> >>  	return -EBUSY;
> >>  }
> >>
> >> @@ -515,7 +521,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> >>  	ret = isolate_single_pageblock(isolate_end, flags, true,
> >>  			skip_isolation, migratetype);
> >>  	if (ret) {
> >> -		unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype);
> >> +		unset_migratetype_isolate(pfn_to_page(isolate_start));
> >>  		return ret;
> >>  	}
> >>
> >> @@ -528,8 +534,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> >>  					start_pfn, end_pfn)) {
> >>  			undo_isolate_page_range(isolate_start, pfn, migratetype);
> >>  			unset_migratetype_isolate(
> >> -				pfn_to_page(isolate_end - pageblock_nr_pages),
> >> -				migratetype);
> >> +				pfn_to_page(isolate_end - pageblock_nr_pages));
> >>  			return -EBUSY;
> >>  		}
> >>  	}
> >> @@ -559,7 +564,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> >>  		page = __first_valid_page(pfn, pageblock_nr_pages);
> >>  		if (!page || !is_migrate_isolate_page(page))
> >>  			continue;
> >> -		unset_migratetype_isolate(page, migratetype);
> >> +		unset_migratetype_isolate(page);
> >>  	}
> >>  }
> >>  /*
> >> --
> >> 2.47.2
> >>
> >>
> >>
>
>
> --
> Best Regards,
> Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-12 16:19       ` Lorenzo Stoakes
@ 2025-05-12 16:28         ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-12 16:28 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Harry Yoo

On 12 May 2025, at 12:19, Lorenzo Stoakes wrote:

> On Mon, May 12, 2025 at 12:13:35PM -0400, Zi Yan wrote:
>> On 12 May 2025, at 12:10, Lorenzo Stoakes wrote:
>>
>>> Andrew - please drop this series, it's broken in mm-new.
>>>
>>> Zi - (as kernel bot reports actually!) I bisected a kernel splat to this
>>> commit, triggerred by the mm/transhuge-stress test (please make sure to run
>>> mm self tests before submitting series :)
>>>
>>> You can trigger it manually with:
>>>
>>> ./transhuge-stress -d 20
>>
>> Thanks. I will fix the issue and resend.
>
> Thanks :)
>
> Sorry re-reading the 'please make sure to run mm self tests' comment sounds
> more snarky thank I intended, and I've definitely forgotten to do it
> sometimes myself, but obviously a useful thing to do :P

You got me. I did not run mm self tests for my series, but will do that
from now on. I was using memory hotplug and hotremove to test my series,
but obviously it is not enough.

>
> I wonder if the issue I mention below is related, actually, unless they're
> running your series on top of v6.15-rc5...

I wonder if something else is causing it. The warning is the check of
making sure pageblock migratetype matches the free list movement.
Anyway, let me reply there. A bisect would be helpful.

>
> I pinged there anyway just in case.
>
> Cheers, Lorenzo
>
>>
>>>
>>> (The same invocation run_vmtest.sh uses).
>>>
>>> Note that this was reported in [0] (thanks to Harry Yoo for pointing this
>>> out to me off-list! :)
>>>
>>> [0]: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/T/#u
>>>
>>> The decoded splat (at this commit in mm-new):
>>>
>>> [   55.835700] ------------[ cut here ]------------
>>> [   55.835705] page type is 0, passed migratetype is 2 (nr=32)
>>> [   55.835720] WARNING: CPU: 2 PID: 288 at mm/page_alloc.c:727 move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
>>> [   55.835734] Modules linked in:
>>> [   55.835739] Tainted: [W]=WARN
>>> [   55.835740] Hardware name: QEMU Standard PC (i440FX + PIIX, 1996), BIOS Arch Linux 1.16.3-1-1 04/01/2014
>>> [   55.835741] RIP: 0010:move_to_free_list (mm/page_alloc.c:727 (discriminator 16))
>>> [ 55.835742] Code: e9 fe ff ff c6 05 f1 9b 7b 01 01 90 48 89 ef e8 11 d7 ff ff 44 89 e1 44 89 ea 48 c7 c7 58 dc 70 82 48 89 c6 e8 1c e3 e0 ff 90 <0f> 0b 90 90 e9 ba fe ff ff 66 90 90 90 90 90 90 90 90 90 90 90 90
>>> All code
>>> ========
>>>    0:	e9 fe ff ff c6       	jmp    0xffffffffc7000003
>>>    5:	05 f1 9b 7b 01       	add    $0x17b9bf1,%eax
>>>    a:	01 90 48 89 ef e8    	add    %edx,-0x171076b8(%rax)
>>>   10:	11 d7                	adc    %edx,%edi
>>>   12:	ff                   	(bad)
>>>   13:	ff 44 89 e1          	incl   -0x1f(%rcx,%rcx,4)
>>>   17:	44 89 ea             	mov    %r13d,%edx
>>>   1a:	48 c7 c7 58 dc 70 82 	mov    $0xffffffff8270dc58,%rdi
>>>   21:	48 89 c6             	mov    %rax,%rsi
>>>   24:	e8 1c e3 e0 ff       	call   0xffffffffffe0e345
>>>   29:	90                   	nop
>>>   2a:*	0f 0b                	ud2		<-- trapping instruction
>>>   2c:	90                   	nop
>>>   2d:	90                   	nop
>>>   2e:	e9 ba fe ff ff       	jmp    0xfffffffffffffeed
>>>   33:	66 90                	xchg   %ax,%ax
>>>   35:	90                   	nop
>>>   36:	90                   	nop
>>>   37:	90                   	nop
>>>   38:	90                   	nop
>>>   39:	90                   	nop
>>>   3a:	90                   	nop
>>>   3b:	90                   	nop
>>>   3c:	90                   	nop
>>>   3d:	90                   	nop
>>>   3e:	90                   	nop
>>>   3f:	90                   	nop
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>>    0:	0f 0b                	ud2
>>>    2:	90                   	nop
>>>    3:	90                   	nop
>>>    4:	e9 ba fe ff ff       	jmp    0xfffffffffffffec3
>>>    9:	66 90                	xchg   %ax,%ax
>>>    b:	90                   	nop
>>>    c:	90                   	nop
>>>    d:	90                   	nop
>>>    e:	90                   	nop
>>>    f:	90                   	nop
>>>   10:	90                   	nop
>>>   11:	90                   	nop
>>>   12:	90                   	nop
>>>   13:	90                   	nop
>>>   14:	90                   	nop
>>>   15:	90                   	nop
>>> [   55.835743] RSP: 0018:ffffc900004eba20 EFLAGS: 00010086
>>> [   55.835744] RAX: 000000000000002f RBX: ffff88826cccb080 RCX: 0000000000000027
>>> [   55.835745] RDX: ffff888263d17b08 RSI: 0000000000000001 RDI: ffff888263d17b00
>>> [   55.835746] RBP: ffffea0005fe0000 R08: 00000000ffffdfff R09: ffffffff82b16528
>>> [   55.835746] R10: 80000000ffffe000 R11: 00000000ffffe000 R12: 0000000000000020
>>> [   55.835746] R13: 0000000000000002 R14: 0000000000000001 R15: 0000000000000005
>>> [   55.835750] FS:  00007fef6a06a740(0000) GS:ffff8882e08a0000(0000) knlGS:0000000000000000
>>> [   55.835751] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
>>> [   55.835751] CR2: 00007fee20c00000 CR3: 0000000179321000 CR4: 0000000000750ef0
>>> [   55.835751] PKRU: 55555554
>>> [   55.835752] Call Trace:
>>> [   55.835755]  <TASK>
>>> [   55.835756] __move_freepages_block (mm/page_alloc.c:1849)
>>> [   55.835758] try_to_claim_block (mm/page_alloc.c:452 (discriminator 3) mm/page_alloc.c:2231 (discriminator 3))
>>> [   55.835759] __rmqueue_pcplist (mm/page_alloc.c:2287 mm/page_alloc.c:2383 mm/page_alloc.c:2422 mm/page_alloc.c:3140)
>>> [   55.835760] get_page_from_freelist (./include/linux/spinlock.h:391 mm/page_alloc.c:3183 mm/page_alloc.c:3213 mm/page_alloc.c:3739)
>>> [   55.835761] __alloc_frozen_pages_noprof (mm/page_alloc.c:5032)
>>> [   55.835763] ? __blk_flush_plug (block/blk-core.c:1227 (discriminator 2))
>>> [   55.835766] alloc_pages_mpol (mm/mempolicy.c:2413)
>>> [   55.835768] vma_alloc_folio_noprof (mm/mempolicy.c:2432 mm/mempolicy.c:2465)
>>> [   55.835769] ? __pte_alloc (mm/memory.c:444)
>>> [   55.835771] do_anonymous_page (mm/memory.c:1064 (discriminator 4) mm/memory.c:4982 (discriminator 4) mm/memory.c:5039 (discriminator 4))
>>> [   55.835772] ? do_huge_pmd_anonymous_page (mm/huge_memory.c:1226 mm/huge_memory.c:1372)
>>> [   55.835774] __handle_mm_fault (mm/memory.c:4197 mm/memory.c:6038 mm/memory.c:6181)
>>> [   55.835776] handle_mm_fault (mm/memory.c:6350)
>>> [   55.835777] do_user_addr_fault (arch/x86/mm/fault.c:1338)
>>> [   55.835779] exc_page_fault (./arch/x86/include/asm/irqflags.h:37 ./arch/x86/include/asm/irqflags.h:114 arch/x86/mm/fault.c:1488 arch/x86/mm/fault.c:1538)
>>> [   55.835783] asm_exc_page_fault (./arch/x86/include/asm/idtentry.h:623)
>>> [   55.835785] RIP: 0033:0x403824
>>> [ 55.835786] Code: e0 0f 85 7c 01 00 00 ba 0e 00 00 00 be 00 00 20 00 48 89 c7 48 89 c3 e8 4a ea ff ff 85 c0 0f 85 51 01 00 00 8b 0d b4 49 00 00 <48> 89 1b 85 c9 0f 84 b1 00 00 00 83 e9 03 48 89 e6 ba 10 00 00 00
>>> All code
>>> ========
>>>    0:	e0 0f                	loopne 0x11
>>>    2:	85 7c 01 00          	test   %edi,0x0(%rcx,%rax,1)
>>>    6:	00 ba 0e 00 00 00    	add    %bh,0xe(%rdx)
>>>    c:	be 00 00 20 00       	mov    $0x200000,%esi
>>>   11:	48 89 c7             	mov    %rax,%rdi
>>>   14:	48 89 c3             	mov    %rax,%rbx
>>>   17:	e8 4a ea ff ff       	call   0xffffffffffffea66
>>>   1c:	85 c0                	test   %eax,%eax
>>>   1e:	0f 85 51 01 00 00    	jne    0x175
>>>   24:	8b 0d b4 49 00 00    	mov    0x49b4(%rip),%ecx        # 0x49de
>>>   2a:*	48 89 1b             	mov    %rbx,(%rbx)		<-- trapping instruction
>>>   2d:	85 c9                	test   %ecx,%ecx
>>>   2f:	0f 84 b1 00 00 00    	je     0xe6
>>>   35:	83 e9 03             	sub    $0x3,%ecx
>>>   38:	48 89 e6             	mov    %rsp,%rsi
>>>   3b:	ba 10 00 00 00       	mov    $0x10,%edx
>>>
>>> Code starting with the faulting instruction
>>> ===========================================
>>>    0:	48 89 1b             	mov    %rbx,(%rbx)
>>>    3:	85 c9                	test   %ecx,%ecx
>>>    5:	0f 84 b1 00 00 00    	je     0xbc
>>>    b:	83 e9 03             	sub    $0x3,%ecx
>>>    e:	48 89 e6             	mov    %rsp,%rsi
>>>   11:	ba 10 00 00 00       	mov    $0x10,%edx
>>> [   55.835786] RSP: 002b:00007ffd50b1e550 EFLAGS: 00010246
>>> [   55.835787] RAX: 0000000000000000 RBX: 00007fee20c00000 RCX: 000000000000000c
>>> [   55.835787] RDX: 000000000000000e RSI: 0000000000200000 RDI: 00007fee20c00000
>>> [   55.835788] RBP: 0000000000000003 R08: 00000000ffffffff R09: 0000000000000000
>>> [   55.835788] R10: 0000000000004032 R11: 0000000000000246 R12: 00007fee20c00000
>>> [   55.835788] R13: 00007fef6a000000 R14: 00000000323ca6b0 R15: 0000000000000fd2
>>> [   55.835789]  </TASK>
>>> [   55.835789] ---[ end trace 0000000000000000 ]---
>>>
>>>
>>> On Fri, May 09, 2025 at 04:01:09PM -0400, Zi Yan wrote:
>>>> Since migratetype is no longer overwritten during pageblock isolation,
>>>> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
>>>>
>>>> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
>>>> when a pageblock is isolated. It is used by move_freepages_block_isolate().
>>>>
>>>> Add pageblock_isolate_and_move_free_pages() and
>>>> pageblock_unisolate_and_move_free_pages() to be explicit about the page
>>>> isolation operations. Both share the common code in
>>>> __move_freepages_block_isolate(), which is renamed from
>>>> move_freepages_block_isolate().
>>>>
>>>> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
>>>> so that one should use set_pageblock_isolate() to isolate pageblocks.
>>>>
>>>> Two consequential changes:
>>>> 1. move pageblock migratetype code out of __move_freepages_block().
>>>> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
>>>>    called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
>>>>    pageblocks are isolated afterwards. Otherwise, all online pageblocks
>>>>    will have non-determined migratetype.
>>>>
>>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>>> ---
>>>>  include/linux/mmzone.h         |  4 +-
>>>>  include/linux/page-isolation.h |  5 ++-
>>>>  mm/memory_hotplug.c            |  7 +++-
>>>>  mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
>>>>  mm/page_isolation.c            | 27 ++++++++-----
>>>>  5 files changed, 82 insertions(+), 34 deletions(-)
>>>>
>>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>>> index 7ef01fe148ce..f66895456974 100644
>>>> --- a/include/linux/mmzone.h
>>>> +++ b/include/linux/mmzone.h
>>>> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
>>>>  extern int page_group_by_mobility_disabled;
>>>>
>>>>  #ifdef CONFIG_MEMORY_ISOLATION
>>>> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
>>>> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
>>>> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
>>>>  #else
>>>> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
>>>>  #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>>>>  #endif
>>>>
>>>> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
>>>> index 898bb788243b..b0a2af0a5357 100644
>>>> --- a/include/linux/page-isolation.h
>>>> +++ b/include/linux/page-isolation.h
>>>> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
>>>>  #define REPORT_FAILURE	0x2
>>>>
>>>>  void set_pageblock_migratetype(struct page *page, int migratetype);
>>>> +void set_pageblock_isolate(struct page *page);
>>>>
>>>> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>>> -				  int migratetype);
>>>> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
>>>> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>>>>
>>>>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>>>  			     int migratetype, int flags);
>>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>>> index b1caedbade5b..c86c47bba019 100644
>>>> --- a/mm/memory_hotplug.c
>>>> +++ b/mm/memory_hotplug.c
>>>> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>>>  	const int nid = zone_to_nid(zone);
>>>>  	int ret;
>>>>  	struct memory_notify arg;
>>>> +	unsigned long isol_pfn;
>>>>
>>>>  	/*
>>>>  	 * {on,off}lining is constrained to full memory sections (or more
>>>> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>>>
>>>>
>>>>  	/* associate pfn range with the zone */
>>>> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
>>>> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
>>>> +	for (isol_pfn = pfn;
>>>> +	     isol_pfn < pfn + nr_pages;
>>>> +	     isol_pfn += pageblock_nr_pages)
>>>> +		set_pageblock_isolate(pfn_to_page(isol_pfn));
>>>>
>>>>  	arg.start_pfn = pfn;
>>>>  	arg.nr_pages = nr_pages;
>>>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>>>> index 04e301fb4879..cfd37b2d992e 100644
>>>> --- a/mm/page_alloc.c
>>>> +++ b/mm/page_alloc.c
>>>> @@ -454,11 +454,9 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>>>>  		migratetype = MIGRATE_UNMOVABLE;
>>>>
>>>>  #ifdef CONFIG_MEMORY_ISOLATION
>>>> -	if (migratetype == MIGRATE_ISOLATE) {
>>>> -		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>>>> -				page_to_pfn(page), PB_migrate_isolate_bit);
>>>> -		return;
>>>> -	}
>>>> +	VM_WARN(migratetype == MIGRATE_ISOLATE,
>>>> +			"Use set_pageblock_isolate() for pageblock isolation");
>>>> +	return;
>>>>  #endif
>>>>  	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>>>>  				page_to_pfn(page), MIGRATETYPE_MASK);
>>>> @@ -1819,8 +1817,8 @@ static inline struct page *__rmqueue_cma_fallback(struct zone *zone,
>>>>  #endif
>>>>
>>>>  /*
>>>> - * Change the type of a block and move all its free pages to that
>>>> - * type's freelist.
>>>> + * Move all free pages of a block to new type's freelist. Caller needs to
>>>> + * change the block type.
>>>>   */
>>>>  static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
>>>>  				  int old_mt, int new_mt)
>>>> @@ -1852,8 +1850,6 @@ static int __move_freepages_block(struct zone *zone, unsigned long start_pfn,
>>>>  		pages_moved += 1 << order;
>>>>  	}
>>>>
>>>> -	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
>>>> -
>>>>  	return pages_moved;
>>>>  }
>>>>
>>>> @@ -1911,11 +1907,16 @@ static int move_freepages_block(struct zone *zone, struct page *page,
>>>>  				int old_mt, int new_mt)
>>>>  {
>>>>  	unsigned long start_pfn;
>>>> +	int res;
>>>>
>>>>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
>>>>  		return -1;
>>>>
>>>> -	return __move_freepages_block(zone, start_pfn, old_mt, new_mt);
>>>> +	res = __move_freepages_block(zone, start_pfn, old_mt, new_mt);
>>>> +	set_pageblock_migratetype(pfn_to_page(start_pfn), new_mt);
>>>> +
>>>> +	return res;
>>>> +
>>>>  }
>>>>
>>>>  #ifdef CONFIG_MEMORY_ISOLATION
>>>> @@ -1943,11 +1944,17 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
>>>>  	return start_pfn;
>>>>  }
>>>>
>>>> +static inline void toggle_pageblock_isolate(struct page *page, bool isolate)
>>>> +{
>>>> +	set_pfnblock_flags_mask(page, (isolate << PB_migrate_isolate),
>>>> +			page_to_pfn(page), PB_migrate_isolate_bit);
>>>> +}
>>>> +
>>>>  /**
>>>> - * move_freepages_block_isolate - move free pages in block for page isolation
>>>> + * __move_freepages_block_isolate - move free pages in block for page isolation
>>>>   * @zone: the zone
>>>>   * @page: the pageblock page
>>>> - * @migratetype: migratetype to set on the pageblock
>>>> + * @isolate: to isolate the given pageblock or unisolate it
>>>>   *
>>>>   * This is similar to move_freepages_block(), but handles the special
>>>>   * case encountered in page isolation, where the block of interest
>>>> @@ -1962,10 +1969,15 @@ static unsigned long find_large_buddy(unsigned long start_pfn)
>>>>   *
>>>>   * Returns %true if pages could be moved, %false otherwise.
>>>>   */
>>>> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>>> -				  int migratetype)
>>>> +static bool __move_freepages_block_isolate(struct zone *zone,
>>>> +		struct page *page, bool isolate)
>>>>  {
>>>>  	unsigned long start_pfn, pfn;
>>>> +	int from_mt;
>>>> +	int to_mt;
>>>> +
>>>> +	if (isolate == (get_pageblock_migratetype(page) == MIGRATE_ISOLATE))
>>>> +		return false;
>>>>
>>>>  	if (!prep_move_freepages_block(zone, page, &start_pfn, NULL, NULL))
>>>>  		return false;
>>>> @@ -1982,7 +1994,7 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>>>
>>>>  		del_page_from_free_list(buddy, zone, order,
>>>>  					get_pfnblock_migratetype(buddy, pfn));
>>>> -		set_pageblock_migratetype(page, migratetype);
>>>> +		toggle_pageblock_isolate(page, isolate);
>>>>  		split_large_buddy(zone, buddy, pfn, order, FPI_NONE);
>>>>  		return true;
>>>>  	}
>>>> @@ -1993,16 +2005,38 @@ bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>>>
>>>>  		del_page_from_free_list(page, zone, order,
>>>>  					get_pfnblock_migratetype(page, pfn));
>>>> -		set_pageblock_migratetype(page, migratetype);
>>>> +		toggle_pageblock_isolate(page, isolate);
>>>>  		split_large_buddy(zone, page, pfn, order, FPI_NONE);
>>>>  		return true;
>>>>  	}
>>>>  move:
>>>> -	__move_freepages_block(zone, start_pfn,
>>>> -			       get_pfnblock_migratetype(page, start_pfn),
>>>> -			       migratetype);
>>>> +	/* use MIGRATETYPE_NO_ISO_MASK to get the non-isolate migratetype */
>>>> +	if (isolate) {
>>>> +		from_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
>>>> +				MIGRATETYPE_NO_ISO_MASK);
>>>> +		to_mt = MIGRATE_ISOLATE;
>>>> +	} else {
>>>> +		from_mt = MIGRATE_ISOLATE;
>>>> +		to_mt = get_pfnblock_flags_mask(page, page_to_pfn(page),
>>>> +				MIGRATETYPE_NO_ISO_MASK);
>>>> +	}
>>>> +
>>>> +	__move_freepages_block(zone, start_pfn, from_mt, to_mt);
>>>> +	toggle_pageblock_isolate(pfn_to_page(start_pfn), isolate);
>>>> +
>>>>  	return true;
>>>>  }
>>>> +
>>>> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
>>>> +{
>>>> +	return __move_freepages_block_isolate(zone, page, true);
>>>> +}
>>>> +
>>>> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page)
>>>> +{
>>>> +	return __move_freepages_block_isolate(zone, page, false);
>>>> +}
>>>> +
>>>>  #endif /* CONFIG_MEMORY_ISOLATION */
>>>>
>>>>  static void change_pageblock_range(struct page *pageblock_page,
>>>> @@ -2194,6 +2228,7 @@ try_to_claim_block(struct zone *zone, struct page *page,
>>>>  	if (free_pages + alike_pages >= (1 << (pageblock_order-1)) ||
>>>>  			page_group_by_mobility_disabled) {
>>>>  		__move_freepages_block(zone, start_pfn, block_type, start_type);
>>>> +		set_pageblock_migratetype(pfn_to_page(start_pfn), start_type);
>>>>  		return __rmqueue_smallest(zone, order, start_type);
>>>>  	}
>>>>
>>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>>>> index 751e21f6d85e..4571940f14db 100644
>>>> --- a/mm/page_isolation.c
>>>> +++ b/mm/page_isolation.c
>>>> @@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
>>>>  	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>>>>  			PB_migrate_isolate_bit);
>>>>  }
>>>> +void set_pageblock_isolate(struct page *page)
>>>> +{
>>>> +	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>>>> +			page_to_pfn(page),
>>>> +			PB_migrate_isolate_bit);
>>>> +}
>>>>
>>>>  /*
>>>>   * This function checks whether the range [start_pfn, end_pfn) includes
>>>> @@ -199,7 +205,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>>>>  	unmovable = has_unmovable_pages(check_unmovable_start, check_unmovable_end,
>>>>  			migratetype, isol_flags);
>>>>  	if (!unmovable) {
>>>> -		if (!move_freepages_block_isolate(zone, page, MIGRATE_ISOLATE)) {
>>>> +		if (!pageblock_isolate_and_move_free_pages(zone, page)) {
>>>>  			spin_unlock_irqrestore(&zone->lock, flags);
>>>>  			return -EBUSY;
>>>>  		}
>>>> @@ -220,7 +226,7 @@ static int set_migratetype_isolate(struct page *page, int migratetype, int isol_
>>>>  	return -EBUSY;
>>>>  }
>>>>
>>>> -static void unset_migratetype_isolate(struct page *page, int migratetype)
>>>> +static void unset_migratetype_isolate(struct page *page)
>>>>  {
>>>>  	struct zone *zone;
>>>>  	unsigned long flags;
>>>> @@ -273,10 +279,10 @@ static void unset_migratetype_isolate(struct page *page, int migratetype)
>>>>  		 * Isolating this block already succeeded, so this
>>>>  		 * should not fail on zone boundaries.
>>>>  		 */
>>>> -		WARN_ON_ONCE(!move_freepages_block_isolate(zone, page, migratetype));
>>>> +		WARN_ON_ONCE(!pageblock_unisolate_and_move_free_pages(zone, page));
>>>>  	} else {
>>>> -		set_pageblock_migratetype(page, migratetype);
>>>> -		__putback_isolated_page(page, order, migratetype);
>>>> +		clear_pageblock_isolate(page);
>>>> +		__putback_isolated_page(page, order, get_pageblock_migratetype(page));
>>>>  	}
>>>>  	zone->nr_isolate_pageblock--;
>>>>  out:
>>>> @@ -394,7 +400,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>>>  		if (PageBuddy(page)) {
>>>>  			int order = buddy_order(page);
>>>>
>>>> -			/* move_freepages_block_isolate() handled this */
>>>> +			/* pageblock_isolate_and_move_free_pages() handled this */
>>>>  			VM_WARN_ON_ONCE(pfn + (1 << order) > boundary_pfn);
>>>>
>>>>  			pfn += 1UL << order;
>>>> @@ -444,7 +450,7 @@ static int isolate_single_pageblock(unsigned long boundary_pfn, int flags,
>>>>  failed:
>>>>  	/* restore the original migratetype */
>>>>  	if (!skip_isolation)
>>>> -		unset_migratetype_isolate(pfn_to_page(isolate_pageblock), migratetype);
>>>> +		unset_migratetype_isolate(pfn_to_page(isolate_pageblock));
>>>>  	return -EBUSY;
>>>>  }
>>>>
>>>> @@ -515,7 +521,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>>>  	ret = isolate_single_pageblock(isolate_end, flags, true,
>>>>  			skip_isolation, migratetype);
>>>>  	if (ret) {
>>>> -		unset_migratetype_isolate(pfn_to_page(isolate_start), migratetype);
>>>> +		unset_migratetype_isolate(pfn_to_page(isolate_start));
>>>>  		return ret;
>>>>  	}
>>>>
>>>> @@ -528,8 +534,7 @@ int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>>>  					start_pfn, end_pfn)) {
>>>>  			undo_isolate_page_range(isolate_start, pfn, migratetype);
>>>>  			unset_migratetype_isolate(
>>>> -				pfn_to_page(isolate_end - pageblock_nr_pages),
>>>> -				migratetype);
>>>> +				pfn_to_page(isolate_end - pageblock_nr_pages));
>>>>  			return -EBUSY;
>>>>  		}
>>>>  	}
>>>> @@ -559,7 +564,7 @@ void undo_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>>>  		page = __first_valid_page(pfn, pageblock_nr_pages);
>>>>  		if (!page || !is_migrate_isolate_page(page))
>>>>  			continue;
>>>> -		unset_migratetype_isolate(page, migratetype);
>>>> +		unset_migratetype_isolate(page);
>>>>  	}
>>>>  }
>>>>  /*
>>>> --
>>>> 2.47.2
>>>>
>>>>
>>>>
>>
>>
>> --
>> Best Regards,
>> Yan, Zi


--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-12 16:10   ` Lorenzo Stoakes
  2025-05-12 16:13     ` Zi Yan
@ 2025-05-12 22:00     ` Andrew Morton
  2025-05-12 23:20     ` Zi Yan
  2 siblings, 0 replies; 42+ messages in thread
From: Andrew Morton @ 2025-05-12 22:00 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: Zi Yan, David Hildenbrand, Oscar Salvador, Johannes Weiner,
	linux-mm, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Harry Yoo

On Mon, 12 May 2025 17:10:56 +0100 Lorenzo Stoakes <lorenzo.stoakes@oracle.com> wrote:

> Andrew - please drop this series, it's broken in mm-new.

Gone, thanks.

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-12 16:10   ` Lorenzo Stoakes
  2025-05-12 16:13     ` Zi Yan
  2025-05-12 22:00     ` Andrew Morton
@ 2025-05-12 23:20     ` Zi Yan
  2 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-12 23:20 UTC (permalink / raw)
  To: Lorenzo Stoakes
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel, Harry Yoo

On 12 May 2025, at 12:10, Lorenzo Stoakes wrote:

> Andrew - please drop this series, it's broken in mm-new.
>
> Zi - (as kernel bot reports actually!) I bisected a kernel splat to this
> commit, triggerred by the mm/transhuge-stress test (please make sure to run
> mm self tests before submitting series :)
>
> You can trigger it manually with:
>
> ./transhuge-stress -d 20
>
> (The same invocation run_vmtest.sh uses).
>
> Note that this was reported in [0] (thanks to Harry Yoo for pointing this
> out to me off-list! :)
>
> [0]: https://lore.kernel.org/linux-mm/87wmalyktd.fsf@linux.ibm.com/T/#u
>

The patch below fixed the issue and all mm tests passed. Will send v5 later
and hopefully I can get some feedback on this series before that.

From 81f4ff35a5e6abf4779597861f69e9c3cce16d41 Mon Sep 17 00:00:00 2001
From: Zi Yan <ziy@nvidia.com>
Date: Mon, 12 May 2025 17:57:49 -0400
Subject: [PATCH] fixup: make set_pageblock_migratetype() set migratetype.

Signed-off-by: Zi Yan <ziy@nvidia.com>
---
 mm/page_alloc.c | 6 ++++--
 1 file changed, 4 insertions(+), 2 deletions(-)

diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index b3476e0f59ad..4b9c2c3d1b89 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -454,9 +454,11 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
 		migratetype = MIGRATE_UNMOVABLE;

 #ifdef CONFIG_MEMORY_ISOLATION
-	VM_WARN(migratetype == MIGRATE_ISOLATE,
+	if (migratetype == MIGRATE_ISOLATE) {
+		VM_WARN(1,
 			"Use set_pageblock_isolate() for pageblock isolation");
-	return;
+		return;
+	}
 #endif
 	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
 				page_to_pfn(page), MIGRATETYPE_MASK);
-- 
2.47.2



Best Regards,
Yan, Zi

^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-09 20:01 ` [PATCH v4 1/4] mm/page_isolation: make page isolation " Zi Yan
@ 2025-05-13 11:32   ` Brendan Jackman
  2025-05-13 14:53     ` Zi Yan
  2025-05-19  8:08   ` David Hildenbrand
  1 sibling, 1 reply; 42+ messages in thread
From: Brendan Jackman @ 2025-05-13 11:32 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand, Oscar Salvador, Johannes Weiner,
	linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Richard Chang,
	linux-kernel

Hi Zi,

I hope you don't mind me jumping in on a late revision like this...

On Fri May 9, 2025 at 8:01 PM UTC, Zi Yan wrote:
> During page isolation, the original migratetype is overwritten, since
> MIGRATE_* are enums and stored in pageblock bitmaps. Change
> MIGRATE_ISOLATE to be stored a standalone bit, PB_migrate_isolate, like
> PB_migrate_skip, so that migratetype is not lost during pageblock
> isolation. pageblock bits needs to be word aligned, so expand
> the number of pageblock bits from 4 to 8 and make PB_migrate_isolate bit 7.

Forgive my ignorance but can you help me confirm if I'm following this -
Do you just mean that NR_PAGEBLOCK_BITS must divide the word size? Or is
there something else going on here?

> +#ifdef CONFIG_MEMORY_ISOLATION
> +	PB_migrate_isolate = 7, /* If set the block is isolated */
> +			/* set it to 7 to make pageblock bit word aligned */
> +#endif

I feel I'm always just asking for commentary so please feel free to
complain if this is annoying. But I think it would be worth adding the 
context from the commit message into the code here (or somewhere else),
e.g:

/*
 * Page isolation is represented with a separate bit, so that the other
 * bits can store the migratetype that the block had before it was
 * isolated.
 */

Just adding in that detail about the intent will help readers a lot IMO.

>  
> +unsigned long get_pageblock_migratetype(const struct page *page)
> +{
> +	unsigned long flags;
> +
> +	flags = get_pfnblock_flags_mask(page, page_to_pfn(page),
> +			MIGRATETYPE_MASK);
> +#ifdef CONFIG_MEMORY_ISOLATION
> +	if (flags & PB_migrate_isolate_bit)
> +		return MIGRATE_ISOLATE;
> +#endif
> +	return flags;
> +}

Can we just do get_pageblock_migratetype(page, page_to_pfn(page)) here?

>  static __always_inline int get_pfnblock_migratetype(const struct page *page,
>  					unsigned long pfn)
>  {
> -	return get_pfnblock_flags_mask(page, pfn, MIGRATETYPE_MASK);
> +	unsigned long flags;
> +
> +	flags = get_pfnblock_flags_mask(page, pfn,
> +			MIGRATETYPE_MASK);
> +#ifdef CONFIG_MEMORY_ISOLATION
> +	if (flags & PB_migrate_isolate_bit)
> +		return MIGRATE_ISOLATE;
> +#endif
> +	return flags;
>  }

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-13 11:32   ` Brendan Jackman
@ 2025-05-13 14:53     ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-13 14:53 UTC (permalink / raw)
  To: Brendan Jackman
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Richard Chang,
	linux-kernel

On 13 May 2025, at 7:32, Brendan Jackman wrote:

> Hi Zi,
>
> I hope you don't mind me jumping in on a late revision like this...

Right on time. :)

>
> On Fri May 9, 2025 at 8:01 PM UTC, Zi Yan wrote:
>> During page isolation, the original migratetype is overwritten, since
>> MIGRATE_* are enums and stored in pageblock bitmaps. Change
>> MIGRATE_ISOLATE to be stored a standalone bit, PB_migrate_isolate, like
>> PB_migrate_skip, so that migratetype is not lost during pageblock
>> isolation. pageblock bits needs to be word aligned, so expand
>> the number of pageblock bits from 4 to 8 and make PB_migrate_isolate bit 7.
>
> Forgive my ignorance but can you help me confirm if I'm following this -
> Do you just mean that NR_PAGEBLOCK_BITS must divide the word size? Or is
> there something else going on here?

You are right. NR_PAGEBLOCK_BITS must divide the word size. I will fix
the commit log in the next version.

>
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +	PB_migrate_isolate = 7, /* If set the block is isolated */
>> +			/* set it to 7 to make pageblock bit word aligned */
I will fix this comment too.

>> +#endif
>
> I feel I'm always just asking for commentary so please feel free to
> complain if this is annoying. But I think it would be worth adding the
> context from the commit message into the code here (or somewhere else),
> e.g:
>
> /*
>  * Page isolation is represented with a separate bit, so that the other
>  * bits can store the migratetype that the block had before it was
>  * isolated.
>  */
>
> Just adding in that detail about the intent will help readers a lot IMO.

Sure. Will add it.

>
>>
>> +unsigned long get_pageblock_migratetype(const struct page *page)
>> +{
>> +	unsigned long flags;
>> +
>> +	flags = get_pfnblock_flags_mask(page, page_to_pfn(page),
>> +			MIGRATETYPE_MASK);
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +	if (flags & PB_migrate_isolate_bit)
>> +		return MIGRATE_ISOLATE;
>> +#endif
>> +	return flags;
>> +}
>
> Can we just do get_pageblock_migratetype(page, page_to_pfn(page)) here?

Based on my observation, the callers all have page and pfn, so using the
current implementation would save a call to page_to_pfn(). I can add a comment
to this function.

/*
 * Use get_pageblock_migratetype() if caller already has both @page and @pfn
 * to save a call to page_to_pfn().
 */

>
>>  static __always_inline int get_pfnblock_migratetype(const struct page *page,
>>  					unsigned long pfn)
>>  {
>> -	return get_pfnblock_flags_mask(page, pfn, MIGRATETYPE_MASK);
>> +	unsigned long flags;
>> +
>> +	flags = get_pfnblock_flags_mask(page, pfn,
>> +			MIGRATETYPE_MASK);
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +	if (flags & PB_migrate_isolate_bit)
>> +		return MIGRATE_ISOLATE;
>> +#endif
>> +	return flags;
>>  }


Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions.
  2025-05-09 20:01 ` [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions Zi Yan
@ 2025-05-17 20:21   ` Vlastimil Babka
  2025-05-18  0:07     ` Zi Yan
  2025-05-18 16:32   ` Johannes Weiner
  1 sibling, 1 reply; 42+ messages in thread
From: Vlastimil Babka @ 2025-05-17 20:21 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand, Oscar Salvador, Johannes Weiner,
	linux-mm
  Cc: Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 5/9/25 22:01, Zi Yan wrote:
> migratetype is no longer overwritten during pageblock isolation,
> start_isolate_page_range(), has_unmovable_pages(), and
> set_migratetype_isolate() no longer need which migratetype to restore
> during isolation failure.
> 
> For has_unmoable_pages(), it needs to know if the isolation is for CMA
> allocation, so adding CMA_ALLOCATION to isolation flags to provide the
> information.
> 
> alloc_contig_range() no longer needs migratetype. Replace it with
> a newly defined acr_flags_t to tell if an allocation is for CMA. So does
> __alloc_contig_migrate_range().
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>

AFAICS has_unmovable_pages() adds the flags parameter but doesn't use it.

But also, I think having both mode and flags is just unnecessary complexity
in this case? CMA_ALLOCATION could be just a new flag? Even if some flag
combinations wouldn't logicaly make sense, this has only so few users so we
don't have to care to make them exclusive with the mode thing.
Also I think REPORT_FAILURE is only used with MEMORY_OFFLINE so it could be
squashed?


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
                   ` (3 preceding siblings ...)
  2025-05-09 20:01 ` [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions Zi Yan
@ 2025-05-17 20:26 ` Vlastimil Babka
  2025-05-18  0:20   ` Zi Yan
  2025-05-19  7:44 ` David Hildenbrand
  5 siblings, 1 reply; 42+ messages in thread
From: Vlastimil Babka @ 2025-05-17 20:26 UTC (permalink / raw)
  To: Zi Yan, David Hildenbrand, Oscar Salvador, Johannes Weiner,
	linux-mm
  Cc: Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 5/9/25 22:01, Zi Yan wrote:
> Hi David and Oscar,
> 
> Can you take a look at Patch 2, which changes how online_pages() set
> online pageblock migratetypes? It used to first set all pageblocks to
> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
> Let me know if there is any issue with my changes.
> 
> Hi Johannes,
> 
> Patch 2 now have set_pageblock_migratetype() not accepting
> MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
> feedback.
> 
> Hi all,
> 
> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
> being overwritten during pageblock isolation process. Currently,
> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
> migratetype. This causes pageblock migratetype loss during
> alloc_contig_range() and memory offline, especially when the process
> fails due to a failed pageblock isolation and the code tries to undo the
> finished pageblock isolations.

Seems mostly fine to me, just sent suggestion for 4/4.
I was kinda hoping that MIGRATE_ISOLATE could stop being a migratetype. But
I also see that it's useful for it to be because then it means it has the
freelists in the buddy allocator, can work via __move_freepages_block() etc.
Oh well. So it's still a migratetype, but the pageblock handling is now
different.


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions.
  2025-05-17 20:21   ` Vlastimil Babka
@ 2025-05-18  0:07     ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-18  0:07 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 17 May 2025, at 16:21, Vlastimil Babka wrote:

> On 5/9/25 22:01, Zi Yan wrote:
>> migratetype is no longer overwritten during pageblock isolation,
>> start_isolate_page_range(), has_unmovable_pages(), and
>> set_migratetype_isolate() no longer need which migratetype to restore
>> during isolation failure.
>>
>> For has_unmoable_pages(), it needs to know if the isolation is for CMA
>> allocation, so adding CMA_ALLOCATION to isolation flags to provide the
>> information.
>>
>> alloc_contig_range() no longer needs migratetype. Replace it with
>> a newly defined acr_flags_t to tell if an allocation is for CMA. So does
>> __alloc_contig_migrate_range().
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>
> AFAICS has_unmovable_pages() adds the flags parameter but doesn't use it.

Yes, will remove it.

>
> But also, I think having both mode and flags is just unnecessary complexity
> in this case? CMA_ALLOCATION could be just a new flag? Even if some flag
> combinations wouldn't logicaly make sense, this has only so few users so we
> don't have to care to make them exclusive with the mode thing.

I was doing that until v3.

> Also I think REPORT_FAILURE is only used with MEMORY_OFFLINE so it could be
> squashed?

Yes, let me do that. Johannes also pointed this out but I missed it.

In the next version, I will remove REPORT_FAILURE as it is implied by
MEMORY_OFFLINE, including isolate_flags_t, and keep the existing enum
with MEMORY_OFFLINE and CMA_ALLOCATION.

Thanks for the review.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-17 20:26 ` [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Vlastimil Babka
@ 2025-05-18  0:20   ` Zi Yan
  2025-05-19 14:15     ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-18  0:20 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: David Hildenbrand, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 17 May 2025, at 16:26, Vlastimil Babka wrote:

> On 5/9/25 22:01, Zi Yan wrote:
>> Hi David and Oscar,
>>
>> Can you take a look at Patch 2, which changes how online_pages() set
>> online pageblock migratetypes? It used to first set all pageblocks to
>> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
>> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
>> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
>> Let me know if there is any issue with my changes.
>>
>> Hi Johannes,
>>
>> Patch 2 now have set_pageblock_migratetype() not accepting
>> MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
>> feedback.
>>
>> Hi all,
>>
>> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
>> being overwritten during pageblock isolation process. Currently,
>> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
>> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
>> migratetype. This causes pageblock migratetype loss during
>> alloc_contig_range() and memory offline, especially when the process
>> fails due to a failed pageblock isolation and the code tries to undo the
>> finished pageblock isolations.
>
> Seems mostly fine to me, just sent suggestion for 4/4.

Thanks.

> I was kinda hoping that MIGRATE_ISOLATE could stop being a migratetype. But
> I also see that it's useful for it to be because then it means it has the
> freelists in the buddy allocator, can work via __move_freepages_block() etc.

Yeah, I wanted to remove MIGRATE_ISOLATE from migratetype too, but there
is a MIGRATE_ISOLATE freelist and /proc/pagetypeinfo also shows isolated
free pages.

> Oh well. So it's still a migratetype, but the pageblock handling is now
> different.

Yep. We also have PB_migrate_skip, a bit in pageblock_bits used for memory
compaction and not part of migratetype.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions.
  2025-05-09 20:01 ` [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions Zi Yan
  2025-05-17 20:21   ` Vlastimil Babka
@ 2025-05-18 16:32   ` Johannes Weiner
  2025-05-18 17:24     ` Zi Yan
  1 sibling, 1 reply; 42+ messages in thread
From: Johannes Weiner @ 2025-05-18 16:32 UTC (permalink / raw)
  To: Zi Yan
  Cc: David Hildenbrand, Oscar Salvador, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On Fri, May 09, 2025 at 04:01:11PM -0400, Zi Yan wrote:
> @@ -22,8 +22,25 @@ static inline bool is_migrate_isolate(int migratetype)
>  }
>  #endif
>  
> -#define MEMORY_OFFLINE	0x1
> -#define REPORT_FAILURE	0x2
> +/*
> + * Isolation modes:
> + * ISOLATE_MODE_NONE - isolate for other purposes than those below
> + * MEMORY_OFFLINE    - isolate to offline (!allocate) memory e.g., skip over
> + *		       PageHWPoison() pages and PageOffline() pages.
> + * CMA_ALLOCATION    - isolate for CMA allocations
> + */
> +enum isolate_mode_t {
> +	ISOLATE_MODE_NONE,
> +	MEMORY_OFFLINE,
> +	CMA_ALLOCATION,
> +};
> +
> +/*
> + * Isolation flags:
> + * REPORT_FAILURE - report details about the failure to isolate the range
> + */
> +typedef unsigned int __bitwise isolate_flags_t;
> +#define REPORT_FAILURE		((__force isolate_flags_t)BIT(0))
>  
>  void set_pageblock_migratetype(struct page *page, int migratetype);
>  void set_pageblock_isolate(struct page *page);
> @@ -32,10 +49,10 @@ bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
>  bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>  
>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
> -			     int migratetype, int flags);
> +			     isolate_mode_t mode, isolate_flags_t flags);

This should be 'enum isolate_mode_t', right?

(isolate_mode_t also exists, but it's something else)

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions.
  2025-05-18 16:32   ` Johannes Weiner
@ 2025-05-18 17:24     ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-18 17:24 UTC (permalink / raw)
  To: Johannes Weiner
  Cc: David Hildenbrand, Oscar Salvador, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 18 May 2025, at 12:32, Johannes Weiner wrote:

> On Fri, May 09, 2025 at 04:01:11PM -0400, Zi Yan wrote:
>> @@ -22,8 +22,25 @@ static inline bool is_migrate_isolate(int migratetype)
>>  }
>>  #endif
>>
>> -#define MEMORY_OFFLINE	0x1
>> -#define REPORT_FAILURE	0x2
>> +/*
>> + * Isolation modes:
>> + * ISOLATE_MODE_NONE - isolate for other purposes than those below
>> + * MEMORY_OFFLINE    - isolate to offline (!allocate) memory e.g., skip over
>> + *		       PageHWPoison() pages and PageOffline() pages.
>> + * CMA_ALLOCATION    - isolate for CMA allocations
>> + */
>> +enum isolate_mode_t {
>> +	ISOLATE_MODE_NONE,
>> +	MEMORY_OFFLINE,
>> +	CMA_ALLOCATION,
>> +};
>> +
>> +/*
>> + * Isolation flags:
>> + * REPORT_FAILURE - report details about the failure to isolate the range
>> + */
>> +typedef unsigned int __bitwise isolate_flags_t;
>> +#define REPORT_FAILURE		((__force isolate_flags_t)BIT(0))
>>
>>  void set_pageblock_migratetype(struct page *page, int migratetype);
>>  void set_pageblock_isolate(struct page *page);
>> @@ -32,10 +49,10 @@ bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page)
>>  bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>>
>>  int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>> -			     int migratetype, int flags);
>> +			     isolate_mode_t mode, isolate_flags_t flags);
>
> This should be 'enum isolate_mode_t', right?
>
> (isolate_mode_t also exists, but it's something else)

Oh, I did not realize that. Let me rename it to pb_isolate_mode_t.
Thanks.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
                   ` (4 preceding siblings ...)
  2025-05-17 20:26 ` [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Vlastimil Babka
@ 2025-05-19  7:44 ` David Hildenbrand
  2025-05-19 14:01   ` Zi Yan
  5 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-19  7:44 UTC (permalink / raw)
  To: Zi Yan, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel

On 09.05.25 22:01, Zi Yan wrote:
> Hi David and Oscar,

Hi,

> 
> Can you take a look at Patch 2, which changes how online_pages() set
> online pageblock migratetypes?

Sorry, now looking :)

> It used to first set all pageblocks to
> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
> Let me know if there is any issue with my changes.

Conceptually, we should start with MIGRATE_MOVABLE + isolated, to then 
clear the isolated bit.

Let me take a look.


-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-09 20:01 ` [PATCH v4 1/4] mm/page_isolation: make page isolation " Zi Yan
  2025-05-13 11:32   ` Brendan Jackman
@ 2025-05-19  8:08   ` David Hildenbrand
  2025-05-19 15:08     ` Zi Yan
  1 sibling, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-19  8:08 UTC (permalink / raw)
  To: Zi Yan, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel

On 09.05.25 22:01, Zi Yan wrote:
> During page isolation, the original migratetype is overwritten, since
> MIGRATE_* are enums and stored in pageblock bitmaps. Change
> MIGRATE_ISOLATE to be stored a standalone bit, PB_migrate_isolate, like
> PB_migrate_skip, so that migratetype is not lost during pageblock
> isolation. pageblock bits needs to be word aligned, so expand
> the number of pageblock bits from 4 to 8 and make PB_migrate_isolate bit 7.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>   include/linux/mmzone.h          | 15 ++++++++------
>   include/linux/pageblock-flags.h |  9 ++++++++-
>   mm/page_alloc.c                 | 36 ++++++++++++++++++++++++++++++++-
>   mm/page_isolation.c             | 11 ++++++++++
>   4 files changed, 63 insertions(+), 8 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index b19a98c20de8..7ef01fe148ce 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -106,14 +106,17 @@ static inline bool migratetype_is_mergeable(int mt)
>   
>   extern int page_group_by_mobility_disabled;
>   
> -#define MIGRATETYPE_MASK ((1UL << PB_migratetype_bits) - 1)
> +#ifdef CONFIG_MEMORY_ISOLATION
> +#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
> +#else
> +#define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
> +#endif
> +
> +unsigned long get_pageblock_migratetype(const struct page *page);
>   
> -#define get_pageblock_migratetype(page)					\
> -	get_pfnblock_flags_mask(page, page_to_pfn(page), MIGRATETYPE_MASK)
> +#define folio_migratetype(folio)					\
> +	get_pageblock_migratetype(&folio->page)
>   
> -#define folio_migratetype(folio)				\
> -	get_pfnblock_flags_mask(&folio->page, folio_pfn(folio),		\
> -			MIGRATETYPE_MASK)
>   struct free_area {
>   	struct list_head	free_list[MIGRATE_TYPES];
>   	unsigned long		nr_free;
> diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h
> index 0c4963339f0b..00040e7df8c8 100644
> --- a/include/linux/pageblock-flags.h
> +++ b/include/linux/pageblock-flags.h
> @@ -20,7 +20,10 @@ enum pageblock_bits {
>   	PB_migrate_end = PB_migrate + PB_migratetype_bits - 1,
>   			/* 3 bits required for migrate types */
>   	PB_migrate_skip,/* If set the block is skipped by compaction */
> -
> +#ifdef CONFIG_MEMORY_ISOLATION
> +	PB_migrate_isolate = 7, /* If set the block is isolated */
> +			/* set it to 7 to make pageblock bit word aligned */

I think what we want to do here is align NR_PAGEBLOCK_BITS up to 4 bits 
at relevant places. Or go to the next power-of-2.

Could we simply to that using something like

#ifdef CONFIG_MEMORY_ISOLATION
	PB_migrate_isolate, /* If set the block is isolated */
#endif
	__NR_PAGEBLOCK_BITS
};

/* We always want the bits to be a power of 2. */
#define NR_PAGEBLOCK_BITS (roundup_pow_of_two(__NR_PAGEBLOCK_BITS))


Would something like that work?

> +#endif
>   	/*
>   	 * Assume the bits will always align on a word. If this assumption
>   	 * changes then get/set pageblock needs updating.
> @@ -28,6 +31,10 @@ enum pageblock_bits {
>   	NR_PAGEBLOCK_BITS
 >   };>
> +#ifdef CONFIG_MEMORY_ISOLATION
> +#define PB_migrate_isolate_bit BIT(PB_migrate_isolate)
> +#endif
> +

I assume we should first change users ot "1 << (PB_migrate_skip)" to 
PB_migrate_skip_bit to keep it similar.

>   #if defined(CONFIG_PAGE_BLOCK_ORDER)
>   #define PAGE_BLOCK_ORDER CONFIG_PAGE_BLOCK_ORDER
>   #else
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index c77592b22256..04e301fb4879 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -381,10 +381,31 @@ unsigned long get_pfnblock_flags_mask(const struct page *page,
>   	return (word >> bitidx) & mask;
>   }
>   
> +unsigned long get_pageblock_migratetype(const struct page *page)
> +{
> +	unsigned long flags;
> +
> +	flags = get_pfnblock_flags_mask(page, page_to_pfn(page),
> +			MIGRATETYPE_MASK);

When calling functions, we usually indent up to the beginning of the 
parameters. Same for the other cases below.

... or just exceed the 80 chars a bit in this case. :)

> +#ifdef CONFIG_MEMORY_ISOLATION
> +	if (flags & PB_migrate_isolate_bit)
> +		return MIGRATE_ISOLATE;
> +#endif
> +	return flags;
> +}
> +
>   static __always_inline int get_pfnblock_migratetype(const struct page *page,
>   					unsigned long pfn)
>   {
> -	return get_pfnblock_flags_mask(page, pfn, MIGRATETYPE_MASK);
> +	unsigned long flags;
> +
> +	flags = get_pfnblock_flags_mask(page, pfn,
> +			MIGRATETYPE_MASK);

This should fit into a single line.

> +#ifdef CONFIG_MEMORY_ISOLATION
> +	if (flags & PB_migrate_isolate_bit)
> +		return MIGRATE_ISOLATE;
> +#endif

If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could 
you ever get PB_migrate_isolate_bit?


I think what we should do is

1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()

2) Remove the mask parameter

3) Perform the masking in all callers.



Maybe, we should convert set_pfnblock_flags_mask() to

void set_clear_pfnblock_flags(struct page *page, unsigned long
			      set_flags, unsigned long clear_flags);

And better, splitting it up (or providing helpers)

set_pfnblock_flags(struct page *page, unsigned long flags);
clear_pfnblock_flags(struct page *page, unsigned long flags);


This implies some more code cleanups first that make the code easier to 
extend.

> +	return flags;
>   }
>   
>   /**
> @@ -402,8 +423,14 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
>   	unsigned long bitidx, word_bitidx;
>   	unsigned long word;
>   
> +#ifdef CONFIG_MEMORY_ISOLATION
> +	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 8);
> +	/* extra one for MIGRATE_ISOLATE */
> +	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits) + 1);
> +#else
>   	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
>   	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits));
> +#endif
>   
>   	bitmap = get_pageblock_bitmap(page, pfn);
>   	bitidx = pfn_to_bitidx(page, pfn);
> @@ -426,6 +453,13 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>   		     migratetype < MIGRATE_PCPTYPES))
>   		migratetype = MIGRATE_UNMOVABLE;
>   
> +#ifdef CONFIG_MEMORY_ISOLATION
> +	if (migratetype == MIGRATE_ISOLATE) {
> +		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
> +				page_to_pfn(page), PB_migrate_isolate_bit);
> +		return;
> +	}
> +#endif
>   	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>   				page_to_pfn(page), MIGRATETYPE_MASK);
>   }
> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
> index b2fc5266e3d2..751e21f6d85e 100644
> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -15,6 +15,17 @@
>   #define CREATE_TRACE_POINTS
>   #include <trace/events/page_isolation.h>
>   
> +static inline bool __maybe_unused get_pageblock_isolate(struct page *page)
> +{
> +	return get_pfnblock_flags_mask(page, page_to_pfn(page),
> +			PB_migrate_isolate_bit);
> +}
> +static inline void clear_pageblock_isolate(struct page *page)
> +{
> +	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
> +			PB_migrate_isolate_bit);
> +}

Should these reside in include/linux/pageblock-flags.h, just like the
CONFIG_COMPACTION "skip" variants?

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-09 20:01 ` [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate() Zi Yan
  2025-05-12  6:25   ` kernel test robot
  2025-05-12 16:10   ` Lorenzo Stoakes
@ 2025-05-19  8:21   ` David Hildenbrand
  2025-05-19 23:06     ` Zi Yan
  2 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-19  8:21 UTC (permalink / raw)
  To: Zi Yan, Oscar Salvador, Johannes Weiner, linux-mm
  Cc: Andrew Morton, Vlastimil Babka, Baolin Wang, Kirill A . Shutemov,
	Mel Gorman, Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Richard Chang, linux-kernel

On 09.05.25 22:01, Zi Yan wrote:
> Since migratetype is no longer overwritten during pageblock isolation,
> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
> 
> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
> when a pageblock is isolated. It is used by move_freepages_block_isolate().
> 
> Add pageblock_isolate_and_move_free_pages() and
> pageblock_unisolate_and_move_free_pages() to be explicit about the page
> isolation operations. Both share the common code in
> __move_freepages_block_isolate(), which is renamed from
> move_freepages_block_isolate().
> 
> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
> so that one should use set_pageblock_isolate() to isolate pageblocks.
> 
> Two consequential changes:
> 1. move pageblock migratetype code out of __move_freepages_block().
> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
>     called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
>     pageblocks are isolated afterwards. Otherwise, all online pageblocks
>     will have non-determined migratetype.
> 
> Signed-off-by: Zi Yan <ziy@nvidia.com>
> ---
>   include/linux/mmzone.h         |  4 +-
>   include/linux/page-isolation.h |  5 ++-
>   mm/memory_hotplug.c            |  7 +++-
>   mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
>   mm/page_isolation.c            | 27 ++++++++-----
>   5 files changed, 82 insertions(+), 34 deletions(-)
> 
> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
> index 7ef01fe148ce..f66895456974 100644
> --- a/include/linux/mmzone.h
> +++ b/include/linux/mmzone.h
> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
>   extern int page_group_by_mobility_disabled;
>   
>   #ifdef CONFIG_MEMORY_ISOLATION
> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
>   #else
> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
>   #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>   #endif
>   
> diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
> index 898bb788243b..b0a2af0a5357 100644
> --- a/include/linux/page-isolation.h
> +++ b/include/linux/page-isolation.h
> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
>   #define REPORT_FAILURE	0x2
>   
>   void set_pageblock_migratetype(struct page *page, int migratetype);
> +void set_pageblock_isolate(struct page *page);
>   
> -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
> -				  int migratetype);
> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>   
>   int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>   			     int migratetype, int flags);
> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b1caedbade5b..c86c47bba019 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>   	const int nid = zone_to_nid(zone);
>   	int ret;
>   	struct memory_notify arg;
> +	unsigned long isol_pfn;
>   
>   	/*
>   	 * {on,off}lining is constrained to full memory sections (or more
> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>   
>   
>   	/* associate pfn range with the zone */
> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
> +	for (isol_pfn = pfn;
> +	     isol_pfn < pfn + nr_pages;
> +	     isol_pfn += pageblock_nr_pages)
> +		set_pageblock_isolate(pfn_to_page(isol_pfn));

Can we move that all the way into memmap_init_range(), where we do the
set_pageblock_migratetype()?

The MIGRATE_UNMOVABLE in mhp_init_memmap_on_memory() is likely fine: all
pages in that pageblock will be used for the memmap. Everything is unmovable,
but no free pages so ... nobody cares? :)


diff --git a/mm/internal.h b/mm/internal.h
index 6b8ed20177432..bc102846fcf1f 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -821,7 +821,7 @@ extern void *memmap_alloc(phys_addr_t size, phys_addr_t align,
  			  int nid, bool exact_nid);
  
  void memmap_init_range(unsigned long, int, unsigned long, unsigned long,
-		unsigned long, enum meminit_context, struct vmem_altmap *, int);
+		unsigned long, enum meminit_context, struct vmem_altmap *, bool);
  
  #if defined CONFIG_COMPACTION || defined CONFIG_CMA
  
diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
index b1caedbade5b1..4b2cf20ad21fb 100644
--- a/mm/memory_hotplug.c
+++ b/mm/memory_hotplug.c
@@ -764,13 +764,13 @@ static inline void section_taint_zone_device(unsigned long pfn)
   * and resizing the pgdat/zone data to span the added pages. After this
   * call, all affected pages are PageOffline().
   *
- * All aligned pageblocks are initialized to the specified migratetype
- * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
- * zone stats (e.g., nr_isolate_pageblock) are touched.
+ * All aligned pageblocks are initialized to MIGRATE_MOVABLE, and are isolated
+ * if requested. Besides setting the migratetype, no related zone stats (e.g.,
+ * nr_isolate_pageblock) are touched.
   */
  void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
  				  unsigned long nr_pages,
-				  struct vmem_altmap *altmap, int migratetype)
+				  struct vmem_altmap *altmap, bool isolate)
  {
  	struct pglist_data *pgdat = zone->zone_pgdat;
  	int nid = pgdat->node_id;
@@ -802,7 +802,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
  	 * are reserved so nobody should be touching them so we should be safe
  	 */
  	memmap_init_range(nr_pages, nid, zone_idx(zone), start_pfn, 0,
-			 MEMINIT_HOTPLUG, altmap, migratetype);
+			 MEMINIT_HOTPLUG, altmap, isolate);
  
  	set_zone_contiguous(zone);
  }
@@ -1127,7 +1127,7 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
  	if (mhp_off_inaccessible)
  		page_init_poison(pfn_to_page(pfn), sizeof(struct page) * nr_pages);
  
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, false);
  
  	for (i = 0; i < nr_pages; i++) {
  		struct page *page = pfn_to_page(pfn + i);
@@ -1192,7 +1192,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
  
  
  	/* associate pfn range with the zone */
-	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
+	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, true);
  
  	arg.start_pfn = pfn;
  	arg.nr_pages = nr_pages;
diff --git a/mm/memremap.c b/mm/memremap.c
index c417c843e9b1f..e47f6809f254b 100644
--- a/mm/memremap.c
+++ b/mm/memremap.c
@@ -254,7 +254,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
  		zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE];
  		move_pfn_range_to_zone(zone, PHYS_PFN(range->start),
  				PHYS_PFN(range_len(range)), params->altmap,
-				MIGRATE_MOVABLE);
+				false);
  	}
  
  	mem_hotplug_done();
diff --git a/mm/mm_init.c b/mm/mm_init.c
index 1c5444e188f82..041106fc524be 100644
--- a/mm/mm_init.c
+++ b/mm/mm_init.c
@@ -867,14 +867,14 @@ static void __init init_unavailable_range(unsigned long spfn,
   * up by memblock_free_all() once the early boot process is
   * done. Non-atomic initialization, single-pass.
   *
- * All aligned pageblocks are initialized to the specified migratetype
- * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
- * zone stats (e.g., nr_isolate_pageblock) are touched.
+ * All aligned pageblocks are initialized to MIGRATE_MOVABLE, and are isolated
+ * if requested. Besides setting the migratetype, no related zone stats (e.g.,
+ * nr_isolate_pageblock) are touched.
   */
  void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone,
  		unsigned long start_pfn, unsigned long zone_end_pfn,
  		enum meminit_context context,
-		struct vmem_altmap *altmap, int migratetype)
+		struct vmem_altmap *altmap, bool isolate)
  {
  	unsigned long pfn, end_pfn = start_pfn + size;
  	struct page *page;
@@ -931,7 +931,9 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone
  		 * over the place during system boot.
  		 */
  		if (pageblock_aligned(pfn)) {
-			set_pageblock_migratetype(page, migratetype);
+			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
+			if (isolate)
+				set_pageblock_isolate(page, isolate)
  			cond_resched();
  		}
  		pfn++;
@@ -954,7 +956,7 @@ static void __init memmap_init_zone_range(struct zone *zone,
  		return;
  
  	memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn,
-			  zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
+			  zone_end_pfn, MEMINIT_EARLY, NULL, false);
  
  	if (*hole_pfn < start_pfn)
  		init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid);
-- 
2.49.0



As an alterantive, a second "isolate" parameter and make sure that migratetype is
never MIGRATE_ISOLATE.

[...]

> --- a/mm/page_isolation.c
> +++ b/mm/page_isolation.c
> @@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
>   	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>   			PB_migrate_isolate_bit);
>   }
> +void set_pageblock_isolate(struct page *page)
> +{
> +	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
> +			page_to_pfn(page),
> +			PB_migrate_isolate_bit);
> +}

Probably better placed in the previous patch, and in the header (see comment to #1).

-- 
Cheers,

David / dhildenb


^ permalink raw reply related	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-19  7:44 ` David Hildenbrand
@ 2025-05-19 14:01   ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-19 14:01 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19 May 2025, at 3:44, David Hildenbrand wrote:

> On 09.05.25 22:01, Zi Yan wrote:
>> Hi David and Oscar,
>
> Hi,
>
>>
>> Can you take a look at Patch 2, which changes how online_pages() set
>> online pageblock migratetypes?
>
> Sorry, now looking :)
>
>> It used to first set all pageblocks to
>> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
>> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
>> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
>> Let me know if there is any issue with my changes.
>
> Conceptually, we should start with MIGRATE_MOVABLE + isolated, to then clear the isolated bit.

OK, in my current V5, I added
void init_pageblock_migratetype(struct page, int migratetype, bool isolate) to
do this, so that one can initialize a pageblock with a migratetype + isolated or not.
Let me check your comments on Patch 4 too.


--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-18  0:20   ` Zi Yan
@ 2025-05-19 14:15     ` David Hildenbrand
  2025-05-19 14:35       ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-19 14:15 UTC (permalink / raw)
  To: Zi Yan, Vlastimil Babka
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Baolin Wang, Kirill A . Shutemov, Mel Gorman, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Richard Chang, linux-kernel

On 18.05.25 02:20, Zi Yan wrote:
> On 17 May 2025, at 16:26, Vlastimil Babka wrote:
> 
>> On 5/9/25 22:01, Zi Yan wrote:
>>> Hi David and Oscar,
>>>
>>> Can you take a look at Patch 2, which changes how online_pages() set
>>> online pageblock migratetypes? It used to first set all pageblocks to
>>> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
>>> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
>>> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
>>> Let me know if there is any issue with my changes.
>>>
>>> Hi Johannes,
>>>
>>> Patch 2 now have set_pageblock_migratetype() not accepting
>>> MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
>>> feedback.
>>>
>>> Hi all,
>>>
>>> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
>>> being overwritten during pageblock isolation process. Currently,
>>> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
>>> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
>>> migratetype. This causes pageblock migratetype loss during
>>> alloc_contig_range() and memory offline, especially when the process
>>> fails due to a failed pageblock isolation and the code tries to undo the
>>> finished pageblock isolations.
>>
>> Seems mostly fine to me, just sent suggestion for 4/4.
> 
> Thanks.
> 
>> I was kinda hoping that MIGRATE_ISOLATE could stop being a migratetype. But
>> I also see that it's useful for it to be because then it means it has the
>> freelists in the buddy allocator, can work via __move_freepages_block() etc.
> 
> Yeah, I wanted to remove MIGRATE_ISOLATE from migratetype too, but there
> is a MIGRATE_ISOLATE freelist and /proc/pagetypeinfo also shows isolated
> free pages.

The latter, we can likely fake.

Is there a reasonable way to remove MIGRATE_ISOLATE completely?

Of course, we could simply duplicate the page lists (one set for 
isolated, one set for !isolated), or keep it as is and simply have a
separate one that we separate out. So, we could have a 
migratetype+isolated pair instead.

Just a thought, did not look into all the ugly details.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-19 14:15     ` David Hildenbrand
@ 2025-05-19 14:35       ` Zi Yan
  2025-05-20  8:58         ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-19 14:35 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19 May 2025, at 10:15, David Hildenbrand wrote:

> On 18.05.25 02:20, Zi Yan wrote:
>> On 17 May 2025, at 16:26, Vlastimil Babka wrote:
>>
>>> On 5/9/25 22:01, Zi Yan wrote:
>>>> Hi David and Oscar,
>>>>
>>>> Can you take a look at Patch 2, which changes how online_pages() set
>>>> online pageblock migratetypes? It used to first set all pageblocks to
>>>> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
>>>> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
>>>> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
>>>> Let me know if there is any issue with my changes.
>>>>
>>>> Hi Johannes,
>>>>
>>>> Patch 2 now have set_pageblock_migratetype() not accepting
>>>> MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
>>>> feedback.
>>>>
>>>> Hi all,
>>>>
>>>> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
>>>> being overwritten during pageblock isolation process. Currently,
>>>> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
>>>> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
>>>> migratetype. This causes pageblock migratetype loss during
>>>> alloc_contig_range() and memory offline, especially when the process
>>>> fails due to a failed pageblock isolation and the code tries to undo the
>>>> finished pageblock isolations.
>>>
>>> Seems mostly fine to me, just sent suggestion for 4/4.
>>
>> Thanks.
>>
>>> I was kinda hoping that MIGRATE_ISOLATE could stop being a migratetype. But
>>> I also see that it's useful for it to be because then it means it has the
>>> freelists in the buddy allocator, can work via __move_freepages_block() etc.
>>
>> Yeah, I wanted to remove MIGRATE_ISOLATE from migratetype too, but there
>> is a MIGRATE_ISOLATE freelist and /proc/pagetypeinfo also shows isolated
>> free pages.
>
> The latter, we can likely fake.
>
> Is there a reasonable way to remove MIGRATE_ISOLATE completely?
>
> Of course, we could simply duplicate the page lists (one set for isolated, one set for !isolated), or keep it as is and simply have a

That could work. It will change vmcore layout and I wonder if that is a concern
or not.

> separate one that we separate out. So, we could have a migratetype+isolated pair instead.

What do you mean by a migratetype+isolate pair?

>
> Just a thought, did not look into all the ugly details.

Another thought is that maybe caller should keep the isolated free pages instead
to make it actually isolated. We might need to keep per-order isolated free page
stats to fake /proc/pagetypeinfo.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-19  8:08   ` David Hildenbrand
@ 2025-05-19 15:08     ` Zi Yan
  2025-05-19 16:42       ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-19 15:08 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19 May 2025, at 4:08, David Hildenbrand wrote:

> On 09.05.25 22:01, Zi Yan wrote:
>> During page isolation, the original migratetype is overwritten, since
>> MIGRATE_* are enums and stored in pageblock bitmaps. Change
>> MIGRATE_ISOLATE to be stored a standalone bit, PB_migrate_isolate, like
>> PB_migrate_skip, so that migratetype is not lost during pageblock
>> isolation. pageblock bits needs to be word aligned, so expand
>> the number of pageblock bits from 4 to 8 and make PB_migrate_isolate bit 7.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>   include/linux/mmzone.h          | 15 ++++++++------
>>   include/linux/pageblock-flags.h |  9 ++++++++-
>>   mm/page_alloc.c                 | 36 ++++++++++++++++++++++++++++++++-
>>   mm/page_isolation.c             | 11 ++++++++++
>>   4 files changed, 63 insertions(+), 8 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index b19a98c20de8..7ef01fe148ce 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -106,14 +106,17 @@ static inline bool migratetype_is_mergeable(int mt)
>>    extern int page_group_by_mobility_disabled;
>>  -#define MIGRATETYPE_MASK ((1UL << PB_migratetype_bits) - 1)
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
>> +#else
>> +#define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>> +#endif
>> +
>> +unsigned long get_pageblock_migratetype(const struct page *page);
>>  -#define get_pageblock_migratetype(page)					\
>> -	get_pfnblock_flags_mask(page, page_to_pfn(page), MIGRATETYPE_MASK)
>> +#define folio_migratetype(folio)					\
>> +	get_pageblock_migratetype(&folio->page)
>>  -#define folio_migratetype(folio)				\
>> -	get_pfnblock_flags_mask(&folio->page, folio_pfn(folio),		\
>> -			MIGRATETYPE_MASK)
>>   struct free_area {
>>   	struct list_head	free_list[MIGRATE_TYPES];
>>   	unsigned long		nr_free;
>> diff --git a/include/linux/pageblock-flags.h b/include/linux/pageblock-flags.h
>> index 0c4963339f0b..00040e7df8c8 100644
>> --- a/include/linux/pageblock-flags.h
>> +++ b/include/linux/pageblock-flags.h
>> @@ -20,7 +20,10 @@ enum pageblock_bits {
>>   	PB_migrate_end = PB_migrate + PB_migratetype_bits - 1,
>>   			/* 3 bits required for migrate types */
>>   	PB_migrate_skip,/* If set the block is skipped by compaction */
>> -
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +	PB_migrate_isolate = 7, /* If set the block is isolated */
>> +			/* set it to 7 to make pageblock bit word aligned */
>
> I think what we want to do here is align NR_PAGEBLOCK_BITS up to 4 bits at relevant places. Or go to the next power-of-2.
>
> Could we simply to that using something like
>
> #ifdef CONFIG_MEMORY_ISOLATION
> 	PB_migrate_isolate, /* If set the block is isolated */
> #endif
> 	__NR_PAGEBLOCK_BITS
> };
>
> /* We always want the bits to be a power of 2. */
> #define NR_PAGEBLOCK_BITS (roundup_pow_of_two(__NR_PAGEBLOCK_BITS))
>
>
> Would something like that work?

Yes, it builds and boots on x86_64 for MEMROY_ISOLATION and !MEMORY_ISOLATION.
Will add this change.

>
>> +#endif
>>   	/*
>>   	 * Assume the bits will always align on a word. If this assumption
>>   	 * changes then get/set pageblock needs updating.
>> @@ -28,6 +31,10 @@ enum pageblock_bits {
>>   	NR_PAGEBLOCK_BITS
>>   };>
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +#define PB_migrate_isolate_bit BIT(PB_migrate_isolate)
>> +#endif
>> +
>
> I assume we should first change users ot "1 << (PB_migrate_skip)" to PB_migrate_skip_bit to keep it similar.

Will add this.
>
>>   #if defined(CONFIG_PAGE_BLOCK_ORDER)
>>   #define PAGE_BLOCK_ORDER CONFIG_PAGE_BLOCK_ORDER
>>   #else
>> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
>> index c77592b22256..04e301fb4879 100644
>> --- a/mm/page_alloc.c
>> +++ b/mm/page_alloc.c
>> @@ -381,10 +381,31 @@ unsigned long get_pfnblock_flags_mask(const struct page *page,
>>   	return (word >> bitidx) & mask;
>>   }
>>  +unsigned long get_pageblock_migratetype(const struct page *page)
>> +{
>> +	unsigned long flags;
>> +
>> +	flags = get_pfnblock_flags_mask(page, page_to_pfn(page),
>> +			MIGRATETYPE_MASK);
>
> When calling functions, we usually indent up to the beginning of the parameters. Same for the other cases below.

OK, will follow this. I was confused, since I see various indentations across files.
There is .clang-format and clang-format indeed indent parameters like you said,
then I will use clang-format.

>
> ... or just exceed the 80 chars a bit in this case. :)
>
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +	if (flags & PB_migrate_isolate_bit)
>> +		return MIGRATE_ISOLATE;
>> +#endif
>> +	return flags;
>> +}
>> +
>>   static __always_inline int get_pfnblock_migratetype(const struct page *page,
>>   					unsigned long pfn)
>>   {
>> -	return get_pfnblock_flags_mask(page, pfn, MIGRATETYPE_MASK);
>> +	unsigned long flags;
>> +
>> +	flags = get_pfnblock_flags_mask(page, pfn,
>> +			MIGRATETYPE_MASK);
>
> This should fit into a single line.

Sure.

>
>> +#ifdef CONFIG_MEMORY_ISOLATION
>> +	if (flags & PB_migrate_isolate_bit)
>> +		return MIGRATE_ISOLATE;
>> +#endif
>
> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?

MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
so it gets PB_migrate_isolate_bit.

>
>
> I think what we should do is
>
> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>
> 2) Remove the mask parameter
>
> 3) Perform the masking in all callers.

get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
get PB_migrate_skip. I do not think we want to include PB_migrate_skip
in the mask to confuse readers.

>
>
>
> Maybe, we should convert set_pfnblock_flags_mask() to
>
> void set_clear_pfnblock_flags(struct page *page, unsigned long
> 			      set_flags, unsigned long clear_flags);
>
> And better, splitting it up (or providing helpers)
>
> set_pfnblock_flags(struct page *page, unsigned long flags);
> clear_pfnblock_flags(struct page *page, unsigned long flags);
>
>
> This implies some more code cleanups first that make the code easier to extend.
>

The same due to PB_migrate_skip.

Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
internal APIs by prepending "__". They are only used by the new
{get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.

WDYT?

>> +	return flags;
>>   }
>>    /**
>> @@ -402,8 +423,14 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
>>   	unsigned long bitidx, word_bitidx;
>>   	unsigned long word;
>>  +#ifdef CONFIG_MEMORY_ISOLATION
>> +	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 8);
>> +	/* extra one for MIGRATE_ISOLATE */
>> +	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits) + 1);
>> +#else
>>   	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
>>   	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits));
>> +#endif
>>    	bitmap = get_pageblock_bitmap(page, pfn);
>>   	bitidx = pfn_to_bitidx(page, pfn);
>> @@ -426,6 +453,13 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>>   		     migratetype < MIGRATE_PCPTYPES))
>>   		migratetype = MIGRATE_UNMOVABLE;
>>  +#ifdef CONFIG_MEMORY_ISOLATION
>> +	if (migratetype == MIGRATE_ISOLATE) {
>> +		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>> +				page_to_pfn(page), PB_migrate_isolate_bit);
>> +		return;
>> +	}
>> +#endif
>>   	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>>   				page_to_pfn(page), MIGRATETYPE_MASK);
>>   }
>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>> index b2fc5266e3d2..751e21f6d85e 100644
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -15,6 +15,17 @@
>>   #define CREATE_TRACE_POINTS
>>   #include <trace/events/page_isolation.h>
>>  +static inline bool __maybe_unused get_pageblock_isolate(struct page *page)
>> +{
>> +	return get_pfnblock_flags_mask(page, page_to_pfn(page),
>> +			PB_migrate_isolate_bit);
>> +}
>> +static inline void clear_pageblock_isolate(struct page *page)
>> +{
>> +	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>> +			PB_migrate_isolate_bit);
>> +}
>
> Should these reside in include/linux/pageblock-flags.h, just like the
> CONFIG_COMPACTION "skip" variants?

They are only used inside mm/page_isolation.c, so I would leave them
here until other users come out.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-19 15:08     ` Zi Yan
@ 2025-05-19 16:42       ` David Hildenbrand
  2025-05-19 17:15         ` Zi Yan
  2025-05-21 11:16         ` Zi Yan
  0 siblings, 2 replies; 42+ messages in thread
From: David Hildenbrand @ 2025-05-19 16:42 UTC (permalink / raw)
  To: Zi Yan
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel


>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>> +	if (flags & PB_migrate_isolate_bit)
>>> +		return MIGRATE_ISOLATE;
>>> +#endif
>>
>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
> 
> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
> so it gets PB_migrate_isolate_bit.
> 

Oh ... that's confusing.

>>
>>
>> I think what we should do is
>>
>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>
>> 2) Remove the mask parameter
>>
>> 3) Perform the masking in all callers.
> 
> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
> in the mask to confuse readers.

The masking will be handled in the caller.

So get_pageblock_skip() would essentially do a

return get_pfnblock_flags() & PB_migrate_skip_bit;

etc.

> 
>>
>>
>>
>> Maybe, we should convert set_pfnblock_flags_mask() to
>>
>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>> 			      set_flags, unsigned long clear_flags);
>>
>> And better, splitting it up (or providing helpers)
>>
>> set_pfnblock_flags(struct page *page, unsigned long flags);
>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>
>>
>> This implies some more code cleanups first that make the code easier to extend.
>>
> 
> The same due to PB_migrate_skip.
> 
> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
> internal APIs by prepending "__". They are only used by the new
> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
> 
> WDYT?

In general, lgtm. I just hope we can avoid the "_mask" part and just 
handle it in these functions directly?

> 
>>> +	return flags;
>>>    }
>>>     /**
>>> @@ -402,8 +423,14 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
>>>    	unsigned long bitidx, word_bitidx;
>>>    	unsigned long word;
>>>   +#ifdef CONFIG_MEMORY_ISOLATION
>>> +	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 8);
>>> +	/* extra one for MIGRATE_ISOLATE */
>>> +	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits) + 1);
>>> +#else
>>>    	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
>>>    	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits));
>>> +#endif
>>>     	bitmap = get_pageblock_bitmap(page, pfn);
>>>    	bitidx = pfn_to_bitidx(page, pfn);
>>> @@ -426,6 +453,13 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>>>    		     migratetype < MIGRATE_PCPTYPES))
>>>    		migratetype = MIGRATE_UNMOVABLE;
>>>   +#ifdef CONFIG_MEMORY_ISOLATION
>>> +	if (migratetype == MIGRATE_ISOLATE) {
>>> +		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>>> +				page_to_pfn(page), PB_migrate_isolate_bit);
>>> +		return;
>>> +	}
>>> +#endif
>>>    	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>>>    				page_to_pfn(page), MIGRATETYPE_MASK);
>>>    }
>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>>> index b2fc5266e3d2..751e21f6d85e 100644
>>> --- a/mm/page_isolation.c
>>> +++ b/mm/page_isolation.c
>>> @@ -15,6 +15,17 @@
>>>    #define CREATE_TRACE_POINTS
>>>    #include <trace/events/page_isolation.h>
>>>   +static inline bool __maybe_unused get_pageblock_isolate(struct page *page)
>>> +{
>>> +	return get_pfnblock_flags_mask(page, page_to_pfn(page),
>>> +			PB_migrate_isolate_bit);
>>> +}
>>> +static inline void clear_pageblock_isolate(struct page *page)
>>> +{
>>> +	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>>> +			PB_migrate_isolate_bit);
>>> +}
>>
>> Should these reside in include/linux/pageblock-flags.h, just like the
>> CONFIG_COMPACTION "skip" variants?
> 
> They are only used inside mm/page_isolation.c, so I would leave them
> here until other users come out.

get_pageblock_skip() and friends are also only used in mm/compaction.c.

Having these simple wrapper as inline functions in the same header 
should make it consistent.

... and avoid tricks like "__maybe_unused" here :)


-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-19 16:42       ` David Hildenbrand
@ 2025-05-19 17:15         ` Zi Yan
  2025-05-21 11:16         ` Zi Yan
  1 sibling, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-19 17:15 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19 May 2025, at 12:42, David Hildenbrand wrote:

>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>> +	if (flags & PB_migrate_isolate_bit)
>>>> +		return MIGRATE_ISOLATE;
>>>> +#endif
>>>
>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>
>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>> so it gets PB_migrate_isolate_bit.
>>
>
> Oh ... that's confusing.
>
>>>
>>>
>>> I think what we should do is
>>>
>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>
>>> 2) Remove the mask parameter
>>>
>>> 3) Perform the masking in all callers.
>>
>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>> in the mask to confuse readers.
>
> The masking will be handled in the caller.
>
> So get_pageblock_skip() would essentially do a
>
> return get_pfnblock_flags() & PB_migrate_skip_bit;
>
> etc.

Got it. Sounds good to me. Will do this.

>
>>
>>>
>>>
>>>
>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>
>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>> 			      set_flags, unsigned long clear_flags);
>>>
>>> And better, splitting it up (or providing helpers)
>>>
>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>
>>>
>>> This implies some more code cleanups first that make the code easier to extend.
>>>
>>
>> The same due to PB_migrate_skip.
>>
>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>> internal APIs by prepending "__". They are only used by the new
>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>
>> WDYT?
>
> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?

Sounds good to me. Will put this and
"#define NR_PAGEBLOCK_BITS (roundup_pow_of_two(__NR_PAGEBLOCK_BITS))"
in a cleanup patch before Patch 1.

>
>>
>>>> +	return flags;
>>>>    }
>>>>     /**
>>>> @@ -402,8 +423,14 @@ void set_pfnblock_flags_mask(struct page *page, unsigned long flags,
>>>>    	unsigned long bitidx, word_bitidx;
>>>>    	unsigned long word;
>>>>   +#ifdef CONFIG_MEMORY_ISOLATION
>>>> +	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 8);
>>>> +	/* extra one for MIGRATE_ISOLATE */
>>>> +	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits) + 1);
>>>> +#else
>>>>    	BUILD_BUG_ON(NR_PAGEBLOCK_BITS != 4);
>>>>    	BUILD_BUG_ON(MIGRATE_TYPES > (1 << PB_migratetype_bits));
>>>> +#endif
>>>>     	bitmap = get_pageblock_bitmap(page, pfn);
>>>>    	bitidx = pfn_to_bitidx(page, pfn);
>>>> @@ -426,6 +453,13 @@ void set_pageblock_migratetype(struct page *page, int migratetype)
>>>>    		     migratetype < MIGRATE_PCPTYPES))
>>>>    		migratetype = MIGRATE_UNMOVABLE;
>>>>   +#ifdef CONFIG_MEMORY_ISOLATION
>>>> +	if (migratetype == MIGRATE_ISOLATE) {
>>>> +		set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>>>> +				page_to_pfn(page), PB_migrate_isolate_bit);
>>>> +		return;
>>>> +	}
>>>> +#endif
>>>>    	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
>>>>    				page_to_pfn(page), MIGRATETYPE_MASK);
>>>>    }
>>>> diff --git a/mm/page_isolation.c b/mm/page_isolation.c
>>>> index b2fc5266e3d2..751e21f6d85e 100644
>>>> --- a/mm/page_isolation.c
>>>> +++ b/mm/page_isolation.c
>>>> @@ -15,6 +15,17 @@
>>>>    #define CREATE_TRACE_POINTS
>>>>    #include <trace/events/page_isolation.h>
>>>>   +static inline bool __maybe_unused get_pageblock_isolate(struct page *page)
>>>> +{
>>>> +	return get_pfnblock_flags_mask(page, page_to_pfn(page),
>>>> +			PB_migrate_isolate_bit);
>>>> +}
>>>> +static inline void clear_pageblock_isolate(struct page *page)
>>>> +{
>>>> +	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>>>> +			PB_migrate_isolate_bit);
>>>> +}
>>>
>>> Should these reside in include/linux/pageblock-flags.h, just like the
>>> CONFIG_COMPACTION "skip" variants?
>>
>> They are only used inside mm/page_isolation.c, so I would leave them
>> here until other users come out.
>
> get_pageblock_skip() and friends are also only used in mm/compaction.c.
>
> Having these simple wrapper as inline functions in the same header should make it consistent.
>
> ... and avoid tricks like "__maybe_unused" here :)

OK, will do this.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-19  8:21   ` David Hildenbrand
@ 2025-05-19 23:06     ` Zi Yan
  2025-05-20  8:58       ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-19 23:06 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19 May 2025, at 4:21, David Hildenbrand wrote:

> On 09.05.25 22:01, Zi Yan wrote:
>> Since migratetype is no longer overwritten during pageblock isolation,
>> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
>>
>> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
>> when a pageblock is isolated. It is used by move_freepages_block_isolate().
>>
>> Add pageblock_isolate_and_move_free_pages() and
>> pageblock_unisolate_and_move_free_pages() to be explicit about the page
>> isolation operations. Both share the common code in
>> __move_freepages_block_isolate(), which is renamed from
>> move_freepages_block_isolate().
>>
>> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
>> so that one should use set_pageblock_isolate() to isolate pageblocks.
>>
>> Two consequential changes:
>> 1. move pageblock migratetype code out of __move_freepages_block().
>> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
>>     called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
>>     pageblocks are isolated afterwards. Otherwise, all online pageblocks
>>     will have non-determined migratetype.
>>
>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>> ---
>>   include/linux/mmzone.h         |  4 +-
>>   include/linux/page-isolation.h |  5 ++-
>>   mm/memory_hotplug.c            |  7 +++-
>>   mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
>>   mm/page_isolation.c            | 27 ++++++++-----
>>   5 files changed, 82 insertions(+), 34 deletions(-)
>>
>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>> index 7ef01fe148ce..f66895456974 100644
>> --- a/include/linux/mmzone.h
>> +++ b/include/linux/mmzone.h
>> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
>>   extern int page_group_by_mobility_disabled;
>>    #ifdef CONFIG_MEMORY_ISOLATION
>> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
>> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
>> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
>>   #else
>> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
>>   #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>>   #endif
>>  diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
>> index 898bb788243b..b0a2af0a5357 100644
>> --- a/include/linux/page-isolation.h
>> +++ b/include/linux/page-isolation.h
>> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
>>   #define REPORT_FAILURE	0x2
>>    void set_pageblock_migratetype(struct page *page, int migratetype);
>> +void set_pageblock_isolate(struct page *page);
>>  -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>> -				  int migratetype);
>> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
>> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>>    int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>   			     int migratetype, int flags);
>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>> index b1caedbade5b..c86c47bba019 100644
>> --- a/mm/memory_hotplug.c
>> +++ b/mm/memory_hotplug.c
>> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>   	const int nid = zone_to_nid(zone);
>>   	int ret;
>>   	struct memory_notify arg;
>> +	unsigned long isol_pfn;
>>    	/*
>>   	 * {on,off}lining is constrained to full memory sections (or more
>> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>      	/* associate pfn range with the zone */
>> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
>> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
>> +	for (isol_pfn = pfn;
>> +	     isol_pfn < pfn + nr_pages;
>> +	     isol_pfn += pageblock_nr_pages)
>> +		set_pageblock_isolate(pfn_to_page(isol_pfn));
>
> Can we move that all the way into memmap_init_range(), where we do the
> set_pageblock_migratetype()?
>
> The MIGRATE_UNMOVABLE in mhp_init_memmap_on_memory() is likely fine: all
> pages in that pageblock will be used for the memmap. Everything is unmovable,
> but no free pages so ... nobody cares? :)

My approach is similar, but a new init_pageblock_migratetype() like
below. Then, I added "bool isolate" instead of replacing the existing
"int migratetype". The advantage is that it saves a call to
set_pfnblock_flags_mask() for each pageblock. Like the alternative
you suggested below.

+void __meminit init_pageblock_migratetype(struct page *page, int migratetype,
+		bool isolate)
+{
+	if (unlikely(page_group_by_mobility_disabled &&
+		     migratetype < MIGRATE_PCPTYPES))
+		migratetype = MIGRATE_UNMOVABLE;
+
+#ifdef CONFIG_MEMORY_ISOLATION
+	if (migratetype == MIGRATE_ISOLATE) {
+		VM_WARN(1,
+			"Set isolate=true to isolate pageblock with a migratetype");
+		return;
+	}
+	if (isolate)
+		migratetype |= PB_migrate_isolate_bit;
+#endif
+	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
+				page_to_pfn(page), MIGRATETYPE_MASK);
+}
+

>
>
> diff --git a/mm/internal.h b/mm/internal.h
> index 6b8ed20177432..bc102846fcf1f 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -821,7 +821,7 @@ extern void *memmap_alloc(phys_addr_t size, phys_addr_t align,
>  			  int nid, bool exact_nid);
>   void memmap_init_range(unsigned long, int, unsigned long, unsigned long,
> -		unsigned long, enum meminit_context, struct vmem_altmap *, int);
> +		unsigned long, enum meminit_context, struct vmem_altmap *, bool);
>   #if defined CONFIG_COMPACTION || defined CONFIG_CMA
>  diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
> index b1caedbade5b1..4b2cf20ad21fb 100644
> --- a/mm/memory_hotplug.c
> +++ b/mm/memory_hotplug.c
> @@ -764,13 +764,13 @@ static inline void section_taint_zone_device(unsigned long pfn)
>   * and resizing the pgdat/zone data to span the added pages. After this
>   * call, all affected pages are PageOffline().
>   *
> - * All aligned pageblocks are initialized to the specified migratetype
> - * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
> - * zone stats (e.g., nr_isolate_pageblock) are touched.
> + * All aligned pageblocks are initialized to MIGRATE_MOVABLE, and are isolated
> + * if requested. Besides setting the migratetype, no related zone stats (e.g.,
> + * nr_isolate_pageblock) are touched.
>   */
>  void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>  				  unsigned long nr_pages,
> -				  struct vmem_altmap *altmap, int migratetype)
> +				  struct vmem_altmap *altmap, bool isolate)
>  {
>  	struct pglist_data *pgdat = zone->zone_pgdat;
>  	int nid = pgdat->node_id;
> @@ -802,7 +802,7 @@ void move_pfn_range_to_zone(struct zone *zone, unsigned long start_pfn,
>  	 * are reserved so nobody should be touching them so we should be safe
>  	 */
>  	memmap_init_range(nr_pages, nid, zone_idx(zone), start_pfn, 0,
> -			 MEMINIT_HOTPLUG, altmap, migratetype);
> +			 MEMINIT_HOTPLUG, altmap, isolate);
>   	set_zone_contiguous(zone);
>  }
> @@ -1127,7 +1127,7 @@ int mhp_init_memmap_on_memory(unsigned long pfn, unsigned long nr_pages,
>  	if (mhp_off_inaccessible)
>  		page_init_poison(pfn_to_page(pfn), sizeof(struct page) * nr_pages);
>  -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_UNMOVABLE);
> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, false);
>   	for (i = 0; i < nr_pages; i++) {
>  		struct page *page = pfn_to_page(pfn + i);
> @@ -1192,7 +1192,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>    	/* associate pfn range with the zone */
> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, true);
>   	arg.start_pfn = pfn;
>  	arg.nr_pages = nr_pages;
> diff --git a/mm/memremap.c b/mm/memremap.c
> index c417c843e9b1f..e47f6809f254b 100644
> --- a/mm/memremap.c
> +++ b/mm/memremap.c
> @@ -254,7 +254,7 @@ static int pagemap_range(struct dev_pagemap *pgmap, struct mhp_params *params,
>  		zone = &NODE_DATA(nid)->node_zones[ZONE_DEVICE];
>  		move_pfn_range_to_zone(zone, PHYS_PFN(range->start),
>  				PHYS_PFN(range_len(range)), params->altmap,
> -				MIGRATE_MOVABLE);
> +				false);
>  	}
>   	mem_hotplug_done();
> diff --git a/mm/mm_init.c b/mm/mm_init.c
> index 1c5444e188f82..041106fc524be 100644
> --- a/mm/mm_init.c
> +++ b/mm/mm_init.c
> @@ -867,14 +867,14 @@ static void __init init_unavailable_range(unsigned long spfn,
>   * up by memblock_free_all() once the early boot process is
>   * done. Non-atomic initialization, single-pass.
>   *
> - * All aligned pageblocks are initialized to the specified migratetype
> - * (usually MIGRATE_MOVABLE). Besides setting the migratetype, no related
> - * zone stats (e.g., nr_isolate_pageblock) are touched.
> + * All aligned pageblocks are initialized to MIGRATE_MOVABLE, and are isolated
> + * if requested. Besides setting the migratetype, no related zone stats (e.g.,
> + * nr_isolate_pageblock) are touched.
>   */
>  void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone,
>  		unsigned long start_pfn, unsigned long zone_end_pfn,
>  		enum meminit_context context,
> -		struct vmem_altmap *altmap, int migratetype)
> +		struct vmem_altmap *altmap, bool isolate)
>  {
>  	unsigned long pfn, end_pfn = start_pfn + size;
>  	struct page *page;
> @@ -931,7 +931,9 @@ void __meminit memmap_init_range(unsigned long size, int nid, unsigned long zone
>  		 * over the place during system boot.
>  		 */
>  		if (pageblock_aligned(pfn)) {
> -			set_pageblock_migratetype(page, migratetype);
> +			set_pageblock_migratetype(page, MIGRATE_MOVABLE);
> +			if (isolate)
> +				set_pageblock_isolate(page, isolate)
>  			cond_resched();
>  		}
>  		pfn++;
> @@ -954,7 +956,7 @@ static void __init memmap_init_zone_range(struct zone *zone,
>  		return;
>   	memmap_init_range(end_pfn - start_pfn, nid, zone_id, start_pfn,
> -			  zone_end_pfn, MEMINIT_EARLY, NULL, MIGRATE_MOVABLE);
> +			  zone_end_pfn, MEMINIT_EARLY, NULL, false);
>   	if (*hole_pfn < start_pfn)
>  		init_unavailable_range(*hole_pfn, start_pfn, zone_id, nid);
> -- 
> 2.49.0
>
>
>
> As an alterantive, a second "isolate" parameter and make sure that migratetype is
> never MIGRATE_ISOLATE.
>
> [...]
>
>> --- a/mm/page_isolation.c
>> +++ b/mm/page_isolation.c
>> @@ -25,6 +25,12 @@ static inline void clear_pageblock_isolate(struct page *page)
>>   	set_pfnblock_flags_mask(page, 0, page_to_pfn(page),
>>   			PB_migrate_isolate_bit);
>>   }
>> +void set_pageblock_isolate(struct page *page)
>> +{
>> +	set_pfnblock_flags_mask(page, PB_migrate_isolate_bit,
>> +			page_to_pfn(page),
>> +			PB_migrate_isolate_bit);
>> +}
>
> Probably better placed in the previous patch, and in the header (see comment to #1).

Sure.


--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-19 14:35       ` Zi Yan
@ 2025-05-20  8:58         ` David Hildenbrand
  2025-05-20 13:18           ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-20  8:58 UTC (permalink / raw)
  To: Zi Yan
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19.05.25 16:35, Zi Yan wrote:
> On 19 May 2025, at 10:15, David Hildenbrand wrote:
> 
>> On 18.05.25 02:20, Zi Yan wrote:
>>> On 17 May 2025, at 16:26, Vlastimil Babka wrote:
>>>
>>>> On 5/9/25 22:01, Zi Yan wrote:
>>>>> Hi David and Oscar,
>>>>>
>>>>> Can you take a look at Patch 2, which changes how online_pages() set
>>>>> online pageblock migratetypes? It used to first set all pageblocks to
>>>>> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
>>>>> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
>>>>> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
>>>>> Let me know if there is any issue with my changes.
>>>>>
>>>>> Hi Johannes,
>>>>>
>>>>> Patch 2 now have set_pageblock_migratetype() not accepting
>>>>> MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
>>>>> feedback.
>>>>>
>>>>> Hi all,
>>>>>
>>>>> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
>>>>> being overwritten during pageblock isolation process. Currently,
>>>>> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
>>>>> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
>>>>> migratetype. This causes pageblock migratetype loss during
>>>>> alloc_contig_range() and memory offline, especially when the process
>>>>> fails due to a failed pageblock isolation and the code tries to undo the
>>>>> finished pageblock isolations.
>>>>
>>>> Seems mostly fine to me, just sent suggestion for 4/4.
>>>
>>> Thanks.
>>>
>>>> I was kinda hoping that MIGRATE_ISOLATE could stop being a migratetype. But
>>>> I also see that it's useful for it to be because then it means it has the
>>>> freelists in the buddy allocator, can work via __move_freepages_block() etc.
>>>
>>> Yeah, I wanted to remove MIGRATE_ISOLATE from migratetype too, but there
>>> is a MIGRATE_ISOLATE freelist and /proc/pagetypeinfo also shows isolated
>>> free pages.
>>
>> The latter, we can likely fake.
>>
>> Is there a reasonable way to remove MIGRATE_ISOLATE completely?
>>
>> Of course, we could simply duplicate the page lists (one set for isolated, one set for !isolated), or keep it as is and simply have a
> 
> That could work. It will change vmcore layout and I wonder if that is a concern
> or not.

Not really. makedumpfile will have to implement support for the new 
layout as it adds support for the new kernel version.

> 
>> separate one that we separate out. So, we could have a migratetype+isolated pair instead.
> 
> What do you mean by a migratetype+isolate pair?

If MIGRATE_ISOLATE no longer exists, relevant code would have to pass 
migratetype+isolated (essentially, what you did in 
init_pageblock_migratetype ).


E.g., we could pass around a "pageblock_info" (or however we call it, 
using a different type than a bare migratetype) from which we can easily 
extract the migratetype and the isolated state.


E.g., init_pageblock_migratetype() could then become

struct pageblock_info pb_info = {
	.migratetype = MIGRATE_MOVABLE,
	.isolated = true,
}
init_pageblock_info(page, pb_info);

So, we'd decouple the migratetype we pass around from the "isolated" 
state. Whoever needs the "isolated" state in addition to the migratetype 
should use get_pageblock_info().

When adding to lists, we can decide what to do based on that information.

> 
>>
>> Just a thought, did not look into all the ugly details.
> 
> Another thought is that maybe caller should keep the isolated free pages instead
> to make it actually isolated.

You mean, not adding them to a list at all in the buddy? I think the 
problem is that if a page gets freed while the pageblock is isolated, it 
cannot get added to the list of an owner easily.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate()
  2025-05-19 23:06     ` Zi Yan
@ 2025-05-20  8:58       ` David Hildenbrand
  0 siblings, 0 replies; 42+ messages in thread
From: David Hildenbrand @ 2025-05-20  8:58 UTC (permalink / raw)
  To: Zi Yan
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 20.05.25 01:06, Zi Yan wrote:
> On 19 May 2025, at 4:21, David Hildenbrand wrote:
> 
>> On 09.05.25 22:01, Zi Yan wrote:
>>> Since migratetype is no longer overwritten during pageblock isolation,
>>> moving pageblocks to and from MIGRATE_ISOLATE no longer needs migratetype.
>>>
>>> Add MIGRATETYPE_NO_ISO_MASK to allow read before-isolation migratetype
>>> when a pageblock is isolated. It is used by move_freepages_block_isolate().
>>>
>>> Add pageblock_isolate_and_move_free_pages() and
>>> pageblock_unisolate_and_move_free_pages() to be explicit about the page
>>> isolation operations. Both share the common code in
>>> __move_freepages_block_isolate(), which is renamed from
>>> move_freepages_block_isolate().
>>>
>>> Make set_pageblock_migratetype() only accept non MIGRATE_ISOLATE types,
>>> so that one should use set_pageblock_isolate() to isolate pageblocks.
>>>
>>> Two consequential changes:
>>> 1. move pageblock migratetype code out of __move_freepages_block().
>>> 2. in online_pages() from mm/memory_hotplug.c, move_pfn_range_to_zone() is
>>>      called with MIGRATE_MOVABLE instead of MIGRATE_ISOLATE and all affected
>>>      pageblocks are isolated afterwards. Otherwise, all online pageblocks
>>>      will have non-determined migratetype.
>>>
>>> Signed-off-by: Zi Yan <ziy@nvidia.com>
>>> ---
>>>    include/linux/mmzone.h         |  4 +-
>>>    include/linux/page-isolation.h |  5 ++-
>>>    mm/memory_hotplug.c            |  7 +++-
>>>    mm/page_alloc.c                | 73 +++++++++++++++++++++++++---------
>>>    mm/page_isolation.c            | 27 ++++++++-----
>>>    5 files changed, 82 insertions(+), 34 deletions(-)
>>>
>>> diff --git a/include/linux/mmzone.h b/include/linux/mmzone.h
>>> index 7ef01fe148ce..f66895456974 100644
>>> --- a/include/linux/mmzone.h
>>> +++ b/include/linux/mmzone.h
>>> @@ -107,8 +107,10 @@ static inline bool migratetype_is_mergeable(int mt)
>>>    extern int page_group_by_mobility_disabled;
>>>     #ifdef CONFIG_MEMORY_ISOLATION
>>> -#define MIGRATETYPE_MASK ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit)
>>> +#define MIGRATETYPE_NO_ISO_MASK (BIT(PB_migratetype_bits) - 1)
>>> +#define MIGRATETYPE_MASK (MIGRATETYPE_NO_ISO_MASK | PB_migrate_isolate_bit)
>>>    #else
>>> +#define MIGRATETYPE_NO_ISO_MASK MIGRATETYPE_MASK
>>>    #define MIGRATETYPE_MASK (BIT(PB_migratetype_bits) - 1)
>>>    #endif
>>>   diff --git a/include/linux/page-isolation.h b/include/linux/page-isolation.h
>>> index 898bb788243b..b0a2af0a5357 100644
>>> --- a/include/linux/page-isolation.h
>>> +++ b/include/linux/page-isolation.h
>>> @@ -26,9 +26,10 @@ static inline bool is_migrate_isolate(int migratetype)
>>>    #define REPORT_FAILURE	0x2
>>>     void set_pageblock_migratetype(struct page *page, int migratetype);
>>> +void set_pageblock_isolate(struct page *page);
>>>   -bool move_freepages_block_isolate(struct zone *zone, struct page *page,
>>> -				  int migratetype);
>>> +bool pageblock_isolate_and_move_free_pages(struct zone *zone, struct page *page);
>>> +bool pageblock_unisolate_and_move_free_pages(struct zone *zone, struct page *page);
>>>     int start_isolate_page_range(unsigned long start_pfn, unsigned long end_pfn,
>>>    			     int migratetype, int flags);
>>> diff --git a/mm/memory_hotplug.c b/mm/memory_hotplug.c
>>> index b1caedbade5b..c86c47bba019 100644
>>> --- a/mm/memory_hotplug.c
>>> +++ b/mm/memory_hotplug.c
>>> @@ -1178,6 +1178,7 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>>    	const int nid = zone_to_nid(zone);
>>>    	int ret;
>>>    	struct memory_notify arg;
>>> +	unsigned long isol_pfn;
>>>     	/*
>>>    	 * {on,off}lining is constrained to full memory sections (or more
>>> @@ -1192,7 +1193,11 @@ int online_pages(unsigned long pfn, unsigned long nr_pages,
>>>       	/* associate pfn range with the zone */
>>> -	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_ISOLATE);
>>> +	move_pfn_range_to_zone(zone, pfn, nr_pages, NULL, MIGRATE_MOVABLE);
>>> +	for (isol_pfn = pfn;
>>> +	     isol_pfn < pfn + nr_pages;
>>> +	     isol_pfn += pageblock_nr_pages)
>>> +		set_pageblock_isolate(pfn_to_page(isol_pfn));
>>
>> Can we move that all the way into memmap_init_range(), where we do the
>> set_pageblock_migratetype()?
>>
>> The MIGRATE_UNMOVABLE in mhp_init_memmap_on_memory() is likely fine: all
>> pages in that pageblock will be used for the memmap. Everything is unmovable,
>> but no free pages so ... nobody cares? :)
> 
> My approach is similar, but a new init_pageblock_migratetype() like
> below. Then, I added "bool isolate" instead of replacing the existing
> "int migratetype". The advantage is that it saves a call to
> set_pfnblock_flags_mask() for each pageblock. Like the alternative
> you suggested below.
> 
> +void __meminit init_pageblock_migratetype(struct page *page, int migratetype,
> +		bool isolate)
> +{
> +	if (unlikely(page_group_by_mobility_disabled &&
> +		     migratetype < MIGRATE_PCPTYPES))
> +		migratetype = MIGRATE_UNMOVABLE;
> +
> +#ifdef CONFIG_MEMORY_ISOLATION
> +	if (migratetype == MIGRATE_ISOLATE) {
> +		VM_WARN(1,
> +			"Set isolate=true to isolate pageblock with a migratetype");
> +		return;
> +	}
> +	if (isolate)
> +		migratetype |= PB_migrate_isolate_bit;
> +#endif
> +	set_pfnblock_flags_mask(page, (unsigned long)migratetype,
> +				page_to_pfn(page), MIGRATETYPE_MASK);
> +}
> +

See my other reply on maybe introducing a "struct pageblock_info" where 
we embed these things, to decouple the actual migratetype from flags.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-20  8:58         ` David Hildenbrand
@ 2025-05-20 13:18           ` Zi Yan
  2025-05-20 13:20             ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-20 13:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 20 May 2025, at 4:58, David Hildenbrand wrote:

> On 19.05.25 16:35, Zi Yan wrote:
>> On 19 May 2025, at 10:15, David Hildenbrand wrote:
>>
>>> On 18.05.25 02:20, Zi Yan wrote:
>>>> On 17 May 2025, at 16:26, Vlastimil Babka wrote:
>>>>
>>>>> On 5/9/25 22:01, Zi Yan wrote:
>>>>>> Hi David and Oscar,
>>>>>>
>>>>>> Can you take a look at Patch 2, which changes how online_pages() set
>>>>>> online pageblock migratetypes? It used to first set all pageblocks to
>>>>>> MIGRATE_ISOLATE, then let undo_isolate_page_range() move the pageblocks
>>>>>> to MIGRATE_MOVABLE. After MIGRATE_ISOLATE becomes a standalone bit, all
>>>>>> online pageblocks need to have a migratetype other than MIGRATE_ISOLATE.
>>>>>> Let me know if there is any issue with my changes.
>>>>>>
>>>>>> Hi Johannes,
>>>>>>
>>>>>> Patch 2 now have set_pageblock_migratetype() not accepting
>>>>>> MIGRATE_ISOLATE. I think it makes code better. Thank you for the great
>>>>>> feedback.
>>>>>>
>>>>>> Hi all,
>>>>>>
>>>>>> This patchset moves MIGRATE_ISOLATE to a standalone bit to avoid
>>>>>> being overwritten during pageblock isolation process. Currently,
>>>>>> MIGRATE_ISOLATE is part of enum migratetype (in include/linux/mmzone.h),
>>>>>> thus, setting a pageblock to MIGRATE_ISOLATE overwrites its original
>>>>>> migratetype. This causes pageblock migratetype loss during
>>>>>> alloc_contig_range() and memory offline, especially when the process
>>>>>> fails due to a failed pageblock isolation and the code tries to undo the
>>>>>> finished pageblock isolations.
>>>>>
>>>>> Seems mostly fine to me, just sent suggestion for 4/4.
>>>>
>>>> Thanks.
>>>>
>>>>> I was kinda hoping that MIGRATE_ISOLATE could stop being a migratetype. But
>>>>> I also see that it's useful for it to be because then it means it has the
>>>>> freelists in the buddy allocator, can work via __move_freepages_block() etc.
>>>>
>>>> Yeah, I wanted to remove MIGRATE_ISOLATE from migratetype too, but there
>>>> is a MIGRATE_ISOLATE freelist and /proc/pagetypeinfo also shows isolated
>>>> free pages.
>>>
>>> The latter, we can likely fake.
>>>
>>> Is there a reasonable way to remove MIGRATE_ISOLATE completely?
>>>
>>> Of course, we could simply duplicate the page lists (one set for isolated, one set for !isolated), or keep it as is and simply have a
>>
>> That could work. It will change vmcore layout and I wonder if that is a concern
>> or not.
>
> Not really. makedumpfile will have to implement support for the new layout as it adds support for the new kernel version.

Got it.

>
>>
>>> separate one that we separate out. So, we could have a migratetype+isolated pair instead.
>>
>> What do you mean by a migratetype+isolate pair?
>
> If MIGRATE_ISOLATE no longer exists, relevant code would have to pass migratetype+isolated (essentially, what you did in init_pageblock_migratetype ).
>
>
> E.g., we could pass around a "pageblock_info" (or however we call it, using a different type than a bare migratetype) from which we can easily extract the migratetype and the isolated state.
>
>
> E.g., init_pageblock_migratetype() could then become
>
> struct pageblock_info pb_info = {
> 	.migratetype = MIGRATE_MOVABLE,
> 	.isolated = true,
> }
> init_pageblock_info(page, pb_info);
>
> So, we'd decouple the migratetype we pass around from the "isolated" state. Whoever needs the "isolated" state in addition to the migratetype should use get_pageblock_info().
>
> When adding to lists, we can decide what to do based on that information.

This looks good to me. I can send a follow-up patchset to get rid of
MIGRATE_ISOLATE along with more cleanups like changing "int migratetype" to
"enum migratetype migratetype" in mm/page_alloc.c.

>
>>
>>>
>>> Just a thought, did not look into all the ugly details.
>>
>> Another thought is that maybe caller should keep the isolated free pages instead
>> to make it actually isolated.
>
> You mean, not adding them to a list at all in the buddy? I think the problem is that
Yes.
> if a page gets freed while the pageblock is isolated, it cannot get added to the list of an owner easily.

Right. In theory, it is possible, since when a MIGRATED_ISOLATE page is freed,
__free_one_page() can find its buddy and add the freed page to its buddy's
buddy_list without performing a merge like current code. But it needs a new
code path in __add_to_free_list(), since it is not added to the head nor the
tail of a free list.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-20 13:18           ` Zi Yan
@ 2025-05-20 13:20             ` David Hildenbrand
  2025-05-20 13:31               ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-20 13:20 UTC (permalink / raw)
  To: Zi Yan
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

>> if a page gets freed while the pageblock is isolated, it cannot get added to the list of an owner easily.
> 
> Right. In theory, it is possible, since when a MIGRATED_ISOLATE page is freed,
> __free_one_page() can find its buddy and add the freed page to its buddy's
> buddy_list without performing a merge like current code. But it needs a new
> code path in __add_to_free_list(), since it is not added to the head nor the
> tail of a free list.

But what if the whole pageblock gets freed in a single shot (IOW, there 
is no buddy to lookup the list for?).

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-20 13:20             ` David Hildenbrand
@ 2025-05-20 13:31               ` Zi Yan
  2025-05-20 13:33                 ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-20 13:31 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 20 May 2025, at 9:20, David Hildenbrand wrote:

>>> if a page gets freed while the pageblock is isolated, it cannot get added to the list of an owner easily.
>>
>> Right. In theory, it is possible, since when a MIGRATED_ISOLATE page is freed,
>> __free_one_page() can find its buddy and add the freed page to its buddy's
>> buddy_list without performing a merge like current code. But it needs a new
>> code path in __add_to_free_list(), since it is not added to the head nor the
>> tail of a free list.
>
> But what if the whole pageblock gets freed in a single shot (IOW, there is no buddy to lookup the list for?).

You are right. This means when MIGRATE_ISOLATE is removed, the global
MIGRATE_ISOLATE free list stays. BTW, looking at struct free_area,
its nr_free accounts for free pages in all migratetypes. Either struct free_area
stays the same or more code changes are needed to have isolated free list
+ !isolated free list. I will try to figure this out in the follow-up patchset.

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-20 13:31               ` Zi Yan
@ 2025-05-20 13:33                 ` David Hildenbrand
  2025-05-20 14:07                   ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-20 13:33 UTC (permalink / raw)
  To: Zi Yan
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 20.05.25 15:31, Zi Yan wrote:
> On 20 May 2025, at 9:20, David Hildenbrand wrote:
> 
>>>> if a page gets freed while the pageblock is isolated, it cannot get added to the list of an owner easily.
>>>
>>> Right. In theory, it is possible, since when a MIGRATED_ISOLATE page is freed,
>>> __free_one_page() can find its buddy and add the freed page to its buddy's
>>> buddy_list without performing a merge like current code. But it needs a new
>>> code path in __add_to_free_list(), since it is not added to the head nor the
>>> tail of a free list.
>>
>> But what if the whole pageblock gets freed in a single shot (IOW, there is no buddy to lookup the list for?).
> 

And thinking about it, another problem is if a page gets freed that has 
no buddies.

> You are right. This means when MIGRATE_ISOLATE is removed, the global
> MIGRATE_ISOLATE free list stays.

Right. We could just have that one separately from the existing array.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit
  2025-05-20 13:33                 ` David Hildenbrand
@ 2025-05-20 14:07                   ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-20 14:07 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Vlastimil Babka, Oscar Salvador, Johannes Weiner, linux-mm,
	Andrew Morton, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 20 May 2025, at 9:33, David Hildenbrand wrote:

> On 20.05.25 15:31, Zi Yan wrote:
>> On 20 May 2025, at 9:20, David Hildenbrand wrote:
>>
>>>>> if a page gets freed while the pageblock is isolated, it cannot get added to the list of an owner easily.
>>>>
>>>> Right. In theory, it is possible, since when a MIGRATED_ISOLATE page is freed,
>>>> __free_one_page() can find its buddy and add the freed page to its buddy's
>>>> buddy_list without performing a merge like current code. But it needs a new
>>>> code path in __add_to_free_list(), since it is not added to the head nor the
>>>> tail of a free list.
>>>
>>> But what if the whole pageblock gets freed in a single shot (IOW, there is no buddy to lookup the list for?).
>>
>
> And thinking about it, another problem is if a page gets freed that has no buddies.
>
>> You are right. This means when MIGRATE_ISOLATE is removed, the global
>> MIGRATE_ISOLATE free list stays.
>
> Right. We could just have that one separately from the existing array.

Yep,

struct free_area {
	struct list_head	free_list[MIGRATE_TYPES];
	struct list_head	isolate_list;
	unsigned long		nr_free;
};

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-19 16:42       ` David Hildenbrand
  2025-05-19 17:15         ` Zi Yan
@ 2025-05-21 11:16         ` Zi Yan
  2025-05-21 11:57           ` David Hildenbrand
  1 sibling, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-21 11:16 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 19 May 2025, at 12:42, David Hildenbrand wrote:

>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>> +	if (flags & PB_migrate_isolate_bit)
>>>> +		return MIGRATE_ISOLATE;
>>>> +#endif
>>>
>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>
>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>> so it gets PB_migrate_isolate_bit.
>>
>
> Oh ... that's confusing.
>
>>>
>>>
>>> I think what we should do is
>>>
>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>
>>> 2) Remove the mask parameter
>>>
>>> 3) Perform the masking in all callers.
>>
>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>> in the mask to confuse readers.
>
> The masking will be handled in the caller.
>
> So get_pageblock_skip() would essentially do a
>
> return get_pfnblock_flags() & PB_migrate_skip_bit;
>
> etc.
>
>>
>>>
>>>
>>>
>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>
>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>> 			      set_flags, unsigned long clear_flags);
>>>
>>> And better, splitting it up (or providing helpers)
>>>
>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>
>>>
>>> This implies some more code cleanups first that make the code easier to extend.
>>>
>>
>> The same due to PB_migrate_skip.
>>
>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>> internal APIs by prepending "__". They are only used by the new
>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>
>> WDYT?
>
> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?

After implementing {get, set, clear}_pfnblock_flags(), I find that
get_pfnblock_flags() is easy like you wrote above, but set and clear are not,
since migratetype and skip/isolate bits are in the same word, meaning
I will need to first read them out, change the field, then write them back.
But it will cause inconsistency if there is a parallel writer to the same
word. So for set and clear, mask is required.

I can try to implement {get, set, clear}_pfnblock_bits(page,pfn, bits) to
only handle standalone bits by using the given @bits as the mask and
{set,get}_pageblock_migratetype() still use the mask.

WDYT?

--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-21 11:16         ` Zi Yan
@ 2025-05-21 11:57           ` David Hildenbrand
  2025-05-21 12:00             ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-21 11:57 UTC (permalink / raw)
  To: Zi Yan
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 21.05.25 13:16, Zi Yan wrote:
> On 19 May 2025, at 12:42, David Hildenbrand wrote:
> 
>>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>>> +	if (flags & PB_migrate_isolate_bit)
>>>>> +		return MIGRATE_ISOLATE;
>>>>> +#endif
>>>>
>>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>>
>>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>>> so it gets PB_migrate_isolate_bit.
>>>
>>
>> Oh ... that's confusing.
>>
>>>>
>>>>
>>>> I think what we should do is
>>>>
>>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>>
>>>> 2) Remove the mask parameter
>>>>
>>>> 3) Perform the masking in all callers.
>>>
>>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>>> in the mask to confuse readers.
>>
>> The masking will be handled in the caller.
>>
>> So get_pageblock_skip() would essentially do a
>>
>> return get_pfnblock_flags() & PB_migrate_skip_bit;
>>
>> etc.
>>
>>>
>>>>
>>>>
>>>>
>>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>>
>>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>>> 			      set_flags, unsigned long clear_flags);
>>>>
>>>> And better, splitting it up (or providing helpers)
>>>>
>>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>>
>>>>
>>>> This implies some more code cleanups first that make the code easier to extend.
>>>>
>>>
>>> The same due to PB_migrate_skip.
>>>
>>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>>> internal APIs by prepending "__". They are only used by the new
>>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>>
>>> WDYT?
>>
>> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?
> 
> After implementing {get, set, clear}_pfnblock_flags(), I find that
> get_pfnblock_flags() is easy like you wrote above, but set and clear are not,
> since migratetype and skip/isolate bits are in the same word, meaning
> I will need to first read them out, change the field, then write them back.

Like existing set_pfnblock_flags_mask() I guess, with the try_cmpxchg() 
loop.

> But it will cause inconsistency if there is a parallel writer to the same
> word. So for set and clear, mask is required.
> 
> I can try to implement {get, set, clear}_pfnblock_bits(page,pfn, bits) to
> only handle standalone bits by using the given @bits as the mask and
> {set,get}_pageblock_migratetype() still use the mask.

We'd still have to do the try_cmpxchg() when dealing with multiple bits, 
right?

For single bits, we could just use set_bit() etc.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-21 11:57           ` David Hildenbrand
@ 2025-05-21 12:00             ` Zi Yan
  2025-05-21 12:11               ` David Hildenbrand
  0 siblings, 1 reply; 42+ messages in thread
From: Zi Yan @ 2025-05-21 12:00 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 21 May 2025, at 7:57, David Hildenbrand wrote:

> On 21.05.25 13:16, Zi Yan wrote:
>> On 19 May 2025, at 12:42, David Hildenbrand wrote:
>>
>>>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>>>> +	if (flags & PB_migrate_isolate_bit)
>>>>>> +		return MIGRATE_ISOLATE;
>>>>>> +#endif
>>>>>
>>>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>>>
>>>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>>>> so it gets PB_migrate_isolate_bit.
>>>>
>>>
>>> Oh ... that's confusing.
>>>
>>>>>
>>>>>
>>>>> I think what we should do is
>>>>>
>>>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>>>
>>>>> 2) Remove the mask parameter
>>>>>
>>>>> 3) Perform the masking in all callers.
>>>>
>>>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>>>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>>>> in the mask to confuse readers.
>>>
>>> The masking will be handled in the caller.
>>>
>>> So get_pageblock_skip() would essentially do a
>>>
>>> return get_pfnblock_flags() & PB_migrate_skip_bit;
>>>
>>> etc.
>>>
>>>>
>>>>>
>>>>>
>>>>>
>>>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>>>
>>>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>>>> 			      set_flags, unsigned long clear_flags);
>>>>>
>>>>> And better, splitting it up (or providing helpers)
>>>>>
>>>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>
>>>>>
>>>>> This implies some more code cleanups first that make the code easier to extend.
>>>>>
>>>>
>>>> The same due to PB_migrate_skip.
>>>>
>>>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>>>> internal APIs by prepending "__". They are only used by the new
>>>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>>>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>>>
>>>> WDYT?
>>>
>>> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?
>>
>> After implementing {get, set, clear}_pfnblock_flags(), I find that
>> get_pfnblock_flags() is easy like you wrote above, but set and clear are not,
>> since migratetype and skip/isolate bits are in the same word, meaning
>> I will need to first read them out, change the field, then write them back.
>
> Like existing set_pfnblock_flags_mask() I guess, with the try_cmpxchg() loop.

Are you saying I duplicate the code in set_pfnblock_flags_mask() to implement
set_pfnblock_flags()? Or just replace set_pfnblock_flags_mask() entirely?

>
>> But it will cause inconsistency if there is a parallel writer to the same
>> word. So for set and clear, mask is required.
>>
>> I can try to implement {get, set, clear}_pfnblock_bits(page,pfn, bits) to
>> only handle standalone bits by using the given @bits as the mask and
>> {set,get}_pageblock_migratetype() still use the mask.
>
> We'd still have to do the try_cmpxchg() when dealing with multiple bits, right?
>
> For single bits, we could just use set_bit() etc.

Mel moved from set_bit() to try_cmpxchg() a word for performance reason. I am
not sure we want to move back.


--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-21 12:00             ` Zi Yan
@ 2025-05-21 12:11               ` David Hildenbrand
  2025-05-21 12:18                 ` Zi Yan
  0 siblings, 1 reply; 42+ messages in thread
From: David Hildenbrand @ 2025-05-21 12:11 UTC (permalink / raw)
  To: Zi Yan
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 21.05.25 14:00, Zi Yan wrote:
> On 21 May 2025, at 7:57, David Hildenbrand wrote:
> 
>> On 21.05.25 13:16, Zi Yan wrote:
>>> On 19 May 2025, at 12:42, David Hildenbrand wrote:
>>>
>>>>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>>>>> +	if (flags & PB_migrate_isolate_bit)
>>>>>>> +		return MIGRATE_ISOLATE;
>>>>>>> +#endif
>>>>>>
>>>>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>>>>
>>>>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>>>>> so it gets PB_migrate_isolate_bit.
>>>>>
>>>>
>>>> Oh ... that's confusing.
>>>>
>>>>>>
>>>>>>
>>>>>> I think what we should do is
>>>>>>
>>>>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>>>>
>>>>>> 2) Remove the mask parameter
>>>>>>
>>>>>> 3) Perform the masking in all callers.
>>>>>
>>>>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>>>>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>>>>> in the mask to confuse readers.
>>>>
>>>> The masking will be handled in the caller.
>>>>
>>>> So get_pageblock_skip() would essentially do a
>>>>
>>>> return get_pfnblock_flags() & PB_migrate_skip_bit;
>>>>
>>>> etc.
>>>>
>>>>>
>>>>>>
>>>>>>
>>>>>>
>>>>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>>>>
>>>>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>>>>> 			      set_flags, unsigned long clear_flags);
>>>>>>
>>>>>> And better, splitting it up (or providing helpers)
>>>>>>
>>>>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>>
>>>>>>
>>>>>> This implies some more code cleanups first that make the code easier to extend.
>>>>>>
>>>>>
>>>>> The same due to PB_migrate_skip.
>>>>>
>>>>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>>>>> internal APIs by prepending "__". They are only used by the new
>>>>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>>>>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>>>>
>>>>> WDYT?
>>>>
>>>> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?
>>>
>>> After implementing {get, set, clear}_pfnblock_flags(), I find that
>>> get_pfnblock_flags() is easy like you wrote above, but set and clear are not,
>>> since migratetype and skip/isolate bits are in the same word, meaning
>>> I will need to first read them out, change the field, then write them back.
>>
>> Like existing set_pfnblock_flags_mask() I guess, with the try_cmpxchg() loop.
> 
> Are you saying I duplicate the code in set_pfnblock_flags_mask() to implement
> set_pfnblock_flags()? Or just replace set_pfnblock_flags_mask() entirely?

The latter as possible.

> 
>>
>>> But it will cause inconsistency if there is a parallel writer to the same
>>> word. So for set and clear, mask is required.
>>>
>>> I can try to implement {get, set, clear}_pfnblock_bits(page,pfn, bits) to
>>> only handle standalone bits by using the given @bits as the mask and
>>> {set,get}_pageblock_migratetype() still use the mask.
>>
>> We'd still have to do the try_cmpxchg() when dealing with multiple bits, right?
>>
>> For single bits, we could just use set_bit() etc.
> 
> Mel moved from set_bit() to try_cmpxchg() a word for performance reason. I am
> not sure we want to move back.

In e58469bafd05 we moved from multiple set_bit etc to a cmpxchange.

-       for (; start_bitidx <= end_bitidx; start_bitidx++, value <<= 1)
-               if (flags & value)
-                       __set_bit(bitidx + start_bitidx, bitmap);
-               else
-                       __clear_bit(bitidx + start_bitidx, bitmap);


However, when only setting/clearing a single bit (e.g., isolated), 
set_bit etc should be much cheaper.

For multiple bits, the existing try_cmpxchg should be kept IMHO.

-- 
Cheers,

David / dhildenb


^ permalink raw reply	[flat|nested] 42+ messages in thread

* Re: [PATCH v4 1/4] mm/page_isolation: make page isolation a standalone bit.
  2025-05-21 12:11               ` David Hildenbrand
@ 2025-05-21 12:18                 ` Zi Yan
  0 siblings, 0 replies; 42+ messages in thread
From: Zi Yan @ 2025-05-21 12:18 UTC (permalink / raw)
  To: David Hildenbrand
  Cc: Oscar Salvador, Johannes Weiner, linux-mm, Andrew Morton,
	Vlastimil Babka, Baolin Wang, Kirill A . Shutemov, Mel Gorman,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman, Richard Chang,
	linux-kernel

On 21 May 2025, at 8:11, David Hildenbrand wrote:

> On 21.05.25 14:00, Zi Yan wrote:
>> On 21 May 2025, at 7:57, David Hildenbrand wrote:
>>
>>> On 21.05.25 13:16, Zi Yan wrote:
>>>> On 19 May 2025, at 12:42, David Hildenbrand wrote:
>>>>
>>>>>>>> +#ifdef CONFIG_MEMORY_ISOLATION
>>>>>>>> +	if (flags & PB_migrate_isolate_bit)
>>>>>>>> +		return MIGRATE_ISOLATE;
>>>>>>>> +#endif
>>>>>>>
>>>>>>> If you call get_pfnblock_flags_mask() with MIGRATETYPE_MASK, how could you ever get PB_migrate_isolate_bit?
>>>>>>
>>>>>> MIGRATETYPE_MASK is ((BIT(PB_migratetype_bits) - 1) | PB_migrate_isolate_bit),
>>>>>> so it gets PB_migrate_isolate_bit.
>>>>>>
>>>>>
>>>>> Oh ... that's confusing.
>>>>>
>>>>>>>
>>>>>>>
>>>>>>> I think what we should do is
>>>>>>>
>>>>>>> 1) Rename get_pfnblock_flags_mask() to get_pfnblock_flags()
>>>>>>>
>>>>>>> 2) Remove the mask parameter
>>>>>>>
>>>>>>> 3) Perform the masking in all callers.
>>>>>>
>>>>>> get_pfnblock_flags_mask() is also used by get_pageblock_skip() to
>>>>>> get PB_migrate_skip. I do not think we want to include PB_migrate_skip
>>>>>> in the mask to confuse readers.
>>>>>
>>>>> The masking will be handled in the caller.
>>>>>
>>>>> So get_pageblock_skip() would essentially do a
>>>>>
>>>>> return get_pfnblock_flags() & PB_migrate_skip_bit;
>>>>>
>>>>> etc.
>>>>>
>>>>>>
>>>>>>>
>>>>>>>
>>>>>>>
>>>>>>> Maybe, we should convert set_pfnblock_flags_mask() to
>>>>>>>
>>>>>>> void set_clear_pfnblock_flags(struct page *page, unsigned long
>>>>>>> 			      set_flags, unsigned long clear_flags);
>>>>>>>
>>>>>>> And better, splitting it up (or providing helpers)
>>>>>>>
>>>>>>> set_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>>> clear_pfnblock_flags(struct page *page, unsigned long flags);
>>>>>>>
>>>>>>>
>>>>>>> This implies some more code cleanups first that make the code easier to extend.
>>>>>>>
>>>>>>
>>>>>> The same due to PB_migrate_skip.
>>>>>>
>>>>>> Based on your suggestion, we could make {set,get}_pfnblock_flags_mask()
>>>>>> internal APIs by prepending "__". They are only used by the new
>>>>>> {get, set, clear}_pfnblock_flags() and {get, set, clear}_pageblock_{skip, isolate}().
>>>>>> Then use {get, set, clear}_pfnblock_flags() for all migratetype operations.
>>>>>>
>>>>>> WDYT?
>>>>>
>>>>> In general, lgtm. I just hope we can avoid the "_mask" part and just handle it in these functions directly?
>>>>
>>>> After implementing {get, set, clear}_pfnblock_flags(), I find that
>>>> get_pfnblock_flags() is easy like you wrote above, but set and clear are not,
>>>> since migratetype and skip/isolate bits are in the same word, meaning
>>>> I will need to first read them out, change the field, then write them back.
>>>
>>> Like existing set_pfnblock_flags_mask() I guess, with the try_cmpxchg() loop.
>>
>> Are you saying I duplicate the code in set_pfnblock_flags_mask() to implement
>> set_pfnblock_flags()? Or just replace set_pfnblock_flags_mask() entirely?
>
> The latter as possible.
>
>>
>>>
>>>> But it will cause inconsistency if there is a parallel writer to the same
>>>> word. So for set and clear, mask is required.
>>>>
>>>> I can try to implement {get, set, clear}_pfnblock_bits(page,pfn, bits) to
>>>> only handle standalone bits by using the given @bits as the mask and
>>>> {set,get}_pageblock_migratetype() still use the mask.
>>>
>>> We'd still have to do the try_cmpxchg() when dealing with multiple bits, right?
>>>
>>> For single bits, we could just use set_bit() etc.
>>
>> Mel moved from set_bit() to try_cmpxchg() a word for performance reason. I am
>> not sure we want to move back.
>
> In e58469bafd05 we moved from multiple set_bit etc to a cmpxchange.
>
> -       for (; start_bitidx <= end_bitidx; start_bitidx++, value <<= 1)
> -               if (flags & value)
> -                       __set_bit(bitidx + start_bitidx, bitmap);
> -               else
> -                       __clear_bit(bitidx + start_bitidx, bitmap);
>
>
> However, when only setting/clearing a single bit (e.g., isolated), set_bit etc should be much cheaper.
>
> For multiple bits, the existing try_cmpxchg should be kept IMHO.

Yes, I was thinking about that too. Let me do that as a standalone cleanup series
first, then resend this one afterwards.
--
Best Regards,
Yan, Zi

^ permalink raw reply	[flat|nested] 42+ messages in thread

end of thread, other threads:[~2025-05-21 12:18 UTC | newest]

Thread overview: 42+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-05-09 20:01 [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Zi Yan
2025-05-09 20:01 ` [PATCH v4 1/4] mm/page_isolation: make page isolation " Zi Yan
2025-05-13 11:32   ` Brendan Jackman
2025-05-13 14:53     ` Zi Yan
2025-05-19  8:08   ` David Hildenbrand
2025-05-19 15:08     ` Zi Yan
2025-05-19 16:42       ` David Hildenbrand
2025-05-19 17:15         ` Zi Yan
2025-05-21 11:16         ` Zi Yan
2025-05-21 11:57           ` David Hildenbrand
2025-05-21 12:00             ` Zi Yan
2025-05-21 12:11               ` David Hildenbrand
2025-05-21 12:18                 ` Zi Yan
2025-05-09 20:01 ` [PATCH v4 2/4] mm/page_isolation: remove migratetype from move_freepages_block_isolate() Zi Yan
2025-05-12  6:25   ` kernel test robot
2025-05-12 16:10   ` Lorenzo Stoakes
2025-05-12 16:13     ` Zi Yan
2025-05-12 16:19       ` Lorenzo Stoakes
2025-05-12 16:28         ` Zi Yan
2025-05-12 22:00     ` Andrew Morton
2025-05-12 23:20     ` Zi Yan
2025-05-19  8:21   ` David Hildenbrand
2025-05-19 23:06     ` Zi Yan
2025-05-20  8:58       ` David Hildenbrand
2025-05-09 20:01 ` [PATCH v4 3/4] mm/page_isolation: remove migratetype from undo_isolate_page_range() Zi Yan
2025-05-09 20:01 ` [PATCH v4 4/4] mm/page_isolation: remove migratetype parameter from more functions Zi Yan
2025-05-17 20:21   ` Vlastimil Babka
2025-05-18  0:07     ` Zi Yan
2025-05-18 16:32   ` Johannes Weiner
2025-05-18 17:24     ` Zi Yan
2025-05-17 20:26 ` [PATCH v4 0/4] Make MIGRATE_ISOLATE a standalone bit Vlastimil Babka
2025-05-18  0:20   ` Zi Yan
2025-05-19 14:15     ` David Hildenbrand
2025-05-19 14:35       ` Zi Yan
2025-05-20  8:58         ` David Hildenbrand
2025-05-20 13:18           ` Zi Yan
2025-05-20 13:20             ` David Hildenbrand
2025-05-20 13:31               ` Zi Yan
2025-05-20 13:33                 ` David Hildenbrand
2025-05-20 14:07                   ` Zi Yan
2025-05-19  7:44 ` David Hildenbrand
2025-05-19 14:01   ` Zi Yan

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).