public inbox for stable@vger.kernel.org
 help / color / mirror / Atom feed
* [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport
@ 2024-10-17 21:58 chrisl
  2024-10-17 21:58 ` [PATCH 6.11.y 1/3] mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO chrisl
                   ` (3 more replies)
  0 siblings, 4 replies; 10+ messages in thread
From: chrisl @ 2024-10-17 21:58 UTC (permalink / raw)
  To: stable
  Cc: Greg KH, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka, kernel test robot,
	Janosch Frank, Marc Hartmayer

A few commits from Yu Zhao have been merged into 6.12.
They need to be backported to 6.11.
- c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
- 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
- e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")

---
Yu Zhao (3):
      mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO
      mm/codetag: fix pgalloc_tag_split()
      mm/codetag: add pgalloc_tag_copy()

 include/linux/alloc_tag.h   | 24 ++++++++-----------
 include/linux/mm.h          | 57 +++++++++++++++++++++++++++++++++++++++++++++
 include/linux/pgalloc_tag.h | 31 ------------------------
 mm/huge_memory.c            |  2 +-
 mm/hugetlb_vmemmap.c        | 40 +++++++++++++++----------------
 mm/migrate.c                |  1 +
 mm/page_alloc.c             |  4 ++--
 7 files changed, 91 insertions(+), 68 deletions(-)
---
base-commit: 8e24a758d14c0b1cd42ab0aea980a1030eea811f
change-id: 20241016-stable-yuzhao-7779910482e8

Best regards,
-- 
Chris Li <chrisl@kernel.org>


^ permalink raw reply	[flat|nested] 10+ messages in thread

* [PATCH 6.11.y 1/3] mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO
  2024-10-17 21:58 [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport chrisl
@ 2024-10-17 21:58 ` chrisl
  2024-10-17 21:58 ` [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split() chrisl
                   ` (2 subsequent siblings)
  3 siblings, 0 replies; 10+ messages in thread
From: chrisl @ 2024-10-17 21:58 UTC (permalink / raw)
  To: stable
  Cc: Greg KH, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka, kernel test robot,
	Janosch Frank, Marc Hartmayer

From: Yu Zhao <yuzhao@google.com>

[ Upstream commit c2a967f6ab0ec896648c0497d3dc15d8f136b148 ]

hugetlb_vmemmap_optimize_folio() and hugetlb_vmemmap_restore_folio() are
wrappers meant to be called regardless of whether HVO is enabled.
Therefore, they should not call synchronize_rcu().  Otherwise, it
regresses use cases not enabling HVO.

So move synchronize_rcu() to __hugetlb_vmemmap_optimize_folio() and
__hugetlb_vmemmap_restore_folio(), and call it once for each batch of
folios when HVO is enabled.

Link: https://lkml.kernel.org/r/20240719042503.2752316-1-yuzhao@google.com
Fixes: bd225530a4c7 ("mm/hugetlb_vmemmap: fix race with speculative PFN walkers")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Reported-by: kernel test robot <oliver.sang@intel.com>
Closes: https://lore.kernel.org/oe-lkp/202407091001.1250ad4a-oliver.sang@intel.com
Reported-by: Janosch Frank <frankja@linux.ibm.com>
Tested-by: Marc Hartmayer <mhartmay@linux.ibm.com>
Acked-by: Muchun Song <muchun.song@linux.dev>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 mm/hugetlb_vmemmap.c | 40 ++++++++++++++++++++--------------------
 1 file changed, 20 insertions(+), 20 deletions(-)

diff --git a/mm/hugetlb_vmemmap.c b/mm/hugetlb_vmemmap.c
index 0c3f56b3578eb..57b7f591eee82 100644
--- a/mm/hugetlb_vmemmap.c
+++ b/mm/hugetlb_vmemmap.c
@@ -43,6 +43,8 @@ struct vmemmap_remap_walk {
 #define VMEMMAP_SPLIT_NO_TLB_FLUSH	BIT(0)
 /* Skip the TLB flush when we remap the PTE */
 #define VMEMMAP_REMAP_NO_TLB_FLUSH	BIT(1)
+/* synchronize_rcu() to avoid writes from page_ref_add_unless() */
+#define VMEMMAP_SYNCHRONIZE_RCU		BIT(2)
 	unsigned long		flags;
 };
 
@@ -457,6 +459,9 @@ static int __hugetlb_vmemmap_restore_folio(const struct hstate *h,
 	if (!folio_test_hugetlb_vmemmap_optimized(folio))
 		return 0;
 
+	if (flags & VMEMMAP_SYNCHRONIZE_RCU)
+		synchronize_rcu();
+
 	vmemmap_end	= vmemmap_start + hugetlb_vmemmap_size(h);
 	vmemmap_reuse	= vmemmap_start;
 	vmemmap_start	+= HUGETLB_VMEMMAP_RESERVE_SIZE;
@@ -489,10 +494,7 @@ static int __hugetlb_vmemmap_restore_folio(const struct hstate *h,
  */
 int hugetlb_vmemmap_restore_folio(const struct hstate *h, struct folio *folio)
 {
-	/* avoid writes from page_ref_add_unless() while unfolding vmemmap */
-	synchronize_rcu();
-
-	return __hugetlb_vmemmap_restore_folio(h, folio, 0);
+	return __hugetlb_vmemmap_restore_folio(h, folio, VMEMMAP_SYNCHRONIZE_RCU);
 }
 
 /**
@@ -515,14 +517,14 @@ long hugetlb_vmemmap_restore_folios(const struct hstate *h,
 	struct folio *folio, *t_folio;
 	long restored = 0;
 	long ret = 0;
-
-	/* avoid writes from page_ref_add_unless() while unfolding vmemmap */
-	synchronize_rcu();
+	unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU;
 
 	list_for_each_entry_safe(folio, t_folio, folio_list, lru) {
 		if (folio_test_hugetlb_vmemmap_optimized(folio)) {
-			ret = __hugetlb_vmemmap_restore_folio(h, folio,
-							      VMEMMAP_REMAP_NO_TLB_FLUSH);
+			ret = __hugetlb_vmemmap_restore_folio(h, folio, flags);
+			/* only need to synchronize_rcu() once for each batch */
+			flags &= ~VMEMMAP_SYNCHRONIZE_RCU;
+
 			if (ret)
 				break;
 			restored++;
@@ -570,6 +572,9 @@ static int __hugetlb_vmemmap_optimize_folio(const struct hstate *h,
 		return ret;
 
 	static_branch_inc(&hugetlb_optimize_vmemmap_key);
+
+	if (flags & VMEMMAP_SYNCHRONIZE_RCU)
+		synchronize_rcu();
 	/*
 	 * Very Subtle
 	 * If VMEMMAP_REMAP_NO_TLB_FLUSH is set, TLB flushing is not performed
@@ -617,10 +622,7 @@ void hugetlb_vmemmap_optimize_folio(const struct hstate *h, struct folio *folio)
 {
 	LIST_HEAD(vmemmap_pages);
 
-	/* avoid writes from page_ref_add_unless() while folding vmemmap */
-	synchronize_rcu();
-
-	__hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, 0);
+	__hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, VMEMMAP_SYNCHRONIZE_RCU);
 	free_vmemmap_page_list(&vmemmap_pages);
 }
 
@@ -647,6 +649,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l
 {
 	struct folio *folio;
 	LIST_HEAD(vmemmap_pages);
+	unsigned long flags = VMEMMAP_REMAP_NO_TLB_FLUSH | VMEMMAP_SYNCHRONIZE_RCU;
 
 	list_for_each_entry(folio, folio_list, lru) {
 		int ret = hugetlb_vmemmap_split_folio(h, folio);
@@ -663,14 +666,12 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l
 
 	flush_tlb_all();
 
-	/* avoid writes from page_ref_add_unless() while folding vmemmap */
-	synchronize_rcu();
-
 	list_for_each_entry(folio, folio_list, lru) {
 		int ret;
 
-		ret = __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages,
-						       VMEMMAP_REMAP_NO_TLB_FLUSH);
+		ret = __hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags);
+		/* only need to synchronize_rcu() once for each batch */
+		flags &= ~VMEMMAP_SYNCHRONIZE_RCU;
 
 		/*
 		 * Pages to be freed may have been accumulated.  If we
@@ -684,8 +685,7 @@ void hugetlb_vmemmap_optimize_folios(struct hstate *h, struct list_head *folio_l
 			flush_tlb_all();
 			free_vmemmap_page_list(&vmemmap_pages);
 			INIT_LIST_HEAD(&vmemmap_pages);
-			__hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages,
-							 VMEMMAP_REMAP_NO_TLB_FLUSH);
+			__hugetlb_vmemmap_optimize_folio(h, folio, &vmemmap_pages, flags);
 		}
 	}
 

-- 
2.47.0.rc1.288.g06298d1525-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split()
  2024-10-17 21:58 [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport chrisl
  2024-10-17 21:58 ` [PATCH 6.11.y 1/3] mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO chrisl
@ 2024-10-17 21:58 ` chrisl
  2024-10-18  8:29   ` Greg KH
  2024-10-17 21:58 ` [PATCH 6.11.y 3/3] mm/codetag: add pgalloc_tag_copy() chrisl
  2024-10-18  5:58 ` [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport Greg KH
  3 siblings, 1 reply; 10+ messages in thread
From: chrisl @ 2024-10-17 21:58 UTC (permalink / raw)
  To: stable
  Cc: Greg KH, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka

From: Yu Zhao <yuzhao@google.com>

[ Upstream commit 95599ef684d01136a8b77c16a7c853496786e173 ]

The current assumption is that a large folio can only be split into
order-0 folios.  That is not the case for hugeTLB demotion, nor for THP
split: see commit c010d47f107f ("mm: thp: split huge page to any lower
order pages").

When a large folio is split into ones of a lower non-zero order, only the
new head pages should be tagged.  Tagging tail pages can cause imbalanced
"calls" counters, since only head pages are untagged by pgalloc_tag_sub()
and the "calls" counts on tail pages are leaked, e.g.,

  # echo 2048kB >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size
  # echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
  # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote
  # echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
  # grep alloc_gigantic_folio /proc/allocinfo

Before this patch:
  0  549427200  mm/hugetlb.c:1549 func:alloc_gigantic_folio

  real  0m2.057s
  user  0m0.000s
  sys   0m2.051s

After this patch:
  0          0  mm/hugetlb.c:1549 func:alloc_gigantic_folio

  real  0m1.711s
  user  0m0.000s
  sys   0m1.704s

Not tagging tail pages also improves the splitting time, e.g., by about
15% when demoting 1GB hugeTLB folios to 2MB ones, as shown above.

Link: https://lkml.kernel.org/r/20240906042108.1150526-2-yuzhao@google.com
Fixes: be25d1d4e822 ("mm: create new codetag references during page splitting")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/mm.h          | 30 ++++++++++++++++++++++++++++++
 include/linux/pgalloc_tag.h | 31 -------------------------------
 mm/huge_memory.c            |  2 +-
 mm/page_alloc.c             |  4 ++--
 4 files changed, 33 insertions(+), 34 deletions(-)

diff --git a/include/linux/mm.h b/include/linux/mm.h
index 1470736017168..8330363126918 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4216,4 +4216,34 @@ void vma_pgtable_walk_end(struct vm_area_struct *vma);
 
 int reserve_mem_find_by_name(const char *name, phys_addr_t *start, phys_addr_t *size);
 
+#ifdef CONFIG_MEM_ALLOC_PROFILING
+static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order)
+{
+	int i;
+	struct alloc_tag *tag;
+	unsigned int nr_pages = 1 << new_order;
+
+	if (!mem_alloc_profiling_enabled())
+		return;
+
+	tag = pgalloc_tag_get(&folio->page);
+	if (!tag)
+		return;
+
+	for (i = nr_pages; i < (1 << old_order); i += nr_pages) {
+		union codetag_ref *ref = get_page_tag_ref(folio_page(folio, i));
+
+		if (ref) {
+			/* Set new reference to point to the original tag */
+			alloc_tag_ref_set(ref, tag);
+			put_page_tag_ref(ref);
+		}
+	}
+}
+#else /* !CONFIG_MEM_ALLOC_PROFILING */
+static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order)
+{
+}
+#endif /* CONFIG_MEM_ALLOC_PROFILING */
+
 #endif /* _LINUX_MM_H */
diff --git a/include/linux/pgalloc_tag.h b/include/linux/pgalloc_tag.h
index 207f0c83c8e97..59a3deb792a8d 100644
--- a/include/linux/pgalloc_tag.h
+++ b/include/linux/pgalloc_tag.h
@@ -80,36 +80,6 @@ static inline void pgalloc_tag_sub(struct page *page, unsigned int nr)
 	}
 }
 
-static inline void pgalloc_tag_split(struct page *page, unsigned int nr)
-{
-	int i;
-	struct page_ext *first_page_ext;
-	struct page_ext *page_ext;
-	union codetag_ref *ref;
-	struct alloc_tag *tag;
-
-	if (!mem_alloc_profiling_enabled())
-		return;
-
-	first_page_ext = page_ext = page_ext_get(page);
-	if (unlikely(!page_ext))
-		return;
-
-	ref = codetag_ref_from_page_ext(page_ext);
-	if (!ref->ct)
-		goto out;
-
-	tag = ct_to_alloc_tag(ref->ct);
-	page_ext = page_ext_next(page_ext);
-	for (i = 1; i < nr; i++) {
-		/* Set new reference to point to the original tag */
-		alloc_tag_ref_set(codetag_ref_from_page_ext(page_ext), tag);
-		page_ext = page_ext_next(page_ext);
-	}
-out:
-	page_ext_put(first_page_ext);
-}
-
 static inline struct alloc_tag *pgalloc_tag_get(struct page *page)
 {
 	struct alloc_tag *tag = NULL;
@@ -142,7 +112,6 @@ static inline void clear_page_tag_ref(struct page *page) {}
 static inline void pgalloc_tag_add(struct page *page, struct task_struct *task,
 				   unsigned int nr) {}
 static inline void pgalloc_tag_sub(struct page *page, unsigned int nr) {}
-static inline void pgalloc_tag_split(struct page *page, unsigned int nr) {}
 static inline struct alloc_tag *pgalloc_tag_get(struct page *page) { return NULL; }
 static inline void pgalloc_tag_sub_pages(struct alloc_tag *tag, unsigned int nr) {}
 
diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 99b146d16a185..837d41906f2ac 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -2976,7 +2976,7 @@ static void __split_huge_page(struct page *page, struct list_head *list,
 	/* Caller disabled irqs, so they are still disabled here */
 
 	split_page_owner(head, order, new_order);
-	pgalloc_tag_split(head, 1 << order);
+	pgalloc_tag_split(folio, order, new_order);
 
 	/* See comment in __split_huge_page_tail() */
 	if (folio_test_anon(folio)) {
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 91ace8ca97e21..72b710566cdbc 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -2764,7 +2764,7 @@ void split_page(struct page *page, unsigned int order)
 	for (i = 1; i < (1 << order); i++)
 		set_page_refcounted(page + i);
 	split_page_owner(page, order, 0);
-	pgalloc_tag_split(page, 1 << order);
+	pgalloc_tag_split(page_folio(page), order, 0);
 	split_page_memcg(page, order, 0);
 }
 EXPORT_SYMBOL_GPL(split_page);
@@ -4950,7 +4950,7 @@ static void *make_alloc_exact(unsigned long addr, unsigned int order,
 		struct page *last = page + nr;
 
 		split_page_owner(page, order, 0);
-		pgalloc_tag_split(page, 1 << order);
+		pgalloc_tag_split(page_folio(page), order, 0);
 		split_page_memcg(page, order, 0);
 		while (page < --last)
 			set_page_refcounted(last);

-- 
2.47.0.rc1.288.g06298d1525-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [PATCH 6.11.y 3/3] mm/codetag: add pgalloc_tag_copy()
  2024-10-17 21:58 [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport chrisl
  2024-10-17 21:58 ` [PATCH 6.11.y 1/3] mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO chrisl
  2024-10-17 21:58 ` [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split() chrisl
@ 2024-10-17 21:58 ` chrisl
  2024-10-18  5:58 ` [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport Greg KH
  3 siblings, 0 replies; 10+ messages in thread
From: chrisl @ 2024-10-17 21:58 UTC (permalink / raw)
  To: stable
  Cc: Greg KH, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka

From: Yu Zhao <yuzhao@google.com>

[ Upstream commit e0a955bf7f61cb034d228736d81c1ab3a47a3dca ]

Add pgalloc_tag_copy() to transfer the codetag from the old folio to the
new one during migration.  This makes original allocation sites persist
cross migration rather than lump into the get_new_folio callbacks passed
into migrate_pages(), e.g., compaction_alloc():

  # echo 1 >/proc/sys/vm/compact_memory
  # grep compaction_alloc /proc/allocinfo

Before this patch:
  132968448  32463  mm/compaction.c:1880 func:compaction_alloc

After this patch:
          0      0  mm/compaction.c:1880 func:compaction_alloc

Link: https://lkml.kernel.org/r/20240906042108.1150526-3-yuzhao@google.com
Fixes: dcfe378c81f7 ("lib: introduce support for page allocation tagging")
Signed-off-by: Yu Zhao <yuzhao@google.com>
Acked-by: Suren Baghdasaryan <surenb@google.com>
Cc: Kent Overstreet <kent.overstreet@linux.dev>
Cc: Muchun Song <muchun.song@linux.dev>
Cc: <stable@vger.kernel.org>
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
---
 include/linux/alloc_tag.h | 24 ++++++++++--------------
 include/linux/mm.h        | 27 +++++++++++++++++++++++++++
 mm/migrate.c              |  1 +
 3 files changed, 38 insertions(+), 14 deletions(-)

diff --git a/include/linux/alloc_tag.h b/include/linux/alloc_tag.h
index 8c61ccd161ba3..39a7fd60e389a 100644
--- a/include/linux/alloc_tag.h
+++ b/include/linux/alloc_tag.h
@@ -137,7 +137,16 @@ static inline void alloc_tag_sub_check(union codetag_ref *ref) {}
 /* Caller should verify both ref and tag to be valid */
 static inline void __alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *tag)
 {
+	alloc_tag_add_check(ref, tag);
+	if (!ref || !tag)
+		return;
+
 	ref->ct = &tag->ct;
+}
+
+static inline void alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *tag)
+{
+	__alloc_tag_ref_set(ref, tag);
 	/*
 	 * We need in increment the call counter every time we have a new
 	 * allocation or when we split a large allocation into smaller ones.
@@ -147,22 +156,9 @@ static inline void __alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag
 	this_cpu_inc(tag->counters->calls);
 }
 
-static inline void alloc_tag_ref_set(union codetag_ref *ref, struct alloc_tag *tag)
-{
-	alloc_tag_add_check(ref, tag);
-	if (!ref || !tag)
-		return;
-
-	__alloc_tag_ref_set(ref, tag);
-}
-
 static inline void alloc_tag_add(union codetag_ref *ref, struct alloc_tag *tag, size_t bytes)
 {
-	alloc_tag_add_check(ref, tag);
-	if (!ref || !tag)
-		return;
-
-	__alloc_tag_ref_set(ref, tag);
+	alloc_tag_ref_set(ref, tag);
 	this_cpu_add(tag->counters->bytes, bytes);
 }
 
diff --git a/include/linux/mm.h b/include/linux/mm.h
index 8330363126918..a3a86fc407385 100644
--- a/include/linux/mm.h
+++ b/include/linux/mm.h
@@ -4240,10 +4240,37 @@ static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new
 		}
 	}
 }
+
+static inline void pgalloc_tag_copy(struct folio *new, struct folio *old)
+{
+	struct alloc_tag *tag;
+	union codetag_ref *ref;
+
+	tag = pgalloc_tag_get(&old->page);
+	if (!tag)
+		return;
+
+	ref = get_page_tag_ref(&new->page);
+	if (!ref)
+		return;
+
+	/* Clear the old ref to the original allocation tag. */
+	clear_page_tag_ref(&old->page);
+	/* Decrement the counters of the tag on get_new_folio. */
+	alloc_tag_sub(ref, folio_nr_pages(new));
+
+	__alloc_tag_ref_set(ref, tag);
+
+	put_page_tag_ref(ref);
+}
 #else /* !CONFIG_MEM_ALLOC_PROFILING */
 static inline void pgalloc_tag_split(struct folio *folio, int old_order, int new_order)
 {
 }
+
+static inline void pgalloc_tag_copy(struct folio *new, struct folio *old)
+{
+}
 #endif /* CONFIG_MEM_ALLOC_PROFILING */
 
 #endif /* _LINUX_MM_H */
diff --git a/mm/migrate.c b/mm/migrate.c
index 368ab3878fa6e..028282b28242e 100644
--- a/mm/migrate.c
+++ b/mm/migrate.c
@@ -666,6 +666,7 @@ void folio_migrate_flags(struct folio *newfolio, struct folio *folio)
 		folio_set_readahead(newfolio);
 
 	folio_copy_owner(newfolio, folio);
+	pgalloc_tag_copy(newfolio, folio);
 
 	mem_cgroup_migrate(folio, newfolio);
 }

-- 
2.47.0.rc1.288.g06298d1525-goog


^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport
  2024-10-17 21:58 [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport chrisl
                   ` (2 preceding siblings ...)
  2024-10-17 21:58 ` [PATCH 6.11.y 3/3] mm/codetag: add pgalloc_tag_copy() chrisl
@ 2024-10-18  5:58 ` Greg KH
  2024-10-18  6:39   ` Kent Overstreet
  2024-10-18  8:18   ` Vlastimil Babka
  3 siblings, 2 replies; 10+ messages in thread
From: Greg KH @ 2024-10-18  5:58 UTC (permalink / raw)
  To: chrisl
  Cc: stable, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka, kernel test robot,
	Janosch Frank, Marc Hartmayer

On Thu, Oct 17, 2024 at 02:58:01PM -0700, chrisl@kernel.org wrote:
> A few commits from Yu Zhao have been merged into 6.12.
> They need to be backported to 6.11.

Why?

> - c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
> - 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
> - e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")

For mm changes, we need an explicit ack from the mm maintainers to take
patches into the stable tree.  Why were these not tagged with the normal
"cc: stable@" tag in the first place?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport
  2024-10-18  5:58 ` [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport Greg KH
@ 2024-10-18  6:39   ` Kent Overstreet
  2024-10-18  8:18   ` Vlastimil Babka
  1 sibling, 0 replies; 10+ messages in thread
From: Kent Overstreet @ 2024-10-18  6:39 UTC (permalink / raw)
  To: Greg KH
  Cc: chrisl, stable, Muchun Song, Andrew Morton, Yu Zhao,
	Suren Baghdasaryan, Vlastimil Babka, kernel test robot,
	Janosch Frank, Marc Hartmayer

On Fri, Oct 18, 2024 at 07:58:41AM +0200, Greg KH wrote:
> On Thu, Oct 17, 2024 at 02:58:01PM -0700, chrisl@kernel.org wrote:
> > A few commits from Yu Zhao have been merged into 6.12.
> > They need to be backported to 6.11.
> 
> Why?
> 
> > - c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
> > - 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
> > - e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")
> 
> For mm changes, we need an explicit ack from the mm maintainers to take
> patches into the stable tree.  Why were these not tagged with the normal
> "cc: stable@" tag in the first place?

I've got a possible regression in memory allocation profiling, and page
splitting would be a likely culprit, so we should probably hold off on
backporting until I bisect.

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport
  2024-10-18  5:58 ` [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport Greg KH
  2024-10-18  6:39   ` Kent Overstreet
@ 2024-10-18  8:18   ` Vlastimil Babka
  2024-10-18  8:30     ` Greg KH
  1 sibling, 1 reply; 10+ messages in thread
From: Vlastimil Babka @ 2024-10-18  8:18 UTC (permalink / raw)
  To: Greg KH, chrisl
  Cc: stable, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, kernel test robot, Janosch Frank, Marc Hartmayer

On 10/18/24 07:58, Greg KH wrote:
> On Thu, Oct 17, 2024 at 02:58:01PM -0700, chrisl@kernel.org wrote:
>> A few commits from Yu Zhao have been merged into 6.12.
>> They need to be backported to 6.11.
> 
> Why?
> 
>> - c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
>> - 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
>> - e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")
> 
> For mm changes, we need an explicit ack from the mm maintainers to take
> patches into the stable tree.  Why were these not tagged with the normal
> "cc: stable@" tag in the first place?

The two codetags commit actually were tagged. c2a967f6ab0ec seems to me an
unrelated performance optimization, could make sense in a LTS kernel but is
there a lot of benefit expected for the lifetime of 6.11?


> thanks,
> 
> greg k-h


^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split()
  2024-10-17 21:58 ` [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split() chrisl
@ 2024-10-18  8:29   ` Greg KH
  2024-10-18 17:15     ` Chris Li
  0 siblings, 1 reply; 10+ messages in thread
From: Greg KH @ 2024-10-18  8:29 UTC (permalink / raw)
  To: chrisl
  Cc: stable, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka

On Thu, Oct 17, 2024 at 02:58:03PM -0700, chrisl@kernel.org wrote:
> From: Yu Zhao <yuzhao@google.com>
> 
> [ Upstream commit 95599ef684d01136a8b77c16a7c853496786e173 ]
> 
> The current assumption is that a large folio can only be split into
> order-0 folios.  That is not the case for hugeTLB demotion, nor for THP
> split: see commit c010d47f107f ("mm: thp: split huge page to any lower
> order pages").
> 
> When a large folio is split into ones of a lower non-zero order, only the
> new head pages should be tagged.  Tagging tail pages can cause imbalanced
> "calls" counters, since only head pages are untagged by pgalloc_tag_sub()
> and the "calls" counts on tail pages are leaked, e.g.,
> 
>   # echo 2048kB >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size
>   # echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
>   # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote
>   # echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
>   # grep alloc_gigantic_folio /proc/allocinfo
> 
> Before this patch:
>   0  549427200  mm/hugetlb.c:1549 func:alloc_gigantic_folio
> 
>   real  0m2.057s
>   user  0m0.000s
>   sys   0m2.051s
> 
> After this patch:
>   0          0  mm/hugetlb.c:1549 func:alloc_gigantic_folio
> 
>   real  0m1.711s
>   user  0m0.000s
>   sys   0m1.704s
> 
> Not tagging tail pages also improves the splitting time, e.g., by about
> 15% when demoting 1GB hugeTLB folios to 2MB ones, as shown above.
> 
> Link: https://lkml.kernel.org/r/20240906042108.1150526-2-yuzhao@google.com
> Fixes: be25d1d4e822 ("mm: create new codetag references during page splitting")
> Signed-off-by: Yu Zhao <yuzhao@google.com>
> Acked-by: Suren Baghdasaryan <surenb@google.com>
> Cc: Kent Overstreet <kent.overstreet@linux.dev>
> Cc: Muchun Song <muchun.song@linux.dev>
> Cc: <stable@vger.kernel.org>
> Signed-off-by: Andrew Morton <akpm@linux-foundation.org>

You did not sign off on this backport, so even if I wanted to take it, I
couldn't :(

Please fix this, and patch 3/3 up, and just send those.

sorry,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport
  2024-10-18  8:18   ` Vlastimil Babka
@ 2024-10-18  8:30     ` Greg KH
  0 siblings, 0 replies; 10+ messages in thread
From: Greg KH @ 2024-10-18  8:30 UTC (permalink / raw)
  To: Vlastimil Babka
  Cc: chrisl, stable, Muchun Song, Andrew Morton, Yu Zhao,
	Suren Baghdasaryan, Kent Overstreet, kernel test robot,
	Janosch Frank, Marc Hartmayer

On Fri, Oct 18, 2024 at 10:18:19AM +0200, Vlastimil Babka wrote:
> On 10/18/24 07:58, Greg KH wrote:
> > On Thu, Oct 17, 2024 at 02:58:01PM -0700, chrisl@kernel.org wrote:
> >> A few commits from Yu Zhao have been merged into 6.12.
> >> They need to be backported to 6.11.
> > 
> > Why?
> > 
> >> - c2a967f6ab0ec ("mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO")
> >> - 95599ef684d01 ("mm/codetag: fix pgalloc_tag_split()")
> >> - e0a955bf7f61c ("mm/codetag: add pgalloc_tag_copy()")
> > 
> > For mm changes, we need an explicit ack from the mm maintainers to take
> > patches into the stable tree.  Why were these not tagged with the normal
> > "cc: stable@" tag in the first place?
> 
> The two codetags commit actually were tagged.

Ah, I missed that, I only looked at the first one and stopped there.

> c2a967f6ab0ec seems to me an
> unrelated performance optimization, could make sense in a LTS kernel but is
> there a lot of benefit expected for the lifetime of 6.11?

I can't take any of these as they aren't properly signed-off so I'll
wait for a resubmit to consider any of these (but patch 1 better have
some justification and ack from the mm developers that we can take
it...)

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split()
  2024-10-18  8:29   ` Greg KH
@ 2024-10-18 17:15     ` Chris Li
  0 siblings, 0 replies; 10+ messages in thread
From: Chris Li @ 2024-10-18 17:15 UTC (permalink / raw)
  To: Greg KH
  Cc: stable, Muchun Song, Andrew Morton, Yu Zhao, Suren Baghdasaryan,
	Kent Overstreet, Vlastimil Babka

Hi Greg,

On Fri, Oct 18, 2024 at 1:29 AM Greg KH <gregkh@linuxfoundation.org> wrote:
>
> On Thu, Oct 17, 2024 at 02:58:03PM -0700, chrisl@kernel.org wrote:
> > From: Yu Zhao <yuzhao@google.com>
> >
> > [ Upstream commit 95599ef684d01136a8b77c16a7c853496786e173 ]
> >
> > The current assumption is that a large folio can only be split into
> > order-0 folios.  That is not the case for hugeTLB demotion, nor for THP
> > split: see commit c010d47f107f ("mm: thp: split huge page to any lower
> > order pages").
> >
> > When a large folio is split into ones of a lower non-zero order, only the
> > new head pages should be tagged.  Tagging tail pages can cause imbalanced
> > "calls" counters, since only head pages are untagged by pgalloc_tag_sub()
> > and the "calls" counts on tail pages are leaked, e.g.,
> >
> >   # echo 2048kB >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote_size
> >   # echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/nr_hugepages
> >   # time echo 700 >/sys/kernel/mm/hugepages/hugepages-1048576kB/demote
> >   # echo 0 >/sys/kernel/mm/hugepages/hugepages-2048kB/nr_hugepages
> >   # grep alloc_gigantic_folio /proc/allocinfo
> >
> > Before this patch:
> >   0  549427200  mm/hugetlb.c:1549 func:alloc_gigantic_folio
> >
> >   real  0m2.057s
> >   user  0m0.000s
> >   sys   0m2.051s
> >
> > After this patch:
> >   0          0  mm/hugetlb.c:1549 func:alloc_gigantic_folio
> >
> >   real  0m1.711s
> >   user  0m0.000s
> >   sys   0m1.704s
> >
> > Not tagging tail pages also improves the splitting time, e.g., by about
> > 15% when demoting 1GB hugeTLB folios to 2MB ones, as shown above.
> >
> > Link: https://lkml.kernel.org/r/20240906042108.1150526-2-yuzhao@google.com
> > Fixes: be25d1d4e822 ("mm: create new codetag references during page splitting")
> > Signed-off-by: Yu Zhao <yuzhao@google.com>
> > Acked-by: Suren Baghdasaryan <surenb@google.com>
> > Cc: Kent Overstreet <kent.overstreet@linux.dev>
> > Cc: Muchun Song <muchun.song@linux.dev>
> > Cc: <stable@vger.kernel.org>
> > Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
>
> You did not sign off on this backport, so even if I wanted to take it, I
> couldn't :(

My bad, sorry my first attempt back port some patches for someone else.

> Please fix this, and patch 3/3 up, and just send those.

I will add sign off to all 3 patches and send out the v2. I haven't
found an easy way to skip the cover letter in b4 yet. It might be
easier just to refresh the whole series.

Chris

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2024-10-18 17:15 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-17 21:58 [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport chrisl
2024-10-17 21:58 ` [PATCH 6.11.y 1/3] mm/hugetlb_vmemmap: don't synchronize_rcu() without HVO chrisl
2024-10-17 21:58 ` [PATCH 6.11.y 2/3] mm/codetag: fix pgalloc_tag_split() chrisl
2024-10-18  8:29   ` Greg KH
2024-10-18 17:15     ` Chris Li
2024-10-17 21:58 ` [PATCH 6.11.y 3/3] mm/codetag: add pgalloc_tag_copy() chrisl
2024-10-18  5:58 ` [PATCH 6.11.y 0/3] : Yu Zhao's memory fix backport Greg KH
2024-10-18  6:39   ` Kent Overstreet
2024-10-18  8:18   ` Vlastimil Babka
2024-10-18  8:30     ` Greg KH

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox