public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH v4 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls
@ 2026-03-09 11:07 Breno Leitao
  2026-03-09 11:07 ` [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
                   ` (3 more replies)
  0 siblings, 4 replies; 12+ messages in thread
From: Breno Leitao @ 2026-03-09 11:07 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle), Breno Leitao

Writing to /sys/kernel/mm/transparent_hugepage/enabled causes
start_stop_khugepaged() called independent of any change.
start_stop_khugepaged() SPAMs the printk ring buffer overflow with the
exact same message, even when nothing changes.

For instance, if you have a custom vm.min_free_kbytes, just touching
/sys/kernel/mm/transparent_hugepage/enabled causes a printk message.
Example:

      # sysctl -w vm.min_free_kbytes=112382
      # for i in $(seq 100); do echo never > /sys/kernel/mm/transparent_hugepage/enabled ; done

and you have 100 WARN messages like the following, which is pretty dull:

      khugepaged: min_free_kbytes is not updated to 112381 because user defined value 112382 is preferred

A similar message shows up when setting thp to "always":

      # for i in $(seq 100); do
      #       echo 1024 > /proc/sys/vm/min_free_kbytes
      #       echo always > /sys/kernel/mm/transparent_hugepage/enabled
      # done

And then, we have 100 messages like:

      khugepaged: raising min_free_kbytes from 1024 to 67584 to help transparent hugepage allocations

This is more common when you have a configuration management system that
writes the THP configuration without an extra read, assuming that
nothing will happen if there is no change in the configuration, but it
prints these annoying messages.

For instance, at Meta's fleet, ~10K servers were producing 3.5M of
these messages per day.

Fix this by making the sysfs _store helpers easier to digest and
ratelimiting the message.

This version is heavily based on Lorenzo's suggestion on V1.

---
Changes in v4:
- Use the enum instead of int in the new functions (akpm).
- Explicitly initialize the enum values (akpm).
- Link to v3: https://patch.msgid.link/20260307-thp_logs-v3-0-a45d2c8f3685@debian.org

Changes in v3:
- Extra ratelimit patch.
- Create two enums, one for anon and one for global. (Lorenzo)
- Remove the `extern` from set_recommended_min_free_kbytes (Lorenzo)
- Export set_recommended_min_free_kbytes() definition to mm/internal.h
  (Lorenzo)
- Link to v2: https://patch.msgid.link/20260305-thp_logs-v2-0-96b3ad795894@debian.org

Changes in v2:
- V2 is heavily based on Lorenzo and Kiryl feedback on v1.
- Link to v1: https://patch.msgid.link/20260304-thp_logs-v1-0-59038218a253@debian.org

---
Breno Leitao (4):
      mm: khugepaged: export set_recommended_min_free_kbytes()
      mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
      mm: huge_memory: refactor enabled_store() with change_enabled()
      mm: ratelimit min_free_kbytes adjustment messages

 mm/huge_memory.c | 147 +++++++++++++++++++++++++++++++++++++------------------
 mm/internal.h    |   5 ++
 mm/khugepaged.c  |   6 +--
 mm/page_alloc.c  |   4 +-
 4 files changed, 110 insertions(+), 52 deletions(-)
---
base-commit: 9dd5012f78d699f7a6051583dc53adeb401e28f0
change-id: 20260303-thp_logs-059d6b80f6d6

Best regards,
--  
Breno Leitao <leitao@debian.org>



^ permalink raw reply	[flat|nested] 12+ messages in thread

* [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes()
  2026-03-09 11:07 [PATCH v4 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
@ 2026-03-09 11:07 ` Breno Leitao
  2026-03-09 13:30   ` David Hildenbrand (Arm)
  2026-03-09 11:07 ` [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2026-03-09 11:07 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle), Breno Leitao

Make set_recommended_min_free_kbytes() callable from outside
khugepaged.c by removing the static qualifier and adding a
declaration in mm/internal.h.

This allows callers that change THP settings to recalculate
watermarks without going through start_stop_khugepaged().

Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
 mm/internal.h   | 5 +++++
 mm/khugepaged.c | 2 +-
 2 files changed, 6 insertions(+), 1 deletion(-)

diff --git a/mm/internal.h b/mm/internal.h
index cb0af847d7d99..7bd768e367793 100644
--- a/mm/internal.h
+++ b/mm/internal.h
@@ -623,6 +623,11 @@ int user_proactive_reclaim(char *buf,
  */
 pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address);
 
+/*
+ * in mm/khugepaged.c
+ */
+void set_recommended_min_free_kbytes(void);
+
 /*
  * in mm/page_alloc.c
  */
diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 1dd3cfca610db..56a41c21b44c9 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2630,7 +2630,7 @@ static int khugepaged(void *none)
 	return 0;
 }
 
-static void set_recommended_min_free_kbytes(void)
+void set_recommended_min_free_kbytes(void)
 {
 	struct zone *zone;
 	int nr_zones = 0;

-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
  2026-03-09 11:07 [PATCH v4 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
  2026-03-09 11:07 ` [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
@ 2026-03-09 11:07 ` Breno Leitao
  2026-03-09 13:43   ` David Hildenbrand (Arm)
  2026-03-09 11:07 ` [PATCH v4 3/4] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
  2026-03-09 11:07 ` [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
  3 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2026-03-09 11:07 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Breno Leitao, Lorenzo Stoakes (Oracle)

Consolidate the repeated spin_lock/set_bit/clear_bit pattern in
anon_enabled_store() into a new change_anon_orders() helper that
loops over an orders[] array, setting the bit for the selected mode
and clearing the others.

Introduce enum anon_enabled_mode and anon_enabled_mode_strings[]
for the per-order anon THP setting.

Use sysfs_match_string() with the anon_enabled_mode_strings[] table
to replace the if/else chain of sysfs_streq() calls.

The helper uses test_and_set_bit()/test_and_clear_bit() to track
whether the state actually changed, so start_stop_khugepaged() is
only called when needed. When the mode is unchanged,
set_recommended_min_free_kbytes() is called directly to preserve
the watermark recalculation behavior of the original code.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
 mm/huge_memory.c | 84 +++++++++++++++++++++++++++++++++++---------------------
 1 file changed, 52 insertions(+), 32 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 8e2746ea74adf..2d5b05a416dab 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -316,6 +316,20 @@ static ssize_t enabled_show(struct kobject *kobj,
 	return sysfs_emit(buf, "%s\n", output);
 }
 
+enum anon_enabled_mode {
+	ANON_ENABLED_ALWAYS	= 0,
+	ANON_ENABLED_MADVISE	= 1,
+	ANON_ENABLED_INHERIT	= 2,
+	ANON_ENABLED_NEVER	= 3,
+};
+
+static const char * const anon_enabled_mode_strings[] = {
+	[ANON_ENABLED_ALWAYS]	= "always",
+	[ANON_ENABLED_MADVISE]	= "madvise",
+	[ANON_ENABLED_INHERIT]	= "inherit",
+	[ANON_ENABLED_NEVER]	= "never",
+};
+
 static ssize_t enabled_store(struct kobject *kobj,
 			     struct kobj_attribute *attr,
 			     const char *buf, size_t count)
@@ -515,48 +529,54 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
 	return sysfs_emit(buf, "%s\n", output);
 }
 
+static bool change_anon_orders(int order, enum anon_enabled_mode mode)
+{
+	static unsigned long *orders[] = {
+		&huge_anon_orders_always,
+		&huge_anon_orders_madvise,
+		&huge_anon_orders_inherit,
+	};
+	enum anon_enabled_mode m;
+	bool changed = false;
+
+	spin_lock(&huge_anon_orders_lock);
+	for (m = 0; m < ARRAY_SIZE(orders); m++) {
+		if (m == mode)
+			changed |= !test_and_set_bit(order, orders[m]);
+		else
+			changed |= test_and_clear_bit(order, orders[m]);
+	}
+	spin_unlock(&huge_anon_orders_lock);
+
+	return changed;
+}
+
 static ssize_t anon_enabled_store(struct kobject *kobj,
 				  struct kobj_attribute *attr,
 				  const char *buf, size_t count)
 {
 	int order = to_thpsize(kobj)->order;
-	ssize_t ret = count;
+	int mode;
 
-	if (sysfs_streq(buf, "always")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_inherit);
-		clear_bit(order, &huge_anon_orders_madvise);
-		set_bit(order, &huge_anon_orders_always);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "inherit")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_madvise);
-		set_bit(order, &huge_anon_orders_inherit);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "madvise")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_inherit);
-		set_bit(order, &huge_anon_orders_madvise);
-		spin_unlock(&huge_anon_orders_lock);
-	} else if (sysfs_streq(buf, "never")) {
-		spin_lock(&huge_anon_orders_lock);
-		clear_bit(order, &huge_anon_orders_always);
-		clear_bit(order, &huge_anon_orders_inherit);
-		clear_bit(order, &huge_anon_orders_madvise);
-		spin_unlock(&huge_anon_orders_lock);
-	} else
-		ret = -EINVAL;
+	mode = sysfs_match_string(anon_enabled_mode_strings, buf);
+	if (mode < 0)
+		return -EINVAL;
 
-	if (ret > 0) {
-		int err;
+	if (change_anon_orders(order, mode)) {
+		int err = start_stop_khugepaged();
 
-		err = start_stop_khugepaged();
 		if (err)
-			ret = err;
+			return err;
+	} else {
+		/*
+		 * Recalculate watermarks even when the mode didn't
+		 * change, as the previous code always called
+		 * start_stop_khugepaged() which does this internally.
+		 */
+		set_recommended_min_free_kbytes();
 	}
-	return ret;
+
+	return count;
 }
 
 static struct kobj_attribute anon_enabled_attr =

-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 3/4] mm: huge_memory: refactor enabled_store() with change_enabled()
  2026-03-09 11:07 [PATCH v4 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
  2026-03-09 11:07 ` [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
  2026-03-09 11:07 ` [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
@ 2026-03-09 11:07 ` Breno Leitao
  2026-03-09 13:45   ` David Hildenbrand (Arm)
  2026-03-09 11:07 ` [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
  3 siblings, 1 reply; 12+ messages in thread
From: Breno Leitao @ 2026-03-09 11:07 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Breno Leitao, Lorenzo Stoakes (Oracle)

Refactor enabled_store() to use a new change_enabled() helper.
Introduce a separate enum global_enabled_mode and
global_enabled_mode_strings[], mirroring the anon_enabled_mode
pattern from the previous commit.

A separate enum is necessary because the global THP setting does
not support "inherit", only "always", "madvise", and "never".
Reusing anon_enabled_mode would leave a NULL gap in the string
array, causing sysfs_match_string() to stop early and fail to
match entries after the gap.

The helper uses the same loop pattern as change_anon_orders(),
iterating over an array of flag bit positions and using
test_and_set_bit()/test_and_clear_bit() to track whether the state
actually changed.

Signed-off-by: Breno Leitao <leitao@debian.org>
Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
---
 mm/huge_memory.c | 63 ++++++++++++++++++++++++++++++++++++++++++--------------
 1 file changed, 48 insertions(+), 15 deletions(-)

diff --git a/mm/huge_memory.c b/mm/huge_memory.c
index 2d5b05a416dab..be42a28da31d8 100644
--- a/mm/huge_memory.c
+++ b/mm/huge_memory.c
@@ -330,30 +330,63 @@ static const char * const anon_enabled_mode_strings[] = {
 	[ANON_ENABLED_NEVER]	= "never",
 };
 
+enum global_enabled_mode {
+	GLOBAL_ENABLED_ALWAYS	= 0,
+	GLOBAL_ENABLED_MADVISE	= 1,
+	GLOBAL_ENABLED_NEVER	= 2,
+};
+
+static const char * const global_enabled_mode_strings[] = {
+	[GLOBAL_ENABLED_ALWAYS]		= "always",
+	[GLOBAL_ENABLED_MADVISE]	= "madvise",
+	[GLOBAL_ENABLED_NEVER]		= "never",
+};
+
+static bool change_enabled(enum global_enabled_mode mode)
+{
+	static const unsigned long thp_flags[] = {
+		TRANSPARENT_HUGEPAGE_FLAG,
+		TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG,
+	};
+	enum global_enabled_mode m;
+	bool changed = false;
+
+	for (m = 0; m < ARRAY_SIZE(thp_flags); m++) {
+		if (m == mode)
+			changed |= !test_and_set_bit(thp_flags[m],
+						     &transparent_hugepage_flags);
+		else
+			changed |= test_and_clear_bit(thp_flags[m],
+						      &transparent_hugepage_flags);
+	}
+
+	return changed;
+}
+
 static ssize_t enabled_store(struct kobject *kobj,
 			     struct kobj_attribute *attr,
 			     const char *buf, size_t count)
 {
-	ssize_t ret = count;
+	int mode;
 
-	if (sysfs_streq(buf, "always")) {
-		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-		set_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-	} else if (sysfs_streq(buf, "madvise")) {
-		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-		set_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else if (sysfs_streq(buf, "never")) {
-		clear_bit(TRANSPARENT_HUGEPAGE_FLAG, &transparent_hugepage_flags);
-		clear_bit(TRANSPARENT_HUGEPAGE_REQ_MADV_FLAG, &transparent_hugepage_flags);
-	} else
-		ret = -EINVAL;
+	mode = sysfs_match_string(global_enabled_mode_strings, buf);
+	if (mode < 0)
+		return -EINVAL;
 
-	if (ret > 0) {
+	if (change_enabled(mode)) {
 		int err = start_stop_khugepaged();
+
 		if (err)
-			ret = err;
+			return err;
+	} else {
+		/*
+		 * Recalculate watermarks even when the mode didn't
+		 * change, as the previous code always called
+		 * start_stop_khugepaged() which does this internally.
+		 */
+		set_recommended_min_free_kbytes();
 	}
-	return ret;
+	return count;
 }
 
 static struct kobj_attribute enabled_attr = __ATTR_RW(enabled);

-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages
  2026-03-09 11:07 [PATCH v4 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
                   ` (2 preceding siblings ...)
  2026-03-09 11:07 ` [PATCH v4 3/4] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
@ 2026-03-09 11:07 ` Breno Leitao
  2026-03-09 13:33   ` Lorenzo Stoakes (Oracle)
                     ` (2 more replies)
  3 siblings, 3 replies; 12+ messages in thread
From: Breno Leitao @ 2026-03-09 11:07 UTC (permalink / raw)
  To: Andrew Morton, David Hildenbrand, Lorenzo Stoakes, Zi Yan,
	Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Breno Leitao

The "raising min_free_kbytes" pr_info message in
set_recommended_min_free_kbytes() and the "min_free_kbytes is not
updated to" pr_warn in calculate_min_free_kbytes() can spam the
kernel log when called repeatedly.

Switch the pr_info in set_recommended_min_free_kbytes() and the
pr_warn in calculate_min_free_kbytes() to their _ratelimited variants
to prevent the log spam for this message.

Signed-off-by: Breno Leitao <leitao@debian.org>
---
 mm/khugepaged.c | 4 ++--
 mm/page_alloc.c | 4 ++--
 2 files changed, 4 insertions(+), 4 deletions(-)

diff --git a/mm/khugepaged.c b/mm/khugepaged.c
index 56a41c21b44c9..d44d463ccfd3e 100644
--- a/mm/khugepaged.c
+++ b/mm/khugepaged.c
@@ -2671,8 +2671,8 @@ void set_recommended_min_free_kbytes(void)
 
 	if (recommended_min > min_free_kbytes) {
 		if (user_min_free_kbytes >= 0)
-			pr_info("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n",
-				min_free_kbytes, recommended_min);
+			pr_info_ratelimited("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n",
+					    min_free_kbytes, recommended_min);
 
 		min_free_kbytes = recommended_min;
 	}
diff --git a/mm/page_alloc.c b/mm/page_alloc.c
index 2d4b6f1a554ed..c840c886807bf 100644
--- a/mm/page_alloc.c
+++ b/mm/page_alloc.c
@@ -6553,8 +6553,8 @@ void calculate_min_free_kbytes(void)
 	if (new_min_free_kbytes > user_min_free_kbytes)
 		min_free_kbytes = clamp(new_min_free_kbytes, 128, 262144);
 	else
-		pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
-				new_min_free_kbytes, user_min_free_kbytes);
+		pr_warn_ratelimited("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
+				    new_min_free_kbytes, user_min_free_kbytes);
 
 }
 

-- 
2.47.3



^ permalink raw reply related	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes()
  2026-03-09 11:07 ` [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
@ 2026-03-09 13:30   ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-09 13:30 UTC (permalink / raw)
  To: Breno Leitao, Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle)

On 3/9/26 12:07, Breno Leitao wrote:
> Make set_recommended_min_free_kbytes() callable from outside
> khugepaged.c by removing the static qualifier and adding a
> declaration in mm/internal.h.
> 
> This allows callers that change THP settings to recalculate
> watermarks without going through start_stop_khugepaged().
> 
> Suggested-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
>  mm/internal.h   | 5 +++++
>  mm/khugepaged.c | 2 +-
>  2 files changed, 6 insertions(+), 1 deletion(-)
> 
> diff --git a/mm/internal.h b/mm/internal.h
> index cb0af847d7d99..7bd768e367793 100644
> --- a/mm/internal.h
> +++ b/mm/internal.h
> @@ -623,6 +623,11 @@ int user_proactive_reclaim(char *buf,
>   */
>  pmd_t *mm_find_pmd(struct mm_struct *mm, unsigned long address);
>  
> +/*
> + * in mm/khugepaged.c
> + */
> +void set_recommended_min_free_kbytes(void);
> +
>  /*
>   * in mm/page_alloc.c
>   */
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 1dd3cfca610db..56a41c21b44c9 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2630,7 +2630,7 @@ static int khugepaged(void *none)
>  	return 0;
>  }
>  
> -static void set_recommended_min_free_kbytes(void)
> +void set_recommended_min_free_kbytes(void)
>  {
>  	struct zone *zone;
>  	int nr_zones = 0;
> 

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages
  2026-03-09 11:07 ` [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
@ 2026-03-09 13:33   ` Lorenzo Stoakes (Oracle)
  2026-03-09 13:46   ` David Hildenbrand (Arm)
  2026-03-10  3:02   ` Baolin Wang
  2 siblings, 0 replies; 12+ messages in thread
From: Lorenzo Stoakes (Oracle) @ 2026-03-09 13:33 UTC (permalink / raw)
  To: Breno Leitao
  Cc: Andrew Morton, David Hildenbrand, Lorenzo Stoakes (Oracle),
	Zi Yan, Baolin Wang, Liam R. Howlett, Nico Pache, Ryan Roberts,
	Dev Jain, Barry Song, Lance Yang, Vlastimil Babka,
	Suren Baghdasaryan, Michal Hocko, Brendan Jackman,
	Johannes Weiner, Mike Rapoport, linux-mm, linux-kernel,
	usamaarif642, kas, kernel-team

-cc old mail +cc new one :) [it'll take a while for this to propagate I know]

On Mon, Mar 09, 2026 at 04:07:33AM -0700, Breno Leitao wrote:
> The "raising min_free_kbytes" pr_info message in
> set_recommended_min_free_kbytes() and the "min_free_kbytes is not
> updated to" pr_warn in calculate_min_free_kbytes() can spam the
> kernel log when called repeatedly.
>
> Switch the pr_info in set_recommended_min_free_kbytes() and the
> pr_warn in calculate_min_free_kbytes() to their _ratelimited variants
> to prevent the log spam for this message.
>
> Signed-off-by: Breno Leitao <leitao@debian.org>

LGTM, so:

Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>

> ---
>  mm/khugepaged.c | 4 ++--
>  mm/page_alloc.c | 4 ++--
>  2 files changed, 4 insertions(+), 4 deletions(-)
>
> diff --git a/mm/khugepaged.c b/mm/khugepaged.c
> index 56a41c21b44c9..d44d463ccfd3e 100644
> --- a/mm/khugepaged.c
> +++ b/mm/khugepaged.c
> @@ -2671,8 +2671,8 @@ void set_recommended_min_free_kbytes(void)
>
>  	if (recommended_min > min_free_kbytes) {
>  		if (user_min_free_kbytes >= 0)
> -			pr_info("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n",
> -				min_free_kbytes, recommended_min);
> +			pr_info_ratelimited("raising min_free_kbytes from %d to %lu to help transparent hugepage allocations\n",
> +					    min_free_kbytes, recommended_min);
>
>  		min_free_kbytes = recommended_min;
>  	}
> diff --git a/mm/page_alloc.c b/mm/page_alloc.c
> index 2d4b6f1a554ed..c840c886807bf 100644
> --- a/mm/page_alloc.c
> +++ b/mm/page_alloc.c
> @@ -6553,8 +6553,8 @@ void calculate_min_free_kbytes(void)
>  	if (new_min_free_kbytes > user_min_free_kbytes)
>  		min_free_kbytes = clamp(new_min_free_kbytes, 128, 262144);
>  	else
> -		pr_warn("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
> -				new_min_free_kbytes, user_min_free_kbytes);
> +		pr_warn_ratelimited("min_free_kbytes is not updated to %d because user defined value %d is preferred\n",
> +				    new_min_free_kbytes, user_min_free_kbytes);
>
>  }
>
>
> --
> 2.47.3
>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
  2026-03-09 11:07 ` [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
@ 2026-03-09 13:43   ` David Hildenbrand (Arm)
  2026-03-10 16:31     ` Breno Leitao
  0 siblings, 1 reply; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-09 13:43 UTC (permalink / raw)
  To: Breno Leitao, Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle)

On 3/9/26 12:07, Breno Leitao wrote:
> Consolidate the repeated spin_lock/set_bit/clear_bit pattern in
> anon_enabled_store() into a new change_anon_orders() helper that
> loops over an orders[] array, setting the bit for the selected mode
> and clearing the others.
> 
> Introduce enum anon_enabled_mode and anon_enabled_mode_strings[]
> for the per-order anon THP setting.
> 
> Use sysfs_match_string() with the anon_enabled_mode_strings[] table
> to replace the if/else chain of sysfs_streq() calls.
> 
> The helper uses test_and_set_bit()/test_and_clear_bit() to track
> whether the state actually changed, so start_stop_khugepaged() is
> only called when needed. When the mode is unchanged,
> set_recommended_min_free_kbytes() is called directly to preserve
> the watermark recalculation behavior of the original code.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
>  mm/huge_memory.c | 84 +++++++++++++++++++++++++++++++++++---------------------
>  1 file changed, 52 insertions(+), 32 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 8e2746ea74adf..2d5b05a416dab 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -316,6 +316,20 @@ static ssize_t enabled_show(struct kobject *kobj,
>  	return sysfs_emit(buf, "%s\n", output);
>  }
>  
> +enum anon_enabled_mode {
> +	ANON_ENABLED_ALWAYS	= 0,
> +	ANON_ENABLED_MADVISE	= 1,
> +	ANON_ENABLED_INHERIT	= 2,
> +	ANON_ENABLED_NEVER	= 3,
> +};
> +
> +static const char * const anon_enabled_mode_strings[] = {
> +	[ANON_ENABLED_ALWAYS]	= "always",
> +	[ANON_ENABLED_MADVISE]	= "madvise",
> +	[ANON_ENABLED_INHERIT]	= "inherit",
> +	[ANON_ENABLED_NEVER]	= "never",
> +};
> +
>  static ssize_t enabled_store(struct kobject *kobj,
>  			     struct kobj_attribute *attr,
>  			     const char *buf, size_t count)
> @@ -515,48 +529,54 @@ static ssize_t anon_enabled_show(struct kobject *kobj,
>  	return sysfs_emit(buf, "%s\n", output);
>  }
>  
> +static bool change_anon_orders(int order, enum anon_enabled_mode mode)

I would suggest something a bit longer but clearer

"set_anon_enabled_mode_for_order()"

Or shorter

"set_anon_enabled_mode"

1) set vs. change. the function returns whether actually something
   changed.

2) We're not really changing "anon_orders". Yeah, we're updating
   variables that are named "huge_anon_orders_XXX", but that's more an
   implementation detail when setting the anon_enabled mode for a
   specific order.

> +{
> +	static unsigned long *orders[] = {
> +		&huge_anon_orders_always,
> +		&huge_anon_orders_madvise,
> +		&huge_anon_orders_inherit,
> +	};

Having a "order" and "orders" variable that have different semantics is
a bit confusing. Don't really have a better suggestion. "enabled_orders"
? hm.


> +	enum anon_enabled_mode m;
> +	bool changed = false;
> +
> +	spin_lock(&huge_anon_orders_lock);
> +	for (m = 0; m < ARRAY_SIZE(orders); m++) {
> +		if (m == mode)
> +			changed |= !test_and_set_bit(order, orders[m]);
> +		else
> +			changed |= test_and_clear_bit(order, orders[m]);
> +	}

Can we use the non-atomic variant here? __test_and_set_bit(). Just
wondering as the lock protects concurrent modifications.


> +	spin_unlock(&huge_anon_orders_lock);
> +
> +	return changed;
> +}
> +

Apart from that LGTM.

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 3/4] mm: huge_memory: refactor enabled_store() with change_enabled()
  2026-03-09 11:07 ` [PATCH v4 3/4] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
@ 2026-03-09 13:45   ` David Hildenbrand (Arm)
  0 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-09 13:45 UTC (permalink / raw)
  To: Breno Leitao, Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle)

On 3/9/26 12:07, Breno Leitao wrote:
> Refactor enabled_store() to use a new change_enabled() helper.
> Introduce a separate enum global_enabled_mode and
> global_enabled_mode_strings[], mirroring the anon_enabled_mode
> pattern from the previous commit.
> 
> A separate enum is necessary because the global THP setting does
> not support "inherit", only "always", "madvise", and "never".
> Reusing anon_enabled_mode would leave a NULL gap in the string
> array, causing sysfs_match_string() to stop early and fail to
> match entries after the gap.
> 
> The helper uses the same loop pattern as change_anon_orders(),
> iterating over an array of flag bit positions and using
> test_and_set_bit()/test_and_clear_bit() to track whether the state
> actually changed.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> Reviewed-by: Lorenzo Stoakes (Oracle) <ljs@kernel.org>
> ---
>  mm/huge_memory.c | 63 ++++++++++++++++++++++++++++++++++++++++++--------------
>  1 file changed, 48 insertions(+), 15 deletions(-)
> 
> diff --git a/mm/huge_memory.c b/mm/huge_memory.c
> index 2d5b05a416dab..be42a28da31d8 100644
> --- a/mm/huge_memory.c
> +++ b/mm/huge_memory.c
> @@ -330,30 +330,63 @@ static const char * const anon_enabled_mode_strings[] = {
>  	[ANON_ENABLED_NEVER]	= "never",
>  };
>  
> +enum global_enabled_mode {
> +	GLOBAL_ENABLED_ALWAYS	= 0,
> +	GLOBAL_ENABLED_MADVISE	= 1,
> +	GLOBAL_ENABLED_NEVER	= 2,
> +};
> +
> +static const char * const global_enabled_mode_strings[] = {
> +	[GLOBAL_ENABLED_ALWAYS]		= "always",
> +	[GLOBAL_ENABLED_MADVISE]	= "madvise",
> +	[GLOBAL_ENABLED_NEVER]		= "never",
> +};
> +
> +static bool change_enabled(enum global_enabled_mode mode)

I'd similarly call this something like "set_global_enabled_mode"


-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages
  2026-03-09 11:07 ` [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
  2026-03-09 13:33   ` Lorenzo Stoakes (Oracle)
@ 2026-03-09 13:46   ` David Hildenbrand (Arm)
  2026-03-10  3:02   ` Baolin Wang
  2 siblings, 0 replies; 12+ messages in thread
From: David Hildenbrand (Arm) @ 2026-03-09 13:46 UTC (permalink / raw)
  To: Breno Leitao, Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team

On 3/9/26 12:07, Breno Leitao wrote:
> The "raising min_free_kbytes" pr_info message in
> set_recommended_min_free_kbytes() and the "min_free_kbytes is not
> updated to" pr_warn in calculate_min_free_kbytes() can spam the
> kernel log when called repeatedly.
> 
> Switch the pr_info in set_recommended_min_free_kbytes() and the
> pr_warn in calculate_min_free_kbytes() to their _ratelimited variants
> to prevent the log spam for this message.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---

Acked-by: David Hildenbrand (Arm) <david@kernel.org>

-- 
Cheers,

David


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages
  2026-03-09 11:07 ` [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
  2026-03-09 13:33   ` Lorenzo Stoakes (Oracle)
  2026-03-09 13:46   ` David Hildenbrand (Arm)
@ 2026-03-10  3:02   ` Baolin Wang
  2 siblings, 0 replies; 12+ messages in thread
From: Baolin Wang @ 2026-03-10  3:02 UTC (permalink / raw)
  To: Breno Leitao, Andrew Morton, David Hildenbrand, Lorenzo Stoakes,
	Zi Yan, Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain,
	Barry Song, Lance Yang, Vlastimil Babka, Suren Baghdasaryan,
	Michal Hocko, Brendan Jackman, Johannes Weiner, Mike Rapoport
  Cc: linux-mm, linux-kernel, usamaarif642, kas, kernel-team



On 3/9/26 7:07 PM, Breno Leitao wrote:
> The "raising min_free_kbytes" pr_info message in
> set_recommended_min_free_kbytes() and the "min_free_kbytes is not
> updated to" pr_warn in calculate_min_free_kbytes() can spam the
> kernel log when called repeatedly.
> 
> Switch the pr_info in set_recommended_min_free_kbytes() and the
> pr_warn in calculate_min_free_kbytes() to their _ratelimited variants
> to prevent the log spam for this message.
> 
> Signed-off-by: Breno Leitao <leitao@debian.org>
> ---

LGTM.
Reviewed-by: Baolin Wang <baolin.wang@linux.alibaba.com>


^ permalink raw reply	[flat|nested] 12+ messages in thread

* Re: [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders()
  2026-03-09 13:43   ` David Hildenbrand (Arm)
@ 2026-03-10 16:31     ` Breno Leitao
  0 siblings, 0 replies; 12+ messages in thread
From: Breno Leitao @ 2026-03-10 16:31 UTC (permalink / raw)
  To: David Hildenbrand (Arm)
  Cc: Andrew Morton, Lorenzo Stoakes, Zi Yan, Baolin Wang,
	Liam R. Howlett, Nico Pache, Ryan Roberts, Dev Jain, Barry Song,
	Lance Yang, Vlastimil Babka, Suren Baghdasaryan, Michal Hocko,
	Brendan Jackman, Johannes Weiner, Mike Rapoport, linux-mm,
	linux-kernel, usamaarif642, kas, kernel-team,
	Lorenzo Stoakes (Oracle)

On Mon, Mar 09, 2026 at 02:43:11PM +0100, David Hildenbrand (Arm) wrote:
> On 3/9/26 12:07, Breno Leitao wrote:
> > +static bool change_anon_orders(int order, enum anon_enabled_mode mode)
> 
> I would suggest something a bit longer but clearer
> 
> "set_anon_enabled_mode_for_order()"
> 
> Or shorter
> 
> "set_anon_enabled_mode"
set_anon_enabled_mode() seems to be better. Then I have:

set_global_enabled_mode() and set_anon_enabled_mode().

> 1) set vs. change. the function returns whether actually something
>    changed.
> 
> 2) We're not really changing "anon_orders". Yeah, we're updating
>    variables that are named "huge_anon_orders_XXX", but that's more an
>    implementation detail when setting the anon_enabled mode for a
>    specific order.
> 
> > +{
> > +	static unsigned long *orders[] = {
> > +		&huge_anon_orders_always,
> > +		&huge_anon_orders_madvise,
> > +		&huge_anon_orders_inherit,
> > +	};
> 
> Having a "order" and "orders" variable that have different semantics is
> a bit confusing. Don't really have a better suggestion. "enabled_orders"
> ? hm.

Ack. renaming to enabled_orders.

> > +	enum anon_enabled_mode m;
> > +	bool changed = false;
> > +
> > +	spin_lock(&huge_anon_orders_lock);
> > +	for (m = 0; m < ARRAY_SIZE(orders); m++) {
> > +		if (m == mode)
> > +			changed |= !test_and_set_bit(order, orders[m]);
> > +		else
> > +			changed |= test_and_clear_bit(order, orders[m]);
> > +	}
> 
> Can we use the non-atomic variant here? __test_and_set_bit(). Just
> wondering as the lock protects concurrent modifications.

Ack!

I will respin a new version,
--breno


^ permalink raw reply	[flat|nested] 12+ messages in thread

end of thread, other threads:[~2026-03-10 16:32 UTC | newest]

Thread overview: 12+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-09 11:07 [PATCH v4 0/4] mm: thp: reduce unnecessary start_stop_khugepaged() calls Breno Leitao
2026-03-09 11:07 ` [PATCH v4 1/4] mm: khugepaged: export set_recommended_min_free_kbytes() Breno Leitao
2026-03-09 13:30   ` David Hildenbrand (Arm)
2026-03-09 11:07 ` [PATCH v4 2/4] mm: huge_memory: refactor anon_enabled_store() with change_anon_orders() Breno Leitao
2026-03-09 13:43   ` David Hildenbrand (Arm)
2026-03-10 16:31     ` Breno Leitao
2026-03-09 11:07 ` [PATCH v4 3/4] mm: huge_memory: refactor enabled_store() with change_enabled() Breno Leitao
2026-03-09 13:45   ` David Hildenbrand (Arm)
2026-03-09 11:07 ` [PATCH v4 4/4] mm: ratelimit min_free_kbytes adjustment messages Breno Leitao
2026-03-09 13:33   ` Lorenzo Stoakes (Oracle)
2026-03-09 13:46   ` David Hildenbrand (Arm)
2026-03-10  3:02   ` Baolin Wang

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox