DAMON development mailing list
 help / color / mirror / Atom feed
* [RFC PATCH v2 0/3] mm/damon/modules: detect and use fresh status
@ 2026-04-18  1:44 SeongJae Park
  2026-04-18  1:44 ` [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values SeongJae Park
                   ` (2 more replies)
  0 siblings, 3 replies; 10+ messages in thread
From: SeongJae Park @ 2026-04-18  1:44 UTC (permalink / raw)
  Cc: SeongJae Park, # 5 . 19 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

DAMON modules including DAMON_RECLAIM, DAMON_LRU_SORT and DAMON_STAT
commonly expose the kdamond running status via their parameters.  Under
certain scenarios including wrong user inputs and memory allocation
failures, those parameter values can be stale.  It can confuse users.
For DAMON_RECLAIM and DAMON_LRU_SORT, it even makes the kdamond unable
to be restarted before the system reboot.

The problem comes from the fact that there are multiple events for the
status changes and it is difficult to follow up all the scenarios.  Fix
the issue by detecting and using the status on demand, instead of using
a cached status that is difficult to be updated.

Patches 1-3 fix the bugs in DAMON_RECLAIM, DAMON_LRU_SORT and DAMON_STAT
in the order.

Changes from v2
- v2: https://lore.kernel.org/20260413185249.5921-1-aethernet65535@gmail.com
- Add RFC tag back, for sashiko review.
- Detect and use fresh status instead of trying to catch up all scenarios.
- Change Liew from the responsible author to a credit-deserved co-developer.
- Move authorship responsibility to SJ.
- Add DAMON_STAT fix.
  - RFC of the fix was posted separately
    (https://lore.kernel.org/20260416143857.76146-1-sj@kernel.org), and
    only commit message wordsmithing is added in this version.
Changes from RFC
- rfc: https://lore.kernel.org/20260330164347.12772-1-aethernet65535@gmail.com
- Remove RFC tag.
- Remove 'damon_thread_status' structure and damon_update_thread_status()
  (SJ pointed out this was too much extension of core API for a problem
  that can be fixed more simply).
- Add a fallback in damon_{lru_sort, reclaim}_turn() 'N' path. If
  damon_stop() fails but kdamond is not running, forcefully reset the
  parameters.
- Reset 'enabled' and 'kdamond_pid' when damon_commit_ctx() fails in
  damon_{lru_sort, reclaim}_apply_parameters() (kdamond will terminate
  eventually in this case).

SeongJae Park (3):
  mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
  mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
  mm/damon/stat: detect and use fresh enabled value

 mm/damon/lru_sort.c | 63 ++++++++++++++++++++++++++++++---------------
 mm/damon/reclaim.c  | 63 ++++++++++++++++++++++++++++++---------------
 mm/damon/stat.c     | 30 ++++++++++++++-------
 3 files changed, 104 insertions(+), 52 deletions(-)


base-commit: 045e2ae4d82f0ee748f2f72fe64696c7da155b1c
-- 
2.47.3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
  2026-04-18  1:44 [RFC PATCH v2 0/3] mm/damon/modules: detect and use fresh status SeongJae Park
@ 2026-04-18  1:44 ` SeongJae Park
  2026-04-18  4:57   ` sashiko-bot
  2026-04-18  1:44 ` [RFC PATCH v2 2/3] mm/damon/lru_sort: " SeongJae Park
  2026-04-18  1:44 ` [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value SeongJae Park
  2 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2026-04-18  1:44 UTC (permalink / raw)
  Cc: SeongJae Park, # 5 . 19 . x, Andrew Morton, damon, linux-kernel,
	linux-mm, Liew Rui Yan

DAMON_RECLAIM updates 'enabled' and 'kdamond_pid' parameter values,
which represents the running status of its kdamond, when the user
explicitly requests start/stop of the kdamond.  The kdamond can,
however, be stopped in events other than the explicit user request in
the following three events.

1. ctx->regions_score_histogram allocation failure at beginning of the
   execution,
2. damon_commit_ctx() failure due to invalid user input, and
3. damon_commit_ctx() failure due to its internal allocation failures.

Hence, if the kdamond is stopped by the above three events, the values
of the status parameters can be stale.  Users could show the stale
values and be confused.  This is already bad, but the real consequence
is worse.  DAMON_RECLAIM avoids unnecessary damon_start() and
damon_stop() calls based on the 'enabled' parameter value.  And the
update of 'enabled' parameter value depends on the damon_start() and
damon_stop() call results.  Hence, once the kdamond has stopped by the
unintentional events, the user cannot restart the kdamond before the
system reboot.  For example, the issue can be reproduced via below
steps.

    # cd /sys/module/damon_reclaim/parameters
    #
    # # start DAMON_RECLAIM
    # echo Y > enabled
    # ps -ef | grep kdamond
    root         806       2  0 17:53 ?        00:00:00 [kdamond.0]
    root         808     803  0 17:53 pts/4    00:00:00 grep kdamond
    #
    # # commit wrong input to stop kdamond withou explicit stop request
    # echo 3 > addr_unit
    # echo Y > commit_inputs
    bash: echo: write error: Invalid argument
    #
    # # confirm kdamond is stopped
    # ps -ef | grep kdamond
    root         811     803  0 17:53 pts/4    00:00:00 grep kdamond
    #
    # # users casn now show stable status
    # cat enabled
    Y
    # cat kdamond_pid
    806
    #
    # # even after fixing the wrong parameter,
    # # kdamond cannot be restarted.
    # echo 1 > addr_unit
    # echo Y > enabled
    # ps -ef | grep kdamond
    root         815     803  0 17:54 pts/4    00:00:00 grep kdamond

The problem will only rarely happen in real and common setups for the
following reasons.  The allocation failures are unlikely in such setups
since those allocations are arguably too small to fail.  Also sane users
on real production environments may not commit wrong input parameters.
But once it happens, the consequence is quite bad.  And the bug is a
bug.

The issue stems from the fact that there are multiple events that can
change the status, and following all the events is challenging.
Dynamically detect and use the fresh status for the parameters when
those are requested.

Fixes: e035c280f6df ("mm/damon/reclaim: support online inputs update")
Cc: <stable@vger.kernel.org> # 5.19.x
Co-developed-by: Liew Rui Yan <aethernet65535@gmail.com>
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/reclaim.c | 63 ++++++++++++++++++++++++++++++----------------
 1 file changed, 42 insertions(+), 21 deletions(-)

diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index 86da147786583..46660c8d4d422 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -144,15 +144,6 @@ static unsigned long addr_unit __read_mostly = 1;
 static bool skip_anon __read_mostly;
 module_param(skip_anon, bool, 0600);
 
-/*
- * PID of the DAMON thread
- *
- * If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread.
- * Else, -1.
- */
-static int kdamond_pid __read_mostly = -1;
-module_param(kdamond_pid, int, 0400);
-
 static struct damos_stat damon_reclaim_stat;
 DEFINE_DAMON_MODULES_DAMOS_STATS_PARAMS(damon_reclaim_stat,
 		reclaim_tried_regions, reclaimed_regions, quota_exceeds);
@@ -288,12 +279,8 @@ static int damon_reclaim_turn(bool on)
 {
 	int err;
 
-	if (!on) {
-		err = damon_stop(&ctx, 1);
-		if (!err)
-			kdamond_pid = -1;
-		return err;
-	}
+	if (!on)
+		return damon_stop(&ctx, 1);
 
 	err = damon_reclaim_apply_parameters();
 	if (err)
@@ -302,9 +289,6 @@ static int damon_reclaim_turn(bool on)
 	err = damon_start(&ctx, 1, true);
 	if (err)
 		return err;
-	kdamond_pid = damon_kdamond_pid(ctx);
-	if (kdamond_pid < 0)
-		return kdamond_pid;
 	return damon_call(ctx, &call_control);
 }
 
@@ -332,10 +316,16 @@ module_param_cb(addr_unit, &addr_unit_param_ops, &addr_unit, 0600);
 MODULE_PARM_DESC(addr_unit,
 	"Scale factor for DAMON_RECLAIM to ops address conversion (default: 1)");
 
+static bool damon_reclaim_enabled(void)
+{
+	if (!ctx)
+		return false;
+	return damon_is_running(ctx);
+}
+
 static int damon_reclaim_enabled_store(const char *val,
 		const struct kernel_param *kp)
 {
-	bool is_enabled = enabled;
 	bool enable;
 	int err;
 
@@ -343,7 +333,7 @@ static int damon_reclaim_enabled_store(const char *val,
 	if (err)
 		return err;
 
-	if (is_enabled == enable)
+	if (damon_reclaim_enabled() == enable)
 		return 0;
 
 	/* Called before init function.  The function will handle this. */
@@ -359,15 +349,46 @@ static int damon_reclaim_enabled_store(const char *val,
 	return err;
 }
 
+static int damon_reclaim_enabled_load(char *buffer,
+		const struct kernel_param *kp)
+{
+	return sprintf(buffer, "%c\n", damon_reclaim_enabled() ? 'Y' : 'N');
+}
+
 static const struct kernel_param_ops enabled_param_ops = {
 	.set = damon_reclaim_enabled_store,
-	.get = param_get_bool,
+	.get = damon_reclaim_enabled_load,
 };
 
 module_param_cb(enabled, &enabled_param_ops, &enabled, 0600);
 MODULE_PARM_DESC(enabled,
 	"Enable or disable DAMON_RECLAIM (default: disabled)");
 
+static int damon_reclaim_kdamond_pid_load(char *buffer,
+		const struct kernel_param *kp)
+{
+	int kdamond_pid = -1;
+
+	if (ctx) {
+		kdamond_pid = damon_kdamond_pid(ctx);
+		if (kdamond_pid < 0)
+			kdamond_pid = -1;
+	}
+	return sprintf(buffer, "%d\n", kdamond_pid);
+}
+
+static const struct kernel_param_ops kdamond_pid_param_ops = {
+	.get = damon_reclaim_kdamond_pid_load,
+};
+
+/*
+ * PID of the DAMON thread
+ *
+ * If DAMON_RECLAIM is enabled, this becomes the PID of the worker thread.
+ * Else, -1.
+ */
+module_param_cb(kdamond_pid, &kdamond_pid_param_ops, NULL, 0400);
+
 static int __init damon_reclaim_init(void)
 {
 	int err;
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH v2 2/3] mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
  2026-04-18  1:44 [RFC PATCH v2 0/3] mm/damon/modules: detect and use fresh status SeongJae Park
  2026-04-18  1:44 ` [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values SeongJae Park
@ 2026-04-18  1:44 ` SeongJae Park
  2026-04-18  5:30   ` sashiko-bot
  2026-04-18  1:44 ` [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value SeongJae Park
  2 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2026-04-18  1:44 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 0 . x, Andrew Morton, damon, linux-kernel,
	linux-mm, Liew Rui Yan

DAMON_LRU_SORT updates 'enabled' and 'kdamond_pid' parameter values,
which represents the running status of its kdamond, when the user
explicitly requests start/stop of the kdamond.  The kdamond can,
however, be stopped in events other than the explicit user request in
the following three events.

1. ctx->regions_score_histogram allocation failure at beginning of the
   execution,
2. damon_commit_ctx() failure due to invalid user input, and
3. damon_commit_ctx() failure due to its internal allocation failures.

Hence, if the kdamond is stopped by the above three events, the values
of the status parameters can be stale.  Users could show the stale
values and be confused.  This is already bad, but the real consequence
is worse.  DAMON_LRU_SORT avoids unnecessary damon_start() and
damon_stop() calls based on the 'enabled' parameter value.  And the
update of 'enabled' parameter value depends on the damon_start() and
damon_stop() call results.  Hence, once the kdamond has stopped by the
unintentional events, the user cannot restart the kdamond before the
system reboot.  For example, the issue can be reproduced via below
steps.

    # cd /sys/module/damon_lru_sort/parameters
    #
    # # start DAMON_LRU_SORT
    # echo Y > enabled
    # ps -ef | grep kdamond
    root         806       2  0 17:53 ?        00:00:00 [kdamond.0]
    root         808     803  0 17:53 pts/4    00:00:00 grep kdamond
    #
    # # commit wrong input to stop kdamond withou explicit stop request
    # echo 3 > addr_unit
    # echo Y > commit_inputs
    bash: echo: write error: Invalid argument
    #
    # # confirm kdamond is stopped
    # ps -ef | grep kdamond
    root         811     803  0 17:53 pts/4    00:00:00 grep kdamond
    #
    # # users casn now show stable status
    # cat enabled
    Y
    # cat kdamond_pid
    806
    #
    # # even after fixing the wrong parameter,
    # # kdamond cannot be restarted.
    # echo 1 > addr_unit
    # echo Y > enabled
    # ps -ef | grep kdamond
    root         815     803  0 17:54 pts/4    00:00:00 grep kdamond

The problem will only rarely happen in real and common setups for the
following reasons.  The allocation failures are unlikely in such setups
since those allocations are arguably too small to fail.  Also sane users
on real production environments may not commit wrong input parameters.
But once it happens, the consequence is quite bad.  And the bug is a
bug.

The issue stems from the fact that there are multiple events that can
change the status, and following all the events is challenging.
Dynamically detect and use the fresh status for the parameters when
those are requested.

Fixes: 40e983cca927 ("mm/damon: introduce DAMON-based LRU-lists Sorting")
Cc: <stable@vger.kernel.org> # 6.0.x
Co-developed-by: Liew Rui Yan <aethernet65535@gmail.com>
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/lru_sort.c | 63 ++++++++++++++++++++++++++++++---------------
 1 file changed, 42 insertions(+), 21 deletions(-)

diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
index 554559d729760..d284cb336b662 100644
--- a/mm/damon/lru_sort.c
+++ b/mm/damon/lru_sort.c
@@ -161,15 +161,6 @@ module_param(monitor_region_end, ulong, 0600);
  */
 static unsigned long addr_unit __read_mostly = 1;
 
-/*
- * PID of the DAMON thread
- *
- * If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread.
- * Else, -1.
- */
-static int kdamond_pid __read_mostly = -1;
-module_param(kdamond_pid, int, 0400);
-
 static struct damos_stat damon_lru_sort_hot_stat;
 DEFINE_DAMON_MODULES_DAMOS_STATS_PARAMS(damon_lru_sort_hot_stat,
 		lru_sort_tried_hot_regions, lru_sorted_hot_regions,
@@ -386,12 +377,8 @@ static int damon_lru_sort_turn(bool on)
 {
 	int err;
 
-	if (!on) {
-		err = damon_stop(&ctx, 1);
-		if (!err)
-			kdamond_pid = -1;
-		return err;
-	}
+	if (!on)
+		return damon_stop(&ctx, 1);
 
 	err = damon_lru_sort_apply_parameters();
 	if (err)
@@ -400,9 +387,6 @@ static int damon_lru_sort_turn(bool on)
 	err = damon_start(&ctx, 1, true);
 	if (err)
 		return err;
-	kdamond_pid = damon_kdamond_pid(ctx);
-	if (kdamond_pid < 0)
-		return kdamond_pid;
 	return damon_call(ctx, &call_control);
 }
 
@@ -430,10 +414,16 @@ module_param_cb(addr_unit, &addr_unit_param_ops, &addr_unit, 0600);
 MODULE_PARM_DESC(addr_unit,
 	"Scale factor for DAMON_LRU_SORT to ops address conversion (default: 1)");
 
+static bool damon_lru_sort_enabled(void)
+{
+	if (!ctx)
+		return false;
+	return damon_is_running(ctx);
+}
+
 static int damon_lru_sort_enabled_store(const char *val,
 		const struct kernel_param *kp)
 {
-	bool is_enabled = enabled;
 	bool enable;
 	int err;
 
@@ -441,7 +431,7 @@ static int damon_lru_sort_enabled_store(const char *val,
 	if (err)
 		return err;
 
-	if (is_enabled == enable)
+	if (damon_lru_sort_enabled() == enable)
 		return 0;
 
 	/* Called before init function.  The function will handle this. */
@@ -457,15 +447,46 @@ static int damon_lru_sort_enabled_store(const char *val,
 	return err;
 }
 
+static int damon_lru_sort_enabled_load(char *buffer,
+		const struct kernel_param *kp)
+{
+	return sprintf(buffer, "%c\n", damon_lru_sort_enabled() ? 'Y' : 'N');
+}
+
 static const struct kernel_param_ops enabled_param_ops = {
 	.set = damon_lru_sort_enabled_store,
-	.get = param_get_bool,
+	.get = damon_lru_sort_enabled_load,
 };
 
 module_param_cb(enabled, &enabled_param_ops, &enabled, 0600);
 MODULE_PARM_DESC(enabled,
 	"Enable or disable DAMON_LRU_SORT (default: disabled)");
 
+static int damon_lru_sort_kdamond_pid_load(char *buffer,
+		const struct kernel_param *kp)
+{
+	int kdamond_pid = -1;
+
+	if (ctx) {
+		kdamond_pid = damon_kdamond_pid(ctx);
+		if (kdamond_pid < 0)
+			kdamond_pid = -1;
+	}
+	return sprintf(buffer, "%d\n", kdamond_pid);
+}
+
+static const struct kernel_param_ops kdamond_pid_param_ops = {
+	.get = damon_lru_sort_kdamond_pid_load,
+};
+
+/*
+ * PID of the DAMON thread
+ *
+ * If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread.
+ * Else, -1.
+ */
+module_param_cb(kdamond_pid, &kdamond_pid_param_ops, NULL, 0400);
+
 static int __init damon_lru_sort_init(void)
 {
 	int err;
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value
  2026-04-18  1:44 [RFC PATCH v2 0/3] mm/damon/modules: detect and use fresh status SeongJae Park
  2026-04-18  1:44 ` [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values SeongJae Park
  2026-04-18  1:44 ` [RFC PATCH v2 2/3] mm/damon/lru_sort: " SeongJae Park
@ 2026-04-18  1:44 ` SeongJae Park
  2026-04-18  5:54   ` sashiko-bot
  2 siblings, 1 reply; 10+ messages in thread
From: SeongJae Park @ 2026-04-18  1:44 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 17 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

DAMON_STAT updates 'enabled' parameter value, which represents the
running status of its kdamond, when the user explicitly requests
start/stop of the kdamond.  The kdamond can, however, be stopped even if
the user explicitly requested the stop, if ctx->regions_score_histogram
allocation failure at beginning of the execution of the kdamond.  Hence,
if the kdamond is stopped by the allocation failure, the value of the
parameter can be stale.

Users could show the stale value and be confused.  The problem will only
rarely happen in real and common setups because the allocation is
arguably too small to fail.  Also, unlike the similar bugs that are now
fixed in DAMON_RECLAIM and DAMON_LRU_SORT, kdamond can be restarted in
this case, because DAMON_STAT force-updates the enabled parameter value
for user inputs.  The bug is a bug, though.

The issue stems from the fact that there are multiple events that can
change the status, and following all the events is challenging.
Dynamically detect and use the fresh status for the parameters when
those are requested.

Fixes: 369c415e6073 ("mm/damon: introduce DAMON_STAT module")
Cc: <stable@vger.kernel.org> # 6.17.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 mm/damon/stat.c | 30 ++++++++++++++++++++----------
 1 file changed, 20 insertions(+), 10 deletions(-)

diff --git a/mm/damon/stat.c b/mm/damon/stat.c
index 99ba346f9e325..3951b762cbddf 100644
--- a/mm/damon/stat.c
+++ b/mm/damon/stat.c
@@ -19,14 +19,17 @@
 static int damon_stat_enabled_store(
 		const char *val, const struct kernel_param *kp);
 
+static int damon_stat_enabled_load(char *buffer,
+		const struct kernel_param *kp);
+
 static const struct kernel_param_ops enabled_param_ops = {
 	.set = damon_stat_enabled_store,
-	.get = param_get_bool,
+	.get = damon_stat_enabled_load,
 };
 
 static bool enabled __read_mostly = IS_ENABLED(
 	CONFIG_DAMON_STAT_ENABLED_DEFAULT);
-module_param_cb(enabled, &enabled_param_ops, &enabled, 0600);
+module_param_cb(enabled, &enabled_param_ops, NULL, 0600);
 MODULE_PARM_DESC(enabled, "Enable of disable DAMON_STAT");
 
 static unsigned long estimated_memory_bandwidth __read_mostly;
@@ -273,17 +276,23 @@ static void damon_stat_stop(void)
 	damon_stat_context = NULL;
 }
 
+static bool damon_stat_enabled(void)
+{
+	if (!damon_stat_context)
+		return false;
+	return damon_is_running(damon_stat_context);
+}
+
 static int damon_stat_enabled_store(
 		const char *val, const struct kernel_param *kp)
 {
-	bool is_enabled = enabled;
 	int err;
 
 	err = kstrtobool(val, &enabled);
 	if (err)
 		return err;
 
-	if (is_enabled == enabled)
+	if (damon_stat_enabled() == enabled)
 		return 0;
 
 	if (!damon_initialized())
@@ -293,16 +302,17 @@ static int damon_stat_enabled_store(
 		 */
 		return 0;
 
-	if (enabled) {
-		err = damon_stat_start();
-		if (err)
-			enabled = false;
-		return err;
-	}
+	if (enabled)
+		return damon_stat_start();
 	damon_stat_stop();
 	return 0;
 }
 
+static int damon_stat_enabled_load(char *buffer, const struct kernel_param *kp)
+{
+	return sprintf(buffer, "%c\n", damon_stat_enabled() ? 'Y' : 'N');
+}
+
 static int __init damon_stat_init(void)
 {
 	int err = 0;
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
  2026-04-18  1:44 ` [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values SeongJae Park
@ 2026-04-18  4:57   ` sashiko-bot
  2026-04-18 15:14     ` SeongJae Park
  0 siblings, 1 reply; 10+ messages in thread
From: sashiko-bot @ 2026-04-18  4:57 UTC (permalink / raw)
  To: SeongJae Park; +Cc: damon

Sashiko AI review found 1 potential issue(s):
- [Medium] Command-line override for the 'enabled' parameter is broken: passing 'enabled=N' after 'enabled=Y' fails to disable the module.
--

commit 6f46639cc34c6ddcbecc0aba915988dff8f651de
Author: SeongJae Park <sj@kernel.org>

mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values

This commit updates DAMON_RECLAIM to dynamically query the running status
of kdamond instead of relying on cached parameter values. This ensures
that when kdamond stops unexpectedly, the enabled and kdamond_pid parameters
reflect the correct state.

> diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> index 86da147786583..46660c8d4d422 100644
> --- a/mm/damon/reclaim.c
> +++ b/mm/damon/reclaim.c

[ ... ]

> @@ -332,10 +316,16 @@ module_param_cb(addr_unit, &addr_unit_param_ops, &addr_unit, 0600);
>  MODULE_PARM_DESC(addr_unit,
>  	"Scale factor for DAMON_RECLAIM to ops address conversion (default: 1)");
>  
> +static bool damon_reclaim_enabled(void)
> +{
> +	if (!ctx)
> +		return false;
> +	return damon_is_running(ctx);
> +}
> +
>  static int damon_reclaim_enabled_store(const char *val,
>  		const struct kernel_param *kp)
>  {
> -	bool is_enabled = enabled;
>  	bool enable;
>  	int err;
>  
> @@ -343,7 +333,7 @@ static int damon_reclaim_enabled_store(const char *val,
>  	if (err)
>  		return err;
>  
> -	if (is_enabled == enable)
> +	if (damon_reclaim_enabled() == enable)
>  		return 0;

Does changing this check break kernel command-line overrides during boot?

During early boot when command-line parameters are parsed, ctx is not yet
allocated, so damon_reclaim_enabled() unconditionally returns false.

If a system has damon_reclaim.enabled=Y in the boot parameters and an
administrator appends damon_reclaim.enabled=N to override it, the first
evaluation correctly sets the internal enabled flag to true.

However, when the second parameter (N) is processed, enable is false.
Since damon_reclaim_enabled() also returns false, the condition matches
and the function returns 0 immediately. The internal enabled variable is
never updated to false, causing the module to start unexpectedly when
damon_reclaim_init() runs.

Should this keep checking the internal enabled variable during early boot?

>  
>  	/* Called before init function.  The function will handle this. */
> @@ -359,15 +349,46 @@ static int damon_reclaim_enabled_store(const char *val,

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260418014439.6353-1-sj@kernel.org?part=1

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v2 2/3] mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
  2026-04-18  1:44 ` [RFC PATCH v2 2/3] mm/damon/lru_sort: " SeongJae Park
@ 2026-04-18  5:30   ` sashiko-bot
  2026-04-18 15:36     ` SeongJae Park
  0 siblings, 1 reply; 10+ messages in thread
From: sashiko-bot @ 2026-04-18  5:30 UTC (permalink / raw)
  To: SeongJae Park; +Cc: damon

Sashiko AI review found 3 potential issue(s):
- [Critical] Missing `.set` callback in `kdamond_pid_param_ops` will cause a kernel panic if the parameter is passed at boot or via modprobe.
- [High] Loading the module with `enabled=Y` causes a NULL pointer dereference on `ctx`.
- [High] Concurrent calls to `damon_lru_sort_turn()` lack locking, leading to list corruption and kernel panics.
--

commit 905c8d84b8daa6d4ddb6fc214e81e98101608d53
Author: SeongJae Park <sj@kernel.org>

mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values

This commit updates DAMON_LRU_SORT to dynamically check the running status
of kdamond instead of relying on cached parameter values. This ensures the
status correctly reflects when kdamond stops due to events other than
explicit user requests.

> diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> index 554559d729760..d284cb336b662 100644
> --- a/mm/damon/lru_sort.c
> +++ b/mm/damon/lru_sort.c
[ ... ]
> @@ -386,12 +377,8 @@ static int damon_lru_sort_turn(bool on)
>  {
>  	int err;
>  
> -	if (!on) {
> -		err = damon_stop(&ctx, 1);
> -		if (!err)
> -			kdamond_pid = -1;
> -		return err;
> -	}
> +	if (!on)
> +		return damon_stop(&ctx, 1);

Could concurrent calls to this function lead to list corruption? 

If a user writes 'Y' to the enabled parameter via sysfs at the same
time damon_lru_sort_init() calls damon_lru_sort_turn(true) during module
initialization, damon_commit_ctx() might concurrently modify ctx->schemes
and ctx->targets without internal locking.

[ ... ]
> @@ -430,10 +414,16 @@ module_param_cb(addr_unit, &addr_unit_param_ops, &addr_unit, 0600);
>  MODULE_PARM_DESC(addr_unit,
>  	"Scale factor for DAMON_LRU_SORT to ops address conversion (default: 1)");
>  
> +static bool damon_lru_sort_enabled(void)
> +{
> +	if (!ctx)
> +		return false;
> +	return damon_is_running(ctx);
> +}
> +
>  static int damon_lru_sort_enabled_store(const char *val,
>  		const struct kernel_param *kp)
>  {
> -	bool is_enabled = enabled;
>  	bool enable;
>  	int err;

Is it possible to hit a NULL pointer dereference on ctx when loading the
module with enabled=Y? 

If the DAMON core is already initialized (e.g. built-in), damon_initialized()
would return true, bypassing the early exit in this function. This could
lead to calling damon_lru_sort_turn(true) before damon_lru_sort_init() has
run, causing damon_commit_ctx() to operate on a NULL ctx pointer.

[ ... ]
> +static const struct kernel_param_ops kdamond_pid_param_ops = {
> +	.get = damon_lru_sort_kdamond_pid_load,
> +};
> +
> +/*
> + * PID of the DAMON thread
> + *
> + * If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread.
> + * Else, -1.
> + */
> +module_param_cb(kdamond_pid, &kdamond_pid_param_ops, NULL, 0400);

Does this cause a NULL pointer dereference if the parameter is passed at
boot or via modprobe?

The kernel parameter parsing code in parse_one() doesn't check if .set is
NULL before invoking it:

    err = params[i].ops->set(val, &params[i]);

A dummy .set function returning an error might be needed to prevent a NULL
pointer dereference when parsing command line arguments.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260418014439.6353-1-sj@kernel.org?part=2

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value
  2026-04-18  1:44 ` [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value SeongJae Park
@ 2026-04-18  5:54   ` sashiko-bot
  2026-04-18 15:40     ` SeongJae Park
  0 siblings, 1 reply; 10+ messages in thread
From: sashiko-bot @ 2026-04-18  5:54 UTC (permalink / raw)
  To: SeongJae Park; +Cc: damon

Sashiko AI review found 1 potential issue(s):
- [Medium] Leaking damon_stat_context when explicitly disabling DAMON_STAT after an unexpected kdamond termination.
--

commit 4bf999258ec2f00a1ac02dbd77ac9f923e38bd2f
Author: SeongJae Park <sj@kernel.org>

mm/damon/stat: detect and use fresh enabled value

This commit modifies DAMON_STAT to dynamically check the kdamond status
when the enabled parameter is requested, rather than relying on a cached
parameter value. This ensures users do not see stale values if kdamond
stops unexpectedly.

> diff --git a/mm/damon/stat.c b/mm/damon/stat.c
> index 99ba346f9e325..3951b762cbddf 100644
> --- a/mm/damon/stat.c
> +++ b/mm/damon/stat.c
[ ... ]
> @@ -273,17 +276,23 @@ static void damon_stat_stop(void)
>  	damon_stat_context = NULL;
>  }
>  
> +static bool damon_stat_enabled(void)
> +{
> +	if (!damon_stat_context)
> +		return false;
> +	return damon_is_running(damon_stat_context);
> +}
> +
>  static int damon_stat_enabled_store(
>  		const char *val, const struct kernel_param *kp)
>  {
> -	bool is_enabled = enabled;
>  	int err;
>  
>  	err = kstrtobool(val, &enabled);
>  	if (err)
>  		return err;
>  
> -	if (is_enabled == enabled)
> +	if (damon_stat_enabled() == enabled)
>  		return 0;

Could this leak damon_stat_context if the kdamond thread stops unexpectedly?

If kdamond terminates due to an internal failure,
damon_is_running(damon_stat_context) returns false, making
damon_stat_enabled() also return false.

If a user then explicitly disables DAMON_STAT by writing 'N' to the enabled
parameter, both damon_stat_enabled() and enabled would be false. This causes
the function to return 0 early, bypassing the call to damon_stat_stop().

Since damon_stat_stop() is the function responsible for invoking
damon_destroy_ctx() and setting damon_stat_context to NULL, does this bypass
leave the context structure allocated?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260418014439.6353-1-sj@kernel.org?part=3

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
  2026-04-18  4:57   ` sashiko-bot
@ 2026-04-18 15:14     ` SeongJae Park
  0 siblings, 0 replies; 10+ messages in thread
From: SeongJae Park @ 2026-04-18 15:14 UTC (permalink / raw)
  To: sashiko-bot; +Cc: SeongJae Park, damon

On Sat, 18 Apr 2026 04:57:15 +0000 sashiko-bot@kernel.org wrote:

> Sashiko AI review found 1 potential issue(s):
> - [Medium] Command-line override for the 'enabled' parameter is broken: passing 'enabled=N' after 'enabled=Y' fails to disable the module.
> --
> 
> commit 6f46639cc34c6ddcbecc0aba915988dff8f651de
> Author: SeongJae Park <sj@kernel.org>
> 
> mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values
> 
> This commit updates DAMON_RECLAIM to dynamically query the running status
> of kdamond instead of relying on cached parameter values. This ensures
> that when kdamond stops unexpectedly, the enabled and kdamond_pid parameters
> reflect the correct state.
> 
> > diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> > index 86da147786583..46660c8d4d422 100644
> > --- a/mm/damon/reclaim.c
> > +++ b/mm/damon/reclaim.c
> 
> [ ... ]
> 
> > @@ -332,10 +316,16 @@ module_param_cb(addr_unit, &addr_unit_param_ops, &addr_unit, 0600);
> >  MODULE_PARM_DESC(addr_unit,
> >  	"Scale factor for DAMON_RECLAIM to ops address conversion (default: 1)");
> >  
> > +static bool damon_reclaim_enabled(void)
> > +{
> > +	if (!ctx)
> > +		return false;
> > +	return damon_is_running(ctx);
> > +}
> > +
> >  static int damon_reclaim_enabled_store(const char *val,
> >  		const struct kernel_param *kp)
> >  {
> > -	bool is_enabled = enabled;
> >  	bool enable;
> >  	int err;
> >  
> > @@ -343,7 +333,7 @@ static int damon_reclaim_enabled_store(const char *val,
> >  	if (err)
> >  		return err;
> >  
> > -	if (is_enabled == enable)
> > +	if (damon_reclaim_enabled() == enable)
> >  		return 0;
> 
> Does changing this check break kernel command-line overrides during boot?
> 
> During early boot when command-line parameters are parsed, ctx is not yet
> allocated, so damon_reclaim_enabled() unconditionally returns false.
> 
> If a system has damon_reclaim.enabled=Y in the boot parameters and an
> administrator appends damon_reclaim.enabled=N to override it, the first
> evaluation correctly sets the internal enabled flag to true.
> 
> However, when the second parameter (N) is processed, enable is false.
> Since damon_reclaim_enabled() also returns false, the condition matches
> and the function returns 0 immediately. The internal enabled variable is
> never updated to false, causing the module to start unexpectedly when
> damon_reclaim_init() runs.
> 
> Should this keep checking the internal enabled variable during early boot?

I don't think such override is common, but I also think there is no harm at
making the old behavior be kept.  I will address this like below, in the next
revision.

'''
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -381,27 +381,20 @@ static bool damon_reclaim_enabled(void)
 static int damon_reclaim_enabled_store(const char *val,
                const struct kernel_param *kp)
 {
-       bool enable;
        int err;

-       err = kstrtobool(val, &enable);
+       err = kstrtobool(val, &enabled);
        if (err)
                return err;

-       if (damon_reclaim_enabled() == enable)
+       if (damon_reclaim_enabled() == enabled)
                return 0;

        /* Called before init function.  The function will handle this. */
        if (!damon_initialized())
-               goto set_param_out;
-
-       err = damon_reclaim_turn(enable);
-       if (err)
-               return err;
+               return 0;

-set_param_out:
-       enabled = enable;
-       return err;
+       return damon_reclaim_turn(enabled);
 }

 static int damon_reclaim_enabled_load(char *buffer,
'''


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v2 2/3] mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
  2026-04-18  5:30   ` sashiko-bot
@ 2026-04-18 15:36     ` SeongJae Park
  0 siblings, 0 replies; 10+ messages in thread
From: SeongJae Park @ 2026-04-18 15:36 UTC (permalink / raw)
  To: sashiko-bot; +Cc: SeongJae Park, damon

On Sat, 18 Apr 2026 05:30:59 +0000 sashiko-bot@kernel.org wrote:

> Sashiko AI review found 3 potential issue(s):
> - [Critical] Missing `.set` callback in `kdamond_pid_param_ops` will cause a kernel panic if the parameter is passed at boot or via modprobe.

I will fix this in the next revision.

> - [High] Loading the module with `enabled=Y` causes a NULL pointer dereference on `ctx`.
> - [High] Concurrent calls to `damon_lru_sort_turn()` lack locking, leading to list corruption and kernel panics.

Above two are valid points but not caused by this patch.  I will separately
work for those.

> --
> 
> commit 905c8d84b8daa6d4ddb6fc214e81e98101608d53
> Author: SeongJae Park <sj@kernel.org>
> 
> mm/damon/lru_sort: detect and use fresh enabled and kdamond_pid values
> 
> This commit updates DAMON_LRU_SORT to dynamically check the running status
> of kdamond instead of relying on cached parameter values. This ensures the
> status correctly reflects when kdamond stops due to events other than
> explicit user requests.
> 
> > diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> > index 554559d729760..d284cb336b662 100644
> > --- a/mm/damon/lru_sort.c
> > +++ b/mm/damon/lru_sort.c
> [ ... ]
> > @@ -386,12 +377,8 @@ static int damon_lru_sort_turn(bool on)
> >  {
> >  	int err;
> >  
> > -	if (!on) {
> > -		err = damon_stop(&ctx, 1);
> > -		if (!err)
> > -			kdamond_pid = -1;
> > -		return err;
> > -	}
> > +	if (!on)
> > +		return damon_stop(&ctx, 1);
> 
> Could concurrent calls to this function lead to list corruption? 
> 
> If a user writes 'Y' to the enabled parameter via sysfs at the same
> time damon_lru_sort_init() calls damon_lru_sort_turn(true) during module
> initialization, damon_commit_ctx() might concurrently modify ctx->schemes
> and ctx->targets without internal locking.

Hmm, good finding.  But the race existis even before this patch.  I will
separately work on this.

> 
> [ ... ]
> > @@ -430,10 +414,16 @@ module_param_cb(addr_unit, &addr_unit_param_ops, &addr_unit, 0600);
> >  MODULE_PARM_DESC(addr_unit,
> >  	"Scale factor for DAMON_LRU_SORT to ops address conversion (default: 1)");
> >  
> > +static bool damon_lru_sort_enabled(void)
> > +{
> > +	if (!ctx)
> > +		return false;
> > +	return damon_is_running(ctx);
> > +}
> > +
> >  static int damon_lru_sort_enabled_store(const char *val,
> >  		const struct kernel_param *kp)
> >  {
> > -	bool is_enabled = enabled;
> >  	bool enable;
> >  	int err;
> 
> Is it possible to hit a NULL pointer dereference on ctx when loading the
> module with enabled=Y? 
> 
> If the DAMON core is already initialized (e.g. built-in), damon_initialized()
> would return true, bypassing the early exit in this function. This could
> lead to calling damon_lru_sort_turn(true) before damon_lru_sort_init() has
> run, causing damon_commit_ctx() to operate on a NULL ctx pointer.

Hmm, good finding.  This is a race of a type that is same to the above one.
And again the race exists even before this patch.  I will separately work on
this.

> 
> [ ... ]
> > +static const struct kernel_param_ops kdamond_pid_param_ops = {
> > +	.get = damon_lru_sort_kdamond_pid_load,
> > +};
> > +
> > +/*
> > + * PID of the DAMON thread
> > + *
> > + * If DAMON_LRU_SORT is enabled, this becomes the PID of the worker thread.
> > + * Else, -1.
> > + */
> > +module_param_cb(kdamond_pid, &kdamond_pid_param_ops, NULL, 0400);
> 
> Does this cause a NULL pointer dereference if the parameter is passed at
> boot or via modprobe?
> 
> The kernel parameter parsing code in parse_one() doesn't check if .set is
> NULL before invoking it:
> 
>     err = params[i].ops->set(val, &params[i]);
> 
> A dummy .set function returning an error might be needed to prevent a NULL
> pointer dereference when parsing command line arguments.

Good point.  I will fix this in the next revision like below.

'''
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -412,6 +412,16 @@ module_param_cb(enabled, &enabled_param_ops, &enabled, 0600);
 MODULE_PARM_DESC(enabled,
        "Enable or disable DAMON_RECLAIM (default: disabled)");

+static int damon_reclaim_kdamond_pid_store(const char *val,
+               const struct kernel_param *kp)
+{
+       /*
+        * kdamond_pid is read-only, but kernel command line could write it.
+        * Do nothing here.
+        */
+       return 0;
+}
+
 static int damon_reclaim_kdamond_pid_load(char *buffer,
                const struct kernel_param *kp)
 {
@@ -426,6 +436,7 @@ static int damon_reclaim_kdamond_pid_load(char *buffer,
 }

 static const struct kernel_param_ops kdamond_pid_param_ops = {
+       .set = damon_reclaim_kdamond_pid_store,
        .get = damon_reclaim_kdamond_pid_load,
 };
'''


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

* Re: [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value
  2026-04-18  5:54   ` sashiko-bot
@ 2026-04-18 15:40     ` SeongJae Park
  0 siblings, 0 replies; 10+ messages in thread
From: SeongJae Park @ 2026-04-18 15:40 UTC (permalink / raw)
  To: sashiko-bot; +Cc: SeongJae Park, damon

On Sat, 18 Apr 2026 05:54:16 +0000 sashiko-bot@kernel.org wrote:

> Sashiko AI review found 1 potential issue(s):
> - [Medium] Leaking damon_stat_context when explicitly disabling DAMON_STAT after an unexpected kdamond termination.

This is wrong.  I replied the detail [1] in the previous version.

[1] https://lore.kernel.org/20260417000308.58975-1-sj@kernel.org


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 10+ messages in thread

end of thread, other threads:[~2026-04-18 15:40 UTC | newest]

Thread overview: 10+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-18  1:44 [RFC PATCH v2 0/3] mm/damon/modules: detect and use fresh status SeongJae Park
2026-04-18  1:44 ` [RFC PATCH v2 1/3] mm/damon/reclaim: detect and use fresh enabled and kdamond_pid values SeongJae Park
2026-04-18  4:57   ` sashiko-bot
2026-04-18 15:14     ` SeongJae Park
2026-04-18  1:44 ` [RFC PATCH v2 2/3] mm/damon/lru_sort: " SeongJae Park
2026-04-18  5:30   ` sashiko-bot
2026-04-18 15:36     ` SeongJae Park
2026-04-18  1:44 ` [RFC PATCH v2 3/3] mm/damon/stat: detect and use fresh enabled value SeongJae Park
2026-04-18  5:54   ` sashiko-bot
2026-04-18 15:40     ` SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox