* [RFC v5] mm/damon: add synchronous commit for commit_inputs
@ 2026-03-25 1:39 Liew Rui Yan
2026-03-25 2:53 ` (sashiko review) " SeongJae Park
2026-03-25 14:29 ` SeongJae Park
0 siblings, 2 replies; 7+ messages in thread
From: Liew Rui Yan @ 2026-03-25 1:39 UTC (permalink / raw)
To: sj; +Cc: damon, linux-mm, Liew Rui Yan
Problem
=======
Writing invalid parameters to sysfs followed by 'commit_inputs=Y' fails
silently (no error returned to shell), because the validation happens
asynchronously in the kdamond.
Solution
========
To fix this, the commit_inputs_store() callback now uses damon_call() to
synchronously commit parameters in the kdamond thread's safe context.
This ensures that validation errors are returned immediately to
userspace, following the pattern used by DAMON_SYSFS.
Changes
=======
1. Added commit_inputs_store() and commit_inputs_fn() to commit
synchronously.
2. Removed handle_commit_inputs().
This change is motivated from another discussion [1].
[1] https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
---
Changes from RFC-v4:
- Rename the 'yes' variable in commit_inputs_store() to the more
understandable 'commit_inputs_request'.
- Return -EBUSY instead of -EINVAL when 'commit_inputs' is triggered
while kdamond is not running.
- Link to RFC-v4: https://lore.kernel.org/20260323021648.6590-1-aethernet65535@gmail.com
Changes from RFC-v3:
- Added checks for 'ctx' and 'damon_is_running()' to prevent NULL
pointer dereference during early boot. (Found by Sashiko.dev)
- Removed handle_commit_inputs() and its associated polling logic as
they have become dead code after moving to the synchronous damon_call()
approach.
- Ensure the 'commit_inputs' is properly updated.
Link to RFC-v3: https://lore.kernel.org/20260322231522.32700-1-aethernet65535@gmail.com
Changes from RFC-v2:
- Removed damon_validate_attrs(), now using damon_commit_ctx() for
synchronous validation in the kdamond context.
- Following DAMON_SYSFS pattern for synchronous commit via damon_call().
- Link to RFC-v2: https://lore.kernel.org/20260321140926.22163-1-aethernet65535@gmail.com
Changes from RFC-v1:
- Remove question from commit message area.
- Added synchronous validation for DAMON_RECLAIM.
- Rename damon_valid_attrs() -> damon_validate_attrs().
- Exported a new function damon_validate_attrs() and declared it in
damon.h.
- Link to RFC-v1: https://lore.kernel.org/20260321002642.22712-1-aethernet65535@gmail.com
mm/damon/lru_sort.c | 54 +++++++++++++++++++++++++++++++++++++++------
mm/damon/reclaim.c | 54 +++++++++++++++++++++++++++++++++++++++------
2 files changed, 94 insertions(+), 14 deletions(-)
diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
index 554559d72976..a2410f648b51 100644
--- a/mm/damon/lru_sort.c
+++ b/mm/damon/lru_sort.c
@@ -39,7 +39,6 @@ static bool enabled __read_mostly;
* the re-reading, DAMON_LRU_SORT will be disabled.
*/
static bool commit_inputs __read_mostly;
-module_param(commit_inputs, bool, 0600);
/*
* Desired active to [in]active memory ratio in bp (1/10,000).
@@ -349,18 +348,59 @@ static int damon_lru_sort_apply_parameters(void)
return err;
}
-static int damon_lru_sort_handle_commit_inputs(void)
+static int damon_lru_sort_commit_inputs_fn(void *arg)
{
+ return damon_lru_sort_apply_parameters();
+}
+
+static int damon_lru_sort_commit_inputs_store(const char *val,
+ const struct kernel_param *kp)
+{
+ bool commit_inputs_request;
int err;
+ struct damon_call_control control = {
+ .fn = damon_lru_sort_commit_inputs_fn,
+ .data = ctx,
+ .repeat = false,
+ };
- if (!commit_inputs)
+ err = kstrtobool(val, &commit_inputs_request);
+ if (err)
+ return err;
+
+ if (!commit_inputs_request)
return 0;
- err = damon_lru_sort_apply_parameters();
- commit_inputs = false;
- return err;
+ /*
+ * Skip damon_call() during early boot or when kdamond is
+ * not running to avoid NULL pointer dereference.
+ */
+ if (!ctx)
+ return -EBUSY;
+
+ err = damon_call(ctx, &control);
+ if (err) {
+ /*
+ * If kdamond is not running, damon_call() returns -EINVAL.
+ * Remap it to -EBUSY to better reflect that the operation
+ * failed due to the current state of the system.
+ */
+ if (err == -EINVAL && !damon_is_running(ctx))
+ return -EBUSY;
+
+ return err;
+ }
+
+ return control.return_code;
}
+static const struct kernel_param_ops commit_inputs_param_ops = {
+ .set = damon_lru_sort_commit_inputs_store,
+ .get = param_get_bool,
+};
+
+module_param_cb(commit_inputs, &commit_inputs_param_ops, &commit_inputs, 0600);
+
static int damon_lru_sort_damon_call_fn(void *arg)
{
struct damon_ctx *c = arg;
@@ -374,7 +414,7 @@ static int damon_lru_sort_damon_call_fn(void *arg)
damon_lru_sort_cold_stat = s->stat;
}
- return damon_lru_sort_handle_commit_inputs();
+ return 0;
}
static struct damon_call_control call_control = {
diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index 86da14778658..4cbd67e08d56 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -39,7 +39,6 @@ static bool enabled __read_mostly;
* re-reading, DAMON_RECLAIM will be disabled.
*/
static bool commit_inputs __read_mostly;
-module_param(commit_inputs, bool, 0600);
/*
* Time threshold for cold memory regions identification in microseconds.
@@ -255,18 +254,59 @@ static int damon_reclaim_apply_parameters(void)
return err;
}
-static int damon_reclaim_handle_commit_inputs(void)
+static int damon_reclaim_commit_inputs_fn(void *arg)
{
+ return damon_reclaim_apply_parameters();
+}
+
+static int damon_reclaim_commit_inputs_store(const char *val,
+ const struct kernel_param *kp)
+{
+ bool commit_inputs_request;
int err;
+ struct damon_call_control control = {
+ .fn = damon_reclaim_commit_inputs_fn,
+ .data = ctx,
+ .repeat = false,
+ };
- if (!commit_inputs)
+ err = kstrtobool(val, &commit_inputs_request);
+ if (err)
+ return err;
+
+ if (!commit_inputs_request)
return 0;
- err = damon_reclaim_apply_parameters();
- commit_inputs = false;
- return err;
+ /*
+ * Skip damon_call() during early boot or when kdamond is
+ * not running to avoid NULL pointer dereference.
+ */
+ if (!ctx)
+ return -EBUSY;
+
+ err = damon_call(ctx, &control);
+ if (err) {
+ /*
+ * If kdamond is not running, damon_call() returns -EINVAL.
+ * Remap it to -EBUSY to better reflect that the operation
+ * failed due to the current state of the system.
+ */
+ if (err == -EINVAL && !damon_is_running(ctx))
+ return -EBUSY;
+
+ return err;
+ }
+
+ return control.return_code;
}
+static const struct kernel_param_ops commit_inputs_param_ops = {
+ .set = damon_reclaim_commit_inputs_store,
+ .get = param_get_bool,
+};
+
+module_param_cb(commit_inputs, &commit_inputs_param_ops, &commit_inputs, 0600);
+
static int damon_reclaim_damon_call_fn(void *arg)
{
struct damon_ctx *c = arg;
@@ -276,7 +316,7 @@ static int damon_reclaim_damon_call_fn(void *arg)
damon_for_each_scheme(s, c)
damon_reclaim_stat = s->stat;
- return damon_reclaim_handle_commit_inputs();
+ return 0;
}
static struct damon_call_control call_control = {
--
2.53.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: (sashiko review) [RFC v5] mm/damon: add synchronous commit for commit_inputs
2026-03-25 1:39 [RFC v5] mm/damon: add synchronous commit for commit_inputs Liew Rui Yan
@ 2026-03-25 2:53 ` SeongJae Park
2026-03-25 7:17 ` Liew Rui Yan
2026-03-25 14:29 ` SeongJae Park
1 sibling, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2026-03-25 2:53 UTC (permalink / raw)
To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm
Forwarding Sashiko review for doing discussions via mails.
# review url: https://sashiko.dev/#/patchset/20260325013939.18167-1-aethernet65535@gmail.com
# start of sashiko.dev inline review
commit 60ccea4154b0c58741fae2323454a5a9496b67fa
Author: Liew Rui Yan <aethernet65535@gmail.com>
mm/damon: add synchronous commit for commit_inputs
Writing invalid parameters to sysfs followed by commit_inputs=Y previously
failed silently. This patch uses damon_call() to synchronously commit
parameters in the kdamond thread's safe context, returning validation errors
immediately to userspace.
Link: https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
> diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> index 554559d72976..a2410f648b51 100644
> --- a/mm/damon/lru_sort.c
> +++ b/mm/damon/lru_sort.c
[ ... ]
> +static int damon_lru_sort_commit_inputs_store(const char *val,
> + const struct kernel_param *kp)
> +{
> + bool commit_inputs_request;
> int err;
> + struct damon_call_control control = {
> + .fn = damon_lru_sort_commit_inputs_fn,
> + .data = ctx,
> + .repeat = false,
> + };
>
> - if (!commit_inputs)
> + err = kstrtobool(val, &commit_inputs_request);
> + if (err)
> + return err;
> +
> + if (!commit_inputs_request)
> return 0;
>
> - err = damon_lru_sort_apply_parameters();
> - commit_inputs = false;
> - return err;
> + /*
> + * Skip damon_call() during early boot or when kdamond is
> + * not running to avoid NULL pointer dereference.
> + */
> + if (!ctx)
> + return -EBUSY;
> +
> + err = damon_call(ctx, &control);
Can this sequence lead to a system-wide deadlock on kernel_param_lock?
Because damon_lru_sort_commit_inputs_store() is a sysfs .set callback, it
executes with the global kernel_param_lock held.
If kdamond_fn terminates (e.g., due to an invalid parameter like addr_unit=3
causing a non-power-of-2 min_region_sz), it processes its exit sequence by
calling kdamond_call(ctx, true) to flush pending requests. After flushing, it
eventually sets ctx->kdamond = NULL at the very end of the thread.
If a user writes to commit_inputs exactly in the window after
kdamond_call() finishes but before ctx->kdamond = NULL is set:
Thread 1 (kdamond_fn):
kdamond_call(ctx, true);
Thread 2 (sysfs write):
damon_call(ctx, &control);
if (damon_is_running(ctx))
wait_for_completion(&control.completion);
Thread 1 (kdamond_fn):
ctx->kdamond = NULL;
Thread 2 will sleep indefinitely in wait_for_completion(). Because Thread 2
holds the global kernel_param_lock, this permanently freezes all module
parameter sysfs interfaces system-wide.
> + if (err) {
> + /*
> + * If kdamond is not running, damon_call() returns -EINVAL.
> + * Remap it to -EBUSY to better reflect that the operation
> + * failed due to the current state of the system.
> + */
> + if (err == -EINVAL && !damon_is_running(ctx))
> + return -EBUSY;
> +
> + return err;
> + }
> +
> + return control.return_code;
> }
> diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> index 86da14778658..4cbd67e08d56 100644
> --- a/mm/damon/reclaim.c
> +++ b/mm/damon/reclaim.c
[ ... ]
> + /*
> + * Skip damon_call() during early boot or when kdamond is
> + * not running to avoid NULL pointer dereference.
> + */
> + if (!ctx)
> + return -EBUSY;
> +
> + err = damon_call(ctx, &control);
Does this code share the same kernel_param_lock deadlock potential as
damon_lru_sort_commit_inputs_store() described above?
# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260325013939.18167-1-aethernet65535@gmail.com
#
# hkml [1] generated a draft of this mail. It can be regenerated
# using below command:
#
# hkml patch sashiko_dev --for_forwarding \
# 20260325013939.18167-1-aethernet65535@gmail.com
#
# [1] https://github.com/sjp38/hackermail
Sent using hkml (https://github.com/sjp38/hackermail)
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: (sashiko review) [RFC v5] mm/damon: add synchronous commit for commit_inputs
2026-03-25 2:53 ` (sashiko review) " SeongJae Park
@ 2026-03-25 7:17 ` Liew Rui Yan
2026-03-25 14:19 ` SeongJae Park
0 siblings, 1 reply; 7+ messages in thread
From: Liew Rui Yan @ 2026-03-25 7:17 UTC (permalink / raw)
To: sj; +Cc: aethernet65535, damon, linux-mm
Hi SeongJae,
> Forwarding Sashiko review for doing discussions via mails.
>
> # review url: https://sashiko.dev/#/patchset/20260325013939.18167-1-aethernet65535@gmail.com
> # start of sashiko.dev inline review
> commit 60ccea4154b0c58741fae2323454a5a9496b67fa
> Author: Liew Rui Yan <aethernet65535@gmail.com>
>
> mm/damon: add synchronous commit for commit_inputs
>
> Writing invalid parameters to sysfs followed by commit_inputs=Y previously
> failed silently. This patch uses damon_call() to synchronously commit
> parameters in the kdamond thread's safe context, returning validation errors
> immediately to userspace.
>
> Link: https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
>
> > diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> > index 554559d72976..a2410f648b51 100644
> > --- a/mm/damon/lru_sort.c
> > +++ b/mm/damon/lru_sort.c
Thank you for forwarding the review from Sashiko.dev!
> > +static int damon_lru_sort_commit_inputs_store(const char *val,
> > + const struct kernel_param *kp)
> > +{
> > + bool commit_inputs_request;
> > int err;
> > + struct damon_call_control control = {
> > + .fn = damon_lru_sort_commit_inputs_fn,
> > + .data = ctx,
> > + .repeat = false,
> > + };
> >
> > - if (!commit_inputs)
> > + err = kstrtobool(val, &commit_inputs_request);
> > + if (err)
> > + return err;
> > +
> > + if (!commit_inputs_request)
> > return 0;
> >
> > - err = damon_lru_sort_apply_parameters();
> > - commit_inputs = false;
> > - return err;
> > + /*
> > + * Skip damon_call() during early boot or when kdamond is
> > + * not running to avoid NULL pointer dereference.
> > + */
> > + if (!ctx)
> > + return -EBUSY;
> > +
> > + err = damon_call(ctx, &control);
>
> Can this sequence lead to a system-wide deadlock on kernel_param_lock?
>
> Because damon_lru_sort_commit_inputs_store() is a sysfs .set callback, it
> executes with the global kernel_param_lock held.
>
> If kdamond_fn terminates (e.g., due to an invalid parameter like addr_unit=3
> causing a non-power-of-2 min_region_sz), it processes its exit sequence by
> calling kdamond_call(ctx, true) to flush pending requests. After flushing, it
> eventually sets ctx->kdamond = NULL at the very end of the thread.
>
> If a user writes to commit_inputs exactly in the window after
> kdamond_call() finishes but before ctx->kdamond = NULL is set:
>
> Thread 1 (kdamond_fn):
> kdamond_call(ctx, true);
>
> Thread 2 (sysfs write):
> damon_call(ctx, &control);
> if (damon_is_running(ctx))
> wait_for_completion(&control.completion);
>
> Thread 1 (kdamond_fn):
> ctx->kdamond = NULL;
>
> Thread 2 will sleep indefinitely in wait_for_completion(). Because Thread 2
> holds the global kernel_param_lock, this permanently freezes all module
> parameter sysfs interfaces system-wide.
I have verified this behavior with the following test case:
# cd /sys/module/damon_lru_sort/parameters/
# echo Y > enabled
# ps aux | grep "[k]damond"
root 70 0.0 0.0 0 0 ? I 12:16 0:00 [kdamond.0]
# echo 3 > addr_unit
# echo Y > commit_inputs
bash: echo: write error: Invalid argument
# ps aux | grep "[k]damond"
... kdamond has exited unexpectedly
I will add a patch in v6 to validate 'addr_unit' in addr_unit_store() to
reject non-power-of-2 inputs immediately.
> > + if (err) {
> > + /*
> > + * If kdamond is not running, damon_call() returns -EINVAL.
> > + * Remap it to -EBUSY to better reflect that the operation
> > + * failed due to the current state of the system.
> > + */
> > + if (err == -EINVAL && !damon_is_running(ctx))
> > + return -EBUSY;
> > +
> > + return err;
> > + }
> > +
> > + return control.return_code;
> > }
>
> > diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> > index 86da14778658..4cbd67e08d56 100644
> > --- a/mm/damon/reclaim.c
> > +++ b/mm/damon/reclaim.c
>
> [ ... ]
>
> > + /*
> > + * Skip damon_call() during early boot or when kdamond is
> > + * not running to avoid NULL pointer dereference.
> > + */
> > + if (!ctx)
> > + return -EBUSY;
> > +
> > + err = damon_call(ctx, &control);
>
> Does this code share the same kernel_param_lock deadlock potential as
> damon_lru_sort_commit_inputs_store() described above?
As we discussed in the RFC-v4 thread [1], this is a false positive.
Since 'enabled=N' and 'commit_inputs=Y' are both serialized by the
global 'kernel_param_lock', and kthreads cannot be forcibly terminated
by userspace signals, the completion signal in damon_call() will always
be reached in a controlled manner.
[1] https://lore.kernel.org/20260323021648.6590-1-aethernet65535@gmail.com
Best regards,
Rui Yan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: (sashiko review) [RFC v5] mm/damon: add synchronous commit for commit_inputs
2026-03-25 7:17 ` Liew Rui Yan
@ 2026-03-25 14:19 ` SeongJae Park
2026-03-26 6:15 ` Liew Rui Yan
0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2026-03-25 14:19 UTC (permalink / raw)
To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm
Hi Liew,
On Wed, 25 Mar 2026 15:17:09 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
> Hi SeongJae,
>
> > Forwarding Sashiko review for doing discussions via mails.
> >
> > # review url: https://sashiko.dev/#/patchset/20260325013939.18167-1-aethernet65535@gmail.com
> > # start of sashiko.dev inline review
> > commit 60ccea4154b0c58741fae2323454a5a9496b67fa
> > Author: Liew Rui Yan <aethernet65535@gmail.com>
> >
> > mm/damon: add synchronous commit for commit_inputs
> >
> > Writing invalid parameters to sysfs followed by commit_inputs=Y previously
> > failed silently. This patch uses damon_call() to synchronously commit
> > parameters in the kdamond thread's safe context, returning validation errors
> > immediately to userspace.
> >
> > Link: https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
> >
> > > diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> > > index 554559d72976..a2410f648b51 100644
> > > --- a/mm/damon/lru_sort.c
> > > +++ b/mm/damon/lru_sort.c
>
> Thank you for forwarding the review from Sashiko.dev!
Thank you for reviewing the review.
>
> > > +static int damon_lru_sort_commit_inputs_store(const char *val,
> > > + const struct kernel_param *kp)
> > > +{
> > > + bool commit_inputs_request;
> > > int err;
> > > + struct damon_call_control control = {
> > > + .fn = damon_lru_sort_commit_inputs_fn,
> > > + .data = ctx,
> > > + .repeat = false,
> > > + };
> > >
> > > - if (!commit_inputs)
> > > + err = kstrtobool(val, &commit_inputs_request);
> > > + if (err)
> > > + return err;
> > > +
> > > + if (!commit_inputs_request)
> > > return 0;
> > >
> > > - err = damon_lru_sort_apply_parameters();
> > > - commit_inputs = false;
> > > - return err;
> > > + /*
> > > + * Skip damon_call() during early boot or when kdamond is
> > > + * not running to avoid NULL pointer dereference.
> > > + */
> > > + if (!ctx)
> > > + return -EBUSY;
> > > +
> > > + err = damon_call(ctx, &control);
> >
> > Can this sequence lead to a system-wide deadlock on kernel_param_lock?
> >
> > Because damon_lru_sort_commit_inputs_store() is a sysfs .set callback, it
> > executes with the global kernel_param_lock held.
> >
> > If kdamond_fn terminates (e.g., due to an invalid parameter like addr_unit=3
> > causing a non-power-of-2 min_region_sz), it processes its exit sequence by
> > calling kdamond_call(ctx, true) to flush pending requests. After flushing, it
> > eventually sets ctx->kdamond = NULL at the very end of the thread.
> >
> > If a user writes to commit_inputs exactly in the window after
> > kdamond_call() finishes but before ctx->kdamond = NULL is set:
> >
> > Thread 1 (kdamond_fn):
> > kdamond_call(ctx, true);
> >
> > Thread 2 (sysfs write):
> > damon_call(ctx, &control);
> > if (damon_is_running(ctx))
> > wait_for_completion(&control.completion);
> >
> > Thread 1 (kdamond_fn):
> > ctx->kdamond = NULL;
> >
> > Thread 2 will sleep indefinitely in wait_for_completion(). Because Thread 2
> > holds the global kernel_param_lock, this permanently freezes all module
> > parameter sysfs interfaces system-wide.
>
> I have verified this behavior with the following test case:
>
> # cd /sys/module/damon_lru_sort/parameters/
> # echo Y > enabled
> # ps aux | grep "[k]damond"
> root 70 0.0 0.0 0 0 ? I 12:16 0:00 [kdamond.0]
FYI, you can read kdamond_pid parameter.
> # echo 3 > addr_unit
> # echo Y > commit_inputs
> bash: echo: write error: Invalid argument
> # ps aux | grep "[k]damond"
>
> ... kdamond has exited unexpectedly
>
> I will add a patch in v6 to validate 'addr_unit' in addr_unit_store() to
> reject non-power-of-2 inputs immediately.
That's because kdamond_fn() has recently changed to exit the loop if
damon_commit_ctx() fails in the loop for any reason. Refer to commit
26f775a054c3 ("mm/damon/core: avoid use of half-online-committed context") for
more details.
Adding the validation in addr_unit_store() would work, but it could be a kind
of whack-a-mole game. I'd prefer doing the validation of all inputs in one
place, just before the last damon_commit_ctx() call. DAMON_SYSFS is doing so,
by making a test context and try committing user input to the test context
before doing that to the running context. How about doing so here, too?
>
> > > + if (err) {
> > > + /*
> > > + * If kdamond is not running, damon_call() returns -EINVAL.
> > > + * Remap it to -EBUSY to better reflect that the operation
> > > + * failed due to the current state of the system.
> > > + */
> > > + if (err == -EINVAL && !damon_is_running(ctx))
> > > + return -EBUSY;
> > > +
> > > + return err;
> > > + }
> > > +
> > > + return control.return_code;
> > > }
> >
> > > diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> > > index 86da14778658..4cbd67e08d56 100644
> > > --- a/mm/damon/reclaim.c
> > > +++ b/mm/damon/reclaim.c
> >
> > [ ... ]
> >
> > > + /*
> > > + * Skip damon_call() during early boot or when kdamond is
> > > + * not running to avoid NULL pointer dereference.
> > > + */
> > > + if (!ctx)
> > > + return -EBUSY;
> > > +
> > > + err = damon_call(ctx, &control);
> >
> > Does this code share the same kernel_param_lock deadlock potential as
> > damon_lru_sort_commit_inputs_store() described above?
>
> As we discussed in the RFC-v4 thread [1], this is a false positive.
> Since 'enabled=N' and 'commit_inputs=Y' are both serialized by the
> global 'kernel_param_lock', and kthreads cannot be forcibly terminated
> by userspace signals, the completion signal in damon_call() will always
> be reached in a controlled manner.
TL; DR: I think the deadlock can happen. But that's a separate issue.
Let's suppose the user writes 'Y' to commit_inputs, and damon_commit_ctx()
fails due to an allocation failure. The write to commit_inputs will complete,
and kdmond_fn() will start its exit sequence.
The user can write 'Y' to commit_inputs again, when kdamond_fn() finished the
last kdamond_call() with 'cancel=True', but before it unset the ->kdamond
field.
Then, the deadlock that sashiko pointed out could happen.
The deadlock is due to the incomplete design of damon_call() cleanup. And I
think the same deadlock is already available with DAMON_SYSFS. I will work on
fixing this.
So the issue already exists. But let's ensure the fix is merged before this
patch, since this patch adds another exploitable path that can consequence in
whole param_lock deadlock.
>
> [1] https://lore.kernel.org/20260323021648.6590-1-aethernet65535@gmail.com
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC v5] mm/damon: add synchronous commit for commit_inputs
2026-03-25 1:39 [RFC v5] mm/damon: add synchronous commit for commit_inputs Liew Rui Yan
2026-03-25 2:53 ` (sashiko review) " SeongJae Park
@ 2026-03-25 14:29 ` SeongJae Park
2026-03-26 6:16 ` Liew Rui Yan
1 sibling, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2026-03-25 14:29 UTC (permalink / raw)
To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm
On Wed, 25 Mar 2026 09:39:39 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
> Problem
> =======
> Writing invalid parameters to sysfs followed by 'commit_inputs=Y' fails
> silently (no error returned to shell), because the validation happens
> asynchronously in the kdamond.
>
> Solution
> ========
> To fix this, the commit_inputs_store() callback now uses damon_call() to
> synchronously commit parameters in the kdamond thread's safe context.
> This ensures that validation errors are returned immediately to
> userspace, following the pattern used by DAMON_SYSFS.
>
> Changes
> =======
> 1. Added commit_inputs_store() and commit_inputs_fn() to commit
> synchronously.
> 2. Removed handle_commit_inputs().
>
> This change is motivated from another discussion [1].
>
> [1] https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
>
> Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
> ---
> Changes from RFC-v4:
> - Rename the 'yes' variable in commit_inputs_store() to the more
> understandable 'commit_inputs_request'.
> - Return -EBUSY instead of -EINVAL when 'commit_inputs' is triggered
> while kdamond is not running.
> - Link to RFC-v4: https://lore.kernel.org/20260323021648.6590-1-aethernet65535@gmail.com
>
> Changes from RFC-v3:
> - Added checks for 'ctx' and 'damon_is_running()' to prevent NULL
> pointer dereference during early boot. (Found by Sashiko.dev)
> - Removed handle_commit_inputs() and its associated polling logic as
> they have become dead code after moving to the synchronous damon_call()
> approach.
> - Ensure the 'commit_inputs' is properly updated.
> Link to RFC-v3: https://lore.kernel.org/20260322231522.32700-1-aethernet65535@gmail.com
>
> Changes from RFC-v2:
> - Removed damon_validate_attrs(), now using damon_commit_ctx() for
> synchronous validation in the kdamond context.
> - Following DAMON_SYSFS pattern for synchronous commit via damon_call().
> - Link to RFC-v2: https://lore.kernel.org/20260321140926.22163-1-aethernet65535@gmail.com
>
> Changes from RFC-v1:
> - Remove question from commit message area.
> - Added synchronous validation for DAMON_RECLAIM.
> - Rename damon_valid_attrs() -> damon_validate_attrs().
> - Exported a new function damon_validate_attrs() and declared it in
> damon.h.
> - Link to RFC-v1: https://lore.kernel.org/20260321002642.22712-1-aethernet65535@gmail.com
>
> mm/damon/lru_sort.c | 54 +++++++++++++++++++++++++++++++++++++++------
> mm/damon/reclaim.c | 54 +++++++++++++++++++++++++++++++++++++++------
> 2 files changed, 94 insertions(+), 14 deletions(-)
>
> diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> index 554559d72976..a2410f648b51 100644
> --- a/mm/damon/lru_sort.c
> +++ b/mm/damon/lru_sort.c
> @@ -39,7 +39,6 @@ static bool enabled __read_mostly;
> * the re-reading, DAMON_LRU_SORT will be disabled.
> */
> static bool commit_inputs __read_mostly;
> -module_param(commit_inputs, bool, 0600);
>
> /*
> * Desired active to [in]active memory ratio in bp (1/10,000).
> @@ -349,18 +348,59 @@ static int damon_lru_sort_apply_parameters(void)
> return err;
> }
>
> -static int damon_lru_sort_handle_commit_inputs(void)
> +static int damon_lru_sort_commit_inputs_fn(void *arg)
> {
> + return damon_lru_sort_apply_parameters();
> +}
> +
> +static int damon_lru_sort_commit_inputs_store(const char *val,
> + const struct kernel_param *kp)
> +{
> + bool commit_inputs_request;
> int err;
> + struct damon_call_control control = {
> + .fn = damon_lru_sort_commit_inputs_fn,
> + .data = ctx,
You don't really use the data in the callback function. Why don't you just
unset it?
> + .repeat = false,
> + };
>
> - if (!commit_inputs)
> + err = kstrtobool(val, &commit_inputs_request);
> + if (err)
> + return err;
> +
> + if (!commit_inputs_request)
> return 0;
>
> - err = damon_lru_sort_apply_parameters();
> - commit_inputs = false;
> - return err;
> + /*
> + * Skip damon_call() during early boot or when kdamond is
> + * not running to avoid NULL pointer dereference.
> + */
> + if (!ctx)
> + return -EBUSY;
> +
> + err = damon_call(ctx, &control);
> + if (err) {
> + /*
> + * If kdamond is not running, damon_call() returns -EINVAL.
> + * Remap it to -EBUSY to better reflect that the operation
> + * failed due to the current state of the system.
> + */
> + if (err == -EINVAL && !damon_is_running(ctx))
> + return -EBUSY;
In my opinion, EINVAL is a better return value. It was failed not because
something is busy but just because DAMON is not running. Committing something
to DAMON while DAMON is not running seems invalid operation to me.
> +
> + return err;
> + }
> +
> + return control.return_code;
> }
>
> +static const struct kernel_param_ops commit_inputs_param_ops = {
> + .set = damon_lru_sort_commit_inputs_store,
> + .get = param_get_bool,
> +};
> +
> +module_param_cb(commit_inputs, &commit_inputs_param_ops, &commit_inputs, 0600);
> +
> static int damon_lru_sort_damon_call_fn(void *arg)
> {
> struct damon_ctx *c = arg;
> @@ -374,7 +414,7 @@ static int damon_lru_sort_damon_call_fn(void *arg)
> damon_lru_sort_cold_stat = s->stat;
> }
>
> - return damon_lru_sort_handle_commit_inputs();
> + return 0;
> }
>
> static struct damon_call_control call_control = {
> diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> index 86da14778658..4cbd67e08d56 100644
> --- a/mm/damon/reclaim.c
> +++ b/mm/damon/reclaim.c
Comments for lru_sort.c apply here, too.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: (sashiko review) [RFC v5] mm/damon: add synchronous commit for commit_inputs
2026-03-25 14:19 ` SeongJae Park
@ 2026-03-26 6:15 ` Liew Rui Yan
0 siblings, 0 replies; 7+ messages in thread
From: Liew Rui Yan @ 2026-03-26 6:15 UTC (permalink / raw)
To: sj; +Cc: aethernet65535, damon, linux-mm
Hi SeongJae,
On Wed, 25 Mar 2026 07:19:56 -0700, SeongJae Park <sj@kernel.org> wrote:
> Hi Liew,
>
> On Wed, 25 Mar 2026 15:17:09 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
>
> > Hi SeongJae,
> >
> > > Forwarding Sashiko review for doing discussions via mails.
> > >
> > > # review url: https://sashiko.dev/#/patchset/20260325013939.18167-1-aethernet65535@gmail.com
> > > # start of sashiko.dev inline review
> > > commit 60ccea4154b0c58741fae2323454a5a9496b67fa
> > > Author: Liew Rui Yan <aethernet65535@gmail.com>
> > >
> > > mm/damon: add synchronous commit for commit_inputs
> > >
> > > Writing invalid parameters to sysfs followed by commit_inputs=Y previously
> > > failed silently. This patch uses damon_call() to synchronously commit
> > > parameters in the kdamond thread's safe context, returning validation errors
> > > immediately to userspace.
> > >
> > > Link: https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
> > >
> > > > diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> > > > index 554559d72976..a2410f648b51 100644
> > > > --- a/mm/damon/lru_sort.c
> > > > +++ b/mm/damon/lru_sort.c
> >
> > Thank you for forwarding the review from Sashiko.dev!
>
> Thank you for reviewing the review.
>
> >
> > > > +static int damon_lru_sort_commit_inputs_store(const char *val,
> > > > + const struct kernel_param *kp)
> > > > +{
> > > > + bool commit_inputs_request;
> > > > int err;
> > > > + struct damon_call_control control = {
> > > > + .fn = damon_lru_sort_commit_inputs_fn,
> > > > + .data = ctx,
> > > > + .repeat = false,
> > > > + };
> > > >
> > > > - if (!commit_inputs)
> > > > + err = kstrtobool(val, &commit_inputs_request);
> > > > + if (err)
> > > > + return err;
> > > > +
> > > > + if (!commit_inputs_request)
> > > > return 0;
> > > >
> > > > - err = damon_lru_sort_apply_parameters();
> > > > - commit_inputs = false;
> > > > - return err;
> > > > + /*
> > > > + * Skip damon_call() during early boot or when kdamond is
> > > > + * not running to avoid NULL pointer dereference.
> > > > + */
> > > > + if (!ctx)
> > > > + return -EBUSY;
> > > > +
> > > > + err = damon_call(ctx, &control);
> > >
> > > Can this sequence lead to a system-wide deadlock on kernel_param_lock?
> > >
> > > Because damon_lru_sort_commit_inputs_store() is a sysfs .set callback, it
> > > executes with the global kernel_param_lock held.
> > >
> > > If kdamond_fn terminates (e.g., due to an invalid parameter like addr_unit=3
> > > causing a non-power-of-2 min_region_sz), it processes its exit sequence by
> > > calling kdamond_call(ctx, true) to flush pending requests. After flushing, it
> > > eventually sets ctx->kdamond = NULL at the very end of the thread.
> > >
> > > If a user writes to commit_inputs exactly in the window after
> > > kdamond_call() finishes but before ctx->kdamond = NULL is set:
> > >
> > > Thread 1 (kdamond_fn):
> > > kdamond_call(ctx, true);
> > >
> > > Thread 2 (sysfs write):
> > > damon_call(ctx, &control);
> > > if (damon_is_running(ctx))
> > > wait_for_completion(&control.completion);
> > >
> > > Thread 1 (kdamond_fn):
> > > ctx->kdamond = NULL;
> > >
> > > Thread 2 will sleep indefinitely in wait_for_completion(). Because Thread 2
> > > holds the global kernel_param_lock, this permanently freezes all module
> > > parameter sysfs interfaces system-wide.
> >
> > I have verified this behavior with the following test case:
> >
> > # cd /sys/module/damon_lru_sort/parameters/
> > # echo Y > enabled
> > # ps aux | grep "[k]damond"
> > root 70 0.0 0.0 0 0 ? I 12:16 0:00 [kdamond.0]
>
> FYI, you can read kdamond_pid parameter.
>
> > # echo 3 > addr_unit
> > # echo Y > commit_inputs
> > bash: echo: write error: Invalid argument
> > # ps aux | grep "[k]damond"
> >
> > ... kdamond has exited unexpectedly
> >
> > I will add a patch in v6 to validate 'addr_unit' in addr_unit_store() to
> > reject non-power-of-2 inputs immediately.
>
> That's because kdamond_fn() has recently changed to exit the loop if
> damon_commit_ctx() fails in the loop for any reason. Refer to commit
> 26f775a054c3 ("mm/damon/core: avoid use of half-online-committed context") for
> more details.
>
> Adding the validation in addr_unit_store() would work, but it could be a kind
> of whack-a-mole game. I'd prefer doing the validation of all inputs in one
> place, just before the last damon_commit_ctx() call. DAMON_SYSFS is doing so,
> by making a test context and try committing user input to the test context
> before doing that to the running context. How about doing so here, too?
Thank you for the suggestion. I have now added the check
'!src->addr_unit || ! is_power_of_2(src->addr_unit)' to
damon_commit_ctx().
> >
> > > > + if (err) {
> > > > + /*
> > > > + * If kdamond is not running, damon_call() returns -EINVAL.
> > > > + * Remap it to -EBUSY to better reflect that the operation
> > > > + * failed due to the current state of the system.
> > > > + */
> > > > + if (err == -EINVAL && !damon_is_running(ctx))
> > > > + return -EBUSY;
> > > > +
> > > > + return err;
> > > > + }
> > > > +
> > > > + return control.return_code;
> > > > }
> > >
> > > > diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> > > > index 86da14778658..4cbd67e08d56 100644
> > > > --- a/mm/damon/reclaim.c
> > > > +++ b/mm/damon/reclaim.c
> > >
> > > [ ... ]
> > >
> > > > + /*
> > > > + * Skip damon_call() during early boot or when kdamond is
> > > > + * not running to avoid NULL pointer dereference.
> > > > + */
> > > > + if (!ctx)
> > > > + return -EBUSY;
> > > > +
> > > > + err = damon_call(ctx, &control);
> > >
> > > Does this code share the same kernel_param_lock deadlock potential as
> > > damon_lru_sort_commit_inputs_store() described above?
> >
> > As we discussed in the RFC-v4 thread [1], this is a false positive.
> > Since 'enabled=N' and 'commit_inputs=Y' are both serialized by the
> > global 'kernel_param_lock', and kthreads cannot be forcibly terminated
> > by userspace signals, the completion signal in damon_call() will always
> > be reached in a controlled manner.
>
> TL; DR: I think the deadlock can happen. But that's a separate issue.
>
> Let's suppose the user writes 'Y' to commit_inputs, and damon_commit_ctx()
> fails due to an allocation failure. The write to commit_inputs will complete,
> and kdmond_fn() will start its exit sequence.
>
> The user can write 'Y' to commit_inputs again, when kdamond_fn() finished the
> last kdamond_call() with 'cancel=True', but before it unset the ->kdamond
> field.
>
> Then, the deadlock that sashiko pointed out could happen.
>
> The deadlock is due to the incomplete design of damon_call() cleanup. And I
> think the same deadlock is already available with DAMON_SYSFS. I will work on
> fixing this.
I really appreciate your dedication to DAMON.
> So the issue already exists. But let's ensure the fix is merged before this
> patch, since this patch adds another exploitable path that can consequence in
> whole param_lock deadlock.
Should I wait for your fix to be merged into damon/next before I post
the next version?
Best regards,
Rui Yan
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC v5] mm/damon: add synchronous commit for commit_inputs
2026-03-25 14:29 ` SeongJae Park
@ 2026-03-26 6:16 ` Liew Rui Yan
0 siblings, 0 replies; 7+ messages in thread
From: Liew Rui Yan @ 2026-03-26 6:16 UTC (permalink / raw)
To: sj; +Cc: aethernet65535, damon, linux-mm
On Wed, 25 Mar 2026 07:29:20 -0700 SeongJae Park <sj@kernel.org> wrote:
> On Wed, 25 Mar 2026 09:39:39 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
>
> > Problem
> > =======
> > Writing invalid parameters to sysfs followed by 'commit_inputs=Y' fails
> > silently (no error returned to shell), because the validation happens
> > asynchronously in the kdamond.
> >
> > Solution
> > ========
> > To fix this, the commit_inputs_store() callback now uses damon_call() to
> > synchronously commit parameters in the kdamond thread's safe context.
> > This ensures that validation errors are returned immediately to
> > userspace, following the pattern used by DAMON_SYSFS.
> >
> > Changes
> > =======
> > 1. Added commit_inputs_store() and commit_inputs_fn() to commit
> > synchronously.
> > 2. Removed handle_commit_inputs().
> >
> > This change is motivated from another discussion [1].
> >
> > [1] https://lore.kernel.org/20260318153731.97470-1-aethernet65535@gmail.com
> >
> > Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
> > ---
> > Changes from RFC-v4:
> > - Rename the 'yes' variable in commit_inputs_store() to the more
> > understandable 'commit_inputs_request'.
> > - Return -EBUSY instead of -EINVAL when 'commit_inputs' is triggered
> > while kdamond is not running.
> > - Link to RFC-v4: https://lore.kernel.org/20260323021648.6590-1-aethernet65535@gmail.com
> >
> > Changes from RFC-v3:
> > - Added checks for 'ctx' and 'damon_is_running()' to prevent NULL
> > pointer dereference during early boot. (Found by Sashiko.dev)
> > - Removed handle_commit_inputs() and its associated polling logic as
> > they have become dead code after moving to the synchronous damon_call()
> > approach.
> > - Ensure the 'commit_inputs' is properly updated.
> > Link to RFC-v3: https://lore.kernel.org/20260322231522.32700-1-aethernet65535@gmail.com
> >
> > Changes from RFC-v2:
> > - Removed damon_validate_attrs(), now using damon_commit_ctx() for
> > synchronous validation in the kdamond context.
> > - Following DAMON_SYSFS pattern for synchronous commit via damon_call().
> > - Link to RFC-v2: https://lore.kernel.org/20260321140926.22163-1-aethernet65535@gmail.com
> >
> > Changes from RFC-v1:
> > - Remove question from commit message area.
> > - Added synchronous validation for DAMON_RECLAIM.
> > - Rename damon_valid_attrs() -> damon_validate_attrs().
> > - Exported a new function damon_validate_attrs() and declared it in
> > damon.h.
> > - Link to RFC-v1: https://lore.kernel.org/20260321002642.22712-1-aethernet65535@gmail.com
> >
> > mm/damon/lru_sort.c | 54 +++++++++++++++++++++++++++++++++++++++------
> > mm/damon/reclaim.c | 54 +++++++++++++++++++++++++++++++++++++++------
> > 2 files changed, 94 insertions(+), 14 deletions(-)
> >
> > diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> > index 554559d72976..a2410f648b51 100644
> > --- a/mm/damon/lru_sort.c
> > +++ b/mm/damon/lru_sort.c
> > @@ -39,7 +39,6 @@ static bool enabled __read_mostly;
> > * the re-reading, DAMON_LRU_SORT will be disabled.
> > */
> > static bool commit_inputs __read_mostly;
> > -module_param(commit_inputs, bool, 0600);
> >
> > /*
> > * Desired active to [in]active memory ratio in bp (1/10,000).
> > @@ -349,18 +348,59 @@ static int damon_lru_sort_apply_parameters(void)
> > return err;
> > }
> >
> > -static int damon_lru_sort_handle_commit_inputs(void)
> > +static int damon_lru_sort_commit_inputs_fn(void *arg)
> > {
> > + return damon_lru_sort_apply_parameters();
> > +}
> > +
> > +static int damon_lru_sort_commit_inputs_store(const char *val,
> > + const struct kernel_param *kp)
> > +{
> > + bool commit_inputs_request;
> > int err;
> > + struct damon_call_control control = {
> > + .fn = damon_lru_sort_commit_inputs_fn,
> > + .data = ctx,
>
> You don't really use the data in the callback function. Why don't you just
> unset it?
You are right, thank you for pointing this out. I will remove the
unnecessary assignment in the next version.
> > + .repeat = false,
> > + };
> >
> > - if (!commit_inputs)
> > + err = kstrtobool(val, &commit_inputs_request);
> > + if (err)
> > + return err;
> > +
> > + if (!commit_inputs_request)
> > return 0;
> >
> > - err = damon_lru_sort_apply_parameters();
> > - commit_inputs = false;
> > - return err;
> > + /*
> > + * Skip damon_call() during early boot or when kdamond is
> > + * not running to avoid NULL pointer dereference.
> > + */
> > + if (!ctx)
> > + return -EBUSY;
> > +
> > + err = damon_call(ctx, &control);
> > + if (err) {
> > + /*
> > + * If kdamond is not running, damon_call() returns -EINVAL.
> > + * Remap it to -EBUSY to better reflect that the operation
> > + * failed due to the current state of the system.
> > + */
> > + if (err == -EINVAL && !damon_is_running(ctx))
> > + return -EBUSY;
>
> In my opinion, EINVAL is a better return value. It was failed not because
> something is busy but just because DAMON is not running. Committing something
> to DAMON while DAMON is not running seems invalid operation to me.
That makes sense. I will revert this to return -EINVAL when DAMON is not
running.
Best regards,
Rui Yan
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-03-26 6:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-03-25 1:39 [RFC v5] mm/damon: add synchronous commit for commit_inputs Liew Rui Yan
2026-03-25 2:53 ` (sashiko review) " SeongJae Park
2026-03-25 7:17 ` Liew Rui Yan
2026-03-25 14:19 ` SeongJae Park
2026-03-26 6:15 ` Liew Rui Yan
2026-03-25 14:29 ` SeongJae Park
2026-03-26 6:16 ` Liew Rui Yan
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox