public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
* [PATCH v3 0/2] mm/damon: validate min_region_size to be power of 2
@ 2026-04-03  5:23 Liew Rui Yan
  2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
  2026-04-03  5:23 ` [PATCH v3 2/2] mm/damon/reclaim: " Liew Rui Yan
  0 siblings, 2 replies; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-03  5:23 UTC (permalink / raw)
  To: sj; +Cc: yanquanmin1, damon, linux-mm, Liew Rui Yan

Problem
=======
When a user provides an invalid min_region_sz (not a power-of-2) via the
DAMON sysfs interface, damon_commit_ctx() fails the validation, which
causes the kdamond to terminate unexpectedly.

Reproduction steps:
1. Enable DAMON_LRU_SORT or DAMON_RECLAIM.
2. Set an invalid 'min_region_sz' by setting 'addr_unit' such that
   'DAMON_MIN_REGION_SZ / addr_unit' is not a power-of-2 (e.g.,
   addr_unit=3).
3. Commit the parameters via 'commit_inputs'.
4. kdamond terminates due to the validation failure.

Solution
========
Add power-of-2 validation in damon_lru_sort_apply_parameters() and
damon_reclaim_apply_parameters(), return -EINVAL immediately.

Patch 1 fixes the issue for DAMON_LRU_SORT.
Patch 2 fixes the issue for DAMON_RECLAIM.

Changes from v2
(https://lore.kernel.org/20260402053756.26606-1-aethernet65535@gmail.com)
- Split the patch into two per-module patches.
- Add Fixes: and Cc: stable tags.
- Elaborate user impact and reproduction steps.
Changes from v1
(https://lore.kernel.org/20260331073231.30060-1-aethernet65535@gmail.com)
- Fix memory leak issue.
Changes from first attempt
(https://lore.kernel.org/20260327062627.66426-1-aethernet65535@gmail.com)
- Renamed the subject.
- Validate min_region_sz rather than addr_unit.

Liew Rui Yan (2):
  mm/damon/lru_sort: validate min_region_size to be power of 2
  mm/damon/reclaim: validate min_region_size to be power of 2

 mm/damon/lru_sort.c | 5 +++++
 mm/damon/reclaim.c  | 5 +++++
 2 files changed, 10 insertions(+)

-- 
2.53.0



^ permalink raw reply	[flat|nested] 17+ messages in thread

* [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03  5:23 [PATCH v3 0/2] mm/damon: validate min_region_size to be power of 2 Liew Rui Yan
@ 2026-04-03  5:23 ` Liew Rui Yan
  2026-04-03  8:31   ` (sashiko review) " Liew Rui Yan
                     ` (2 more replies)
  2026-04-03  5:23 ` [PATCH v3 2/2] mm/damon/reclaim: " Liew Rui Yan
  1 sibling, 3 replies; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-03  5:23 UTC (permalink / raw)
  To: sj; +Cc: yanquanmin1, damon, linux-mm, Liew Rui Yan, stable

The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
However, if an invalid input is provided via the DAMON_LRU_SORT
interface, the validation failure occurs too late, causing kdamond to
terminate unexpectedly.

To reproduce:
1. Enable DAMON_LRU_SORT.
2. Set an invalid 'addr_unit' (e.g., addr_unit=3) so that
   'min_region_sz = DAMON_MIN_REGION_SZ / addr_unit' becomes
   non-power-of-2.
3. Commit parameters, and observe kdamond termination.

This patch adds an early check in damon_lru_sort_apply_parameters() to
validate 'min_region_sz' and return -EINVAL immediately if it is not
a power-of-2, preventing unexpected kdamond termination.

Fixes: 2e0fe9245d6b ("mm/damon/lru_sort: support addr_unit for DAMON_LRU_SORT")
Cc: <stable@vger.kernel.org> # 6.18.x
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
---
 mm/damon/lru_sort.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
index 554559d72976..3fd176ef9d9c 100644
--- a/mm/damon/lru_sort.c
+++ b/mm/damon/lru_sort.c
@@ -294,6 +294,11 @@ static int damon_lru_sort_apply_parameters(void)
 	param_ctx->addr_unit = addr_unit;
 	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
 
+	if (!is_power_of_2(param_ctx->min_region_sz)) {
+		err = -EINVAL;
+		goto out;
+	}
+
 	if (!damon_lru_sort_mon_attrs.sample_interval) {
 		err = -EINVAL;
 		goto out;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* [PATCH v3 2/2] mm/damon/reclaim: validate min_region_size to be power of 2
  2026-04-03  5:23 [PATCH v3 0/2] mm/damon: validate min_region_size to be power of 2 Liew Rui Yan
  2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
@ 2026-04-03  5:23 ` Liew Rui Yan
  2026-04-03  8:33   ` (sashiko review) " Liew Rui Yan
  2026-04-03 16:20   ` SeongJae Park
  1 sibling, 2 replies; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-03  5:23 UTC (permalink / raw)
  To: sj; +Cc: yanquanmin1, damon, linux-mm, Liew Rui Yan, stable

The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
However, if an invalid input is provided via the DAMON_RECLAIM interface,
the validation failure occurs too late, causing kdamond to terminate
unexpectedly.

To reproduce:
1. Enable DAMON_RECLAIM.
2. Set an invalid 'addr_unit' (e.g., addr_unit=3) so that
   'min_region_sz = DAMON_MIN_REGION_SZ / addr_unit' becomes
   non-power-of-2.
3. Commit parameters, and observe kdamond termination.

This patch adds an early check in damon_reclaim_apply_parameters() to
validate 'min_region_sz' and return -EINVAL immediately if it is not a
power-of-2, preventing unexpected kdamond termination.

Fixes: 7db551fcfb2a ("mm/damon/reclaim: support addr_unit for DAMON_RECLAIM")
Cc: <stable@vger.kernel.org> # 6.18.x
Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
---
 mm/damon/reclaim.c | 5 +++++
 1 file changed, 5 insertions(+)

diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
index 86da14778658..2747eef5919d 100644
--- a/mm/damon/reclaim.c
+++ b/mm/damon/reclaim.c
@@ -204,6 +204,11 @@ static int damon_reclaim_apply_parameters(void)
 	param_ctx->addr_unit = addr_unit;
 	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
 
+	if (!is_power_of_2(param_ctx->min_region_sz)) {
+		err = -EINVAL;
+		goto out;
+	}
+
 	if (!damon_reclaim_mon_attrs.aggr_interval) {
 		err = -EINVAL;
 		goto out;
-- 
2.53.0



^ permalink raw reply related	[flat|nested] 17+ messages in thread

* Re: (sashiko review) [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
@ 2026-04-03  8:31   ` Liew Rui Yan
  2026-04-03 15:55     ` SeongJae Park
  2026-04-03  8:59   ` Quanmin Yan
  2026-04-03 16:19   ` SeongJae Park
  2 siblings, 1 reply; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-03  8:31 UTC (permalink / raw)
  To: aethernet65535; +Cc: damon, linux-mm, sj, stable, yanquanmin1

Forwarding Sashiko review for doing discussions via mails.

# review url: https://sashiko.dev/#/patchset/20260403052837.58063-2-aethernet65535@gmail.com
# start of sashiko.dev inline review
commit 525a0116b6d91612e679b30ae84e489a7540edbf
Author: Liew Rui Yan <aethernet65535@gmail.com>

mm/damon/lru_sort: validate min_region_size to be power of 2

This commit adds an early check in damon_lru_sort_apply_parameters() to
ensure min_region_sz is a power of 2. This prevents kdamond from unexpectedly
terminating when given an invalid addr_unit.

> diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> index 554559d729760..3fd176ef9d9c1 100644
> --- a/mm/damon/lru_sort.c
> +++ b/mm/damon/lru_sort.c
> @@ -294,6 +294,11 @@ static int damon_lru_sort_apply_parameters(void)
>  	param_ctx->addr_unit = addr_unit;
>  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
>  
> +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +

Instead of adding this early check here, could this be resolved by fixing
the stateless validation in damon_commit_ctx() directly?

Looking at damon_commit_ctx(), it currently sets dst->maybe_corrupted = true
before checking if src->min_region_sz is a power of 2:

int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
{
	int err;

	dst->maybe_corrupted = true;
	if (!is_power_of_2(src->min_region_sz))
		return -EINVAL;
...
}

When this validation fails, it leaves the active context marked as corrupted
without making any state changes, which causes the kdamond thread to terminate
permanently. Moving the stateless is_power_of_2() check above
dst->maybe_corrupted = true would resolve the regression for all current and
future callers without requiring duplicated validation logic at every
call site.

Does this missing validation also affect mm/damon/sysfs.c?

When a user starts DAMON via state_store() -> damon_sysfs_turn_damon_on(),
the code calls damon_sysfs_build_ctx(). This calculates min_region_sz and
then directly calls damon_start(), bypassing damon_commit_ctx() entirely.

Because this bypasses the is_power_of_2() checks, kdamond can actively run
with a non-power-of-2 min_region_sz. Core DAMON operations like
damon_set_regions() use ALIGN() and ALIGN_DOWN() with min_region_sz, which
perform bitwise masking that requires the alignment parameter to be a
power of 2.

Could this allow userspace to trigger incorrect address boundaries, massive
unsigned underflows, or infinite loops during region splitting?

>  	if (!damon_lru_sort_mon_attrs.sample_interval) {
>  		err = -EINVAL;
>  		goto out;


# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260403052837.58063-2-aethernet65535@gmail.com
#
# hkml [1] generated a draft of this mail.  It can be regenerated
# using below command:
#
#     hkml patch sashiko_dev --for_forwarding \
#             20260403052837.58063-2-aethernet65535@gmail.com
#
# [1] https://github.com/sjp38/hackermail


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: (sashiko review) [PATCH v3 2/2] mm/damon/reclaim: validate min_region_size to be power of 2
  2026-04-03  5:23 ` [PATCH v3 2/2] mm/damon/reclaim: " Liew Rui Yan
@ 2026-04-03  8:33   ` Liew Rui Yan
  2026-04-03 16:07     ` SeongJae Park
  2026-04-03 16:20   ` SeongJae Park
  1 sibling, 1 reply; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-03  8:33 UTC (permalink / raw)
  To: aethernet65535; +Cc: damon, linux-mm, sj, stable, yanquanmin1

Forwarding Sashiko review for doing discussions via mails.

# review url: https://sashiko.dev/#/patchset/20260403052837.58063-3-aethernet65535@gmail.com
# start of sashiko.dev inline review
commit 28b6de17622e17753e02ff606413650ad898f09d
Author: Liew Rui Yan <aethernet65535@gmail.com>

mm/damon/reclaim: validate min_region_size to be power of 2

This commit adds an early check in damon_reclaim_apply_parameters() to
validate min_region_sz and return -EINVAL if it is not a power-of-2.
This prevents kdamond from unexpectedly terminating due to late
validation failures.

> The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
> However, if an invalid input is provided via the DAMON_RECLAIM interface,
> the validation failure occurs too late, causing kdamond to terminate
> unexpectedly.

Is the root cause of this unexpected termination in damon_commit_ctx()
itself?

Looking at damon_commit_ctx() in mm/damon/core.c, it sets
dst->maybe_corrupted = true before running the is_power_of_2() check:

damon_commit_ctx()
    dst->maybe_corrupted = true;
    if (!is_power_of_2(src->min_region_sz))
        return -EINVAL;

If the check fails, it correctly returns -EINVAL but incorrectly leaves the
running context permanently marked as corrupted.  When kdamond_fn()
observes ctx->maybe_corrupted is true, it explicitly breaks its main loop,
terminating the thread.

Would it be safer to fix this regression by validating inputs before
mutating the maybe_corrupted state in damon_commit_ctx(), so that other
callers remain protected from unexpected daemon-crashing denial-of-service?

Also, does the DAMON sysfs interface bypass this min_region_sz validation,
allowing kdamond to run with invalid alignment?

In mm/damon/sysfs.c, a user can set addr_unit to a non-power-of-2 value via
the sysfs file. When turning DAMON on:

damon_sysfs_turn_damon_on()
  -> damon_sysfs_build_ctx()
    -> damon_sysfs_apply_inputs()

The code calculates ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / sys_ctx->addr_unit, 1)
but lacks the is_power_of_2() safeguard. It then calls damon_start() directly,
bypassing damon_commit_ctx() entirely.

Since DAMON heavily relies on ALIGN() and ALIGN_DOWN() macros with
min_region_sz, could a non-power-of-2 value produce corrupted mathematical
masks, leading to overlapping bounds and memory corruption?

> diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> index 86da147786583..2747eef5919d1 100644
> --- a/mm/damon/reclaim.c
> +++ b/mm/damon/reclaim.c
> @@ -204,6 +204,11 @@ static int damon_reclaim_apply_parameters(void)
>  	param_ctx->addr_unit = addr_unit;
>  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
>  
> +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	if (!damon_reclaim_mon_attrs.aggr_interval) {
>  		err = -EINVAL;
>  		goto out;
>  	}

Can concurrent modifications of damon_reclaim_mon_attrs.aggr_interval via
sysfs cause a division by zero oops?

The global parameter is checked here, but it can be concurrently modified by
root without subsystem-level locks.

Subsequently, the function calls damon_reclaim_new_scheme() which executes:

damon_reclaim_new_scheme()
    .min_age_region = min_age / damon_reclaim_mon_attrs.aggr_interval;

If a concurrent sysfs write sets the interval to 0 immediately after the
check, could this re-fetch trigger an unhandled divide error?

Further down in damon_reclaim_apply_parameters(), the code calls:

    err = damon_set_region_biggest_system_ram_default(param_target,
                    &monitor_region_start,
                    &monitor_region_end,
                    ...);

Can a data race on monitor_region_start and monitor_region_end bypass
bounds checking?

These global variables are exported as module parameters and can be
concurrently written by userspace.

When passed to damon_set_region_biggest_system_ram_default(), the function
validates the pointers:

damon_set_region_biggest_system_ram_default()
    if (*start > *end)
        return -EINVAL;

Without READ_ONCE() wrappers, could these values be modified right after
the validation, satisfying the forbidden state where *start > *end?

If this happens, the invalid boundaries are blindly assigned to addr_range.
When the region is later sized, damon_sz_region() calculates
r->ar.end - r->ar.start, resulting in a massive underflowed unsigned long.

Could this cause an unbounded OOM loop inside kdamond_fn(), where the
region split condition while (damon_sz_region(r) > max_region_sz) executes
infinitely and constantly allocates new structures?


# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260403052837.58063-3-aethernet65535@gmail.com
#
# hkml [1] generated a draft of this mail.  It can be regenerated
# using below command:
#
#     hkml patch sashiko_dev --for_forwarding \
#             20260403052837.58063-3-aethernet65535@gmail.com
#
# [1] https://github.com/sjp38/hackermail


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
  2026-04-03  8:31   ` (sashiko review) " Liew Rui Yan
@ 2026-04-03  8:59   ` Quanmin Yan
  2026-04-03  9:50     ` Liew Rui Yan
  2026-04-03 16:19   ` SeongJae Park
  2 siblings, 1 reply; 17+ messages in thread
From: Quanmin Yan @ 2026-04-03  8:59 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: damon, linux-mm, stable, sj

Hi Liew,

在 2026/4/3 13:23, Liew Rui Yan 写道:
> The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
> However, if an invalid input is provided via the DAMON_LRU_SORT
> interface, the validation failure occurs too late, causing kdamond to
> terminate unexpectedly.

I'm a little confused, what does "causing kdamond to terminate
unexpectedly" mean? The damon_lru_sort_apply_parameters function will
eventually call damon_commit_ctx, and the power-of-2 check is always
performed. Is the early check here to prevent some more broken case
or am I missing something?


Thanks,
Quanmin Yan

> To reproduce:
> 1. Enable DAMON_LRU_SORT.
> 2. Set an invalid 'addr_unit' (e.g., addr_unit=3) so that
>     'min_region_sz = DAMON_MIN_REGION_SZ / addr_unit' becomes
>     non-power-of-2.
> 3. Commit parameters, and observe kdamond termination.
>
> This patch adds an early check in damon_lru_sort_apply_parameters() to
> validate 'min_region_sz' and return -EINVAL immediately if it is not
> a power-of-2, preventing unexpected kdamond termination.
>
> Fixes: 2e0fe9245d6b ("mm/damon/lru_sort: support addr_unit for DAMON_LRU_SORT")
> Cc: <stable@vger.kernel.org> # 6.18.x
> Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
> ---
>   mm/damon/lru_sort.c | 5 +++++
>   1 file changed, 5 insertions(+)
>
> diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> index 554559d72976..3fd176ef9d9c 100644
> --- a/mm/damon/lru_sort.c
> +++ b/mm/damon/lru_sort.c
> @@ -294,6 +294,11 @@ static int damon_lru_sort_apply_parameters(void)
>   	param_ctx->addr_unit = addr_unit;
>   	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
>   
> +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>   	if (!damon_lru_sort_mon_attrs.sample_interval) {
>   		err = -EINVAL;
>   		goto out;


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03  8:59   ` Quanmin Yan
@ 2026-04-03  9:50     ` Liew Rui Yan
  0 siblings, 0 replies; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-03  9:50 UTC (permalink / raw)
  To: yanquanmin1; +Cc: aethernet65535, damon, linux-mm, sj, stable

Hi Quanmin,

On Fri, 3 Apr 2026 16:59:38 +0800 Quanmin Yan <yanquanmin1@huawei.com> wrote:
> [...]
> 
> I'm a little confused, what does "causing kdamond to terminate
> unexpectedly" mean? The damon_lru_sort_apply_parameters function will
> eventually call damon_commit_ctx, and the power-of-2 check is always
> performed. Is the early check here to prevent some more broken case
> or am I missing something?

terminate unexpectedly means - termination not initiated by the user.
(e.g., not by echo N > enabled)

The issue is not about whether the check exists, but about when it
happens.

In damon_commit_ctx():

    dst->maybe_corrupted = true;
    if (!is_power_of_2(src->min_region_sz))
        return -EINVAL;

Even though -EINVAL is returned, the 'maybe_corrupted' flag has already
been set. When kdamond sees this flag, it terminates.

My patch prevents this by rejecting invalid 'min_region_sz' before
damon_commit_ctx() is called, so 'maybe_corrupted' never gets set for
invalid inputs.

Best regards,
Rui Yan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: (sashiko review) [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03  8:31   ` (sashiko review) " Liew Rui Yan
@ 2026-04-03 15:55     ` SeongJae Park
  2026-04-04  9:09       ` Liew Rui Yan
  0 siblings, 1 reply; 17+ messages in thread
From: SeongJae Park @ 2026-04-03 15:55 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm, stable, yanquanmin1

Hello Liew,


I recently learned not everyone is interested [1] in sashiko review.  So when
I sharing sashiko review, I now try to drop recipients who are not necessarily
interested in the review.  My choices at the moment is the author of the patch,
maintainers of the subsystem, and the mailing lists of the subsystem.

So if I was sharing this review, I would dropped stable@ and yanquanmin1@ from
the Cc list.

Please consider doing similar when you share sashiko review.

Everyone including myself is learning how to use sashiko well, so it could be
confusing.  Sorry about that.  I'm planning to make a guideline for sashiko
review usage in DAMON subsystem, and update hkml to embed it, hoping that to
help reducing some of the confusion.

Anyway, seems sashiko found no blockers for this patch.

On Fri,  3 Apr 2026 16:31:25 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> Forwarding Sashiko review for doing discussions via mails.
> 
> # review url: https://sashiko.dev/#/patchset/20260403052837.58063-2-aethernet65535@gmail.com
> # start of sashiko.dev inline review
> commit 525a0116b6d91612e679b30ae84e489a7540edbf
> Author: Liew Rui Yan <aethernet65535@gmail.com>
> 
> mm/damon/lru_sort: validate min_region_size to be power of 2
> 
> This commit adds an early check in damon_lru_sort_apply_parameters() to
> ensure min_region_sz is a power of 2. This prevents kdamond from unexpectedly
> terminating when given an invalid addr_unit.
> 
> > diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> > index 554559d729760..3fd176ef9d9c1 100644
> > --- a/mm/damon/lru_sort.c
> > +++ b/mm/damon/lru_sort.c
> > @@ -294,6 +294,11 @@ static int damon_lru_sort_apply_parameters(void)
> >  	param_ctx->addr_unit = addr_unit;
> >  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
> >  
> > +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> > +		err = -EINVAL;
> > +		goto out;
> > +	}
> > +
> 
> Instead of adding this early check here, could this be resolved by fixing
> the stateless validation in damon_commit_ctx() directly?
> 
> Looking at damon_commit_ctx(), it currently sets dst->maybe_corrupted = true
> before checking if src->min_region_sz is a power of 2:
> 
> int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
> {
> 	int err;
> 
> 	dst->maybe_corrupted = true;
> 	if (!is_power_of_2(src->min_region_sz))
> 		return -EINVAL;
> ...
> }
> 
> When this validation fails, it leaves the active context marked as corrupted
> without making any state changes, which causes the kdamond thread to terminate
> permanently. Moving the stateless is_power_of_2() check above
> dst->maybe_corrupted = true would resolve the regression for all current and
> future callers without requiring duplicated validation logic at every
> call site.

We discussed this before, and decided not to add the check for now.

Keeping this behavior also makes the expectation for callers simpler:
damon_commit_ctx() failure always means kdamond will be terminated.

> 
> Does this missing validation also affect mm/damon/sysfs.c?
> 
> When a user starts DAMON via state_store() -> damon_sysfs_turn_damon_on(),
> the code calls damon_sysfs_build_ctx(). This calculates min_region_sz and
> then directly calls damon_start(), bypassing damon_commit_ctx() entirely.
> 
> Because this bypasses the is_power_of_2() checks, kdamond can actively run
> with a non-power-of-2 min_region_sz. Core DAMON operations like
> damon_set_regions() use ALIGN() and ALIGN_DOWN() with min_region_sz, which
> perform bitwise masking that requires the alignment parameter to be a
> power of 2.
> 
> Could this allow userspace to trigger incorrect address boundaries, massive
> unsigned underflows, or infinite loops during region splitting?

Good catch, but none of this patch's business.  I will work on this.

> 
> >  	if (!damon_lru_sort_mon_attrs.sample_interval) {
> >  		err = -EINVAL;
> >  		goto out;
> 
> 
> # end of sashiko.dev inline review
> # review url: https://sashiko.dev/#/patchset/20260403052837.58063-2-aethernet65535@gmail.com

[1] https://lore.kernel.org/20260331043441.67196-1-sj@kernel.org


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: (sashiko review) [PATCH v3 2/2] mm/damon/reclaim: validate min_region_size to be power of 2
  2026-04-03  8:33   ` (sashiko review) " Liew Rui Yan
@ 2026-04-03 16:07     ` SeongJae Park
  0 siblings, 0 replies; 17+ messages in thread
From: SeongJae Park @ 2026-04-03 16:07 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm

Dropping stable@ and Quanmin from the Cc.  Read my reply to the sashiko review
sharing for the first patch of this series to show why I do that.

And I find no blockers for this patch from below sashiko review.  Read further
if you are interested in details.

On Fri,  3 Apr 2026 16:33:22 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> Forwarding Sashiko review for doing discussions via mails.
> 
> # review url: https://sashiko.dev/#/patchset/20260403052837.58063-3-aethernet65535@gmail.com
> # start of sashiko.dev inline review
> commit 28b6de17622e17753e02ff606413650ad898f09d
> Author: Liew Rui Yan <aethernet65535@gmail.com>
> 
> mm/damon/reclaim: validate min_region_size to be power of 2
> 
> This commit adds an early check in damon_reclaim_apply_parameters() to
> validate min_region_sz and return -EINVAL if it is not a power-of-2.
> This prevents kdamond from unexpectedly terminating due to late
> validation failures.
> 
> > The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
> > However, if an invalid input is provided via the DAMON_RECLAIM interface,
> > the validation failure occurs too late, causing kdamond to terminate
> > unexpectedly.
> 
> Is the root cause of this unexpected termination in damon_commit_ctx()
> itself?
> 
> Looking at damon_commit_ctx() in mm/damon/core.c, it sets
> dst->maybe_corrupted = true before running the is_power_of_2() check:
> 
> damon_commit_ctx()
>     dst->maybe_corrupted = true;
>     if (!is_power_of_2(src->min_region_sz))
>         return -EINVAL;
> 
> If the check fails, it correctly returns -EINVAL but incorrectly leaves the
> running context permanently marked as corrupted.  When kdamond_fn()
> observes ctx->maybe_corrupted is true, it explicitly breaks its main loop,
> terminating the thread.
> 
> Would it be safer to fix this regression by validating inputs before
> mutating the maybe_corrupted state in damon_commit_ctx(), so that other
> callers remain protected from unexpected daemon-crashing denial-of-service?
> 
> Also, does the DAMON sysfs interface bypass this min_region_sz validation,
> allowing kdamond to run with invalid alignment?
> 
> In mm/damon/sysfs.c, a user can set addr_unit to a non-power-of-2 value via
> the sysfs file. When turning DAMON on:
> 
> damon_sysfs_turn_damon_on()
>   -> damon_sysfs_build_ctx()
>     -> damon_sysfs_apply_inputs()
> 
> The code calculates ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / sys_ctx->addr_unit, 1)
> but lacks the is_power_of_2() safeguard. It then calls damon_start() directly,
> bypassing damon_commit_ctx() entirely.
> 
> Since DAMON heavily relies on ALIGN() and ALIGN_DOWN() macros with
> min_region_sz, could a non-power-of-2 value produce corrupted mathematical
> masks, leading to overlapping bounds and memory corruption?

I answered above two questions on my reply to the sashiko review sharing mail
for the first patch of this series.  Please refer to that for details.

In short, the first behavior is intended one and we don't want to change it.
For the second finding, I will work on it.

> 
> > diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> > index 86da147786583..2747eef5919d1 100644
> > --- a/mm/damon/reclaim.c
> > +++ b/mm/damon/reclaim.c
> > @@ -204,6 +204,11 @@ static int damon_reclaim_apply_parameters(void)
> >  	param_ctx->addr_unit = addr_unit;
> >  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
> >  
> > +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> > +		err = -EINVAL;
> > +		goto out;
> > +	}
> > +
> >  	if (!damon_reclaim_mon_attrs.aggr_interval) {
> >  		err = -EINVAL;
> >  		goto out;
> >  	}
> 
> Can concurrent modifications of damon_reclaim_mon_attrs.aggr_interval via
> sysfs cause a division by zero oops?
> 
> The global parameter is checked here, but it can be concurrently modified by
> root without subsystem-level locks.
> 
> Subsequently, the function calls damon_reclaim_new_scheme() which executes:
> 
> damon_reclaim_new_scheme()
>     .min_age_region = min_age / damon_reclaim_mon_attrs.aggr_interval;
> 
> If a concurrent sysfs write sets the interval to 0 immediately after the
> check, could this re-fetch trigger an unhandled divide error?
> 
> Further down in damon_reclaim_apply_parameters(), the code calls:
> 
>     err = damon_set_region_biggest_system_ram_default(param_target,
>                     &monitor_region_start,
>                     &monitor_region_end,
>                     ...);
> 
> Can a data race on monitor_region_start and monitor_region_end bypass
> bounds checking?
> 
> These global variables are exported as module parameters and can be
> concurrently written by userspace.
> 
> When passed to damon_set_region_biggest_system_ram_default(), the function
> validates the pointers:
> 
> damon_set_region_biggest_system_ram_default()
>     if (*start > *end)
>         return -EINVAL;
> 
> Without READ_ONCE() wrappers, could these values be modified right after
> the validation, satisfying the forbidden state where *start > *end?
> 
> If this happens, the invalid boundaries are blindly assigned to addr_range.
> When the region is later sized, damon_sz_region() calculates
> r->ar.end - r->ar.start, resulting in a massive underflowed unsigned long.
> 
> Could this cause an unbounded OOM loop inside kdamond_fn(), where the
> region split condition while (damon_sz_region(r) > max_region_sz) executes
> infinitely and constantly allocates new structures?

Again, not this patch's business.

We addressed [1] this sort of races by defining updating parameter during
commit_inputs processing as a wrong user behavior, and adding warning for
stable kernels.

Also Liew's recent patch [2] will remove the races in the mainline and new
kernels.

[1] https://lore.kernel.org/20260328133216.9697-1-sj@kernel.org
[2] https://lore.kernel.org/20260402155733.77050-4-sj@kernel.org


> 
> 
> # end of sashiko.dev inline review
> # review url: https://sashiko.dev/#/patchset/20260403052837.58063-3-aethernet65535@gmail.com


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
  2026-04-03  8:31   ` (sashiko review) " Liew Rui Yan
  2026-04-03  8:59   ` Quanmin Yan
@ 2026-04-03 16:19   ` SeongJae Park
  2026-04-04  9:11     ` Liew Rui Yan
  2 siblings, 1 reply; 17+ messages in thread
From: SeongJae Park @ 2026-04-03 16:19 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, yanquanmin1, damon, linux-mm, stable

On Fri,  3 Apr 2026 13:23:49 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
> However, if an invalid input is provided via the DAMON_LRU_SORT
> interface, the validation failure occurs too late, causing kdamond to
> terminate unexpectedly.

As Quanmin also asked, clarifying the unexpected termination would be nice.

> 
> To reproduce:
> 1. Enable DAMON_LRU_SORT.
> 2. Set an invalid 'addr_unit' (e.g., addr_unit=3) so that
>    'min_region_sz = DAMON_MIN_REGION_SZ / addr_unit' becomes
>    non-power-of-2.
> 3. Commit parameters, and observe kdamond termination.
> 
> This patch adds an early check in damon_lru_sort_apply_parameters() to
> validate 'min_region_sz' and return -EINVAL immediately if it is not
> a power-of-2, preventing unexpected kdamond termination.
> 
> Fixes: 2e0fe9245d6b ("mm/damon/lru_sort: support addr_unit for DAMON_LRU_SORT")
> Cc: <stable@vger.kernel.org> # 6.18.x

I remember I suggested adding stable@, but only if you think it deserve.  I'm
now not very sure if this deserves Cc-ing stable@.  As I mentioned before,
there are multiple patches to review in parallel (you are also having such
multiple patches in the queue).  Please don't expect I will follow full
contexts especially when a single person posting multiple patches in parallel
every day or two, and bear in mind with me.

Sorry about the limited bandwidth from my side.  You could also simply slow
down your pace, though.

For stable@ Cc-ing patches, more clearly describing the user impact would be
nice, and helpful for judging if it deserves that.  Could you please elaborate?

> Signed-off-by: Liew Rui Yan <aethernet65535@gmail.com>
> ---
>  mm/damon/lru_sort.c | 5 +++++
>  1 file changed, 5 insertions(+)
> 
> diff --git a/mm/damon/lru_sort.c b/mm/damon/lru_sort.c
> index 554559d72976..3fd176ef9d9c 100644
> --- a/mm/damon/lru_sort.c
> +++ b/mm/damon/lru_sort.c
> @@ -294,6 +294,11 @@ static int damon_lru_sort_apply_parameters(void)
>  	param_ctx->addr_unit = addr_unit;
>  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
>  
> +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	if (!damon_lru_sort_mon_attrs.sample_interval) {
>  		err = -EINVAL;
>  		goto out;

Code change looks good to me.


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 2/2] mm/damon/reclaim: validate min_region_size to be power of 2
  2026-04-03  5:23 ` [PATCH v3 2/2] mm/damon/reclaim: " Liew Rui Yan
  2026-04-03  8:33   ` (sashiko review) " Liew Rui Yan
@ 2026-04-03 16:20   ` SeongJae Park
  1 sibling, 0 replies; 17+ messages in thread
From: SeongJae Park @ 2026-04-03 16:20 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, yanquanmin1, damon, linux-mm, stable

On Fri,  3 Apr 2026 13:23:50 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
> However, if an invalid input is provided via the DAMON_RECLAIM interface,
> the validation failure occurs too late, causing kdamond to terminate
> unexpectedly.
> 
> To reproduce:
> 1. Enable DAMON_RECLAIM.
> 2. Set an invalid 'addr_unit' (e.g., addr_unit=3) so that
>    'min_region_sz = DAMON_MIN_REGION_SZ / addr_unit' becomes
>    non-power-of-2.
> 3. Commit parameters, and observe kdamond termination.
> 
> This patch adds an early check in damon_reclaim_apply_parameters() to
> validate 'min_region_sz' and return -EINVAL immediately if it is not a
> power-of-2, preventing unexpected kdamond termination.
> 
> Fixes: 7db551fcfb2a ("mm/damon/reclaim: support addr_unit for DAMON_RECLAIM")
> Cc: <stable@vger.kernel.org> # 6.18.x

I'm not very sure if this deserves Cc-ing stable@.  I posted more details on my
reply to the first patch of this series.  Let's discuss further on the thread.

I will skip reviewing this patch until the discussion on the thread is done.


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: (sashiko review) [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03 15:55     ` SeongJae Park
@ 2026-04-04  9:09       ` Liew Rui Yan
  2026-04-04 20:18         ` SeongJae Park
  0 siblings, 1 reply; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-04  9:09 UTC (permalink / raw)
  To: sj; +Cc: aethernet65535, damon, linux-mm

On Fri,  3 Apr 2026 08:55:29 -0700 SeongJae Park <sj@kernel.org> wrote:

> I recently learned not everyone is interested [1] in sashiko review.  So when
> I sharing sashiko review, I now try to drop recipients who are not necessarily
> interested in the review.  My choices at the moment is the author of the patch,
> maintainers of the subsystem, and the mailing lists of the subsystem.
> 
> So if I was sharing this review, I would dropped stable@ and yanquanmin1@ from
> the Cc list.
> 
> Please consider doing similar when you share sashiko review.
> 
> Everyone including myself is learning how to use sashiko well, so it could be
> confusing.  Sorry about that.  I'm planning to make a guideline for sashiko
> review usage in DAMON subsystem, and update hkml to embed it, hoping that to
> help reducing some of the confusion.

Thank you for the guidance, I will keep this in mind for future Sashiko
review forwards.

> Anyway, seems sashiko found no blockers for this patch.

And thank you very much for reviewing the review! :>

Best regards,
Rui Yan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-03 16:19   ` SeongJae Park
@ 2026-04-04  9:11     ` Liew Rui Yan
  2026-04-04 20:30       ` SeongJae Park
  0 siblings, 1 reply; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-04  9:11 UTC (permalink / raw)
  To: sj; +Cc: aethernet65535, damon, linux-mm

Hi SeongJae,

On Fri,  3 Apr 2026 09:19:06 -0700 SeongJae Park <sj@kernel.org> wrote:

> As Quanmin also asked, clarifying the unexpected termination would be nice.

I will clarify "unexpected termination" in v4's commit message:

    "Unexpectedly" here means the kdamond terminates without the user
    requesting it (e.g., not by wrtting "N" to 'enabled').

> I remember I suggested adding stable@, but only if you think it deserve.  I'm
> now not very sure if this deserves Cc-ing stable@.  As I mentioned before,

I think this deserves Cc stable@ because:
- The issue can be triggered by accidental user misconfiguration
- It causes kdamond termination and hard to recover/restart. Once
  triggered, kdamond remains in an unusable state, and I have not found
  a way to recover/restart it through the sysfs interface without a
  system reboot.
- The fix is small and low-risk
- It improves user experience on stable kernels

But I'm open to dropping it if you think otherwise.

> there are multiple patches to review in parallel (you are also having such
> multiple patches in the queue).  Please don't expect I will follow full
> contexts especially when a single person posting multiple patches in parallel
> every day or two, and bear in mind with me.
> 
> Sorry about the limited bandwidth from my side.  You could also simply slow
> down your pace, though.

I will try to adjust my pace. I've also noticed that the quality of my
recent replies/patches hasn't been ideal, and I will try to be less
rushed in the future. Thank you for your understanding and suggestion.

> For stable@ Cc-ing patches, more clearly describing the user impact would be
> nice, and helpful for judging if it deserves that.  Could you please elaborate?

I will add a "User Impact" section in v4's commit message:

    User Impact
    ===========
    Currently, if a user commit an invalid 'addr_unit', kdamond may
    terminate abruptly. Once terminated, it cannot be easily restarted
    via sysfs, pontentially requiring a system reboot to restore DAMON
    functionality. This patch prevent such termination by validating
    parameters (addr_unit and min_region_sz) early.

Best regards,
Rui Yan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: (sashiko review) [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-04  9:09       ` Liew Rui Yan
@ 2026-04-04 20:18         ` SeongJae Park
  0 siblings, 0 replies; 17+ messages in thread
From: SeongJae Park @ 2026-04-04 20:18 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm

On Sat,  4 Apr 2026 17:09:27 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> On Fri,  3 Apr 2026 08:55:29 -0700 SeongJae Park <sj@kernel.org> wrote:
> 
> > I recently learned not everyone is interested [1] in sashiko review.  So when
> > I sharing sashiko review, I now try to drop recipients who are not necessarily
> > interested in the review.  My choices at the moment is the author of the patch,
> > maintainers of the subsystem, and the mailing lists of the subsystem.
> > 
> > So if I was sharing this review, I would dropped stable@ and yanquanmin1@ from
> > the Cc list.
> > 
> > Please consider doing similar when you share sashiko review.
> > 
> > Everyone including myself is learning how to use sashiko well, so it could be
> > confusing.  Sorry about that.  I'm planning to make a guideline for sashiko
> > review usage in DAMON subsystem, and update hkml to embed it, hoping that to
> > help reducing some of the confusion.
> 
> Thank you for the guidance, I will keep this in mind for future Sashiko
> review forwards.

No worry.  I just pushed hkml change for this.  The latest version of hkml will
add the guide to the forwarding mail draft and make the format easier to add
your inline comments together, by default like below:

    $ hkml patch sashiko_dev 20260325013939.18167-1-aethernet65535@gmail.com --for_forwarding

    Adding your opinion together with the sashiko review sharing
    is a recommended practice for reducing traffic.  Will you
    do so?  If so, I will adjust format to be easier for that
    [Y/n]
    # sashiko review suggestions
    #
    # 1. Consider reducing recipients.  Maybe the author,
    #    maintainers, reviewers, and mailing list of the
    #    direct subsystem and parent susystem mailing lists
    #    could be a starting point.
    # 2. Add short summary of your opinion at the beginning.
    #    For example:
    #
    #      sashiko found an issue.  I will respin this patch.
    #      sashiko found no issue.
    #
    # Please don't forget removing this comment block before
    # sending this!

    Forwarding sashiko review in a reply format with my inline
    comments below, for details of my view and doing discussions
    via mails if needed.

    > # review url: https://sashiko.dev/#/patchset/20260325013939.18167-1-aethernet65535@gmail.com
    > # start of sashiko.dev inline review
    > commit 60ccea4154b0c58741fae2323454a5a9496b67fa
    > Author: Liew Rui Yan <aethernet65535@gmail.com>
    [...]

> 
> > Anyway, seems sashiko found no blockers for this patch.
> 
> And thank you very much for reviewing the review! :>

I'm happy to help :)


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-04  9:11     ` Liew Rui Yan
@ 2026-04-04 20:30       ` SeongJae Park
  2026-04-06  9:41         ` Liew Rui Yan
  0 siblings, 1 reply; 17+ messages in thread
From: SeongJae Park @ 2026-04-04 20:30 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm

On Sat,  4 Apr 2026 17:11:22 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> Hi SeongJae,
> 
> On Fri,  3 Apr 2026 09:19:06 -0700 SeongJae Park <sj@kernel.org> wrote:
> 
> > As Quanmin also asked, clarifying the unexpected termination would be nice.
> 
> I will clarify "unexpected termination" in v4's commit message:
> 
>     "Unexpectedly" here means the kdamond terminates without the user
>     requesting it (e.g., not by wrtting "N" to 'enabled').
> 
> > I remember I suggested adding stable@, but only if you think it deserve.  I'm
> > now not very sure if this deserves Cc-ing stable@.  As I mentioned before,
> 
> I think this deserves Cc stable@ because:
> - The issue can be triggered by accidental user misconfiguration
> - It causes kdamond termination and hard to recover/restart. Once
>   triggered, kdamond remains in an unusable state, and I have not found
>   a way to recover/restart it through the sysfs interface without a
>   system reboot.
> - The fix is small and low-risk
> - It improves user experience on stable kernels

Thank you for elaborating this.  I now think this deserves Cc-ing stable@.
Let's make sure the user impact is clearly documented on the commit message.
Also, the code change itself and the current commit message is not clearly
explaining how the real problem (DAMON cannot be restarted) can happen.  Please
add the description in the next revision.

> 
> But I'm open to dropping it if you think otherwise.
> 
> > there are multiple patches to review in parallel (you are also having such
> > multiple patches in the queue).  Please don't expect I will follow full
> > contexts especially when a single person posting multiple patches in parallel
> > every day or two, and bear in mind with me.
> > 
> > Sorry about the limited bandwidth from my side.  You could also simply slow
> > down your pace, though.
> 
> I will try to adjust my pace. I've also noticed that the quality of my
> recent replies/patches hasn't been ideal, and I will try to be less
> rushed in the future. Thank you for your understanding and suggestion.

Sounds good :)

> 
> > For stable@ Cc-ing patches, more clearly describing the user impact would be
> > nice, and helpful for judging if it deserves that.  Could you please elaborate?
> 
> I will add a "User Impact" section in v4's commit message:

Yes, please add this kind of message that make clear what is the user impact.

> 
>     User Impact
>     ===========
>     Currently, if a user commit an invalid 'addr_unit', kdamond may
>     terminate abruptly. Once terminated, it cannot be easily restarted
>     via sysfs, pontentially requiring a system reboot to restore DAMON
>     functionality. This patch prevent such termination by validating
>     parameters (addr_unit and min_region_sz) early.

You mentioned it cannot be restarted above.  The above message sounds like
there is a way to restart it via sysfs, though it is not easy.  Please make it
clear and consistent.  As I also requested above, please add the internal
mechanism of how that makes restart unable.  This itself doesn't explain it.


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-04 20:30       ` SeongJae Park
@ 2026-04-06  9:41         ` Liew Rui Yan
  2026-04-06 15:50           ` SeongJae Park
  0 siblings, 1 reply; 17+ messages in thread
From: Liew Rui Yan @ 2026-04-06  9:41 UTC (permalink / raw)
  To: sj; +Cc: aethernet65535, damon, linux-mm

On Sat,  4 Apr 2026 13:30:43 -0700 SeongJae Park <sj@kernel.org> wrote:

> On Sat,  4 Apr 2026 17:11:22 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
> 
> > [...]
> > I think this deserves Cc stable@ because:
> > - The issue can be triggered by accidental user misconfiguration
> > - It causes kdamond termination and hard to recover/restart. Once
> >   triggered, kdamond remains in an unusable state, and I have not found
> >   a way to recover/restart it through the sysfs interface without a
> >   system reboot.
> > - The fix is small and low-risk
> > - It improves user experience on stable kernels
> 
> Thank you for elaborating this.  I now think this deserves Cc-ing stable@.

Thank you for agreeing to Cc stable@.

> Let's make sure the user impact is clearly documented on the commit message.
> Also, the code change itself and the current commit message is not clearly
> explaining how the real problem (DAMON cannot be restarted) can happen.  Please
> add the description in the next revision.

I will add an explanation of why DAMON cannot be restarted in the next
revision.

> [...]
> 
> > 
> > > For stable@ Cc-ing patches, more clearly describing the user impact would be
> > > nice, and helpful for judging if it deserves that.  Could you please elaborate?
> > 
> > I will add a "User Impact" section in v4's commit message:
> 
> Yes, please add this kind of message that make clear what is the user impact.
> 
> > 
> >     User Impact
> >     ===========
> >     Currently, if a user commit an invalid 'addr_unit', kdamond may
> >     terminate abruptly. Once terminated, it cannot be easily restarted
> >     via sysfs, pontentially requiring a system reboot to restore DAMON
> >     functionality. This patch prevent such termination by validating
> >     parameters (addr_unit and min_region_sz) early.
> 
> You mentioned it cannot be restarted above.  The above message sounds like
> there is a way to restart it via sysfs, though it is not easy.  Please make it
> clear and consistent.  As I also requested above, please add the internal
> mechanism of how that makes restart unable.  This itself doesn't explain it.

Thank you for your correction. I will make the description clear and
consistent, and also explain the internal mechanism 
(the 'maybe_corrupted' flag and why writing to 'enabled' cannot recover
it).

Note: 
Please excuse my slower responses in the coming weeks, as I have
upcoming exams. I will be most active and able to follow up during
weekend (Friday/Saturday).

Best regards,
Rui Yan


^ permalink raw reply	[flat|nested] 17+ messages in thread

* Re: [PATCH v3 1/2] mm/damon/lru_sort: validate min_region_size to be power of 2
  2026-04-06  9:41         ` Liew Rui Yan
@ 2026-04-06 15:50           ` SeongJae Park
  0 siblings, 0 replies; 17+ messages in thread
From: SeongJae Park @ 2026-04-06 15:50 UTC (permalink / raw)
  To: Liew Rui Yan; +Cc: SeongJae Park, damon, linux-mm

On Mon,  6 Apr 2026 17:41:16 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> On Sat,  4 Apr 2026 13:30:43 -0700 SeongJae Park <sj@kernel.org> wrote:
> 
> > On Sat,  4 Apr 2026 17:11:22 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:
> > 
> > > [...]
> > > I think this deserves Cc stable@ because:
> > > - The issue can be triggered by accidental user misconfiguration
> > > - It causes kdamond termination and hard to recover/restart. Once
> > >   triggered, kdamond remains in an unusable state, and I have not found
> > >   a way to recover/restart it through the sysfs interface without a
> > >   system reboot.
> > > - The fix is small and low-risk
> > > - It improves user experience on stable kernels
> > 
> > Thank you for elaborating this.  I now think this deserves Cc-ing stable@.
> 
> Thank you for agreeing to Cc stable@.
> 
> > Let's make sure the user impact is clearly documented on the commit message.
> > Also, the code change itself and the current commit message is not clearly
> > explaining how the real problem (DAMON cannot be restarted) can happen.  Please
> > add the description in the next revision.
> 
> I will add an explanation of why DAMON cannot be restarted in the next
> revision.
> 
> > [...]
> > 
> > > 
> > > > For stable@ Cc-ing patches, more clearly describing the user impact would be
> > > > nice, and helpful for judging if it deserves that.  Could you please elaborate?
> > > 
> > > I will add a "User Impact" section in v4's commit message:
> > 
> > Yes, please add this kind of message that make clear what is the user impact.
> > 
> > > 
> > >     User Impact
> > >     ===========
> > >     Currently, if a user commit an invalid 'addr_unit', kdamond may
> > >     terminate abruptly. Once terminated, it cannot be easily restarted
> > >     via sysfs, pontentially requiring a system reboot to restore DAMON
> > >     functionality. This patch prevent such termination by validating
> > >     parameters (addr_unit and min_region_sz) early.
> > 
> > You mentioned it cannot be restarted above.  The above message sounds like
> > there is a way to restart it via sysfs, though it is not easy.  Please make it
> > clear and consistent.  As I also requested above, please add the internal
> > mechanism of how that makes restart unable.  This itself doesn't explain it.
> 
> Thank you for your correction. I will make the description clear and
> consistent, and also explain the internal mechanism 
> (the 'maybe_corrupted' flag and why writing to 'enabled' cannot recover
> it).

Sounds all good :)

> 
> Note: 
> Please excuse my slower responses in the coming weeks, as I have
> upcoming exams. I will be most active and able to follow up during
> weekend (Friday/Saturday).

No worries, take your time :)


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 17+ messages in thread

end of thread, other threads:[~2026-04-06 15:50 UTC | newest]

Thread overview: 17+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-04-03  5:23 [PATCH v3 0/2] mm/damon: validate min_region_size to be power of 2 Liew Rui Yan
2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
2026-04-03  8:31   ` (sashiko review) " Liew Rui Yan
2026-04-03 15:55     ` SeongJae Park
2026-04-04  9:09       ` Liew Rui Yan
2026-04-04 20:18         ` SeongJae Park
2026-04-03  8:59   ` Quanmin Yan
2026-04-03  9:50     ` Liew Rui Yan
2026-04-03 16:19   ` SeongJae Park
2026-04-04  9:11     ` Liew Rui Yan
2026-04-04 20:30       ` SeongJae Park
2026-04-06  9:41         ` Liew Rui Yan
2026-04-06 15:50           ` SeongJae Park
2026-04-03  5:23 ` [PATCH v3 2/2] mm/damon/reclaim: " Liew Rui Yan
2026-04-03  8:33   ` (sashiko review) " Liew Rui Yan
2026-04-03 16:07     ` SeongJae Park
2026-04-03 16:20   ` SeongJae Park

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox