Re: (sashiko review) [PATCH v3 2/2] mm/damon/reclaim: validate min_region_size to be power of 2

public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed

From: Liew Rui Yan <aethernet65535@gmail.com>
To: aethernet65535@gmail.com
Cc: damon@lists.linux.dev, linux-mm@kvack.org, sj@kernel.org,
	stable@vger.kernel.org, yanquanmin1@huawei.com
Subject: Re: (sashiko review) [PATCH v3 2/2] mm/damon/reclaim: validate min_region_size to be power of 2
Date: Fri,  3 Apr 2026 16:33:22 +0800	[thread overview]
Message-ID: <20260403083322.5852-1-aethernet65535@gmail.com> (raw)
In-Reply-To: <20260403052837.58063-3-aethernet65535@gmail.com>

Forwarding Sashiko review for doing discussions via mails.

# review url: https://sashiko.dev/#/patchset/20260403052837.58063-3-aethernet65535@gmail.com
# start of sashiko.dev inline review
commit 28b6de17622e17753e02ff606413650ad898f09d
Author: Liew Rui Yan <aethernet65535@gmail.com>

mm/damon/reclaim: validate min_region_size to be power of 2

This commit adds an early check in damon_reclaim_apply_parameters() to
validate min_region_sz and return -EINVAL if it is not a power-of-2.
This prevents kdamond from unexpectedly terminating due to late
validation failures.

> The damon_commit_ctx() checks if 'min_region_sz' is a power-of-2.
> However, if an invalid input is provided via the DAMON_RECLAIM interface,
> the validation failure occurs too late, causing kdamond to terminate
> unexpectedly.

Is the root cause of this unexpected termination in damon_commit_ctx()
itself?

Looking at damon_commit_ctx() in mm/damon/core.c, it sets
dst->maybe_corrupted = true before running the is_power_of_2() check:

damon_commit_ctx()
    dst->maybe_corrupted = true;
    if (!is_power_of_2(src->min_region_sz))
        return -EINVAL;

If the check fails, it correctly returns -EINVAL but incorrectly leaves the
running context permanently marked as corrupted.  When kdamond_fn()
observes ctx->maybe_corrupted is true, it explicitly breaks its main loop,
terminating the thread.

Would it be safer to fix this regression by validating inputs before
mutating the maybe_corrupted state in damon_commit_ctx(), so that other
callers remain protected from unexpected daemon-crashing denial-of-service?

Also, does the DAMON sysfs interface bypass this min_region_sz validation,
allowing kdamond to run with invalid alignment?

In mm/damon/sysfs.c, a user can set addr_unit to a non-power-of-2 value via
the sysfs file. When turning DAMON on:

damon_sysfs_turn_damon_on()
  -> damon_sysfs_build_ctx()
    -> damon_sysfs_apply_inputs()

The code calculates ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / sys_ctx->addr_unit, 1)
but lacks the is_power_of_2() safeguard. It then calls damon_start() directly,
bypassing damon_commit_ctx() entirely.

Since DAMON heavily relies on ALIGN() and ALIGN_DOWN() macros with
min_region_sz, could a non-power-of-2 value produce corrupted mathematical
masks, leading to overlapping bounds and memory corruption?

> diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> index 86da147786583..2747eef5919d1 100644
> --- a/mm/damon/reclaim.c
> +++ b/mm/damon/reclaim.c
> @@ -204,6 +204,11 @@ static int damon_reclaim_apply_parameters(void)
>  	param_ctx->addr_unit = addr_unit;
>  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
>  
> +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> +		err = -EINVAL;
> +		goto out;
> +	}
> +
>  	if (!damon_reclaim_mon_attrs.aggr_interval) {
>  		err = -EINVAL;
>  		goto out;
>  	}

Can concurrent modifications of damon_reclaim_mon_attrs.aggr_interval via
sysfs cause a division by zero oops?

The global parameter is checked here, but it can be concurrently modified by
root without subsystem-level locks.

Subsequently, the function calls damon_reclaim_new_scheme() which executes:

damon_reclaim_new_scheme()
    .min_age_region = min_age / damon_reclaim_mon_attrs.aggr_interval;

If a concurrent sysfs write sets the interval to 0 immediately after the
check, could this re-fetch trigger an unhandled divide error?

Further down in damon_reclaim_apply_parameters(), the code calls:

    err = damon_set_region_biggest_system_ram_default(param_target,
                    &monitor_region_start,
                    &monitor_region_end,
                    ...);

Can a data race on monitor_region_start and monitor_region_end bypass
bounds checking?

These global variables are exported as module parameters and can be
concurrently written by userspace.

When passed to damon_set_region_biggest_system_ram_default(), the function
validates the pointers:

damon_set_region_biggest_system_ram_default()
    if (*start > *end)
        return -EINVAL;

Without READ_ONCE() wrappers, could these values be modified right after
the validation, satisfying the forbidden state where *start > *end?

If this happens, the invalid boundaries are blindly assigned to addr_range.
When the region is later sized, damon_sz_region() calculates
r->ar.end - r->ar.start, resulting in a massive underflowed unsigned long.

Could this cause an unbounded OOM loop inside kdamond_fn(), where the
region split condition while (damon_sz_region(r) > max_region_sz) executes
infinitely and constantly allocates new structures?

# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260403052837.58063-3-aethernet65535@gmail.com
#
# hkml [1] generated a draft of this mail.  It can be regenerated
# using below command:
#
#     hkml patch sashiko_dev --for_forwarding \
#             20260403052837.58063-3-aethernet65535@gmail.com
#
# [1] https://github.com/sjp38/hackermail

next prev parent reply	other threads:[~2026-04-03  8:33 UTC|newest]

Thread overview: 17+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-03  5:23 [PATCH v3 0/2] mm/damon: validate min_region_size to be power of 2 Liew Rui Yan
2026-04-03  5:23 ` [PATCH v3 1/2] mm/damon/lru_sort: " Liew Rui Yan
2026-04-03  8:31   ` (sashiko review) " Liew Rui Yan
2026-04-03 15:55     ` SeongJae Park
2026-04-04  9:09       ` Liew Rui Yan
2026-04-04 20:18         ` SeongJae Park
2026-04-03  8:59   ` Quanmin Yan
2026-04-03  9:50     ` Liew Rui Yan
2026-04-03 16:19   ` SeongJae Park
2026-04-04  9:11     ` Liew Rui Yan
2026-04-04 20:30       ` SeongJae Park
2026-04-06  9:41         ` Liew Rui Yan
2026-04-06 15:50           ` SeongJae Park
2026-04-03  5:23 ` [PATCH v3 2/2] mm/damon/reclaim: " Liew Rui Yan
2026-04-03  8:33   ` Liew Rui Yan [this message]
2026-04-03 16:07     ` (sashiko review) " SeongJae Park
2026-04-03 16:20   ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260403083322.5852-1-aethernet65535@gmail.com \
    --to=aethernet65535@gmail.com \
    --cc=damon@lists.linux.dev \
    --cc=linux-mm@kvack.org \
    --cc=sj@kernel.org \
    --cc=stable@vger.kernel.org \
    --cc=yanquanmin1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox