public inbox for linux-mm@kvack.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: Liew Rui Yan <aethernet65535@gmail.com>
Cc: SeongJae Park <sj@kernel.org>, damon@lists.linux.dev, linux-mm@kvack.org
Subject: Re: (sashiko review) [PATCH v4 2/2] mm/damon/reclaim: validate min_region_size to be power of 2
Date: Fri, 10 Apr 2026 06:44:50 -0700	[thread overview]
Message-ID: <20260410134451.81846-1-sj@kernel.org> (raw)
In-Reply-To: <20260410100822.196999-1-aethernet65535@gmail.com>

On Fri, 10 Apr 2026 18:08:22 +0800 Liew Rui Yan <aethernet65535@gmail.com> wrote:

> Sashiko found some issues.
> 
> Forwarding sashiko review in a reply format with my inline
> comments below, for details of my view and doing discussions
> via mails if needed.
> 
> > # review url: https://sashiko.dev/#/patchset/20260410044259.95877-3-aethernet65535@gmail.com
> > # start of sashiko.dev inline review
> > commit 7a816b41f4b321b6a513227bef0e50f3bc1e86f1
> > Author: Liew Rui Yan <aethernet65535@gmail.com>
> > 
> > mm/damon/reclaim: validate min_region_size to be power of 2
> > 
> > This commit adds an early validation in damon_reclaim_apply_parameters() to
> > check if min_region_sz is a power of 2. It attempts to prevent kdamond from
> > terminating unexpectedly by catching invalid values before they reach
> > damon_commit_ctx().
> > 
> > > diff --git a/mm/damon/reclaim.c b/mm/damon/reclaim.c
> > > index 86da147786583..2747eef5919d1 100644
> > > --- a/mm/damon/reclaim.c
> > > +++ b/mm/damon/reclaim.c
> > > @@ -204,6 +204,11 @@ static int damon_reclaim_apply_parameters(void)
> > >  	param_ctx->addr_unit = addr_unit;
> > >  	param_ctx->min_region_sz = max(DAMON_MIN_REGION_SZ / addr_unit, 1);
> > 
> > Is it possible for a concurrent sysfs write to change addr_unit between these
> > two reads, allowing an invalid value to bypass the new validation?
> > 
> > The global variable addr_unit is read twice without READ_ONCE(), once to set
> > param_ctx->addr_unit, and again to calculate param_ctx->min_region_sz.
> > 
> > If addr_unit is changed from a non-power-of-2 to a power-of-2 between the two
> > reads, the validation below might succeed using the new value, while the
> > invalid addr_unit is committed to the active context.
> 
> I believe this is possible, because the current call to
> damon_reclaim_apply_parameters() is asynchronous and does not hold the
> 'kernel_param_lock' during execution, so this situation should be
> possible.
> 
> However, I'm still unsure whether your suggested changes should be
> included in this patch.

We already addressed this with by defining it as an expected behavior that the
user should avoid.  And the queued patch [2] from Liew will make commits
synchronous and fix this issue together for the mainline.  Let me know if I'm
missing something.

> 
> > 
> > > +	if (!is_power_of_2(param_ctx->min_region_sz)) {
> > > +		err = -EINVAL;
> > > +		goto out;
> > > +	}
> > > +
> > 
> > Does this early check mask a structural issue in the core API?
> > 
> > Looking at damon_commit_ctx() in mm/damon/core.c:
> > 
> > int damon_commit_ctx(struct damon_ctx *dst, struct damon_ctx *src)
> > {
> >         ...
> >         dst->maybe_corrupted = true;
> >         if (!is_power_of_2(src->min_region_sz))
> >                 return -EINVAL;
> >         ...
> > 
> > If a different caller passes an invalid min_region_sz, it will still trigger
> > this same trap.
> > 
> > Could the root cause be addressed by moving the maybe_corrupted flag update
> > below the validation checks inside damon_commit_ctx()?
> > 
> > Are there other ways damon_commit_ctx() could leave the context corrupted?
> > 
> > If a user dynamically enables features that require memory allocation,
> > damon_commit_ctx() calls damon_commit_schemes() which attempts to allocate
> > memory via damos_commit_quota_goals().
> > 
> > If this allocation fails with -ENOMEM, it returns an error but leaves
> > dst->maybe_corrupted = true on the active context, permanently terminating
> > kdamond.
> > 
> 
> I discussed this in my previous reply. But I realized there might have
> been some misundertanding in my previous explanation. What I wanted to
> say is that there might indeed be some issues with 'maybe_corrupted' in
> damon_commit_ctx() that need adjustment, but that's not within the scope
> of this patch.

I agree.

> 
> > Can a concurrent sysfs write cause a divide-by-zero panic in this same
> > function?
> > 
> > Looking further down in damon_reclaim_apply_parameters():
> > 
> >         if (quota_mem_pressure_us) {
> >                 goal = damos_new_quota_goal(DAMOS_QUOTA_SOME_MEM_PSI_US,
> >                                 quota_mem_pressure_us);
> > 
> > The global module parameter quota_mem_pressure_us is read twice without
> > READ_ONCE().
> > 
> > If a concurrent write changes its value to 0 immediately after the if
> > condition passes, a quota goal with a target_value of 0 is added.
> > 
> > When DAMON evaluates the quota score in damos_quota_score(), it uses the
> > macro mult_frac(goal->current_value, 10000, goal->target_value). Dividing
> > by a goal->target_value of 0 will cause a kernel panic.
> 
> I tried to reproduce this issue, but without success. 
> 
> However, based on the code, I believe the problem exists because I
> didn't find any checks preventing 'goal->target_value=0'. Please let me
> know if I've missed anything.

As I mentioned above, it is unallowed user inputs [1], and Liew's queued patch
[2] will fix this for mainline.

> 
> > 
> > A similar sequence exists for quota_autotune_feedback.
> > 
> > >  	if (!damon_reclaim_mon_attrs.aggr_interval) {
> > >  		err = -EINVAL;
> > >  		goto out;
> > >  	}
> > 
> > 
> > # end of sashiko.dev inline review
> > # review url: https://sashiko.dev/#/patchset/20260410044259.95877-3-aethernet65535@gmail.com
> 
> # hkml [1] generated a draft of this mail.  You can regenerate
> # this using below command:
> #
> #     hkml patch sashiko_dev --for_forwarding \
> #             20260410044259.95877-3-aethernet65535@gmail.com
> #
> # [1] https://github.com/sjp38/hackermail
> 
> Best regards,
> Rui Yan

[1] https://lkml.kernel.org/r/20260329153052.46657-2-sj@kernel.org
[2] https://lore.kernel.org/20260402155733.77050-4-sj@kernel.org


Thanks,
SJ


  reply	other threads:[~2026-04-10 13:44 UTC|newest]

Thread overview: 12+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-10  4:42 [PATCH v4 0/2] mm/damon: validate min_region_size to be power of 2 Liew Rui Yan
2026-04-10  4:42 ` [PATCH v4 1/2] mm/damon/lru_sort: " Liew Rui Yan
2026-04-10  9:40   ` (sashiko review) " Liew Rui Yan
2026-04-10 13:55     ` SeongJae Park
2026-04-10 16:46       ` Liew Rui Yan
2026-04-10 17:00         ` SeongJae Park
2026-04-10 13:56   ` SeongJae Park
2026-04-10  4:42 ` [PATCH v4 2/2] mm/damon/reclaim: " Liew Rui Yan
2026-04-10 10:08   ` (sashiko review) " Liew Rui Yan
2026-04-10 13:44     ` SeongJae Park [this message]
2026-04-10 13:57   ` SeongJae Park
2026-04-10 14:05 ` [PATCH v4 0/2] mm/damon: " SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260410134451.81846-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=aethernet65535@gmail.com \
    --cc=damon@lists.linux.dev \
    --cc=linux-mm@kvack.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox