From: SeongJae Park <sj@kernel.org>
To: SeongJae Park <sj@kernel.org>
Cc: Andrew Morton <akpm@linux-foundation.org>,
damon@lists.linux.dev, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: (sashiko review) [RFC PATCH v3 02/10] mm/damon/core: introduce failed region quota charge ratio
Date: Tue, 7 Apr 2026 08:51:10 -0700 [thread overview]
Message-ID: <20260407155110.51749-1-sj@kernel.org> (raw)
In-Reply-To: <20260407010536.83603-3-sj@kernel.org>
TL; DR: sashiko made a good finding that better to be addressed. I will
address it in the next revision. sashiko also made two findings but I will
ignore those.
Forwarding full sashiko review in a reply format with my inline comments below,
for sharing details of my view and doing followup discussions via mails if
needed.
> # review url: https://sashiko.dev/#/patchset/20260407010536.83603-3-sj@kernel.org
> # start of sashiko.dev inline review
> commit 09593130776383947d3d32b60b59b00c4bc2e692
> Author: SeongJae Park <sj@kernel.org>
>
> mm/damon/core: introduce failed region quota charge ratio
>
> This patch introduces a configuration to charge a different quota ratio
> for memory regions where DAMOS actions failed. It aims to improve fairness
> in quota accounting between successful and failed actions.
>
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > index c29f70bcce894..2a09d6bdf5f3e 100644
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> [ ... ]
> > @@ -2041,6 +2043,23 @@ static void damos_walk_cancel(struct damon_ctx *ctx)
> > mutex_unlock(&ctx->walk_control_lock);
> > }
> >
> > +static void damos_charge_quota(struct damos_quota *quota,
> > + unsigned long sz_region, unsigned long sz_applied)
> > +{
> > + /*
> > + * sz_applied could be bigger than sz_region, depending on ops
> > + * implementation of the action, e.g., damos_pa_pageout(). Charge only
> > + * the region size in the case.
> > + */
> > + if (!quota->fail_charge_denom || sz_applied > sz_region)
> > + quota->charged_sz += sz_region;
> > + else
> > + quota->charged_sz += sz_applied + mult_frac(
> > + (sz_region - sz_applied),
> > + quota->fail_charge_num,
> > + quota->fail_charge_denom);
> > +}
>
> Can this cause an integer overflow on 32-bit architectures?
>
> The fail_charge_num and fail_charge_denom are unsigned int and can be set
> to arbitrary values by userspace.
>
> If a user configures a large numerator (e.g., close to UINT_MAX), the
> multiplications inside mult_frac() could overflow the 32-bit unsigned long
> boundary on 32-bit architectures, leading to an incorrectly truncated value
> being added to quota->charged_sz.
sashiko is asking a same question again. As I replied on the previous version
thread, I will keep this as is for simplicity of the code, unless a human
argues differently.
>
> [ ... ]
> > @@ -2105,7 +2124,7 @@ static void damos_apply_scheme(struct damon_ctx *c, struct damon_target *t,
> > ktime_get_coarse_ts64(&end);
> > quota->total_charged_ns += timespec64_to_ns(&end) -
> > timespec64_to_ns(&begin);
> > - quota->charged_sz += sz;
> > + damos_charge_quota(quota, sz, sz_applied);
>
> Could this permanently disable a scheme if actions completely fail during the
> first time window?
>
> If fail_charge_num is 0 and all actions fail across all regions in the first
> charge window, quota->charged_sz evaluates to 0.
>
> At the end of the window, total_charged_sz becomes 0, while total_charged_ns
> records the actual time spent attempting. The calculated throughput
> becomes 0, which sets the effective size quota (esz) to 0 for the
> next window.
>
> With esz = 0, damos_quota_is_full() always evaluates to true, causing DAMON
> to skip applying the scheme to any regions. Since no regions are processed,
> total_charged_ns and total_charged_sz never increase again, permanently
> bricking the scheme.
sashiko is asking a same question again. As I replied to the previous
question, this is not cuased by this patch. So I'm working separately for the
issue. I recently posted the fix:
https://lore.kernel.org/20260407003153.79589-1-sj@kernel.org
>
> Does this code create an infinite region splitting loop if an action fails
> near the quota limit?
>
> In damos_apply_scheme(), if the current region size exceeds the remaining
> quota, it splits the region into sz and the remainder:
>
> sz = ALIGN_DOWN(quota->esz - quota->charged_sz, c->min_region_sz);
> damon_split_region_at(t, r, sz);
>
> If the action on the split region fails, and fail_charge_num is less than
> fail_charge_denom, charged_sz increases by less than sz (or by 0).
> Because charged_sz did not increase fully, the remaining quota remains at
> least min_region_sz, so damos_quota_is_full() evaluates to false.
>
> When the scheme loop proceeds to the remainder of the region, it will split
> it again by the same sz. Could this repeat until the entire target memory
> is shattered into tiny regions, causing a massive increase in allocations
> and a kernel lockup?
Good finding. Yes, this can increase number of region splits on the code path.
I think it should be safe from lockup, since each DAMOS action implementation
uses cond_resched(). But the increased number of regions makes me worried. I
think we should merge the splitted region back, after DAMOS action is applied.
I will add the change in the next revision.
>
> > if (damos_quota_is_full(quota, c->min_region_sz)) {
> > quota->charge_target_from = t;
> > quota->charge_addr_from = r->ar.end + 1;
> > }
>
>
> # end of sashiko.dev inline review
> # review url: https://sashiko.dev/#/patchset/20260407010536.83603-3-sj@kernel.org
Thanks,
SJ
next prev parent reply other threads:[~2026-04-07 15:51 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-07 1:05 [RFC PATCH v3 00/10] mm/damon: introduce DAMOS failed region quota charge ratio SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 01/10] mm/damon/core: handle <min_region_sz remaining quota as empty SeongJae Park
2026-04-07 14:12 ` (sashiko review) " SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 02/10] mm/damon/core: introduce failed region quota charge ratio SeongJae Park
2026-04-07 15:51 ` SeongJae Park [this message]
2026-04-07 1:05 ` [RFC PATCH v3 03/10] mm/damon/sysfs-schemes: implement fail_charge_{num,denom} files SeongJae Park
2026-04-07 15:55 ` (sashiko review) " SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 04/10] Docs/mm/damon/design: document fail_charge_{num,denom} SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 05/10] Docs/admin-guide/mm/damon/usage: document fail_charge_{num,denom} files SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 06/10] Docs/ABI/damon: document fail_charge_{num,denom} SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 07/10] mm/damon/tests/core-kunit: test fail_charge_{num,denom} committing SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 08/10] selftests/damon/_damon_sysfs: support failed region quota charge ratio SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 09/10] selftests/damon/drgn_dump_damon_status: " SeongJae Park
2026-04-07 1:05 ` [RFC PATCH v3 10/10] selftests/damon/sysfs.py: test " SeongJae Park
2026-04-08 16:48 ` [RFC PATCH v3 00/10] mm/damon: introduce DAMOS " Bijan Tabatabai
2026-04-09 0:00 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260407155110.51749-1-sj@kernel.org \
--to=sj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=damon@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.