* [PATCH 6.12.y] mm/damon/core: disallow time-quota setting zero esz
[not found] <2026050313-giddy-quizzical-4cc1@gregkh>
@ 2026-05-04 12:52 ` SeongJae Park
2026-05-04 18:02 ` sashiko-bot
0 siblings, 1 reply; 3+ messages in thread
From: SeongJae Park @ 2026-05-04 12:52 UTC (permalink / raw)
To: stable; +Cc: damon, SeongJae Park, Andrew Morton
When the throughput of a DAMOS scheme is very slow, DAMOS time quota can
make the effective size quota smaller than damon_ctx->min_region_sz. In
the case, damos_apply_scheme() will skip applying the action, because the
action is tried at region level, which requires >=min_region_sz size.
That is, the quota is effectively exceeded for the quota charge window.
Because no action will be applied, the total_charged_sz and
total_charged_ns are also not updated. damos_set_effective_quota() will
try to update the effective size quota before starting the next charge
window. However, because the total_charged_sz and total_charged_ns have
not updated, the throughput and effective size quota are also not changed.
Since effective size quota can only be decreased, other effective size
quota update factors including DAMOS quota goals and size quota cannot
make any change, either.
As a result, the scheme is unexpectedly deactivated until the user notices
and mitigates the situation. The users can mitigate this situation by
changing the time quota online or re-install the scheme. While the
mitigation is somewhat straightforward, finding the situation would be
challenging, because DAMON is not providing good observabilities for that.
Even if such observability is provided, doing the additional monitoring
and the mitigation is somewhat cumbersome and not aligned to the intention
of the time quota. The time quota was intended to help reduce the user's
administration overhead.
Fix the problem by setting time quota-modified effective size quota be at
least min_region_sz always.
The issue was discovered [1] by sashiko.
Link: https://lore.kernel.org/20260407003153.79589-1-sj@kernel.org
Link: https://lore.kernel.org/20260405192504.110014-1-sj@kernel.org [1]
Fixes: 1cd243030059 ("mm/damon/schemes: implement time quota")
Signed-off-by: SeongJae Park <sj@kernel.org>
Cc: <stable@vger.kernel.org> # 5.16.x
Signed-off-by: Andrew Morton <akpm@linux-foundation.org>
(cherry picked from commit 8bbde987c2b84f80da0853f739f0a920386f8b99)
Signed-off-by: SeongJae Park <sj@kernel.org>
---
mm/damon/core.c | 1 +
1 file changed, 1 insertion(+)
diff --git a/mm/damon/core.c b/mm/damon/core.c
index ed2b75023181..69f8244324b3 100644
--- a/mm/damon/core.c
+++ b/mm/damon/core.c
@@ -1577,6 +1577,7 @@ static void damos_set_effective_quota(struct damos_quota *quota)
esz = min(throughput * quota->ms, esz);
else
esz = throughput * quota->ms;
+ esz = max(DAMON_MIN_REGION, esz);
}
if (quota->sz && quota->sz < esz)
--
2.47.3
^ permalink raw reply related [flat|nested] 3+ messages in thread
* Re: [PATCH 6.12.y] mm/damon/core: disallow time-quota setting zero esz
2026-05-04 12:52 ` [PATCH 6.12.y] mm/damon/core: disallow time-quota setting zero esz SeongJae Park
@ 2026-05-04 18:02 ` sashiko-bot
2026-05-06 16:59 ` SeongJae Park
0 siblings, 1 reply; 3+ messages in thread
From: sashiko-bot @ 2026-05-04 18:02 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon
Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
- [High] The patch masks a severe pre-existing integer overflow in the time quota throughput calculation, which permanently throttles the scheme instead of fixing the root cause.
- [Medium] The patch forces the effective size quota to be at least DAMON_MIN_REGION when a time quota is configured, breaking the ability of quota goals to intentionally pause the scheme.
--
commit 648433abd4b11fd3f4a77018e7e47105abac1ce7
Author: SeongJae Park <sj@kernel.org>
mm/damon/core: disallow time-quota setting zero esz
When the throughput of a DAMOS scheme is very slow, the DAMOS time quota can
make the effective size quota smaller than the minimum region size, preventing
actions from being applied. This commit fixes the issue by setting the time
quota-modified effective size quota to at least DAMON_MIN_REGION.
Link: https://lore.kernel.org/20260407003153.79589-1-sj@kernel.org
Link: https://lore.kernel.org/20260405192504.110014-1-sj@kernel.org [1]
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index ed2b750231810..69f8244324b36 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -1577,6 +1577,7 @@ static void damos_set_effective_quota(struct damos_quota *quota)
> esz = min(throughput * quota->ms, esz);
> else
> esz = throughput * quota->ms;
> + esz = max(DAMON_MIN_REGION, esz);
Does this cause a regression by breaking the ability of quota goals to
intentionally pause the scheme?
If a dynamic quota goal determines that actions should be paused and requests
an esz smaller than DAMON_MIN_REGION, this max() macro forcibly overrides it
and bumps the quota back to DAMON_MIN_REGION.
This causes the scheme to process at least 1 page per charge window instead
of fully pausing.
Could the max() boundary be applied exclusively to the throughput-derived quota
before it is min()'ed with the goal-derived quota?
Additionally, does this code mask a pre-existing integer overflow regression
in the throughput calculation?
Looking earlier in damos_set_effective_quota():
if (quota->total_charged_ns)
throughput = quota->total_charged_sz * 1000000 /
quota->total_charged_ns;
Because total_charged_sz and total_charged_ns are continuously accumulated
over the scheme's lifetime, total_charged_sz * 1000000 will overflow an
unsigned long on 64-bit systems after processing around 18.4 TB of memory.
When the numerator overflows, it wraps modulo 2^64 while total_charged_ns
continues to grow. This causes the calculated throughput to collapse to
near-zero.
Also, on 32-bit systems, total_charged_ns stores nanoseconds and overflows
a 32-bit unsigned long after 4.29 seconds of accumulated action time, causing
erratic throughput calculations.
By clamping the output to DAMON_MIN_REGION here, the patch seems to permanently
throttle the scheme to process exactly 1 page per charge window once the
overflow occurs, rather than fixing the underlying throughput collapse.
> }
>
> if (quota->sz && quota->sz < esz)
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260504125220.15509-1-sj@kernel.org?part=1
^ permalink raw reply [flat|nested] 3+ messages in thread
* Re: [PATCH 6.12.y] mm/damon/core: disallow time-quota setting zero esz
2026-05-04 18:02 ` sashiko-bot
@ 2026-05-06 16:59 ` SeongJae Park
0 siblings, 0 replies; 3+ messages in thread
From: SeongJae Park @ 2026-05-06 16:59 UTC (permalink / raw)
To: sashiko-bot; +Cc: SeongJae Park, damon
On Mon, 04 May 2026 18:02:15 +0000 sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> - [High] The patch masks a severe pre-existing integer overflow in the time quota throughput calculation, which permanently throttles the scheme instead of fixing the root cause.
> - [Medium] The patch forces the effective size quota to be at least DAMON_MIN_REGION when a time quota is configured, breaking the ability of quota goals to intentionally pause the scheme.
No issue for this patch. Please read below for more details.
> --
>
> commit 648433abd4b11fd3f4a77018e7e47105abac1ce7
> Author: SeongJae Park <sj@kernel.org>
>
> mm/damon/core: disallow time-quota setting zero esz
>
> When the throughput of a DAMOS scheme is very slow, the DAMOS time quota can
> make the effective size quota smaller than the minimum region size, preventing
> actions from being applied. This commit fixes the issue by setting the time
> quota-modified effective size quota to at least DAMON_MIN_REGION.
>
> Link: https://lore.kernel.org/20260407003153.79589-1-sj@kernel.org
> Link: https://lore.kernel.org/20260405192504.110014-1-sj@kernel.org [1]
>
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > index ed2b750231810..69f8244324b36 100644
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> > @@ -1577,6 +1577,7 @@ static void damos_set_effective_quota(struct damos_quota *quota)
> > esz = min(throughput * quota->ms, esz);
> > else
> > esz = throughput * quota->ms;
> > + esz = max(DAMON_MIN_REGION, esz);
>
> Does this cause a regression by breaking the ability of quota goals to
> intentionally pause the scheme?
>
> If a dynamic quota goal determines that actions should be paused and requests
> an esz smaller than DAMON_MIN_REGION, this max() macro forcibly overrides it
> and bumps the quota back to DAMON_MIN_REGION.
>
> This causes the scheme to process at least 1 page per charge window instead
> of fully pausing.
>
> Could the max() boundary be applied exclusively to the throughput-derived quota
> before it is min()'ed with the goal-derived quota?
No, this should be fine. This is not a very technically intended behavior, but
letting DAMOS process one page per interval only when time quota is set should
make no real issue. Let's keep this behavior.
>
> Additionally, does this code mask a pre-existing integer overflow regression
> in the throughput calculation?
>
> Looking earlier in damos_set_effective_quota():
>
> if (quota->total_charged_ns)
> throughput = quota->total_charged_sz * 1000000 /
> quota->total_charged_ns;
>
> Because total_charged_sz and total_charged_ns are continuously accumulated
> over the scheme's lifetime, total_charged_sz * 1000000 will overflow an
> unsigned long on 64-bit systems after processing around 18.4 TB of memory.
>
> When the numerator overflows, it wraps modulo 2^64 while total_charged_ns
> continues to grow. This causes the calculated throughput to collapse to
> near-zero.
>
> Also, on 32-bit systems, total_charged_ns stores nanoseconds and overflows
> a 32-bit unsigned long after 4.29 seconds of accumulated action time, causing
> erratic throughput calculations.
>
> By clamping the output to DAMON_MIN_REGION here, the patch seems to permanently
> throttle the scheme to process exactly 1 page per charge window once the
> overflow occurs, rather than fixing the underlying throughput collapse.
Good catch, but not this patch's fault. Refer to my previous answer [1] to a
similar Sashiko question.
[1] https://lore.kernel.org/20260505150012.29007-1-sj@kernel.org
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 3+ messages in thread
end of thread, other threads:[~2026-05-06 16:59 UTC | newest]
Thread overview: 3+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <2026050313-giddy-quizzical-4cc1@gregkh>
2026-05-04 12:52 ` [PATCH 6.12.y] mm/damon/core: disallow time-quota setting zero esz SeongJae Park
2026-05-04 18:02 ` sashiko-bot
2026-05-06 16:59 ` SeongJae Park
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox