All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: SeongJae Park <sj@kernel.org>
Cc: sashiko-bot@kernel.org, damon@lists.linux.dev
Subject: Re: [RFC PATCH v1.2 03/17] mm/damon/core: always update ->last_nr_accesses for intervals change
Date: Sun, 21 Jun 2026 14:13:02 -0700	[thread overview]
Message-ID: <20260621211302.11854-1-sj@kernel.org> (raw)
In-Reply-To: <20260621202307.10276-1-sj@kernel.org>

On Sun, 21 Jun 2026 13:23:16 -0700 SeongJae Park <sj@kernel.org> wrote:

> On Sun, 21 Jun 2026 16:11:04 +0000 sashiko-bot@kernel.org wrote:
> 
> > Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> > 
> > Pre-existing issues:
> > - [High] Arithmetic overflow in DAMON access scaling functions when interval ratio is large.
> > - [High] The patch attempts to fix a 'wrong value' returned by `damon_nr_accesses_mvsum()`, but misdiagnoses the root cause and leaves a major double-counting bug unresolved.
> > --
> > 
> > commit 6636564dd42488eb260d7102569887a00bdd3d71
> > Author: SeongJae Park <sj@kernel.org>
> > 
> > mm/damon/core: always update ->last_nr_accesses for intervals change
> > 
> > This commit modifies damon_update_monitoring_result() to unconditionally
> > update last_nr_accesses when aggregation intervals change. It aims to
> > prevent incorrect calculations in damon_nr_accesses_mvsum() that could occur
> > if called with a stale last_nr_accesses value alongside a fresh aggregation
> > timestamp.
> > 
> > > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > > index addd14e4e70bf..bc40a6d10f277 100644
> > > --- a/mm/damon/core.c
> > > +++ b/mm/damon/core.c
> > > @@ -873,6 +873,8 @@ static void damon_update_monitoring_result(struct damon_region *r,
> > >  		struct damon_attrs *old_attrs, struct damon_attrs *new_attrs,
> > >  		bool aggregating)
> > >  {
> > > +	r->last_nr_accesses = damon_nr_accesses_for_new_attrs(
> > > +			r->last_nr_accesses, old_attrs, new_attrs);
> > 
> > [Severity: High]
> > This is a pre-existing issue, but does this scaling trigger an arithmetic
> > overflow when the interval ratio is large?
> > 
> > When damon_nr_accesses_for_new_attrs() is called, it performs 32-bit
> > arithmetic in damon_nr_accesses_to_accesses_bp() and
> > damon_accesses_bp_to_nr_accesses().
> > 
> > Specifically:
> > 
> > mm/damon/core.c:damon_nr_accesses_to_accesses_bp() {
> >     ...
> >     return mult_frac(nr_accesses, 10000, damon_max_nr_accesses(attrs));
> > }
> > 
> > mm/damon/core.c:damon_accesses_bp_to_nr_accesses() {
> >     ...
> >     return accesses_bp * damon_max_nr_accesses(attrs) / 10000;
> > }
> > 
> > If the ratio of aggr_interval to sample_interval exceeds 429,496 (for
> > example, a 1s aggregation and 1us sample), the intermediate multiplications
> > like r * 10000 inside mult_frac() are computed as 32-bit unsigned int,
> > which overflows the 32-bit maximum.
> > 
> > Can this silently corrupt the region access counts upon attribute updates,
> > leading to completely incorrect monitoring results?
> 
> Known low priority issue.  Not a blocker of this patch.
> 
> > 
> > 
> > [Severity: High]
> > This isn't a bug introduced by this patch, but does scaling
> > last_nr_accesses fully resolve the incorrect values returned by
> > damon_nr_accesses_mvsum(), or does it leave a double-counting flaw intact?
> > 
> > The commit message notes that damon_nr_accesses_mvsum() can return the
> > wrong value. However, the root cause appears to be a state desynchronization
> > rather than just a stale last_nr_accesses.
> > 
> > When damon_set_attrs() is called, it advances ctx->next_aggregation_sis
> > into the future:
> > 
> > mm/damon/core.c:damon_set_attrs() {
> >     ...
> >     ctx->next_aggregation_sis = ctx->passed_sample_intervals +
> >                                 attrs->aggr_interval / sample_interval;
> > }
> > 
> > If a subsequent call to damon_nr_accesses_mvsum() evaluates this future
> > timestamp, it assumes it is at the start of a new interval, meaning
> > left_window_bp evaluates to 10000.
> > 
> > mm/damon/core.c:damon_nr_accesses_mvsum() {
> >     ...
> >     return damon_mvsum(r->nr_accesses, r->last_nr_accesses, left_window_bp);
> > }
> > 
> > Because kdamond_reset_aggregated() has not yet executed, r->nr_accesses
> > still holds the fully accumulated count of the just-completed interval.
> > This means damon_mvsum() adds two full intervals' worth of access counts
> > together, effectively calculating:
> > r->nr_accesses + (r->last_nr_accesses * 10000 / 10000).
> > 
> > Does unconditionally scaling last_nr_accesses merely adjust the second term
> > of this sum while preserving the underlying double-counting logic flaw?
> 
> Good point.  In the long term, damon_update_monitoring_result() should modify
> nr_accesses.  We cannot do that at this moment, since we have to support both
> nr_accesses_bp and damon_nr_accesses_mvsum().   I will add a temporal
> preparatory hack to damonn_nr_accesses_mvsum() for this issue, like below.
> 
> '''
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
> @@ -268,6 +268,8 @@ unsigned int damon_nr_accesses_mvsum(struct damon_region *r,
>                         ctx->passed_sample_intervals;
>         left_window_bp = mult_frac(left_window, 10000, window_len);
> 
> +       if (left_window_bp == 10000)
> +               return r->last_nr_accesses;
>         return damon_mvsum(r->nr_accesses, r->last_nr_accesses,
>                         left_window_bp);
>  }
> '''
> 
> At the end of the series, we will updte damon_update_monitoring_result() to
> reset nr_accesses always,and remove this temporal hack.

No, I will not.

After kdamond_call() that could call damon_set_attrs(), if it was the last
iteration for the current aggregation, kdamond_fn() will further invoke
aggregation interval operations based on the before-damon_set_attrs() cached
timestamps.  Those operations include DAMOS (if apply interval is aligned with
the aggregation interval), regions split, regions reset and intervals
auto-tuning.  Some of those operations still use nr_accesses, so nr_accesses
cannot unconditionally reset in damon_update_monitoring_result().

I will make this series be merged with the nr_accesses_mvsum() change, and
cleanup/simplify the logic after this series.


Thanks,
SJ

[...]

  reply	other threads:[~2026-06-21 21:13 UTC|newest]

Thread overview: 33+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-21 15:56 [RFC PATCH v1.2 00/17] mm/damon: optimize out nr_accesses_bp SeongJae Park
2026-06-21 15:56 ` [RFC PATCH v1.2 01/17] mm/damon: introduce damon_nr_accesses_mvsum() SeongJae Park
2026-06-21 16:10   ` sashiko-bot
2026-06-21 20:01     ` SeongJae Park
2026-06-21 15:56 ` [RFC PATCH v1.2 02/17] mm/damon/tests/core-kunit: test damon_mvsum() SeongJae Park
2026-06-21 15:56 ` [RFC PATCH v1.2 03/17] mm/damon/core: always update ->last_nr_accesses for intervals change SeongJae Park
2026-06-21 16:11   ` sashiko-bot
2026-06-21 20:23     ` SeongJae Park
2026-06-21 21:13       ` SeongJae Park [this message]
2026-06-21 15:57 ` [RFC PATCH v1.2 04/17] mm/damon/core: use damon_nr_accesses_mvsum() in __damos_valid_target() SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 05/17] mm/damon/core: use damon_nr_accesses_mvsum() for damos region tracing SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 06/17] mm/damon/sysfs-schemes: use damon_nr_accesses_mvsum() for damo regions SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 07/17] mm/damon/core: remove damon_warn_fix_nr_accesses_corruption() SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 08/17] mm/damon/core: remove damon_verify_reset_aggregated() SeongJae Park
2026-06-21 16:06   ` sashiko-bot
2026-06-21 20:24     ` SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 09/17] mm/damon/core: remove damon_verify_merge_regions_of() SeongJae Park
2026-06-21 18:09   ` sashiko-bot
2026-06-21 20:35     ` SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 10/17] mm/damon/tests/core-kunit: remove nr_accesses_bp setup and tests SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 11/17] selftests/damon/drgn_dump_damon_status: do not dump nr_accesses_bp SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 12/17] mm/damon/core: remove nr_accesses_bp setups and updates SeongJae Park
2026-06-21 18:10   ` sashiko-bot
2026-06-21 20:37     ` SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 13/17] mm/damon/core: remove attrs param from damon_update_region_access_rate() SeongJae Park
2026-06-21 16:14   ` sashiko-bot
2026-06-21 20:40     ` SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 14/17] mm/damonn/paddr: remove attrs param from __damon_pa_check_access() SeongJae Park
2026-06-21 16:07   ` sashiko-bot
2026-06-21 20:42     ` SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 15/17] mm/damon/vaddr: remove attrs param from __damon_va_check_access() SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 16/17] mm/damon/core: remove damon_moving_sum() and its unit test SeongJae Park
2026-06-21 15:57 ` [RFC PATCH v1.2 17/17] mm/damon: remove damon_region->nr_accesses_bp SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260621211302.11854-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=damon@lists.linux.dev \
    --cc=sashiko-bot@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.