From: SeongJae Park <sj@kernel.org>
To: sashiko-bot@kernel.org
Cc: SeongJae Park <sj@kernel.org>,
"Ravi Jonnalagadda" <ravis.opensrc@gmail.com>,
damon@lists.linux.dev
Subject: Re: [RFC PATCH 5/5] mm/damon/paddr: add time budget to migration page walk
Date: Sun, 17 May 2026 16:46:14 -0700 [thread overview]
Message-ID: <20260517234616.89496-1-sj@kernel.org> (raw)
In-Reply-To: <20260516235545.18B71C2BCB7@smtp.kernel.org>
On Sat, 16 May 2026 23:55:44 +0000 sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 4 potential issue(s) to consider:
> - [High] The amortized time budget check can be permanently bypassed due to variable PFN strides and loop continues.
> - [High] Early termination causes permanent starvation for the tail end of large DAMON regions.
> - [High] The time budget targets the wrong phase, failing to cap the actual latency spike of the migration.
> - [High] PFN advancement after folio conversion permanently misaligns the scan.
> --
>
> commit 87860cbf6b4be5f161140aa9a1f9b7d0461955ba
> Author: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
>
> mm/damon/paddr: add time budget to migration page walk
>
> This patch adds a 100ms wall-clock time budget to damon_pa_migrate() to
> prevent long migration walks from causing high scheduling latencies. The
> time check is amortized by checking ktime_get() every 4096 pages.
>
> > diff --git a/mm/damon/paddr.c b/mm/damon/paddr.c
> > index e844c990987b9..a2565287bc10f 100644
> > --- a/mm/damon/paddr.c
> > +++ b/mm/damon/paddr.c
> [ ... ]
> > @@ -283,6 +293,11 @@ static unsigned long damon_pa_migrate(struct damon_region *r,
> > }
> > }
> >
> > + /* Time budget: keep kdamond responsive on long migration walks. */
> > + if (!(pfn & DAMON_PA_MIGRATE_TIME_CHECK_MASK) &&
> > + ktime_after(ktime_get(), deadline))
> > + break;
>
> Can this time check be permanently bypassed?
>
> If pfn advances by variable strides, such as when addr += folio_size(folio)
> adds a multi-page stride that jumps over the 4096-page boundary, the exact
> match !(pfn & DAMON_PA_MIGRATE_TIME_CHECK_MASK) is missed, skipping the
> check for the entire next 4096-page cycle.
>
> Additionally, if pfn does land exactly on a multiple of 4096, but the
> pageblock is skipped via the continue statement just above this block:
>
> if (!page) {
> addr += pageblock_nr_pages * PAGE_SIZE;
> continue;
> }
>
> the loop advances and leaps over the time check.
>
> If the loop terminates early here, is there a mechanism to save the PFN
> where the scan aborted?
>
> Since the DAMON core is unaware of this internal early return, the next time
> the scheme applies to this region, it appears it will restart the walk from
> the very beginning of the region.
>
> Could this result in a permanent livelock for large regions, where memory
> beyond the 100ms scan horizon is never reached or migrated?
>
> > folio = damon_get_folio(pfn);
> > if (damon_pa_invalid_damos_folio(folio, s)) {
> [ ... ]
> > if (!folio_isolate_lru(folio))
> > goto put_folio;
> > list_add(&folio->lru, &folio_list);
> > put_folio:
> > addr += folio_size(folio);
>
> This wasn't introduced by this patch, but does the address advancement
> correctly handle tail pages?
>
> Since the region start addr might not be folio-aligned, if it falls within
> a tail page of a large folio, does adding the total folio_size(folio)
> overshoot the end of the current folio and skip the head page of the next
> folio?
As I previously replied [1], I will work on this.
>
> > folio_put(folio);
> > }
> > applied = damon_migrate_pages(&folio_list, s->target_nid);
>
> Does the time budget cap the actual latency spike of the migration?
>
> The 100ms budget limits the fast folio isolation phase. The loop can isolate
> gigabytes of memory within 100ms. Once the deadline is reached, all
> collected folios are passed to damon_migrate_pages().
>
> Could migrating this massive batch, which requires memory allocation and
> page copying, take seconds to complete in a single blocking call and fail
> to keep the migration responsive?
I will revisit other Sashiko's detailed comments once we finish the high level
discussion and decide to pursue this direction.
[1] https://lore.kernel.org/20260517234112.89245-1-sj@kernel.org
>
> --
> Sashiko AI review · https://sashiko.dev/#/patchset/20260516210357.2247-1-ravis.opensrc@gmail.com?part=5
Thanks,
SJ
next prev parent reply other threads:[~2026-05-17 23:46 UTC|newest]
Thread overview: 27+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-05-16 21:03 [RFC PATCH 0/5] mm/damon: DAMOS quota controller and paddr migration walk fixes Ravi Jonnalagadda
2026-05-16 21:03 ` [RFC PATCH 1/5] mm/damon/core: fix nr_accesses_bp underflow in damon_moving_sum Ravi Jonnalagadda
2026-05-16 22:29 ` sashiko-bot
2026-05-17 18:21 ` SeongJae Park
2026-05-17 18:16 ` SeongJae Park
2026-05-16 21:03 ` [RFC PATCH 2/5] mm/damon/core: cap effective quota size to total monitored memory Ravi Jonnalagadda
2026-05-16 22:55 ` sashiko-bot
2026-05-17 18:48 ` SeongJae Park
2026-05-17 18:36 ` SeongJae Park
2026-05-18 5:22 ` Ravi Jonnalagadda
2026-05-19 0:38 ` SeongJae Park
2026-05-16 21:03 ` [RFC PATCH 3/5] mm/damon/core: floor effective quota size at minimum region size Ravi Jonnalagadda
2026-05-17 18:47 ` SeongJae Park
2026-05-20 18:37 ` Ravi Jonnalagadda
2026-05-21 0:36 ` SeongJae Park
2026-05-16 21:03 ` [RFC PATCH 4/5] mm/damon/paddr: skip free pageblocks in migration walk Ravi Jonnalagadda
2026-05-16 23:36 ` sashiko-bot
2026-05-17 23:41 ` SeongJae Park
2026-05-17 23:37 ` SeongJae Park
2026-05-18 5:38 ` Ravi Jonnalagadda
2026-05-19 1:14 ` SeongJae Park
2026-05-16 21:03 ` [RFC PATCH 5/5] mm/damon/paddr: add time budget to migration page walk Ravi Jonnalagadda
2026-05-16 23:55 ` sashiko-bot
2026-05-17 23:46 ` SeongJae Park [this message]
2026-05-17 23:43 ` SeongJae Park
2026-05-18 5:54 ` Ravi Jonnalagadda
2026-05-19 1:27 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260517234616.89496-1-sj@kernel.org \
--to=sj@kernel.org \
--cc=damon@lists.linux.dev \
--cc=ravis.opensrc@gmail.com \
--cc=sashiko-bot@kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox