From: sashiko-bot@kernel.org
To: "Ravi Jonnalagadda" <ravis.opensrc@gmail.com>
Cc: damon@lists.linux.dev
Subject: Re: [PATCH v9] mm/damon: add node_eligible_mem_bp goal metric
Date: Sun, 26 Apr 2026 22:04:08 +0000 [thread overview]
Message-ID: <20260426220409.95B26C2BCAF@smtp.kernel.org> (raw)
In-Reply-To: <20260426213859.1835-1-ravis.opensrc@gmail.com>
Sashiko AI review found 4 potential issue(s):
- [High] Missing `cond_resched()` inside the inner address-scanning loop causes CPU soft lockups.
- [High] Direct use of `mult_frac()` with 64-bit `phys_addr_t` causes a linker error on 32-bit architectures with PAE.
- [High] Integer overflow in `mult_frac()` on 32-bit systems without PAE leads to incorrect metric calculations.
- [High] Unbounded O(N) page-by-page memory scan in quota evaluation breaks DAMON's low-overhead design.
--
commit 078c85e7912c268fbbe87930fc285713a5b57d05
Author: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
mm/damon: add node_eligible_mem_bp goal metric
This commit introduces a new DAMON quota goal metric,
DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP, which tracks the ratio of scheme-eligible
memory on a target NUMA node compared to total system-wide eligible memory.
It operates by exhaustively scanning the physical address ranges of eligible
regions to determine folio NUMA locality, returning the ratio in basis points.
> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index 3dbbbfdeff71..b23b6bd53aba 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
[ ... ]
> @@ -2287,7 +2306,111 @@ static unsigned long damos_get_node_memcg_used_bp(
[ ... ]
> +static phys_addr_t damos_calc_eligible_bytes(struct damon_ctx *c,
> + struct damos *s, int nid, phys_addr_t *total)
> +{
> + struct damon_target *t;
> + struct damon_region *r;
> + phys_addr_t total_eligible = 0;
> + phys_addr_t node_eligible = 0;
> +
> + damon_for_each_target(t, c) {
> + damon_for_each_region(r, t) {
> + phys_addr_t addr, end_addr;
> +
> + if (!__damos_valid_target(r, s))
> + continue;
> +
> + /* Convert from core address units to physical bytes */
> + addr = (phys_addr_t)r->ar.start * c->addr_unit;
> + end_addr = (phys_addr_t)r->ar.end * c->addr_unit;
> + while (addr < end_addr) {
Does this unbounded page-by-page scan break DAMON's low-overhead design?
This inner loop iterates through every single page in all eligible regions
during the frequent quota adjustment cycle. Scanning massive memory pools
(e.g., hundreds of GBs) requires hundreds of millions of synchronous atomic
refcount operations via damon_get_folio().
Should the NUMA distribution be estimated using sampling or region sizes
instead?
> + struct folio *folio;
> + phys_addr_t folio_start, folio_end;
> + phys_addr_t overlap_start, overlap_end;
> + phys_addr_t counted;
> +
> + folio = damon_get_folio(PHYS_PFN(addr));
> + if (!folio) {
> + addr = PAGE_ALIGN_DOWN(addr + PAGE_SIZE);
> + continue;
> + }
> +
> + /*
> + * Calculate exact overlap between the region
> + * [addr, end_addr) and the folio range.
> + * The folio may start before addr if addr is
> + * in the middle of a large folio.
> + */
> + folio_start = PFN_PHYS(folio_pfn(folio));
> + folio_end = folio_start + folio_size(folio);
> +
> + overlap_start = max(addr, folio_start);
> + overlap_end = min(end_addr, folio_end);
> +
> + if (overlap_end > overlap_start) {
> + counted = overlap_end - overlap_start;
> + total_eligible += counted;
> + if (folio_nid(folio) == nid)
> + node_eligible += counted;
> + }
> +
> + /* Advance past the entire folio */
> + addr = folio_end;
> + folio_put(folio);
> + }
> + cond_resched();
Can missing cond_resched() inside the while loop cause CPU soft lockups?
Because cond_resched() is placed outside the inner loop, and DAMON regions
can span hundreds of gigabytes, the kdamond thread will monopolize the CPU
for millions of iterations without yielding. Should a cond_resched() be
added inside the address-scanning loop?
> + }
> + }
> +
> + *total = total_eligible;
> + return node_eligible;
> +}
> +
> +static unsigned long damos_get_node_eligible_mem_bp(struct damon_ctx *c,
> + struct damos *s, int nid)
> +{
> + phys_addr_t total_eligible = 0;
> + phys_addr_t node_eligible;
> +
> + if (c->ops.id != DAMON_OPS_PADDR)
> + return 0;
> +
> + if (nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
> + return 0;
> +
> + node_eligible = damos_calc_eligible_bytes(c, s, nid, &total_eligible);
> +
> + if (!total_eligible)
> + return 0;
> +
> + return mult_frac(node_eligible, 10000, total_eligible);
Can this cause a linker error on 32-bit architectures with PAE?
On 32-bit systems with CONFIG_PHYS_ADDR_T_64BIT enabled, phys_addr_t is a
64-bit integer. The mult_frac() macro performs native division using the
type of its arguments. Since the kernel does not link 64-bit division
routines (__udivdi3) on 32-bit platforms, this will cause a build failure.
Could this also cause an integer overflow on 32-bit systems without PAE?
When phys_addr_t is a 32-bit unsigned integer, the multiplication
node_eligible * 10000 is performed in 32-bit arithmetic. If node_eligible
exceeds roughly 429 KB (UINT_MAX / 10000), the multiplication will silently
overflow, producing an incorrect ratio.
> +}
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260426213859.1835-1-ravis.opensrc@gmail.com?part=1
next prev parent reply other threads:[~2026-04-26 22:04 UTC|newest]
Thread overview: 5+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-04-26 21:38 [PATCH v9] mm/damon: add node_eligible_mem_bp goal metric Ravi Jonnalagadda
2026-04-26 22:04 ` sashiko-bot [this message]
2026-04-26 23:08 ` SeongJae Park
2026-04-26 23:03 ` SeongJae Park
2026-04-27 0:59 ` Ravi Jonnalagadda
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260426220409.95B26C2BCAF@smtp.kernel.org \
--to=sashiko-bot@kernel.org \
--cc=damon@lists.linux.dev \
--cc=ravis.opensrc@gmail.com \
--cc=sashiko@lists.linux.dev \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.