All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: sashiko-bot@kernel.org
Cc: SeongJae Park <sj@kernel.org>,
	"Ravi Jonnalagadda" <ravis.opensrc@gmail.com>,
	damon@lists.linux.dev
Subject: Re: [PATCH v7] mm/damon: add node_eligible_mem_bp goal metric
Date: Fri, 24 Apr 2026 17:45:31 -0700	[thread overview]
Message-ID: <20260425004534.85531-1-sj@kernel.org> (raw)
In-Reply-To: <20260424205908.CF949C2BCB8@smtp.kernel.org>

On Fri, 24 Apr 2026 20:59:07 +0000 sashiko-bot@kernel.org wrote:

> Sashiko AI review found 4 potential issue(s):
> - [High] Integer Truncation and Arithmetic Overflow on 32-bit Architectures

No real problem.

> - [High] CPU Soft Lockups due to Unbounded Physical Address Iteration

Unlikely.

> - [Medium] Missing Core Filters Evaluation Causes Eligible Memory Over-counting

This is intentional.

> - [Medium] Unaligned Address Advancement Truncates Exact Overlap Calculation

I agree to Sashiko.

> --
> 
> commit d92efd581a7b3318becfa32765dc87547acb91c6
> Author: Ravi Jonnalagadda <ravis.opensrc@gmail.com>
> Subject: mm/damon: add node_eligible_mem_bp goal metric
> 
> This commit introduces a new metric node_eligible_mem_bp for DAMON schemes
> to control memory distribution across NUMA nodes based on the ratio of
> scheme-eligible memory. It calculates physical bytes per node using PADDR
> ops and integrates with the DAMOS quota goal system.
> 
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > index 3dbbbfdeff719f..4ac8052052613d 100644
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> 
> [ ... ]
> 
> > @@ -2287,7 +2305,112 @@ static unsigned long damos_get_node_memcg_used_bp(
> >  		numerator = i.totalram - used_pages;
> >  	return mult_frac(numerator, 10000, i.totalram);
> >  }
> > -#else
> > +
> > +#ifdef CONFIG_DAMON_PADDR
> > +/*
> > + * damos_calc_eligible_bytes() - Calculate raw eligible bytes per node.
> > + * @c:		The DAMON context.
> > + * @s:		The scheme.
> > + * @nid:	The target NUMA node id.
> > + * @total:	Output for total eligible bytes across all nodes.
> > + *
> > + * Iterates through each folio in eligible regions to accurately determine
> > + * which node the memory resides on. Returns eligible bytes on the specified
> > + * node and sets *total to the sum across all nodes.
> > + *
> > + * Note: This function requires damon_get_folio() from ops-common.c, which is
> > + * only available when CONFIG_DAMON_PADDR is enabled. It also requires the
> > + * context to be using PADDR operations for meaningful results.
> > + */
> > +static phys_addr_t damos_calc_eligible_bytes(struct damon_ctx *c,
> > +		struct damos *s, int nid, phys_addr_t *total)
> > +{
> > +	struct damon_target *t;
> > +	struct damon_region *r;
> > +	phys_addr_t total_eligible = 0;
> > +	phys_addr_t node_eligible = 0;
> > +
> > +	damon_for_each_target(t, c) {
> > +		damon_for_each_region(r, t) {
> > +			phys_addr_t addr, end_addr;
> > +
> > +			if (!__damos_valid_target(r, s))
> > +				continue;
> 
> Does this overcount eligible memory by skipping core filters?
> 
> The quota adjustment logic uses damos_core_filter_out() to exclude explicitly
> filtered regions, but it appears this calculation only checks
> __damos_valid_target(). Should damos_core_filter_out() be evaluated here as 
> well?

This is an intentional behavior as we discussed [1] on the previous version.

> 
> > +
> > +			/* Convert from core address units to physical bytes */
> > +			addr = (phys_addr_t)r->ar.start * c->addr_unit;
> > +			end_addr = (phys_addr_t)r->ar.end * c->addr_unit;
> > +			while (addr < end_addr) {
> > +				struct folio *folio;
> > +				phys_addr_t folio_start, folio_end;
> > +				phys_addr_t overlap_start, overlap_end;
> > +				phys_addr_t counted;
> > +
> > +				folio = damon_get_folio(PHYS_PFN(addr));
> > +				if (!folio) {
> > +					addr += PAGE_SIZE;
> > +					continue;
> > +				}
> 
> If addr is unaligned, does adding PAGE_SIZE preserve the unaligned offset and
> cause under-counting on the next valid folio?
> 
> If an unaligned fallback happens, the address might skip past a valid page
> boundary instead of aligning to it. When advancing the address without a
> valid folio, would it be better to align it to the next page boundary?

Makes sense.  I suggested to use PAGE_ALING_DOWN() on the reply to the patch.

> 
> > +
> > +				/*
> > +				 * Calculate exact overlap between the region
> > +				 * [addr, end_addr) and the folio range.
> > +				 * The folio may start before addr if addr is
> > +				 * in the middle of a large folio.
> > +				 */
> 
> [ ... ]
> 
> > +				if (overlap_end > overlap_start) {
> > +					counted = overlap_end - overlap_start;
> > +					total_eligible += counted;
> > +					if (folio_nid(folio) = nid)
> > +						node_eligible += counted;
> > +				}
> > +
> > +				/* Advance past the entire folio */
> > +				addr = folio_end;
> > +				folio_put(folio);
> > +			}
> > +			cond_resched();
> > +		}
> > +	}
> 
> Could this cause CPU soft lockups for very large regions?
> 
> Since the while loop can iterate millions of times for large memory regions,
> should cond_resched() be placed inside the while loop instead of outside it?

Not likely.  Users could set min_nr_regions to control the maximum size of each
region.  We might be able to make this more robust, but seems not a blocker of
this patch in my opinion.

> 
> > +
> > +	*total = total_eligible;
> > +	return node_eligible;
> > +}
> > +
> > +static unsigned long damos_get_node_eligible_mem_bp(struct damon_ctx *c,
> > +		struct damos *s, int nid)
> > +{
> > +	phys_addr_t total_eligible = 0;
> > +	phys_addr_t node_eligible;
> > +
> > +	if (c->ops.id != DAMON_OPS_PADDR)
> > +		return 0;
> > +
> > +	if (nid < 0 || nid >= MAX_NUMNODES || !node_online(nid))
> > +		return 0;
> > +
> > +	node_eligible = damos_calc_eligible_bytes(c, s, nid, &total_eligible);
> > +
> > +	if (!total_eligible)
> > +		return 0;
> > +
> > +	return mult_frac((unsigned long)node_eligible, 10000,
> > +			(unsigned long)total_eligible);
> > +}
> 
> Can this cast cause truncation and overflow on 32-bit architectures? 
> 
> Both node_eligible and total_eligible are phys_addr_t, which are 64-bit on
> PAE systems, but unsigned long is 32-bit.
> 
> If total_eligible is an exact multiple of 4GB, it would cast to 0 and trigger
> a divide-by-zero panic in mult_frac(). Additionally, mult_frac() multiplies
> by 10000 using 32-bit math, which might overflow for values larger than
> roughly 429 KB.
> 
> Should this use 64-bit division or shift the byte sizes down to page counts
> before calculating the fraction?

Too much overflow concern in my opinion.  The overflow makes no real critical
problem.  Let's keep this as is for now.

> 
> [ ... ]
> 
> -- 
> Sashiko AI review · https://sashiko.dev/#/patchset/20260424203448.5040-1-ravis.opensrc@gmail.com?part=1
> 

[1] https://lore.kernel.org/20260405224550.76218-1-sj@kernel.org


Thanks,
SJ

  reply	other threads:[~2026-04-25  0:45 UTC|newest]

Thread overview: 6+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-04-24 20:34 [PATCH v7] mm/damon: add node_eligible_mem_bp goal metric Ravi Jonnalagadda
2026-04-24 20:59 ` sashiko-bot
2026-04-25  0:45   ` SeongJae Park [this message]
2026-04-25  0:39 ` SeongJae Park
2026-04-25  4:18   ` Ravi Jonnalagadda
2026-04-25 15:37     ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260425004534.85531-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=damon@lists.linux.dev \
    --cc=ravis.opensrc@gmail.com \
    --cc=sashiko-bot@kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.