From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id 0B6DF30B50F for ; Sun, 26 Apr 2026 22:04:09 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777241050; cv=none; b=At75xltmsi05Yczk4agFXUYzlwNaSkz6L2WwuVN/rZC9F7NLpO/1y2oRRi3HQPtHuBskgJFpE2O7dhJtl9Pw7caFOO4B1kce6WMhlQQBdS+IJNKlNqmzx87h1to8k8Zk7XD7UbQlm0u9mhjFWR17llnkYTaC0ud1hM2Ha62xs3s= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1777241050; c=relaxed/simple; bh=ICSCqO+eCaERVOShAoqNpBTc6Wxu4AQJkUeIfIIgds4=; h=From:Subject:To:Cc:In-Reply-To:References:Content-Type:Date: Message-Id; b=V+YMiUZy9bLk+Wq6lg1XPtD990VkpR3lHuQXvAjG8mQAWBHkacPajyT6uCNVzXmpsvYTsEceTRu9OaVPhAIPZddRYzEMbeNi8jpaR3yvG/06uNNSFTKoG7NAVpLPW6KsB+sVY4Gyhozr1oAu2KS7TXJPMXdpdwz6KmDFkUrzsyA= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=IIJlGryR; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="IIJlGryR" Received: by smtp.kernel.org (Postfix) with ESMTPSA id 95B26C2BCAF; Sun, 26 Apr 2026 22:04:09 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1777241049; bh=ICSCqO+eCaERVOShAoqNpBTc6Wxu4AQJkUeIfIIgds4=; h=From:Subject:Reply-To:To:Cc:In-Reply-To:References:Date:From; b=IIJlGryR6WI28E49Ilun18rLGWBkMA6l8nTKgC2n8kInME/2Ui1wz1IAkj6PdkHVa yqwEnmVzfc17zkn2sZTETOZer9SkJM+ZWkEGMda2/agYw6+rgW6EvuF4s15Wg7WxDw ijrKBsJaOyBNbkJz4v2pI+fhNvxyamgvCxymd7ETai53PY1IKe7tNuUJodNnaOoyJc Nlj3v/rWgkeHGQpXidWLcC2Szw9DKn1HLE7JRG1ZuzmHKeH0UGG2ZBHqeFLZVQd8bN xXe8JqiCJhlTg6Z8LUsVOlQpVAY9zo+8Xi27aWQ5DuHUA7s4NaBRqowi6ygKHUV5HK rXWCMgMYnIcbQ== From: sashiko-bot@kernel.org Subject: Re: [PATCH v9] mm/damon: add node_eligible_mem_bp goal metric Reply-To: sashiko@lists.linux.dev To: "Ravi Jonnalagadda" Cc: damon@lists.linux.dev In-Reply-To: <20260426213859.1835-1-ravis.opensrc@gmail.com> References: <20260426213859.1835-1-ravis.opensrc@gmail.com> Content-Type: text/plain; charset=utf-8 Content-Transfer-Encoding: quoted-printable Date: Sun, 26 Apr 2026 22:04:08 +0000 Message-Id: <20260426220409.95B26C2BCAF@smtp.kernel.org> Precedence: bulk X-Mailing-List: damon@lists.linux.dev List-Id: List-Subscribe: List-Unsubscribe: Sashiko AI review found 4 potential issue(s): - [High] Missing `cond_resched()` inside the inner address-scanning loop ca= uses CPU soft lockups. - [High] Direct use of `mult_frac()` with 64-bit `phys_addr_t` causes a lin= ker error on 32-bit architectures with PAE. - [High] Integer overflow in `mult_frac()` on 32-bit systems without PAE le= ads to incorrect metric calculations. - [High] Unbounded O(N) page-by-page memory scan in quota evaluation breaks= DAMON's low-overhead design. -- commit 078c85e7912c268fbbe87930fc285713a5b57d05 Author: Ravi Jonnalagadda mm/damon: add node_eligible_mem_bp goal metric This commit introduces a new DAMON quota goal metric, DAMOS_QUOTA_NODE_ELIGIBLE_MEM_BP, which tracks the ratio of scheme-eligible memory on a target NUMA node compared to total system-wide eligible memory. It operates by exhaustively scanning the physical address ranges of eligible regions to determine folio NUMA locality, returning the ratio in basis poin= ts. > diff --git a/mm/damon/core.c b/mm/damon/core.c > index 3dbbbfdeff71..b23b6bd53aba 100644 > --- a/mm/damon/core.c > +++ b/mm/damon/core.c [ ... ] > @@ -2287,7 +2306,111 @@ static unsigned long damos_get_node_memcg_used_bp( [ ... ] > +static phys_addr_t damos_calc_eligible_bytes(struct damon_ctx *c, > + struct damos *s, int nid, phys_addr_t *total) > +{ > + struct damon_target *t; > + struct damon_region *r; > + phys_addr_t total_eligible =3D 0; > + phys_addr_t node_eligible =3D 0; > + > + damon_for_each_target(t, c) { > + damon_for_each_region(r, t) { > + phys_addr_t addr, end_addr; > + > + if (!__damos_valid_target(r, s)) > + continue; > + > + /* Convert from core address units to physical bytes */ > + addr =3D (phys_addr_t)r->ar.start * c->addr_unit; > + end_addr =3D (phys_addr_t)r->ar.end * c->addr_unit; > + while (addr < end_addr) { Does this unbounded page-by-page scan break DAMON's low-overhead design? This inner loop iterates through every single page in all eligible regions during the frequent quota adjustment cycle. Scanning massive memory pools (e.g., hundreds of GBs) requires hundreds of millions of synchronous atomic refcount operations via damon_get_folio().=20 Should the NUMA distribution be estimated using sampling or region sizes instead? > + struct folio *folio; > + phys_addr_t folio_start, folio_end; > + phys_addr_t overlap_start, overlap_end; > + phys_addr_t counted; > + > + folio =3D damon_get_folio(PHYS_PFN(addr)); > + if (!folio) { > + addr =3D PAGE_ALIGN_DOWN(addr + PAGE_SIZE); > + continue; > + } > + > + /* > + * Calculate exact overlap between the region > + * [addr, end_addr) and the folio range. > + * The folio may start before addr if addr is > + * in the middle of a large folio. > + */ > + folio_start =3D PFN_PHYS(folio_pfn(folio)); > + folio_end =3D folio_start + folio_size(folio); > + > + overlap_start =3D max(addr, folio_start); > + overlap_end =3D min(end_addr, folio_end); > + > + if (overlap_end > overlap_start) { > + counted =3D overlap_end - overlap_start; > + total_eligible +=3D counted; > + if (folio_nid(folio) =3D=3D nid) > + node_eligible +=3D counted; > + } > + > + /* Advance past the entire folio */ > + addr =3D folio_end; > + folio_put(folio); > + } > + cond_resched(); Can missing cond_resched() inside the while loop cause CPU soft lockups? Because cond_resched() is placed outside the inner loop, and DAMON regions can span hundreds of gigabytes, the kdamond thread will monopolize the CPU for millions of iterations without yielding. Should a cond_resched() be added inside the address-scanning loop? > + } > + } > + > + *total =3D total_eligible; > + return node_eligible; > +} > + > +static unsigned long damos_get_node_eligible_mem_bp(struct damon_ctx *c, > + struct damos *s, int nid) > +{ > + phys_addr_t total_eligible =3D 0; > + phys_addr_t node_eligible; > + > + if (c->ops.id !=3D DAMON_OPS_PADDR) > + return 0; > + > + if (nid < 0 || nid >=3D MAX_NUMNODES || !node_online(nid)) > + return 0; > + > + node_eligible =3D damos_calc_eligible_bytes(c, s, nid, &total_eligible); > + > + if (!total_eligible) > + return 0; > + > + return mult_frac(node_eligible, 10000, total_eligible); Can this cause a linker error on 32-bit architectures with PAE? On 32-bit systems with CONFIG_PHYS_ADDR_T_64BIT enabled, phys_addr_t is a 64-bit integer. The mult_frac() macro performs native division using the type of its arguments. Since the kernel does not link 64-bit division routines (__udivdi3) on 32-bit platforms, this will cause a build failure. Could this also cause an integer overflow on 32-bit systems without PAE? When phys_addr_t is a 32-bit unsigned integer, the multiplication node_eligible * 10000 is performed in 32-bit arithmetic. If node_eligible exceeds roughly 429 KB (UINT_MAX / 10000), the multiplication will silently overflow, producing an incorrect ratio. > +} --=20 Sashiko AI review =C2=B7 https://sashiko.dev/#/patchset/20260426213859.1835= -1-ravis.opensrc@gmail.com?part=3D1