All of lore.kernel.org
 help / color / mirror / Atom feed
* [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid
@ 2026-06-09  6:46 Zenghui Yu
  2026-06-09  6:58 ` sashiko-bot
  2026-06-09 14:49 ` SeongJae Park
  0 siblings, 2 replies; 5+ messages in thread
From: Zenghui Yu @ 2026-06-09  6:46 UTC (permalink / raw)
  To: damon, linux-mm, linux-kernel
  Cc: sj, akpm, wangzhigang17, liqiqi23, Zenghui Yu

The comment on top of `struct damon_region` clearly says that

    For any use case, @ar should be non-zero positive size.

which is now verified in damon_verify_new_region() if the kernel is built
with DAMON_DEBUG_SANITY.

The WARN_ONCE() can be triggered if the mtier sample module is enabled
before node{0,1}_{start,end}_addr have been properly initialized, which is
obviously not good.

 ------------[ cut here ]------------
 start 0 >= end 0
 WARNING: mm/damon/core.c:217 at damon_new_region+0xf4/0x118, CPU#59: bash/341468
 Call trace:
  damon_new_region+0xf4/0x118 (P)
  damon_set_regions+0xfc/0x3c0
  damon_sample_mtier_build_ctx+0xe8/0x3a8
  damon_sample_mtier_start+0x1c/0x90
  damon_sample_mtier_enable_store+0x98/0xb0
  param_attr_store+0xb4/0x128
  module_attr_store+0x2c/0x50
  sysfs_kf_write+0x58/0x90
  kernfs_fop_write_iter+0x16c/0x238
  vfs_write+0x2c0/0x370
  ksys_write+0x74/0x118
  __arm64_sys_write+0x24/0x38
  invoke_syscall+0xa8/0x118
  el0_svc_common.constprop.0+0x48/0xf0
  do_el0_svc+0x24/0x38
  el0_svc+0x54/0x370
  el0t_64_sync_handler+0xa0/0xe8
  el0t_64_sync+0x1ac/0x1b0
 ---[ end trace 0000000000000000 ]---

Note that the same issue can happen if detect_node_addresses is true, and
node 0 or 1 is memoryless. Fix it together by checking the validity of
parameters right before damon_new_region() and fail early if they're
invalid.

Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>
---
* From v1 [1]:
  - take into account the detect_node_addresses case (Sashiko)
  - based on mm-new (SeongJae)

[1] https://lore.kernel.org/r/20260608111534.264-1-yuzenghui@huawei.com

 samples/damon/mtier.c | 3 +++
 1 file changed, 3 insertions(+)

diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index eb1143de8df1..3785b0c7ffb1 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -120,6 +120,9 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
 		addr.end = promote ? node1_end_addr : node0_end_addr;
 	}
 
+	if (addr.start >= addr.end)
+		goto free_out;
+
 	range.start = addr.start;
 	range.end = addr.end;
 
-- 
2.33.0



^ permalink raw reply related	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid
  2026-06-09  6:46 [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid Zenghui Yu
@ 2026-06-09  6:58 ` sashiko-bot
  2026-06-09  8:41   ` Zenghui Yu
  2026-06-09 14:36   ` SeongJae Park
  2026-06-09 14:49 ` SeongJae Park
  1 sibling, 2 replies; 5+ messages in thread
From: sashiko-bot @ 2026-06-09  6:58 UTC (permalink / raw)
  To: Zenghui Yu; +Cc: damon

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Memory leak of `damon_ctx` structures and potential orphaned `kdamond` threads when `damon_start()` fails.
- [High] Incomplete fix allowing integer overflow and resulting in `WARN_ONCE()` and massive region size.
--

commit 567b4e8a4b1811419a6203e400b62d93a47bfe84
Author: Zenghui Yu <yuzenghui@huawei.com>

samples/damon/mtier: fail early if address range parameters are invalid

This commit adds an early check in damon_sample_mtier_build_ctx() to ensure
the address range parameters are valid. This is intended to prevent a
WARN_ONCE() that occurs when the node address bounds are not properly
initialized before the module is enabled.

> diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> index eb1143de8df17..3785b0c7ffb17 100644
> --- a/samples/damon/mtier.c
> +++ b/samples/damon/mtier.c
> @@ -120,6 +120,9 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
>  		addr.end = promote ? node1_end_addr : node0_end_addr;
>  	}
>  
> +	if (addr.start >= addr.end)
> +		goto free_out;
> +

[Severity: High]
This isn't a bug introduced by this patch, but does this incomplete fix still
allow an integer overflow?

If addr.end is set to ULONG_MAX via sysfs and addr.start is a valid address,
this check passes. Later, damon_set_regions() calls ALIGN():

mm/damon/core.c:damon_set_regions() {
    ...
    ranges[i].end = ALIGN(ranges[i].end, min_region_sz);
    ...
}

The ALIGN macro will add min_region_sz - 1 to ULONG_MAX, causing an integer
overflow that wraps the value to 0. This results in calling
damon_new_region(start, 0), which triggers the exact WARN_ONCE() this
patch intended to fix.

Additionally, could this cause damon_sz_region() to return a massive value,
breaking the logic in damon_apply_min_nr_regions() and causing excessive
allocations?

>  	range.start = addr.start;
>  	range.end = addr.end;

[Severity: High]
This is a pre-existing issue, but does this code leak the context structures
and orphan the threads if damon_start() fails?

samples/damon/mtier.c:damon_sample_mtier_start() {
    ...
	ctxs[1] = ctx;
	return damon_start(ctxs, 2, true);
}

If damon_start() fails (e.g., returning -EBUSY or -ENOMEM), it returns the
error directly without cleaning up the dynamically allocated contexts in
ctxs[0] and ctxs[1].

The caller damon_sample_mtier_enable_store() sets enabled = false on error.
Subsequent attempts to enable the module will allocate new contexts and
overwrite the ctxs array, permanently leaking the previous contexts.

If damon_start() successfully starts the first context but fails on the
second, does the first context's kdamond thread remain running as an
orphaned thread?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260609064653.1829-1-yuzenghui@huawei.com?part=1

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid
  2026-06-09  6:58 ` sashiko-bot
@ 2026-06-09  8:41   ` Zenghui Yu
  2026-06-09 14:36   ` SeongJae Park
  1 sibling, 0 replies; 5+ messages in thread
From: Zenghui Yu @ 2026-06-09  8:41 UTC (permalink / raw)
  To: sashiko-reviews; +Cc: damon, SeongJae Park

On 2026/6/9 14:58, sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> 
> Pre-existing issues:
> - [High] Memory leak of `damon_ctx` structures and potential orphaned `kdamond` threads when `damon_start()` fails.
> - [High] Incomplete fix allowing integer overflow and resulting in `WARN_ONCE()` and massive region size.
> --
> 
> commit 567b4e8a4b1811419a6203e400b62d93a47bfe84
> Author: Zenghui Yu <yuzenghui@huawei.com>
> 
> samples/damon/mtier: fail early if address range parameters are invalid
> 
> This commit adds an early check in damon_sample_mtier_build_ctx() to ensure
> the address range parameters are valid. This is intended to prevent a
> WARN_ONCE() that occurs when the node address bounds are not properly
> initialized before the module is enabled.
> 
> > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> > index eb1143de8df17..3785b0c7ffb17 100644
> > --- a/samples/damon/mtier.c
> > +++ b/samples/damon/mtier.c
> > @@ -120,6 +120,9 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
> >  		addr.end = promote ? node1_end_addr : node0_end_addr;
> >  	}
> >  
> > +	if (addr.start >= addr.end)
> > +		goto free_out;
> > +
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but does this incomplete fix still
> allow an integer overflow?
> 
> If addr.end is set to ULONG_MAX via sysfs and addr.start is a valid address,
> this check passes. Later, damon_set_regions() calls ALIGN():
> 
> mm/damon/core.c:damon_set_regions() {
>     ...
>     ranges[i].end = ALIGN(ranges[i].end, min_region_sz);
>     ...
> }
> 
> The ALIGN macro will add min_region_sz - 1 to ULONG_MAX, causing an integer
> overflow that wraps the value to 0. This results in calling
> damon_new_region(start, 0), which triggers the exact WARN_ONCE() this
> patch intended to fix.

Yeah! I can reproduce the WARN() by: (with the "incomplete fix" applied)

    $ echo 0xffffffffffffffff > /sys/module/damon_sample_mtier/parameters/node0_end_addr
    $ echo Y > /sys/module/damon_sample_mtier/parameters/enabled

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid
  2026-06-09  6:58 ` sashiko-bot
  2026-06-09  8:41   ` Zenghui Yu
@ 2026-06-09 14:36   ` SeongJae Park
  1 sibling, 0 replies; 5+ messages in thread
From: SeongJae Park @ 2026-06-09 14:36 UTC (permalink / raw)
  To: sashiko-bot; +Cc: SeongJae Park, Zenghui Yu, damon

On Tue, 09 Jun 2026 06:58:28 +0000 sashiko-bot@kernel.org wrote:

> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> 
> Pre-existing issues:
> - [High] Memory leak of `damon_ctx` structures and potential orphaned `kdamond` threads when `damon_start()` fails.
> - [High] Incomplete fix allowing integer overflow and resulting in `WARN_ONCE()` and massive region size.
> --
> 
> commit 567b4e8a4b1811419a6203e400b62d93a47bfe84
> Author: Zenghui Yu <yuzenghui@huawei.com>
> 
> samples/damon/mtier: fail early if address range parameters are invalid
> 
> This commit adds an early check in damon_sample_mtier_build_ctx() to ensure
> the address range parameters are valid. This is intended to prevent a
> WARN_ONCE() that occurs when the node address bounds are not properly
> initialized before the module is enabled.
> 
> > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> > index eb1143de8df17..3785b0c7ffb17 100644
> > --- a/samples/damon/mtier.c
> > +++ b/samples/damon/mtier.c
> > @@ -120,6 +120,9 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
> >  		addr.end = promote ? node1_end_addr : node0_end_addr;
> >  	}
> >  
> > +	if (addr.start >= addr.end)
> > +		goto free_out;
> > +
> 
> [Severity: High]
> This isn't a bug introduced by this patch, but does this incomplete fix still
> allow an integer overflow?
> 
> If addr.end is set to ULONG_MAX via sysfs and addr.start is a valid address,
> this check passes. Later, damon_set_regions() calls ALIGN():
> 
> mm/damon/core.c:damon_set_regions() {
>     ...
>     ranges[i].end = ALIGN(ranges[i].end, min_region_sz);
>     ...
> }
> 
> The ALIGN macro will add min_region_sz - 1 to ULONG_MAX, causing an integer
> overflow that wraps the value to 0. This results in calling
> damon_new_region(start, 0), which triggers the exact WARN_ONCE() this
> patch intended to fix.

First of all, not a blocker of this patch.  I also don't think this is what
normaal and real users would really do with a good intention, or easy to
mistakenly set.  I don't think this needs a fix for now.  We could revisit this
if it turns out this causes a real (not test-purpose) issue.

> 
> Additionally, could this cause damon_sz_region() to return a massive value,
> breaking the logic in damon_apply_min_nr_regions() and causing excessive
> allocations?

DAMON manges the monitoring target address rannge with dynamic-sized regions.
And the maximum number of regions can be set by users.  So I don't think this
kind of issue can happen.

> 
> >  	range.start = addr.start;
> >  	range.end = addr.end;
> 
> [Severity: High]
> This is a pre-existing issue, but does this code leak the context structures
> and orphan the threads if damon_start() fails?
> 
> samples/damon/mtier.c:damon_sample_mtier_start() {
>     ...
> 	ctxs[1] = ctx;
> 	return damon_start(ctxs, 2, true);
> }
> 
> If damon_start() fails (e.g., returning -EBUSY or -ENOMEM), it returns the
> error directly without cleaning up the dynamically allocated contexts in
> ctxs[0] and ctxs[1].
> 
> The caller damon_sample_mtier_enable_store() sets enabled = false on error.
> Subsequent attempts to enable the module will allocate new contexts and
> overwrite the ctxs array, permanently leaking the previous contexts.
> 
> If damon_start() successfully starts the first context but fails on the
> second, does the first context's kdamond thread remain running as an
> orphaned thread?

Not a blocker of this patch.  I'm separately working [1] on this.

[1] https://lore.kernel.org/20260609142119.68120-1-sj@kernel.org


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 5+ messages in thread

* Re: [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid
  2026-06-09  6:46 [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid Zenghui Yu
  2026-06-09  6:58 ` sashiko-bot
@ 2026-06-09 14:49 ` SeongJae Park
  1 sibling, 0 replies; 5+ messages in thread
From: SeongJae Park @ 2026-06-09 14:49 UTC (permalink / raw)
  To: Zenghui Yu
  Cc: SeongJae Park, damon, linux-mm, linux-kernel, akpm, wangzhigang17,
	liqiqi23, stable

On Tue, 9 Jun 2026 14:46:52 +0800 Zenghui Yu <yuzenghui@huawei.com> wrote:

> The comment on top of `struct damon_region` clearly says that
> 
>     For any use case, @ar should be non-zero positive size.
> 
> which is now verified in damon_verify_new_region() if the kernel is built
> with DAMON_DEBUG_SANITY.
> 
> The WARN_ONCE() can be triggered if the mtier sample module is enabled
> before node{0,1}_{start,end}_addr have been properly initialized, which is
> obviously not good.
> 
>  ------------[ cut here ]------------
>  start 0 >= end 0
>  WARNING: mm/damon/core.c:217 at damon_new_region+0xf4/0x118, CPU#59: bash/341468
>  Call trace:
>   damon_new_region+0xf4/0x118 (P)
>   damon_set_regions+0xfc/0x3c0
>   damon_sample_mtier_build_ctx+0xe8/0x3a8
>   damon_sample_mtier_start+0x1c/0x90
>   damon_sample_mtier_enable_store+0x98/0xb0
>   param_attr_store+0xb4/0x128
>   module_attr_store+0x2c/0x50
>   sysfs_kf_write+0x58/0x90
>   kernfs_fop_write_iter+0x16c/0x238
>   vfs_write+0x2c0/0x370
>   ksys_write+0x74/0x118
>   __arm64_sys_write+0x24/0x38
>   invoke_syscall+0xa8/0x118
>   el0_svc_common.constprop.0+0x48/0xf0
>   do_el0_svc+0x24/0x38
>   el0_svc+0x54/0x370
>   el0t_64_sync_handler+0xa0/0xe8
>   el0t_64_sync+0x1ac/0x1b0
>  ---[ end trace 0000000000000000 ]---
> 
> Note that the same issue can happen if detect_node_addresses is true, and
> node 0 or 1 is memoryless. Fix it together by checking the validity of
> parameters right before damon_new_region() and fail early if they're
> invalid.

Thank you for this patch, Zenghui!

> 
> Signed-off-by: Zenghui Yu <yuzenghui@huawei.com>

I think this deserves Fixes: and Cc: stable, like below.

Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Cc: <stable@vger.kernel.org> # 6.16.x

Other than that, looks good to me.

Reviewed-by: SeongJae Park <sj@kernel.org>

I applied  this patch to damon/next [1] tree.  We are now quite close to next
merge window.  We (mm community) want to focus on making mm.git more stabilized
and therefore ready for the next merge window, rather than adding more changes
that are not really urgent.  I understand this series is not really urgent,
because it is causing only DAMON internal weird behavior and one time warning
on debug kernels.

Hence, Andrew might not add this patch until next -rc1 release.  In the case, I
will request adding this to mm.git after next -rc1 release.  So, no action from
your side is needed for now.  Let me know if you think this is really urgent or
I'm missing something.

[1] https://origin.kernel.org/doc/html/latest/mm/damon/maintainer-profile.html#scm-trees


Thanks,
SJ

[...]


^ permalink raw reply	[flat|nested] 5+ messages in thread

end of thread, other threads:[~2026-06-09 14:49 UTC | newest]

Thread overview: 5+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09  6:46 [PATCH v2] samples/damon/mtier: fail early if address range parameters are invalid Zenghui Yu
2026-06-09  6:58 ` sashiko-bot
2026-06-09  8:41   ` Zenghui Yu
2026-06-09 14:36   ` SeongJae Park
2026-06-09 14:49 ` SeongJae Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.