* [RFC PATCH v2 0/2] samples/damon/mtier: handle damon_{start,stop}() failures
@ 2026-06-09 14:21 SeongJae Park
2026-06-09 14:21 ` [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure SeongJae Park
2026-06-09 14:21 ` [RFC PATCH v2 2/2] samples/damon/mtier: handle damon_stop() failure SeongJae Park
0 siblings, 2 replies; 7+ messages in thread
From: SeongJae Park @ 2026-06-09 14:21 UTC (permalink / raw)
Cc: SeongJae Park, # 6 . 16 . x, Andrew Morton, damon, linux-kernel,
linux-mm
DAMON sample module mtier is not correctly handling failures from
damon_start() and damon_stop(). As a result, it can leak memory,
disrupt next DAMON operations, and dereference freed and random memory.
Fix those.
Changes from RFC v1
- RFC v1: https://lore.kernel.org/20260609005443.2122-1-sj@kernel.org
- Add damon_stop() failure handling fix to the series.
SeongJae Park (2):
samples/damon/mtier: handle damon_start() failure
samples/damon/mtier: handle damon_stop() failure
samples/damon/mtier.c | 14 ++++++++++++--
1 file changed, 12 insertions(+), 2 deletions(-)
base-commit: f822d98f51b6e2b3ab41cad2e8bc1013d06f93bb
--
2.47.3
^ permalink raw reply [flat|nested] 7+ messages in thread
* [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure
2026-06-09 14:21 [RFC PATCH v2 0/2] samples/damon/mtier: handle damon_{start,stop}() failures SeongJae Park
@ 2026-06-09 14:21 ` SeongJae Park
2026-06-09 14:35 ` sashiko-bot
2026-06-09 14:21 ` [RFC PATCH v2 2/2] samples/damon/mtier: handle damon_stop() failure SeongJae Park
1 sibling, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2026-06-09 14:21 UTC (permalink / raw)
Cc: SeongJae Park, # 6 . 16 . x, Andrew Morton, damon, linux-kernel,
linux-mm
damon_sample_mtier_start() callers assume it will clean up resources
when it fails. And the function does the cleanup for context buildup
failures. However, it is not doing the cleanup for damon_start()
failure.
As a result, when damon_start() fails, it could leak the memory for
DAMON context. Also, if damon_start() fails for only the second
context, the first context will indefinitely run, and avoid starting
other DAMON contexts since it is running in the exclusive mode. Stop
possibly started DAMON context and free the contexts in case of the
failure to fix the issues.
Note that the issue can reliably be reproduced because the module calls
damon_start() in the exclusive mode. For example,
$ sudo damo start
$ echo Y | sudo tee /sys/module/damon_sample_mtier/parameters/enabled
$ sudo cat /proc/allocinfo | grep damon_new_ctx
Because the first command is running another DAMON instance, the second
command fails the damon_start() call because the new DAMON instance
cannot exclusively run. And without this fix, by repeating the second
and the third commands above, we can show the memory consumption is only
increasing due to the leaks. It requires the sudo permission though.
The issue was discovered [1] by Sashiko.
[1] https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Cc: <stable@vger.kernel.org> # 6.16.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
samples/damon/mtier.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index eb1143de8df17..66b591f2180fa 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -174,6 +174,7 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
static int damon_sample_mtier_start(void)
{
struct damon_ctx *ctx;
+ int err;
ctx = damon_sample_mtier_build_ctx(true);
if (!ctx)
@@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
return -ENOMEM;
}
ctxs[1] = ctx;
- return damon_start(ctxs, 2, true);
+ err = damon_start(ctxs, 2, true);
+ if (!err)
+ return 0;
+
+ if (damon_is_running(ctxs[0]))
+ damon_stop(ctxs, 1);
+ damon_destroy_ctx(ctxs[0]);
+ damon_destroy_ctx(ctxs[1]);
+ return err;
}
static void damon_sample_mtier_stop(void)
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [RFC PATCH v2 2/2] samples/damon/mtier: handle damon_stop() failure
2026-06-09 14:21 [RFC PATCH v2 0/2] samples/damon/mtier: handle damon_{start,stop}() failures SeongJae Park
2026-06-09 14:21 ` [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure SeongJae Park
@ 2026-06-09 14:21 ` SeongJae Park
1 sibling, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-06-09 14:21 UTC (permalink / raw)
Cc: SeongJae Park, # 6 . 16 . x, Andrew Morton, damon, linux-kernel,
linux-mm
damon_sample_mtier_stop() assumes its damon_stop() call will always
successfully stops the two DAMON contexts. Hence it deallocates the two
DAMON contexts after the damon_stop() call. However, if a given context
is already stopped, damon_stop() fails and returns an error while
letting the DAMON contexts that have not yet stopped keep running. This
kind of unexpected early DAMON context stops could happen due to memory
allocation failures in kdamond_fn(). Because damon_sample_mtier_stop()
just deallocates all DAMON contexts with damon_target and damon_region
objects that are linked to the contexts, the execution of the unstopped
DAMON context (kdamond) ends up using the memory that freed
(use-after-free). Fix the issue by separating the damon_stop() to be
invoked per context.
Note that DAMON_SYSFS also allows multiple DAMON contexts execution.
But, it calls damon_stop() for each context one by one. Hence this
issue is only in mtier.
For the long term, it would be better to refactor damon_stop() to always
ensure stopping all contexts regardless of the failures in the middle.
Make this fix in the current way, though, to keep it simple and easy to
backport. I will do the refactoring later.
The issue was discovered [1] by Sashiko.
[1] https://lore.kernel.org/20260609014219.3013-1-sj@kernel.org
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Cc: <stable@vger.kernel.org> # 6.16.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
samples/damon/mtier.c | 3 ++-
1 file changed, 2 insertions(+), 1 deletion(-)
diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index 66b591f2180fa..faaaaa12e6206 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -199,7 +199,8 @@ static int damon_sample_mtier_start(void)
static void damon_sample_mtier_stop(void)
{
- damon_stop(ctxs, 2);
+ damon_stop(ctxs, 1);
+ damon_stop(&ctxs[1], 1);
damon_destroy_ctx(ctxs[0]);
damon_destroy_ctx(ctxs[1]);
}
--
2.47.3
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure
2026-06-09 14:21 ` [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure SeongJae Park
@ 2026-06-09 14:35 ` sashiko-bot
2026-06-09 14:58 ` SeongJae Park
0 siblings, 1 reply; 7+ messages in thread
From: sashiko-bot @ 2026-06-09 14:35 UTC (permalink / raw)
To: SeongJae Park; +Cc: damon
Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
Pre-existing issues:
- [High] Resource leak (struct damon_ctx, struct damos, struct damon_target, and struct pid) on damon_start() failure.
--
commit e95eb25cfbc21e60927df5e7e4eadcf5468c1c41
Author: SeongJae Park <sj@kernel.org>
samples/damon/mtier: handle damon_start() failure
This commit fixes a resource leak in the memory tiering DAMON sample module
where failing to start the DAMON contexts leads to memory leaks and
indefinitely running contexts.
Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org
> diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> --- a/samples/damon/mtier.c
> +++ b/samples/damon/mtier.c
[ ... ]
> @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
> return -ENOMEM;
> }
> ctxs[1] = ctx;
> - return damon_start(ctxs, 2, true);
> + err = damon_start(ctxs, 2, true);
> + if (!err)
> + return 0;
> +
> + if (damon_is_running(ctxs[0]))
> + damon_stop(ctxs, 1);
> + damon_destroy_ctx(ctxs[0]);
> + damon_destroy_ctx(ctxs[1]);
> + return err;
> }
[Severity: High]
This is a pre-existing issue, but it looks like the exact same resource leak
pattern exists in other DAMON sample modules when damon_start() fails.
Looking at damon_sample_prcl_start() in samples/damon/prcl.c:
err = damon_start(&ctx, 1, true);
if (err)
return err;
And similarly in damon_sample_wsse_start() in samples/damon/wsse.c:
err = damon_start(&ctx, 1, true);
if (err)
return err;
Since they return directly without calling damon_destroy_ctx() or putting
the target_pidp reference, do these modules also leak their newly allocated
contexts and PID references when damon_start() fails?
--
Sashiko AI review · https://sashiko.dev/#/patchset/20260609142119.68120-1-sj@kernel.org?part=1
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure
2026-06-09 14:35 ` sashiko-bot
@ 2026-06-09 14:58 ` SeongJae Park
2026-06-09 15:04 ` SeongJae Park
0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2026-06-09 14:58 UTC (permalink / raw)
To: sashiko-bot; +Cc: SeongJae Park, damon
On Tue, 09 Jun 2026 14:35:38 +0000 sashiko-bot@kernel.org wrote:
> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
>
> Pre-existing issues:
> - [High] Resource leak (struct damon_ctx, struct damos, struct damon_target, and struct pid) on damon_start() failure.
> --
>
> commit e95eb25cfbc21e60927df5e7e4eadcf5468c1c41
> Author: SeongJae Park <sj@kernel.org>
> samples/damon/mtier: handle damon_start() failure
>
> This commit fixes a resource leak in the memory tiering DAMON sample module
> where failing to start the DAMON contexts leads to memory leaks and
> indefinitely running contexts.
>
> Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org
>
> > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> > --- a/samples/damon/mtier.c
> > +++ b/samples/damon/mtier.c
>
> [ ... ]
>
> > @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
> > return -ENOMEM;
> > }
> > ctxs[1] = ctx;
> > - return damon_start(ctxs, 2, true);
> > + err = damon_start(ctxs, 2, true);
> > + if (!err)
> > + return 0;
> > +
> > + if (damon_is_running(ctxs[0]))
> > + damon_stop(ctxs, 1);
> > + damon_destroy_ctx(ctxs[0]);
> > + damon_destroy_ctx(ctxs[1]);
> > + return err;
> > }
>
> [Severity: High]
> This is a pre-existing issue, but it looks like the exact same resource leak
> pattern exists in other DAMON sample modules when damon_start() fails.
>
> Looking at damon_sample_prcl_start() in samples/damon/prcl.c:
>
> err = damon_start(&ctx, 1, true);
> if (err)
> return err;
>
> And similarly in damon_sample_wsse_start() in samples/damon/wsse.c:
>
> err = damon_start(&ctx, 1, true);
> if (err)
> return err;
>
> Since they return directly without calling damon_destroy_ctx() or putting
> the target_pidp reference, do these modules also leak their newly allocated
> contexts and PID references when damon_start() fails?
Nice finding. Not a blocker of this patch, though. I will separately work on
this.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure
2026-06-09 14:58 ` SeongJae Park
@ 2026-06-09 15:04 ` SeongJae Park
2026-06-09 15:06 ` SeongJae Park
0 siblings, 1 reply; 7+ messages in thread
From: SeongJae Park @ 2026-06-09 15:04 UTC (permalink / raw)
To: SeongJae Park; +Cc: sashiko-bot, damon
On Tue, 09 Jun 2026 07:58:22 -0700 SeongJae Park <sj@kernel.org> wrote:
> On Tue, 09 Jun 2026 14:35:38 +0000 sashiko-bot@kernel.org wrote:
>
> > Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> >
> > Pre-existing issues:
> > - [High] Resource leak (struct damon_ctx, struct damos, struct damon_target, and struct pid) on damon_start() failure.
> > --
> >
> > commit e95eb25cfbc21e60927df5e7e4eadcf5468c1c41
> > Author: SeongJae Park <sj@kernel.org>
> > samples/damon/mtier: handle damon_start() failure
> >
> > This commit fixes a resource leak in the memory tiering DAMON sample module
> > where failing to start the DAMON contexts leads to memory leaks and
> > indefinitely running contexts.
> >
> > Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org
> >
> > > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> > > --- a/samples/damon/mtier.c
> > > +++ b/samples/damon/mtier.c
> >
> > [ ... ]
> >
> > > @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
> > > return -ENOMEM;
> > > }
> > > ctxs[1] = ctx;
> > > - return damon_start(ctxs, 2, true);
> > > + err = damon_start(ctxs, 2, true);
> > > + if (!err)
> > > + return 0;
> > > +
> > > + if (damon_is_running(ctxs[0]))
> > > + damon_stop(ctxs, 1);
> > > + damon_destroy_ctx(ctxs[0]);
> > > + damon_destroy_ctx(ctxs[1]);
> > > + return err;
> > > }
> >
> > [Severity: High]
> > This is a pre-existing issue, but it looks like the exact same resource leak
> > pattern exists in other DAMON sample modules when damon_start() fails.
> >
> > Looking at damon_sample_prcl_start() in samples/damon/prcl.c:
> >
> > err = damon_start(&ctx, 1, true);
> > if (err)
> > return err;
> >
> > And similarly in damon_sample_wsse_start() in samples/damon/wsse.c:
> >
> > err = damon_start(&ctx, 1, true);
> > if (err)
> > return err;
> >
> > Since they return directly without calling damon_destroy_ctx() or putting
> > the target_pidp reference, do these modules also leak their newly allocated
> > contexts and PID references when damon_start() fails?
>
> Nice finding. Not a blocker of this patch, though. I will separately work on
> this.
Hmm, but... They do destroy the ctx in damon_sample_{wsse,prcl}_stop(). And
users cannot trigger damon_sample_{wsse,prcl}_start() without calling the stop
function first, due to the current 'enabled' value based skippinng in
damon_sample_wsse_enable_store(). So, this seems not a true issue. Let me
know if I'm missing something.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure
2026-06-09 15:04 ` SeongJae Park
@ 2026-06-09 15:06 ` SeongJae Park
0 siblings, 0 replies; 7+ messages in thread
From: SeongJae Park @ 2026-06-09 15:06 UTC (permalink / raw)
To: SeongJae Park; +Cc: sashiko-bot, damon
On Tue, 09 Jun 2026 08:04:13 -0700 SeongJae Park <sj@kernel.org> wrote:
> On Tue, 09 Jun 2026 07:58:22 -0700 SeongJae Park <sj@kernel.org> wrote:
>
> > On Tue, 09 Jun 2026 14:35:38 +0000 sashiko-bot@kernel.org wrote:
> >
> > > Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider:
> > >
> > > Pre-existing issues:
> > > - [High] Resource leak (struct damon_ctx, struct damos, struct damon_target, and struct pid) on damon_start() failure.
> > > --
> > >
> > > commit e95eb25cfbc21e60927df5e7e4eadcf5468c1c41
> > > Author: SeongJae Park <sj@kernel.org>
> > > samples/damon/mtier: handle damon_start() failure
> > >
> > > This commit fixes a resource leak in the memory tiering DAMON sample module
> > > where failing to start the DAMON contexts leads to memory leaks and
> > > indefinitely running contexts.
> > >
> > > Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org
> > >
> > > > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
> > > > --- a/samples/damon/mtier.c
> > > > +++ b/samples/damon/mtier.c
> > >
> > > [ ... ]
> > >
> > > > @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
> > > > return -ENOMEM;
> > > > }
> > > > ctxs[1] = ctx;
> > > > - return damon_start(ctxs, 2, true);
> > > > + err = damon_start(ctxs, 2, true);
> > > > + if (!err)
> > > > + return 0;
> > > > +
> > > > + if (damon_is_running(ctxs[0]))
> > > > + damon_stop(ctxs, 1);
> > > > + damon_destroy_ctx(ctxs[0]);
> > > > + damon_destroy_ctx(ctxs[1]);
> > > > + return err;
> > > > }
> > >
> > > [Severity: High]
> > > This is a pre-existing issue, but it looks like the exact same resource leak
> > > pattern exists in other DAMON sample modules when damon_start() fails.
> > >
> > > Looking at damon_sample_prcl_start() in samples/damon/prcl.c:
> > >
> > > err = damon_start(&ctx, 1, true);
> > > if (err)
> > > return err;
> > >
> > > And similarly in damon_sample_wsse_start() in samples/damon/wsse.c:
> > >
> > > err = damon_start(&ctx, 1, true);
> > > if (err)
> > > return err;
> > >
> > > Since they return directly without calling damon_destroy_ctx() or putting
> > > the target_pidp reference, do these modules also leak their newly allocated
> > > contexts and PID references when damon_start() fails?
> >
> > Nice finding. Not a blocker of this patch, though. I will separately work on
> > this.
>
> Hmm, but... They do destroy the ctx in damon_sample_{wsse,prcl}_stop(). And
> users cannot trigger damon_sample_{wsse,prcl}_start() without calling the stop
> function first, due to the current 'enabled' value based skippinng in
> damon_sample_wsse_enable_store(). So, this seems not a true issue. Let me
> know if I'm missing something.
Ok, I was missing this. damon_sample_wsse_enable_store() reset 'enabled' when
damon_sample_wsse_start() fails. Same pattern for prcl and mtier. So seems
the issue is real. I will separately work on this.
Thanks,
SJ
[...]
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2026-06-09 15:06 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-09 14:21 [RFC PATCH v2 0/2] samples/damon/mtier: handle damon_{start,stop}() failures SeongJae Park
2026-06-09 14:21 ` [RFC PATCH v2 1/2] samples/damon/mtier: handle damon_start() failure SeongJae Park
2026-06-09 14:35 ` sashiko-bot
2026-06-09 14:58 ` SeongJae Park
2026-06-09 15:04 ` SeongJae Park
2026-06-09 15:06 ` SeongJae Park
2026-06-09 14:21 ` [RFC PATCH v2 2/2] samples/damon/mtier: handle damon_stop() failure SeongJae Park
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.