[RFC PATCH v3 0/4] samples/damon: handle damon

All of lore.kernel.org
 help / color / mirror / Atom feed

* [RFC PATCH v3 0/4] samples/damon: handle damon_{start,stop}() failures
@ 2026-06-10  1:14 SeongJae Park
  2026-06-10  1:14 ` [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure SeongJae Park
                   ` (3 more replies)
  0 siblings, 4 replies; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  1:14 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 14 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

All DAMON sample modules are not correctly handling failures from
damon_start().  Among those, mtier also has an additional problem for
handling of damon_stop() failures.  As a result, memory leaks, next
DAMON operation disruptions, and use-after-free can happen.  Fix those.

Changes from RFC v2
- RFC v2: https://lore.kernel.org/20260609142119.68120-1-sj@kernel.org
- Add damon_start() failure handling fix for wsse and prcl.
Changes from RFC v1
- RFC v1: https://lore.kernel.org/20260609005443.2122-1-sj@kernel.org
- Add damon_stop() failure handling fix to the series.

SeongJae Park (4):
  samples/damon/wsse: handle damon_start() failure
  samples/damon/prcl: handle damon_start() failure
  samples/damon/mtier: handle damon_start() failure
  samples/damon/mtier: handle damon_stop() failure

 samples/damon/mtier.c | 14 ++++++++++++--
 samples/damon/prcl.c  |  4 +++-
 samples/damon/wsse.c  |  4 +++-
 3 files changed, 18 insertions(+), 4 deletions(-)


base-commit: e38932476396c4da618a9e904ba4e45f1891d910
-- 
2.47.3


^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure
  2026-06-10  1:14 [RFC PATCH v3 0/4] samples/damon: handle damon_{start,stop}() failures SeongJae Park
@ 2026-06-10  1:14 ` SeongJae Park
  2026-06-10  1:29   ` sashiko-bot
  2026-06-10  1:14 ` [RFC PATCH v3 2/4] samples/damon/prcl: " SeongJae Park
                   ` (2 subsequent siblings)
  3 siblings, 1 reply; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  1:14 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 14 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

damon_sample_wsse_start() callers assume it will clean up resources when
it fails.  And the function does the cleanup for context buildup
failures.  However, it is not doing the cleanup for damon_start()
failure.  As a result, when damon_start() fails, it leaks the memory for
DAMON context.  Free the context in case of the failure to fix the
issues.

Note that the issue can reliably be reproduced because the module calls
damon_start() in the exclusive mode.  For example,

    $ sudo damo start
    $ echo $$ | sudo tee /sys/module/damon_sample_wsse/parameters/target_pid
    $ echo Y | sudo tee /sys/module/damon_sample_wsse/parameters/enabled
    $ sudo cat /proc/allocinfo | grep damon_new_ctx

Because the first command is running another DAMON instance, the third
command fails the damon_start() call because the new DAMON instance
cannot exclusively run.  And without this fix, by repeating the third
and the fourth commands above, we can show the memory consumption is
only increasing due to the leaks.  It requires the sudo permission
though.

The issue was discovered [1] by Sashiko.

[1] https://lore.kernel.org/20260609145814.70163-1-sj@kernel.org

Fixes: b757c6cfc696 ("samples/damon/wsse: start and stop DAMON as the user requests")
Cc: <stable@vger.kernel.org> # 6.14.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 samples/damon/wsse.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c
index 799ad44439434..bbd9392ab5b36 100644
--- a/samples/damon/wsse.c
+++ b/samples/damon/wsse.c
@@ -87,8 +87,10 @@ static int damon_sample_wsse_start(void)
 	target->pid = target_pidp;

 	err = damon_start(&ctx, 1, true);
-	if (err)
+	if (err) {
+		damon_destroy_ctx(ctx);
 		return err;
+	}
 	repeat_call_control.data = ctx;
 	return damon_call(ctx, &repeat_call_control);
 }
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure
  2026-06-10  1:14 ` [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure SeongJae Park
@ 2026-06-10  1:29   ` sashiko-bot
  2026-06-10  3:48     ` SeongJae Park
  0 siblings, 1 reply; 9+ messages in thread
From: sashiko-bot @ 2026-06-10  1:29 UTC (permalink / raw)
  To: SeongJae Park; +Cc: damon

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Unhandled failure of `damon_call()` leads to a permanent memory leak of the DAMON context.
- [High] Concurrent execution of `module_init` and sysfs store causes Use-After-Free and Double Free on the global `ctx`.
--

commit 6215f97f729272927aafd065cdad15da2af100a8
Author: SeongJae Park <sj@kernel.org>

samples/damon/wsse: handle damon_start() failure

This commit fixes a memory leak in damon_sample_wsse_start() by explicitly
freeing the DAMON context if damon_start() fails.

Link: https://lore.kernel.org/20260609145814.70163-1-sj@kernel.org

> diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c
> index 799ad44439434..bbd9392ab5b36 100644
> --- a/samples/damon/wsse.c
> +++ b/samples/damon/wsse.c
> @@ -87,8 +87,10 @@ static int damon_sample_wsse_start(void)
>  	target->pid = target_pidp;
>  
>  	err = damon_start(&ctx, 1, true);
> -	if (err)
> +	if (err) {
> +		damon_destroy_ctx(ctx);
>  		return err;
> +	}

[Severity: High]
This is a pre-existing issue, but does this new error path leave the global
ctx pointer dangling?

If the module is loaded with enabled=1, damon_sample_wsse_init() calls
damon_sample_wsse_start(). Concurrently, user space can write 0 to the
enabled sysfs parameter, which triggers damon_sample_wsse_stop().

If damon_start() encounters an error, this new path calls
damon_destroy_ctx(ctx) but ctx is not set to NULL. A concurrent
damon_sample_wsse_stop() would see ctx != NULL and call
damon_destroy_ctx(ctx) again, leading to a double free.

Similarly, if damon_sample_wsse_stop() executes while
damon_sample_wsse_start() is running, stop() could call
damon_destroy_ctx(ctx) while start() is actively using it, causing a
use-after-free.

>  	repeat_call_control.data = ctx;
>  	return damon_call(ctx, &repeat_call_control);
>  }

[Severity: High]
This is a pre-existing issue, but does this return path leak the newly
created ctx if damon_call() fails?

If damon_call() returns an error (for example, -ECANCELED if the kdamond
thread exits immediately and sets ctx->call_controls_obsolete to true), the
error is returned directly to the caller.

Because the kdamond thread does not free the context itself when exiting,
is damon_destroy_ctx(ctx) also needed here to prevent leaking the DAMON
context and its associated objects?

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260610011420.3018-1-sj@kernel.org?part=1

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure
  2026-06-10  1:29   ` sashiko-bot
@ 2026-06-10  3:48     ` SeongJae Park
  0 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  3:48 UTC (permalink / raw)
  To: sashiko-bot; +Cc: SeongJae Park, damon

On Wed, 10 Jun 2026 01:29:12 +0000 sashiko-bot@kernel.org wrote:

> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> 
> Pre-existing issues:
> - [High] Unhandled failure of `damon_call()` leads to a permanent memory leak of the DAMON context.
> - [High] Concurrent execution of `module_init` and sysfs store causes Use-After-Free and Double Free on the global `ctx`.
> --
> 
> commit 6215f97f729272927aafd065cdad15da2af100a8
> Author: SeongJae Park <sj@kernel.org>
> 
> samples/damon/wsse: handle damon_start() failure
> 
> This commit fixes a memory leak in damon_sample_wsse_start() by explicitly
> freeing the DAMON context if damon_start() fails.
> 
> Link: https://lore.kernel.org/20260609145814.70163-1-sj@kernel.org
> 
> > diff --git a/samples/damon/wsse.c b/samples/damon/wsse.c
> > index 799ad44439434..bbd9392ab5b36 100644
> > --- a/samples/damon/wsse.c
> > +++ b/samples/damon/wsse.c
> > @@ -87,8 +87,10 @@ static int damon_sample_wsse_start(void)
> >  	target->pid = target_pidp;
> >  
> >  	err = damon_start(&ctx, 1, true);
> > -	if (err)
> > +	if (err) {
> > +		damon_destroy_ctx(ctx);
> >  		return err;
> > +	}
> 
> [Severity: High]
> This is a pre-existing issue, but does this new error path leave the global
> ctx pointer dangling?
> 
> If the module is loaded with enabled=1, damon_sample_wsse_init() calls
> damon_sample_wsse_start(). Concurrently, user space can write 0 to the
> enabled sysfs parameter, which triggers damon_sample_wsse_stop().
> 
> If damon_start() encounters an error, this new path calls
> damon_destroy_ctx(ctx) but ctx is not set to NULL. A concurrent
> damon_sample_wsse_stop() would see ctx != NULL and call
> damon_destroy_ctx(ctx) again, leading to a double free.
> 
> Similarly, if damon_sample_wsse_stop() executes while
> damon_sample_wsse_start() is running, stop() could call
> damon_destroy_ctx(ctx) while start() is actively using it, causing a
> use-after-free.

damon_sample_wsse cannot be dynamically loaded, so this issue cannot happen.

> 
> >  	repeat_call_control.data = ctx;
> >  	return damon_call(ctx, &repeat_call_control);
> >  }
> 
> [Severity: High]
> This is a pre-existing issue, but does this return path leak the newly
> created ctx if damon_call() fails?
> 
> If damon_call() returns an error (for example, -ECANCELED if the kdamond
> thread exits immediately and sets ctx->call_controls_obsolete to true), the
> error is returned directly to the caller.
> 
> Because the kdamond thread does not free the context itself when exiting,
> is damon_destroy_ctx(ctx) also needed here to prevent leaking the DAMON
> context and its associated objects?

Good finding.  I will fix this with another patch.


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 2/4] samples/damon/prcl: handle damon_start() failure
  2026-06-10  1:14 [RFC PATCH v3 0/4] samples/damon: handle damon_{start,stop}() failures SeongJae Park
  2026-06-10  1:14 ` [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure SeongJae Park
@ 2026-06-10  1:14 ` SeongJae Park
  2026-06-10  1:29   ` sashiko-bot
  2026-06-10  1:14 ` [RFC PATCH v3 3/4] samples/damon/mtier: " SeongJae Park
  2026-06-10  1:14 ` [RFC PATCH v3 4/4] samples/damon/mtier: handle damon_stop() failure SeongJae Park
  3 siblings, 1 reply; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  1:14 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 14 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

damon_sample_prcl_start() callers assume it will clean up resources when
it fails.  And the function does the cleanup for context buildup
failures.  However, it is not doing the cleanup for damon_start()
failure.  As a result, when damon_start() fails, it leaks the memory for
DAMON context.  Free the context in case of the failure to fix the
issues.

Note that the issue can reliably be reproduced because the module calls
damon_start() in the exclusive mode.  For example,

    $ sudo damo start
    $ echo $$ | sudo tee /sys/module/damon_sample_prcl/parameters/target_pid
    $ echo Y | sudo tee /sys/module/damon_sample_prcl/parameters/enabled
    $ sudo cat /proc/allocinfo | grep damon_new_ctx

Because the first command is running another DAMON instance, the third
command fails the damon_start() call because the new DAMON instance
cannot exclusively run.  And without this fix, by repeating the third
and the fourth commands above, we can show the memory consumption is
only increasing due to the leaks.  It requires the sudo permission
though.

The issue was discovered [1] by Sashiko.

[1] https://lore.kernel.org/20260609145814.70163-1-sj@kernel.org

Fixes: 2aca254620a8 ("samples/damon: introduce a skeleton of a smaple DAMON module for proactive reclamation")
Cc: <stable@vger.kernel.org> # 6.14.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 samples/damon/prcl.c | 4 +++-
 1 file changed, 3 insertions(+), 1 deletion(-)

diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c
index b7c50f2656ce7..0db2598946911 100644
--- a/samples/damon/prcl.c
+++ b/samples/damon/prcl.c
@@ -106,8 +106,10 @@ static int damon_sample_prcl_start(void)
 	damon_set_schemes(ctx, &scheme, 1);

 	err = damon_start(&ctx, 1, true);
-	if (err)
+	if (err) {
+		damon_destroy_ctx(ctx);
 		return err;
+	}

 	repeat_call_control.data = ctx;
 	return damon_call(ctx, &repeat_call_control);
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 2/4] samples/damon/prcl: handle damon_start() failure
  2026-06-10  1:14 ` [RFC PATCH v3 2/4] samples/damon/prcl: " SeongJae Park
@ 2026-06-10  1:29   ` sashiko-bot
  2026-06-10  3:52     ` SeongJae Park
  0 siblings, 1 reply; 9+ messages in thread
From: sashiko-bot @ 2026-06-10  1:29 UTC (permalink / raw)
  To: SeongJae Park; +Cc: damon

Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:

Pre-existing issues:
- [High] Loading the module dynamically with the `enabled=Y` parameter causes a permanent DAMON thread and context leak.
- [Medium] The DAMON context, targets, and schemes are leaked if `damon_call()` fails in `damon_sample_prcl_start()`.
--

commit 66049e90074b43852fd5c6e982d320d3e8718fb4
Author: SeongJae Park <sj@kernel.org>

samples/damon/prcl: handle damon_start() failure

This commit fixes a memory leak in damon_sample_prcl_start() by destroying
the Damon context if damon_start() fails. This addresses an issue where
the context is leaked when the module fails to start in exclusive mode.

Link: https://lore.kernel.org/20260609145814.70163-1-sj@kernel.org

> diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c
> index b7c50f2656ce7..0db2598946911 100644
> --- a/samples/damon/prcl.c
> +++ b/samples/damon/prcl.c
> @@ -106,8 +106,10 @@ static int damon_sample_prcl_start(void)
>  	damon_set_schemes(ctx, &scheme, 1);
>  
>  	err = damon_start(&ctx, 1, true);
> -	if (err)
> +	if (err) {
> +		damon_destroy_ctx(ctx);
>  		return err;
> +	}
>  
>  	repeat_call_control.data = ctx;
>  	return damon_call(ctx, &repeat_call_control);

[Severity: Medium]
This is a pre-existing issue, but does this function also leak the Damon
context, targets, and schemes if damon_call() fails?

If damon_start() succeeds, it spawns the kdamond thread. If the newly spawned
thread terminates early, for example due to a memory allocation failure for
regions_score_histogram in kdamond_fn(), it sets call_controls_obsolete.

If this happens before damon_call() acquires the call_controls_lock:

mm/damon/core.c:damon_call() {
    ...
	mutex_lock(&ctx->call_controls_lock);
	if (ctx->call_controls_obsolete) {
		mutex_unlock(&ctx->call_controls_lock);
		return -ECANCELED;
	}
    ...
}

Because damon_sample_prcl_start() returns this error directly to the caller
without cleanup, the caller receives the error and disables the module. This
permanently bypasses damon_sample_prcl_stop(), meaning damon_destroy_ctx()
is never executed.

[Severity: High]
This is also a pre-existing issue, but does loading the module dynamically
with the enabled=Y parameter cause a permanent Damon thread and context leak?

During module load, the kernel parses parameters and invokes the module's
parameter callback before executing the init function. If enabled=Y is
passed, the callback successfully calls damon_sample_prcl_start(), which
allocates the context and starts the kdamond thread in exclusive mode.

Then, do_init_module() executes damon_sample_prcl_init():

samples/damon/prcl.c:damon_sample_prcl_init() {
    ...
	if (enabled) {
		err = damon_sample_prcl_start();
		if (err)
			return err;
	}
    ...
}

Because the enabled flag is still true, damon_sample_prcl_init() erroneously
calls damon_sample_prcl_start() a second time. This unconditionally allocates
a new ctx, overwriting the global ctx pointer and losing the reference to the
original context.

The second damon_start() call fails with -EBUSY because the first thread is
already running in exclusive mode. The original kdamond thread is left
running forever and its context pointer is permanently lost, locking out any
other Damon instances on the system until reboot.

-- 
Sashiko AI review · https://sashiko.dev/#/patchset/20260610011420.3018-1-sj@kernel.org?part=2

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC PATCH v3 2/4] samples/damon/prcl: handle damon_start() failure
  2026-06-10  1:29   ` sashiko-bot
@ 2026-06-10  3:52     ` SeongJae Park
  0 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  3:52 UTC (permalink / raw)
  To: sashiko-bot; +Cc: SeongJae Park, damon

On Wed, 10 Jun 2026 01:29:48 +0000 sashiko-bot@kernel.org wrote:

> Thank you for your contribution! Sashiko AI review found 2 potential issue(s) to consider:
> 
> Pre-existing issues:
> - [High] Loading the module dynamically with the `enabled=Y` parameter causes a permanent DAMON thread and context leak.
> - [Medium] The DAMON context, targets, and schemes are leaked if `damon_call()` fails in `damon_sample_prcl_start()`.
> --
> 
> commit 66049e90074b43852fd5c6e982d320d3e8718fb4
> Author: SeongJae Park <sj@kernel.org>
> 
> samples/damon/prcl: handle damon_start() failure
> 
> This commit fixes a memory leak in damon_sample_prcl_start() by destroying
> the Damon context if damon_start() fails. This addresses an issue where
> the context is leaked when the module fails to start in exclusive mode.
> 
> Link: https://lore.kernel.org/20260609145814.70163-1-sj@kernel.org
> 
> > diff --git a/samples/damon/prcl.c b/samples/damon/prcl.c
> > index b7c50f2656ce7..0db2598946911 100644
> > --- a/samples/damon/prcl.c
> > +++ b/samples/damon/prcl.c
> > @@ -106,8 +106,10 @@ static int damon_sample_prcl_start(void)
> >  	damon_set_schemes(ctx, &scheme, 1);
> >  
> >  	err = damon_start(&ctx, 1, true);
> > -	if (err)
> > +	if (err) {
> > +		damon_destroy_ctx(ctx);
> >  		return err;
> > +	}
> >  
> >  	repeat_call_control.data = ctx;
> >  	return damon_call(ctx, &repeat_call_control);
> 
> [Severity: Medium]
> This is a pre-existing issue, but does this function also leak the Damon
> context, targets, and schemes if damon_call() fails?
> 
> If damon_start() succeeds, it spawns the kdamond thread. If the newly spawned
> thread terminates early, for example due to a memory allocation failure for
> regions_score_histogram in kdamond_fn(), it sets call_controls_obsolete.
> 
> If this happens before damon_call() acquires the call_controls_lock:
> 
> mm/damon/core.c:damon_call() {
>     ...
> 	mutex_lock(&ctx->call_controls_lock);
> 	if (ctx->call_controls_obsolete) {
> 		mutex_unlock(&ctx->call_controls_lock);
> 		return -ECANCELED;
> 	}
>     ...
> }
> 
> Because damon_sample_prcl_start() returns this error directly to the caller
> without cleanup, the caller receives the error and disables the module. This
> permanently bypasses damon_sample_prcl_stop(), meaning damon_destroy_ctx()
> is never executed.

Good catch, I will fix this with another patch.

> 
> [Severity: High]
> This is also a pre-existing issue, but does loading the module dynamically
> with the enabled=Y parameter cause a permanent Damon thread and context leak?
> 
> During module load, the kernel parses parameters and invokes the module's
> parameter callback before executing the init function. If enabled=Y is
> passed, the callback successfully calls damon_sample_prcl_start(), which
> allocates the context and starts the kdamond thread in exclusive mode.
> 
> Then, do_init_module() executes damon_sample_prcl_init():
> 
> samples/damon/prcl.c:damon_sample_prcl_init() {
>     ...
> 	if (enabled) {
> 		err = damon_sample_prcl_start();
> 		if (err)
> 			return err;
> 	}
>     ...
> }
> 
> Because the enabled flag is still true, damon_sample_prcl_init() erroneously
> calls damon_sample_prcl_start() a second time. This unconditionally allocates
> a new ctx, overwriting the global ctx pointer and losing the reference to the
> original context.
> 
> The second damon_start() call fails with -EBUSY because the first thread is
> already running in exclusive mode. The original kdamond thread is left
> running forever and its context pointer is permanently lost, locking out any
> other Damon instances on the system until reboot.

prcl cannot dynamiclly loaded, so this issue cannot happen.


Thanks,
SJ

[...]

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 3/4] samples/damon/mtier: handle damon_start() failure
  2026-06-10  1:14 [RFC PATCH v3 0/4] samples/damon: handle damon_{start,stop}() failures SeongJae Park
  2026-06-10  1:14 ` [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure SeongJae Park
  2026-06-10  1:14 ` [RFC PATCH v3 2/4] samples/damon/prcl: " SeongJae Park
@ 2026-06-10  1:14 ` SeongJae Park
  2026-06-10  1:14 ` [RFC PATCH v3 4/4] samples/damon/mtier: handle damon_stop() failure SeongJae Park
  3 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  1:14 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 16 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

damon_sample_mtier_start() callers assume it will clean up resources
when it fails.  And the function does the cleanup for context buildup
failures.  However, it is not doing the cleanup for damon_start()
failure.

As a result, when damon_start() fails, it could leak the memory for
DAMON context.  Also, if damon_start() fails for only the second
context, the first context will indefinitely run, and avoid starting
other DAMON contexts since it is running in the exclusive mode.  Stop
possibly started DAMON context and free the contexts in case of the
failure to fix the issues.

Note that the issue can reliably be reproduced because the module calls
damon_start() in the exclusive mode.  For example,

    $ sudo damo start
    $ echo Y | sudo tee /sys/module/damon_sample_mtier/parameters/enabled
    $ sudo cat /proc/allocinfo | grep damon_new_ctx

Because the first command is running another DAMON instance, the second
command fails the damon_start() call because the new DAMON instance
cannot exclusively run.  And without this fix, by repeating the second
and the third commands above, we can show the memory consumption is only
increasing due to the leaks.  It requires the sudo permission though.

The issue was discovered [1] by Sashiko.

[1] https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org

Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Cc: <stable@vger.kernel.org> # 6.16.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 samples/damon/mtier.c | 11 ++++++++++-
 1 file changed, 10 insertions(+), 1 deletion(-)

diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index eb1143de8df17..66b591f2180fa 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -174,6 +174,7 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
 static int damon_sample_mtier_start(void)
 {
 	struct damon_ctx *ctx;
+	int err;

 	ctx = damon_sample_mtier_build_ctx(true);
 	if (!ctx)
@@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
 		return -ENOMEM;
 	}
 	ctxs[1] = ctx;
-	return damon_start(ctxs, 2, true);
+	err = damon_start(ctxs, 2, true);
+	if (!err)
+		return 0;
+
+	if (damon_is_running(ctxs[0]))
+		damon_stop(ctxs, 1);
+	damon_destroy_ctx(ctxs[0]);
+	damon_destroy_ctx(ctxs[1]);
+	return err;
 }

 static void damon_sample_mtier_stop(void)
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC PATCH v3 4/4] samples/damon/mtier: handle damon_stop() failure
  2026-06-10  1:14 [RFC PATCH v3 0/4] samples/damon: handle damon_{start,stop}() failures SeongJae Park
                   ` (2 preceding siblings ...)
  2026-06-10  1:14 ` [RFC PATCH v3 3/4] samples/damon/mtier: " SeongJae Park
@ 2026-06-10  1:14 ` SeongJae Park
  3 siblings, 0 replies; 9+ messages in thread
From: SeongJae Park @ 2026-06-10  1:14 UTC (permalink / raw)
  Cc: SeongJae Park, # 6 . 16 . x, Andrew Morton, damon, linux-kernel,
	linux-mm

damon_sample_mtier_stop() assumes its damon_stop() call will always
successfully stops the two DAMON contexts.  Hence it deallocates the two
DAMON contexts after the damon_stop() call.  However, if a given context
is already stopped, damon_stop() fails and returns an error while
letting the DAMON contexts that have not yet stopped keep running.  This
kind of unexpected early DAMON context stops could happen due to memory
allocation failures in kdamond_fn().  Because damon_sample_mtier_stop()
just deallocates all DAMON contexts with damon_target and damon_region
objects that are linked to the contexts, the execution of the unstopped
DAMON context (kdamond) ends up using the memory that freed
(use-after-free).  Fix the issue by separating the damon_stop() to be
invoked per context.

Note that DAMON_SYSFS also allows multiple DAMON contexts execution.
But, it calls damon_stop() for each context one by one.  Hence this
issue is only in mtier.

For the long term, it would be better to refactor damon_stop() to always
ensure stopping all contexts regardless of the failures in the middle.
Make this fix in the current way, though, to keep it simple and easy to
backport.  I will do the refactoring later.

The issue was discovered [1] by Sashiko.

[1] https://lore.kernel.org/20260609014219.3013-1-sj@kernel.org

Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Cc: <stable@vger.kernel.org> # 6.16.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
 samples/damon/mtier.c | 3 ++-
 1 file changed, 2 insertions(+), 1 deletion(-)

diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index 66b591f2180fa..faaaaa12e6206 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -199,7 +199,8 @@ static int damon_sample_mtier_start(void)

 static void damon_sample_mtier_stop(void)
 {
-	damon_stop(ctxs, 2);
+	damon_stop(ctxs, 1);
+	damon_stop(&ctxs[1], 1);
 	damon_destroy_ctx(ctxs[0]);
 	damon_destroy_ctx(ctxs[1]);
 }
-- 
2.47.3

^ permalink raw reply related	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2026-06-10  3:52 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2026-06-10  1:14 [RFC PATCH v3 0/4] samples/damon: handle damon_{start,stop}() failures SeongJae Park
2026-06-10  1:14 ` [RFC PATCH v3 1/4] samples/damon/wsse: handle damon_start() failure SeongJae Park
2026-06-10  1:29   ` sashiko-bot
2026-06-10  3:48     ` SeongJae Park
2026-06-10  1:14 ` [RFC PATCH v3 2/4] samples/damon/prcl: " SeongJae Park
2026-06-10  1:29   ` sashiko-bot
2026-06-10  3:52     ` SeongJae Park
2026-06-10  1:14 ` [RFC PATCH v3 3/4] samples/damon/mtier: " SeongJae Park
2026-06-10  1:14 ` [RFC PATCH v3 4/4] samples/damon/mtier: handle damon_stop() failure SeongJae Park

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.