* [RFC PATCH] samples/damon/mtier: handle damon_start() failure
@ 2026-06-09 0:54 SeongJae Park
2026-06-09 1:06 ` sashiko-bot
0 siblings, 1 reply; 4+ messages in thread
From: SeongJae Park @ 2026-06-09 0:54 UTC (permalink / raw)
Cc: SeongJae Park, # 6 . 16 . x, Andrew Morton, damon, linux-kernel,
linux-mm
damon_sample_mtier_start() callers assume it will clean up resources
when it fails. And the function does the cleanup for context buildup
failures. However, it is not doing the cleanup for damon_start()
failure.
As a result, when damon_start() fails, it could leak the memory for
DAMON context. Also, if damon_start() fails for only the second
context, the first context will indefinitely run, and avoid starting
other DAMON contexts since it is running in the exclusive mode. Stop
possibly started DAMON context and free the contexts in case of the
failure to fix the issues.
Note that the issue can reliably be reproduced because the module calls
damon_start() in the exclusive mode. For example,
$ sudo damo start
$ echo Y | sudo tee /sys/module/damon_sample_mtier/parameters/enabled
$ sudo cat /proc/allocinfo | grep damon_new_ctx
Because the first command is running another DAMON instance, the second
command fails the damon_start() call because the new DAMON instance
cannot exclusively run. And without this fix, by repeating the second
and the third commands above, we can show the memory consumption is only
increasing due to the leaks. It requires the sudo permission though.
The issue was discovered [1] by Sashiko.
[1] https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org
Fixes: 82a08bde3cf7 ("samples/damon: implement a DAMON module for memory tiering")
Cc: <stable@vger.kernel.org> # 6.16.x
Signed-off-by: SeongJae Park <sj@kernel.org>
---
samples/damon/mtier.c | 11 ++++++++++-
1 file changed, 10 insertions(+), 1 deletion(-)
diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c
index eb1143de8df17..66b591f2180fa 100644
--- a/samples/damon/mtier.c
+++ b/samples/damon/mtier.c
@@ -174,6 +174,7 @@ static struct damon_ctx *damon_sample_mtier_build_ctx(bool promote)
static int damon_sample_mtier_start(void)
{
struct damon_ctx *ctx;
+ int err;
ctx = damon_sample_mtier_build_ctx(true);
if (!ctx)
@@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void)
return -ENOMEM;
}
ctxs[1] = ctx;
- return damon_start(ctxs, 2, true);
+ err = damon_start(ctxs, 2, true);
+ if (!err)
+ return 0;
+
+ if (damon_is_running(ctxs[0]))
+ damon_stop(ctxs, 1);
+ damon_destroy_ctx(ctxs[0]);
+ damon_destroy_ctx(ctxs[1]);
+ return err;
}
static void damon_sample_mtier_stop(void)
base-commit: 947b8ee1c1735e548454493da9999a2647621bb0
--
2.47.3
^ permalink raw reply related [flat|nested] 4+ messages in thread* Re: [RFC PATCH] samples/damon/mtier: handle damon_start() failure 2026-06-09 0:54 [RFC PATCH] samples/damon/mtier: handle damon_start() failure SeongJae Park @ 2026-06-09 1:06 ` sashiko-bot 2026-06-09 1:42 ` SeongJae Park 0 siblings, 1 reply; 4+ messages in thread From: sashiko-bot @ 2026-06-09 1:06 UTC (permalink / raw) To: SeongJae Park; +Cc: damon Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: Pre-existing issues: - [High] `damon_stop()` aborts its loop on the first error, leaving subsequent contexts running and triggering a Use-After-Free when their memory is destroyed. -- commit 6da87efa9d6982132beff965026bf155183e5c93 Author: SeongJae Park <sj@kernel.org> samples/damon/mtier: handle damon_start() failure This commit adds cleanup for damon_start() failures in the memory tiering DAMON module. When damon_start() fails, it stops any running contexts and frees them to prevent memory leaks and exclusive mode locks. Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c > index eb1143de8df17..66b591f2180fa 100644 > --- a/samples/damon/mtier.c > +++ b/samples/damon/mtier.c [ ... ] > @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void) > return -ENOMEM; > } > ctxs[1] = ctx; > - return damon_start(ctxs, 2, true); > + err = damon_start(ctxs, 2, true); > + if (!err) > + return 0; > + > + if (damon_is_running(ctxs[0])) > + damon_stop(ctxs, 1); > + damon_destroy_ctx(ctxs[0]); > + damon_destroy_ctx(ctxs[1]); > + return err; > } > > static void damon_sample_mtier_stop(void) [Severity: High] This is a pre-existing issue, but does damon_stop() leave contexts running and cause a use-after-free here? Looking at damon_stop() in mm/damon/core.c: int damon_stop(struct damon_ctx **ctxs, int nr_ctxs) { int i, err = 0; for (i = 0; i < nr_ctxs; i++) { err = __damon_stop(ctxs[i]); if (err) break; } return err; } If __damon_stop() returns an error for the first context (for instance, if the kthread exited prematurely due to an allocation failure), the loop breaks immediately. When damon_sample_mtier_stop() calls damon_stop(ctxs, 2), this would mean ctxs[1] is never stopped if stopping ctxs[0] returns an error. The subsequent unconditional calls to damon_destroy_ctx(ctxs[1]) in damon_sample_mtier_stop() would then free the context, its targets, and its schemes while kdamond is still executing and dereferencing them. Can this lead to a use-after-free? -- Sashiko AI review · https://sashiko.dev/#/patchset/20260609005443.2122-1-sj@kernel.org?part=1 ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] samples/damon/mtier: handle damon_start() failure 2026-06-09 1:06 ` sashiko-bot @ 2026-06-09 1:42 ` SeongJae Park 2026-06-11 14:25 ` XIAO WU 0 siblings, 1 reply; 4+ messages in thread From: SeongJae Park @ 2026-06-09 1:42 UTC (permalink / raw) To: sashiko-bot; +Cc: SeongJae Park, damon On Tue, 09 Jun 2026 01:06:15 +0000 sashiko-bot@kernel.org wrote: > Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: > > Pre-existing issues: > - [High] `damon_stop()` aborts its loop on the first error, leaving subsequent contexts running and triggering a Use-After-Free when their memory is destroyed. > -- > > commit 6da87efa9d6982132beff965026bf155183e5c93 > Author: SeongJae Park <sj@kernel.org> > > samples/damon/mtier: handle damon_start() failure > > This commit adds cleanup for damon_start() failures in the memory tiering > DAMON module. When damon_start() fails, it stops any running contexts > and frees them to prevent memory leaks and exclusive mode locks. > > Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org > > > diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c > > index eb1143de8df17..66b591f2180fa 100644 > > --- a/samples/damon/mtier.c > > +++ b/samples/damon/mtier.c > [ ... ] > > @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void) > > return -ENOMEM; > > } > > ctxs[1] = ctx; > > - return damon_start(ctxs, 2, true); > > + err = damon_start(ctxs, 2, true); > > + if (!err) > > + return 0; > > + > > + if (damon_is_running(ctxs[0])) > > + damon_stop(ctxs, 1); > > + damon_destroy_ctx(ctxs[0]); > > + damon_destroy_ctx(ctxs[1]); > > + return err; > > } > > > > static void damon_sample_mtier_stop(void) > > [Severity: High] > This is a pre-existing issue, So not a blocker of this patch. > but does damon_stop() leave contexts running > and cause a use-after-free here? > > Looking at damon_stop() in mm/damon/core.c: > > int damon_stop(struct damon_ctx **ctxs, int nr_ctxs) > { > int i, err = 0; > > for (i = 0; i < nr_ctxs; i++) { > err = __damon_stop(ctxs[i]); > if (err) > break; > } > return err; > } > > If __damon_stop() returns an error for the first context (for instance, if > the kthread exited prematurely due to an allocation failure), the loop > breaks immediately. > > When damon_sample_mtier_stop() calls damon_stop(ctxs, 2), this would mean > ctxs[1] is never stopped if stopping ctxs[0] returns an error. > > The subsequent unconditional calls to damon_destroy_ctx(ctxs[1]) in > damon_sample_mtier_stop() would then free the context, its targets, and its > schemes while kdamond is still executing and dereferencing them. > > Can this lead to a use-after-free? Makes sense. I will separately work on this. Thanks, SJ [...] ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: [RFC PATCH] samples/damon/mtier: handle damon_start() failure 2026-06-09 1:42 ` SeongJae Park @ 2026-06-11 14:25 ` XIAO WU 0 siblings, 0 replies; 4+ messages in thread From: XIAO WU @ 2026-06-11 14:25 UTC (permalink / raw) To: SeongJae Park, sashiko-bot; +Cc: damon Hi, I've written a Poc for this issues which may be helpful and Please see the details and the PoC code below: On 9/6/26 9:42, SeongJae Park wrote: > On Tue, 09 Jun 2026 01:06:15 +0000 sashiko-bot@kernel.org wrote: > >> Thank you for your contribution! Sashiko AI review found 1 potential issue(s) to consider: >> >> Pre-existing issues: >> - [High] `damon_stop()` aborts its loop on the first error, leaving subsequent contexts running and triggering a Use-After-Free when their memory is destroyed. >> -- >> >> commit 6da87efa9d6982132beff965026bf155183e5c93 >> Author: SeongJae Park <sj@kernel.org> >> >> samples/damon/mtier: handle damon_start() failure >> >> This commit adds cleanup for damon_start() failures in the memory tiering >> DAMON module. When damon_start() fails, it stops any running contexts >> and frees them to prevent memory leaks and exclusive mode locks. >> >> Link: https://lore.kernel.org/20260608112455.274231F00893@smtp.kernel.org >> >>> diff --git a/samples/damon/mtier.c b/samples/damon/mtier.c >>> index eb1143de8df17..66b591f2180fa 100644 >>> --- a/samples/damon/mtier.c >>> +++ b/samples/damon/mtier.c >> [ ... ] >>> @@ -185,7 +186,15 @@ static int damon_sample_mtier_start(void) >>> return -ENOMEM; >>> } >>> ctxs[1] = ctx; >>> - return damon_start(ctxs, 2, true); >>> + err = damon_start(ctxs, 2, true); >>> + if (!err) >>> + return 0; >>> + >>> + if (damon_is_running(ctxs[0])) >>> + damon_stop(ctxs, 1); >>> + damon_destroy_ctx(ctxs[0]); >>> + damon_destroy_ctx(ctxs[1]); >>> + return err; >>> } >>> >>> static void damon_sample_mtier_stop(void) >> >> [Severity: High] >> This is a pre-existing issue, > > So not a blocker of this patch. > >> but does damon_stop() leave contexts running >> and cause a use-after-free here? >> >> Looking at damon_stop() in mm/damon/core.c: >> >> int damon_stop(struct damon_ctx **ctxs, int nr_ctxs) >> { >> int i, err = 0; >> >> for (i = 0; i < nr_ctxs; i++) { >> err = __damon_stop(ctxs[i]); >> if (err) >> break; >> } >> return err; >> } >> >> If __damon_stop() returns an error for the first context (for instance, if >> the kthread exited prematurely due to an allocation failure), the loop >> breaks immediately. >> >> When damon_sample_mtier_stop() calls damon_stop(ctxs, 2), this would mean >> ctxs[1] is never stopped if stopping ctxs[0] returns an error. >> >> The subsequent unconditional calls to damon_destroy_ctx(ctxs[1]) in >> damon_sample_mtier_stop() would then free the context, its targets, and its >> schemes while kdamond is still executing and dereferencing them. >> >> Can this lead to a use-after-free? > > Makes sense. I will separately work on this. > > > Thanks, > SJ > > [...] --- [PoC Start] --- /* * POC for UAF in damon_sample_mtier * * Uses failslab to make kdamond_fn's kmalloc_array fail. * This causes kdamond to exit spontaneously, setting ctx->kdamond=NULL. * Then damon_sample_mtier_stop hits the break-early bug. */ #define _GNU_SOURCE #include <stdio.h> #include <stdlib.h> #include <string.h> #include <unistd.h> #include <fcntl.h> #define ENABLED_PATH "/sys/module/damon_sample_mtier/parameters/enabled" static void wf(const char *path, const char *s) { int fd = open(path, O_WRONLY); if (fd >= 0) { write(fd, s, strlen(s)); close(fd); } } int main(void) { int i; /* Suppress verbose failslab messages */ wf("/sys/kernel/debug/failslab/verbose", "0"); wf("/sys/kernel/debug/failslab/verbose_ratelimit_burst", "10000"); wf("/sys/kernel/debug/failslab/verbose_ratelimit_interval_ms", "60000"); /* Enable failslab: 1% probability, unlimited fails */ wf("/sys/kernel/debug/failslab/probability", "1"); wf("/sys/kernel/debug/failslab/times", "-1"); wf("/sys/kernel/debug/failslab/interval", "1"); wf("/sys/kernel/debug/failslab/task-filter", "0"); wf("/sys/kernel/debug/failslab/ignore-gfp-wait", "0"); fprintf(stderr, "[poc] failslab: 1%% unlimited, verbose=0. Start loop\n"); for (i = 0; i < 20000; i++) { int fd; fd = open(ENABLED_PATH, O_WRONLY); if (fd >= 0) { write(fd, "Y", 1); close(fd); } fd = open(ENABLED_PATH, O_WRONLY); if (fd >= 0) { write(fd, "N", 1); close(fd); } if ((i+1) % 2000 == 0) fprintf(stderr, "[poc] %d/20000\n", i+1); } wf("/sys/kernel/debug/failslab/probability", "0"); wf("/sys/kernel/debug/failslab/times", "0"); fprintf(stderr, "[poc] Done. Checking dmesg...\n"); system("dmesg | grep -i -E 'BUG|KASAN|use-after-free|UAF|Oops|general protection|kernel BUG|WARNING' | tail -20"); return 0; } --- [PoC End] --- And some explanations here: --- [EXPLANATIONS BEGIN] --- 1. Bug Summary **Commit**: `6da87efa9d6982132beff965026bf155183e5c93` - "samples/damon/mtier: handle damon_start() failure" **Bug Type**: Use-After-Free (UAF) **Location**: `samples/damon/mtier.c` - `damon_sample_mtier_stop()` 2. Root Cause The commit adds cleanup code in `damon_sample_mtier_start()` for when `damon_start()` fails: ```c err = damon_start(ctxs, 2, true); if (!err) return 0; if (damon_is_running(ctxs[0])) damon_stop(ctxs, 1); damon_destroy_ctx(ctxs[0]); damon_destroy_ctx(ctxs[1]); return err; ``` The bug: `damon_stop()` in `mm/damon/core.c` loops over contexts and breaks on the first error: ```c int damon_stop(struct damon_ctx **ctxs, int nr_ctxs) { int i, err = 0; for (i = 0; i < nr_ctxs; i++) { err = __damon_stop(ctxs[i]); if (err) break; } return err; } ``` `__damon_stop()` returns `-EPERM` if `ctx->kdamond` is NULL (the kdamond thread already exited). If ctxs[0]'s kdamond thread has already exited due to an internal failure (e.g., `kmalloc_array(GFP_KERNEL)` for `regions_score_histogram` fails inside `kdamond_fn`), then: 1. `damon_stop(ctxs, 2)` calls `__damon_stop(ctxs[0])` → returns -EPERM → loop breaks 2. `__damon_stop(ctxs[1])` is **never called** 3. `damon_destroy_ctx(ctxs[1])` frees ctxs[1] while its kdamond thread is still running 4. The running kdamond continues dereferencing freed ctxs[1] memory → USE-AFTER-FREE --- [EXPLANATIONS END] --- To fix this, damon_stop() should probably not abort on the first error, ensuring it iterates through all contexts to stop any surviving threads. Hope this helps with the patch. Best regards, XIAO WU ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2026-06-11 14:33 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-06-09 0:54 [RFC PATCH] samples/damon/mtier: handle damon_start() failure SeongJae Park 2026-06-09 1:06 ` sashiko-bot 2026-06-09 1:42 ` SeongJae Park 2026-06-11 14:25 ` XIAO WU
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox