From: SeongJae Park <sj@kernel.org>
To: SeongJae Park <sj@kernel.org>
Cc: "# 6 . 14 . x" <stable@vger.kernel.org>,
Andrew Morton <akpm@linux-foundation.org>,
damon@lists.linux.dev, linux-kernel@vger.kernel.org,
linux-mm@kvack.org
Subject: Re: (sashiko review) [RFC PATCH 2/2] mm/damon/core: fix damon_call() vs kdamond_fn() exit race deadlock
Date: Thu, 26 Mar 2026 06:51:46 -0700 [thread overview]
Message-ID: <20260326135146.90670-1-sj@kernel.org> (raw)
In-Reply-To: <20260326134330.90521-1-sj@kernel.org>
On Thu, 26 Mar 2026 06:43:29 -0700 SeongJae Park <sj@kernel.org> wrote:
> Forwarding Sashiko review for doing discussions via mails.
>
> # review url: https://sashiko.dev/#/patchset/20260326062347.88569-3-sj@kernel.org
> # start of sashiko.dev inline review
> commit b886bf00a72b1462f29533c91c431cd2ba3fa3a6
> Author: SeongJae Park <sj@kernel.org>
> mm/damon/core: fix damon_call() vs kdamond_fn() exit race deadlock
>
> This commit introduces a call_controls_obsolete flag to protect damon_call()
> from racing with the kdamond termination sequence. This replaces the previous
> after-registration damon_is_running() check.
>
> Link: https://lore.kernel.org/20260325141956.87144-1-sj@kernel.org
>
> > diff --git a/mm/damon/core.c b/mm/damon/core.c
> > index db6c67e52d2b8..a2b553e2c5a81 100644
> > --- a/mm/damon/core.c
> > +++ b/mm/damon/core.c
> [ ... ]
> > @@ -1629,10 +1600,12 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
> > INIT_LIST_HEAD(&control->list);
> >
> > mutex_lock(&ctx->call_controls_lock);
> > + if (ctx->call_controls_obsolete) {
> > + mutex_unlock(&ctx->call_controls_lock);
> > + return -ECANCELED;
> > + }
> > list_add_tail(&control->list, &ctx->call_controls);
> > mutex_unlock(&ctx->call_controls_lock);
> > - if (!damon_is_running(ctx))
> > - return damon_call_handle_inactive_ctx(ctx, control);
>
> Does this removal cause an indefinite wait if damon_call() is invoked on an
> unstarted context?
>
> For a newly allocated damon_ctx, call_controls_obsolete is zero-initialized
> to false. Without the damon_is_running() check, if a caller invokes
> damon_call() before damon_start(), it appears the request will be queued and
> block on wait_for_completion() indefinitely since no kdamond thread is
> running to handle or cancel it.
Yes. But it is clearly wrong usage of this function. I will add a comment
clarifying this, e.g., "this function shouldn't be called for unstarted DAMON
context. In the case, it could be indefinitely sleep."
>
> > if (control->repeat)
> > return 0;
> > wait_for_completion(&control->completion);
> [ ... ]
> > @@ -3062,6 +3038,9 @@ static int kdamond_fn(void *data)
> > damon_destroy_targets(ctx);
> >
> > kfree(ctx->regions_score_histogram);
> > + mutex_lock(&ctx->call_controls_lock);
> > + ctx->call_controls_obsolete = true;
> > + mutex_unlock(&ctx->call_controls_lock);
> > kdamond_call(ctx, true);
> > damos_walk_cancel(ctx);
>
> Can this sequence also cause a spurious -ECANCELED failure when a context is
> restarted?
>
> When restarting a context, kdamond_fn() signals kdamond_started before
> calling kdamond_init_ctx():
>
> complete(&ctx->kdamond_started);
> kdamond_init_ctx(ctx);
>
> If damon_start() unblocks and immediately invokes damon_call(), couldn't it
> observe the stale call_controls_obsolete == true from the previous run and
> return -ECANCELED, since kdamond_init_ctx() has not yet reset the flag?
Good catch, I will do the call_controls_obsolete unset before the complete()
call.
>
> Also, is damos_walk() vulnerable to the same termination race being fixed
> here for damon_call()?
>
> In kdamond_fn(), damos_walk_cancel() processes cancellations before
> ctx->kdamond is set to NULL. If damos_walk() is called during this window,
> it checks damon_is_running() (which still returns true), sets the walk
> control, and blocks on wait_for_completion().
>
> Since cancellations are already processed, will it wait indefinitely?
Yes. I'm working on it for another patch.
Thanks,
SJ
[...]
next prev parent reply other threads:[~2026-03-26 13:51 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2026-03-26 6:23 [RFC PATCH 0/2] mm/damon: fix damon_call()-related leak and deadlock SeongJae Park
2026-03-26 6:23 ` [RFC PATCH 1/2] mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails SeongJae Park
2026-03-26 6:23 ` [RFC PATCH 2/2] mm/damon/core: fix damon_call() vs kdamond_fn() exit race deadlock SeongJae Park
2026-03-26 13:43 ` (sashiko review) " SeongJae Park
2026-03-26 13:51 ` SeongJae Park [this message]
2026-03-26 13:42 ` (sashiko status) [RFC PATCH 0/2] mm/damon: fix damon_call()-related leak and deadlock SeongJae Park
2026-03-26 14:53 ` SeongJae Park
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20260326135146.90670-1-sj@kernel.org \
--to=sj@kernel.org \
--cc=akpm@linux-foundation.org \
--cc=damon@lists.linux.dev \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-mm@kvack.org \
--cc=stable@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.