All of lore.kernel.org
 help / color / mirror / Atom feed
From: SeongJae Park <sj@kernel.org>
To: SeongJae Park <sj@kernel.org>
Cc: "# 6 . 14 . x" <stable@vger.kernel.org>,
	Andrew Morton <akpm@linux-foundation.org>,
	damon@lists.linux.dev, linux-kernel@vger.kernel.org,
	linux-mm@kvack.org
Subject: Re: (sashiko review) [RFC PATCH 2/2] mm/damon/core: fix damon_call() vs kdamond_fn() exit race deadlock
Date: Thu, 26 Mar 2026 06:43:29 -0700	[thread overview]
Message-ID: <20260326134330.90521-1-sj@kernel.org> (raw)
In-Reply-To: <20260326062347.88569-3-sj@kernel.org>

Forwarding Sashiko review for doing discussions via mails.

# review url: https://sashiko.dev/#/patchset/20260326062347.88569-3-sj@kernel.org
# start of sashiko.dev inline review
commit b886bf00a72b1462f29533c91c431cd2ba3fa3a6
Author: SeongJae Park <sj@kernel.org>
mm/damon/core: fix damon_call() vs kdamond_fn() exit race deadlock

This commit introduces a call_controls_obsolete flag to protect damon_call()
from racing with the kdamond termination sequence. This replaces the previous
after-registration damon_is_running() check.

Link: https://lore.kernel.org/20260325141956.87144-1-sj@kernel.org

> diff --git a/mm/damon/core.c b/mm/damon/core.c
> index db6c67e52d2b8..a2b553e2c5a81 100644
> --- a/mm/damon/core.c
> +++ b/mm/damon/core.c
[ ... ]
> @@ -1629,10 +1600,12 @@ int damon_call(struct damon_ctx *ctx, struct damon_call_control *control)
>  	INIT_LIST_HEAD(&control->list);
>  
>  	mutex_lock(&ctx->call_controls_lock);
> +	if (ctx->call_controls_obsolete) {
> +		mutex_unlock(&ctx->call_controls_lock);
> +		return -ECANCELED;
> +	}
>  	list_add_tail(&control->list, &ctx->call_controls);
>  	mutex_unlock(&ctx->call_controls_lock);
> -	if (!damon_is_running(ctx))
> -		return damon_call_handle_inactive_ctx(ctx, control);

Does this removal cause an indefinite wait if damon_call() is invoked on an
unstarted context?

For a newly allocated damon_ctx, call_controls_obsolete is zero-initialized
to false. Without the damon_is_running() check, if a caller invokes
damon_call() before damon_start(), it appears the request will be queued and
block on wait_for_completion() indefinitely since no kdamond thread is
running to handle or cancel it.

>  	if (control->repeat)
>  		return 0;
>  	wait_for_completion(&control->completion);
[ ... ]
> @@ -3062,6 +3038,9 @@ static int kdamond_fn(void *data)
>  	damon_destroy_targets(ctx);
>  
>  	kfree(ctx->regions_score_histogram);
> +	mutex_lock(&ctx->call_controls_lock);
> +	ctx->call_controls_obsolete = true;
> +	mutex_unlock(&ctx->call_controls_lock);
>  	kdamond_call(ctx, true);
>  	damos_walk_cancel(ctx);

Can this sequence also cause a spurious -ECANCELED failure when a context is
restarted?

When restarting a context, kdamond_fn() signals kdamond_started before
calling kdamond_init_ctx():

        complete(&ctx->kdamond_started);
        kdamond_init_ctx(ctx);

If damon_start() unblocks and immediately invokes damon_call(), couldn't it
observe the stale call_controls_obsolete == true from the previous run and
return -ECANCELED, since kdamond_init_ctx() has not yet reset the flag?

Also, is damos_walk() vulnerable to the same termination race being fixed
here for damon_call()?

In kdamond_fn(), damos_walk_cancel() processes cancellations before
ctx->kdamond is set to NULL. If damos_walk() is called during this window,
it checks damon_is_running() (which still returns true), sets the walk
control, and blocks on wait_for_completion().

Since cancellations are already processed, will it wait indefinitely?


# end of sashiko.dev inline review
# review url: https://sashiko.dev/#/patchset/20260326062347.88569-3-sj@kernel.org
#
# hkml [1] generated a draft of this mail.  It can be regenerated
# using below command:
#
#     hkml patch sashiko_dev --for_forwarding \
#             20260326062347.88569-3-sj@kernel.org
#
# [1] https://github.com/sjp38/hackermail

Sent using hkml (https://github.com/sjp38/hackermail)

  reply	other threads:[~2026-03-26 13:43 UTC|newest]

Thread overview: 7+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-03-26  6:23 [RFC PATCH 0/2] mm/damon: fix damon_call()-related leak and deadlock SeongJae Park
2026-03-26  6:23 ` [RFC PATCH 1/2] mm/damon/sysfs: dealloc repeat_call_control if damon_call() fails SeongJae Park
2026-03-26  6:23 ` [RFC PATCH 2/2] mm/damon/core: fix damon_call() vs kdamond_fn() exit race deadlock SeongJae Park
2026-03-26 13:43   ` SeongJae Park [this message]
2026-03-26 13:51     ` (sashiko review) " SeongJae Park
2026-03-26 13:42 ` (sashiko status) [RFC PATCH 0/2] mm/damon: fix damon_call()-related leak and deadlock SeongJae Park
2026-03-26 14:53   ` SeongJae Park

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20260326134330.90521-1-sj@kernel.org \
    --to=sj@kernel.org \
    --cc=akpm@linux-foundation.org \
    --cc=damon@lists.linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-mm@kvack.org \
    --cc=stable@vger.kernel.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.