All of lore.kernel.org
 help / color / mirror / Atom feed
From: "Falkowski, Maciej" <maciej.falkowski@linux.intel.com>
To: Lizhi Hou <lizhi.hou@amd.com>,
	ogabbay@kernel.org, quic_jhugo@quicinc.com,
	dri-devel@lists.freedesktop.org
Cc: linux-kernel@vger.kernel.org, max.zhen@amd.com,
	sonal.santan@amd.com, mario.limonciello@amd.com
Subject: Re: [PATCH] accel/amdxdna: Fix deadlock between context destroy and job timeout
Date: Thu, 13 Nov 2025 17:45:28 +0100	[thread overview]
Message-ID: <9ae59f7e-9d99-4e73-a805-99586d8f49bb@linux.intel.com> (raw)
In-Reply-To: <20251107181050.1293125-1-lizhi.hou@amd.com>

Reviewed-by: Maciej Falkowski <maciej.falkowski@linux.intel.com>

You could add some lockdep assertions for dev_lock to better track its 
state.

Best regards,
Maciej

On 11/7/2025 7:10 PM, Lizhi Hou wrote:
> Hardware context destroy function holds dev_lock while waiting for all jobs
> to complete. The timeout job also needs to acquire dev_lock, this leads to
> a deadlock.
>
> Fix the issue by temporarily releasing dev_lock before waiting for all
> jobs to finish, and reacquiring it afterward.
>
> Fixes: 4fd6ca90fc7f ("accel/amdxdna: Refactor hardware context destroy routine")
> Signed-off-by: Lizhi Hou <lizhi.hou@amd.com>
> ---
>   drivers/accel/amdxdna/aie2_ctx.c | 6 ++++--
>   1 file changed, 4 insertions(+), 2 deletions(-)
>
> diff --git a/drivers/accel/amdxdna/aie2_ctx.c b/drivers/accel/amdxdna/aie2_ctx.c
> index bdc90fe8a47e..42d876a427c5 100644
> --- a/drivers/accel/amdxdna/aie2_ctx.c
> +++ b/drivers/accel/amdxdna/aie2_ctx.c
> @@ -690,17 +690,19 @@ void aie2_hwctx_fini(struct amdxdna_hwctx *hwctx)
>   	xdna = hwctx->client->xdna;
>   
>   	XDNA_DBG(xdna, "%s sequence number %lld", hwctx->name, hwctx->priv->seq);
> -	drm_sched_entity_destroy(&hwctx->priv->entity);
> -
>   	aie2_hwctx_wait_for_idle(hwctx);
>   
>   	/* Request fw to destroy hwctx and cancel the rest pending requests */
>   	aie2_release_resource(hwctx);
>   
> +	mutex_unlock(&xdna->dev_lock);
> +	drm_sched_entity_destroy(&hwctx->priv->entity);
> +
>   	/* Wait for all submitted jobs to be completed or canceled */
>   	wait_event(hwctx->priv->job_free_wq,
>   		   atomic64_read(&hwctx->job_submit_cnt) ==
>   		   atomic64_read(&hwctx->job_free_cnt));
> +	mutex_lock(&xdna->dev_lock);
>   
>   	drm_sched_fini(&hwctx->priv->sched);
>   	aie2_ctx_syncobj_destroy(hwctx);

  reply	other threads:[~2025-11-13 16:45 UTC|newest]

Thread overview: 3+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-07 18:10 [PATCH] accel/amdxdna: Fix deadlock between context destroy and job timeout Lizhi Hou
2025-11-13 16:45 ` Falkowski, Maciej [this message]
2025-11-13 17:24   ` Lizhi Hou

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=9ae59f7e-9d99-4e73-a805-99586d8f49bb@linux.intel.com \
    --to=maciej.falkowski@linux.intel.com \
    --cc=dri-devel@lists.freedesktop.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=lizhi.hou@amd.com \
    --cc=mario.limonciello@amd.com \
    --cc=max.zhen@amd.com \
    --cc=ogabbay@kernel.org \
    --cc=quic_jhugo@quicinc.com \
    --cc=sonal.santan@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.