From mboxrd@z Thu Jan 1 00:00:00 1970 Received: from smtp.kernel.org (aws-us-west-2-korg-mail-1.web.codeaurora.org [10.30.226.201]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.subspace.kernel.org (Postfix) with ESMTPS id A782A40DFDA for ; Tue, 5 May 2026 23:40:55 +0000 (UTC) Authentication-Results: smtp.subspace.kernel.org; arc=none smtp.client-ip=10.30.226.201 ARC-Seal:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778024455; cv=none; b=ULlbkbQiGjiqxZ05+gTt6FV1C6qvcOOfwaRR+qjRsjbVpgXpCW7lE6ol3YMnvAUUngiNl94kXZxR1KXjT6twIQorute2vmxjmCgavt3C5xQhV8HltSGO6jpYuiFJ32Cfox5SY55E+QmRwL7fwb86eqFenzRlB6EZzZLWEGVrJCo= ARC-Message-Signature:i=1; a=rsa-sha256; d=subspace.kernel.org; s=arc-20240116; t=1778024455; c=relaxed/simple; bh=f20BOxJoF77F75xIuq1SIbdRBJaMXVGiM1e8pvOxdvk=; h=Message-ID:Date:MIME-Version:Subject:To:Cc:References:From: In-Reply-To:Content-Type; b=m4G9b+IxAaF3tmj6Y1CG3kiRB1GziC4W0o30aCpiFdq+HNwKlAvUC5YCIocWeCgIa8jiEiGI9xv3G8FHlvK1wyyNKJgbGGhSmRr8kN2mghtImqr6Z5dupK+yxLisrmsud2P9FQSPTM5qaWl77OJ1selPsAaGQYb/mCv9K51g1Tg= ARC-Authentication-Results:i=1; smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b=O3bNbkbt; arc=none smtp.client-ip=10.30.226.201 Authentication-Results: smtp.subspace.kernel.org; dkim=pass (2048-bit key) header.d=kernel.org header.i=@kernel.org header.b="O3bNbkbt" Received: by smtp.kernel.org (Postfix) with ESMTPSA id F01EBC2BCB4; Tue, 5 May 2026 23:40:54 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/simple; d=kernel.org; s=k20201202; t=1778024455; bh=f20BOxJoF77F75xIuq1SIbdRBJaMXVGiM1e8pvOxdvk=; h=Date:Subject:To:Cc:References:From:In-Reply-To:From; b=O3bNbkbtz/JoykujKm2mCDqQxEAmuAYRY6O4s9erx6GH9JpLpLx9hXcPscyjN6Pq/ kBb9qO/VyJVejNgs40TYTaHvCQqbB3pQsHBPXwpV3OuQzv3WBEN9jIX85d5n18dDrD HBc/WXKwLI+kjgMutO/4zP0AvAc9a4TdzWPxJadxpPBbgT9mOdQqD3gLxZQ3zzlNhy ikF9o2L8gscNX4Ha/kCw2uxJ7dwsgBp7Rutat+0veKn6OPyFlv1+Ytskw38RvrBID+ Ylrdpga+rVNEFg7E3bxwLyP+mnipcGOurVe+/ePX+WAi94Dp0g0qZ75IRH1YN32a6w 0skYxQQIbVoKQ== Message-ID: <4307e321-ea7b-4798-890c-adba537fc5e0@kernel.org> Date: Tue, 5 May 2026 18:40:53 -0500 Precedence: bulk X-Mailing-List: linux-kernel@vger.kernel.org List-Id: List-Subscribe: List-Unsubscribe: MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] firmware: stratix10-rsu: avoid RSU mutex deadlock on update Content-Language: en-US To: tze.yee.ng@altera.com, linux-kernel@vger.kernel.org Cc: Adrian Ng Ho Yin , Nazim Amirul References: From: Dinh Nguyen In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 8bit From sashiko claude-sonnet-4-6, On 5/4/26 21:51, tze.yee.ng@altera.com wrote: > From: Tze Yee Ng > > Use mutex_trylock() in rsu_send_msg() instead of mutex_lock() so callers > that already hold priv->lock or race with an in-flight RSU operation get > -EAGAIN instead of blocking forever. > > In reboot_image_store(), treat -EAGAIN as a no-op success so sysfs write > does not hang when the RSU path is busy. > > Signed-off-by: Tze Yee Ng > --- > drivers/firmware/stratix10-rsu.c | 15 ++++++++++++--- > 1 file changed, 12 insertions(+), 3 deletions(-) > > diff --git a/drivers/firmware/stratix10-rsu.c b/drivers/firmware/stratix10-rsu.c > index e1912108a0fe..f3ee0b4e49a8 100644 > --- a/drivers/firmware/stratix10-rsu.c > +++ b/drivers/firmware/stratix10-rsu.c > @@ -254,7 +254,13 @@ static void rsu_async_get_spt_table_callback(struct device *dev, > * is necessary to get RSU boot log or set the address of bitstream to > * boot after reboot. > * > - * Returns 0 on success or -ETIMEDOUT on error. > + * Return: > + * * 0 on success > + * * -EAGAIN if the driver mutex could not be acquired > + * * a negative errno from stratix10_svc_send() on failure to send the request > + * * -ETIMEDOUT if waiting for the SMC callback times out > + * * a negative errno from wait_for_completion_interruptible_timeout() if > + * interrupted or otherwise signaled while waiting (for example -ERESTARTSYS) > */ > static int rsu_send_msg(struct stratix10_rsu_priv *priv, > enum stratix10_svc_command_code command, > @@ -264,7 +270,8 @@ static int rsu_send_msg(struct stratix10_rsu_priv *priv, > struct stratix10_svc_client_msg msg; > int ret; > > - mutex_lock(&priv->lock); > + if (!mutex_trylock(&priv->lock)) > + return -EAGAIN; The commit message says this prevents deadlocks from "callers that already hold priv->lock." Does such a caller actually exist? Looking at the driver, priv->lock is initialized in stratix10_rsu_probe() and is acquired only inside rsu_send_msg() itself. The completion callbacks (rsu_command_callback and friends) call complete() without taking priv->lock, so there is no reentrant acquisition path through callbacks either. Two concurrent sysfs writes would serialize correctly under the original mutex_lock() — the second writer would block until the first completed. That is the expected behavior for a one-shot firmware configuration command, not a deadlock. > reinit_completion(&priv->completion); > priv->client.receive_cb = callback; > > @@ -597,7 +604,9 @@ static ssize_t reboot_image_store(struct device *dev, > > ret = rsu_send_msg(priv, COMMAND_RSU_UPDATE, > address, rsu_command_callback); > - if (ret) { > + if (ret == -EAGAIN) { > + return 0; Two concerns here. First, does returning 0 violate the sysfs store callback contract? Sysfs store handlers are expected to return the number of bytes consumed on success, or a negative errno on error. Returning 0 tells kernfs that zero bytes were consumed, which is a short write. A POSIX-compliant write loop seeing 0 bytes returned on a non-zero write will retry indefinitely. The intended silent no-op should return count, not 0. Second, when mutex_trylock fails, COMMAND_RSU_UPDATE is never sent to the firmware, yet reboot_image_store() returns without an error. The operator's request to change the persistent boot image is silently dropped with no log message and no error to userspace. On the next reboot the device will boot the old image rather than the one that was written to the attribute. For a one-shot persistent firmware state change, is silently discarding the request the right tradeoff? Blocking in a sysfs store callback is permitted, so keeping the original mutex_lock() would let concurrent writers serialize correctly without any data loss. Dinh