From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id 5B1C5D0D78F for ; Fri, 11 Oct 2024 13:47:58 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:In-Reply-To:Content-Type: MIME-Version:References:Message-ID:Subject:Cc:To:From:Date:Reply-To: Content-Transfer-Encoding:Content-ID:Content-Description:Resent-Date: Resent-From:Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=G0zePhX2qRMm3p+7+KAONmi6XBDdyQq4fokJt3btxS4=; b=2Tcq68njx72fB6vmBAwmxynnAm Jg5BSYBCRLpC7Q5T95NAu19lH0f34cgep/Cj6I+HpYDPgQo0Pi0Q+snadsZo3o54JJAbRmpQ37pNY wZqDqPDpOXNK0JHe4XULCZSqK3xzqp6+sHnih1Uh6XCT5SBcm677A6uX8N6K1Ivk2YG2A7KdD5FPH 4A72LTm1623S75/Z6jdZz0BusJl1bWFpjQn63b9M7TvYRtf0eec/wMdfuh0XDvQULmWEkRPAMet0F At7UwqK/QIV898fcMgUxQzQ8xd9+Hhf23cYO50Vi18e8tpLoxqYJaTmg6QcGv+3Lm95aXsMbk8cwZ 8krK0Gdw==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1szFzg-0000000GVeA-2MLx; Fri, 11 Oct 2024 13:47:48 +0000 Received: from foss.arm.com ([217.140.110.172]) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1szFvF-0000000GUL6-0ym2 for linux-arm-kernel@lists.infradead.org; Fri, 11 Oct 2024 13:43:14 +0000 Received: from usa-sjc-imap-foss1.foss.arm.com (unknown [10.121.207.14]) by usa-sjc-mx-foss1.foss.arm.com (Postfix) with ESMTP id 86298497; Fri, 11 Oct 2024 06:43:40 -0700 (PDT) Received: from pluto (usa-sjc-mx-foss1.foss.arm.com [172.31.20.19]) by usa-sjc-imap-foss1.foss.arm.com (Postfix) with ESMTPSA id 4D01A3F5A1; Fri, 11 Oct 2024 06:43:09 -0700 (PDT) Date: Fri, 11 Oct 2024 14:43:01 +0100 From: Cristian Marussi To: Justin Chen Cc: arm-scmi@vger.kernel.org, cristian.marussi@arm.com, sudeep.holla@arm.com, linux-arm-kernel@lists.infradead.org, peng.fan@nxp.com, bcm-kernel-feedback-list@broadcom.com, florian.fainelli@broadcom.com Subject: Re: [PATCH v2] firmware: arm_scmi: Queue in scmi layer for mailbox implementation Message-ID: References: <20241009192637.1090238-1-justin.chen@broadcom.com> MIME-Version: 1.0 Content-Type: text/plain; charset=us-ascii Content-Disposition: inline In-Reply-To: <20241009192637.1090238-1-justin.chen@broadcom.com> X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241011_064313_383665_D7DC9D16 X-CRM114-Status: GOOD ( 33.23 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On Wed, Oct 09, 2024 at 12:26:37PM -0700, Justin Chen wrote: > send_message() does not block in the MBOX implementation. This is > because the mailbox layer has its own queue. However, this confuses > the per xfer timeouts as they all start their timeout ticks in > parallel. > > Consider a case where the xfer timeout is 30ms and a SCMI transaction > takes 25ms. > > 0ms: Message #0 is queued in mailbox layer and sent out, then sits > at scmi_wait_for_message_response() with a timeout of 30ms > 1ms: Message #1 is queued in mailbox layer but not sent out yet. > Since send_message() doesn't block, it also sits at > scmi_wait_for_message_response() with a timeout of 30ms > ... > 25ms: Message #0 is completed, txdone is called and Message #1 is > sent out > 31ms: Message #1 times out since the count started at 1ms. Even > though it has only been inflight for 6ms. > > Fixes: b53515fa177c ("firmware: arm_scmi: Make MBOX transport a standalone driver") > Signed-off-by: Justin Chen > --- > > Changes in v2: Hi Justin, thanks. A few nitpicks and one remark down below. > > - Added Fixes tag > - Improved commit message to better capture the issue > > .../firmware/arm_scmi/transports/mailbox.c | 21 +++++++++++++------ > 1 file changed, 15 insertions(+), 6 deletions(-) > > diff --git a/drivers/firmware/arm_scmi/transports/mailbox.c b/drivers/firmware/arm_scmi/transports/mailbox.c > index 1a754dee24f7..30bc2865582f 100644 > --- a/drivers/firmware/arm_scmi/transports/mailbox.c > +++ b/drivers/firmware/arm_scmi/transports/mailbox.c > @@ -33,6 +33,7 @@ struct scmi_mailbox { > struct mbox_chan *chan_platform_receiver; > struct scmi_chan_info *cinfo; > struct scmi_shared_mem __iomem *shmem; > + struct mutex chan_lock; Missing Doxygen comment.... arm_scmi/transports/mailbox.c:39: warning: Function parameter or struct member 'chan_lock' not described in 'scmi_mailbox > }; > > #define client_to_scmi_mailbox(c) container_of(c, struct scmi_mailbox, cl) > @@ -205,6 +206,7 @@ static int mailbox_chan_setup(struct scmi_chan_info *cinfo, struct device *dev, > cl->rx_callback = rx_callback; > cl->tx_block = false; > cl->knows_txdone = tx; > + mutex_init(&smbox->chan_lock); This could be move at the end of this function after the channels are requested and it is no more possible to fail and bail out....messages wont flow and lock wont be used anyway until this chan_setup completes... ...BUT I have NOT string opinion about this....you can leave it here too...up to you > > smbox->chan = mbox_request_channel(cl, tx ? 0 : p2a_chan); > if (IS_ERR(smbox->chan)) { > @@ -267,11 +269,21 @@ static int mailbox_send_message(struct scmi_chan_info *cinfo, > struct scmi_mailbox *smbox = cinfo->transport_info; > int ret; > > + /* > + * The mailbox layer has it's own queue. However the mailbox queue confuses its own queue > + * the per message SCMI timeouts since the clock starts when the message is > + * submitted into the mailbox queue. So when multiple messages are queued up > + * the clock starts on all messages instead of only the one inflight. > + */ > + mutex_lock(&smbox->chan_lock); > + > ret = mbox_send_message(smbox->chan, xfer); > > /* mbox_send_message returns non-negative value on success, so reset */ > if (ret > 0) > ret = 0; > + else > + mutex_unlock(&smbox->chan_lock); I think this should be else if (ret < 0) mutex_unlock(&smbox->chan_lock); ...since looking at mbox_send_message() and its implementation it returns NON-Negative integers on Success...so 0 from mbox_send_mmessage() also means SUCCESS and we should not release the mutex (I think the 'ret' returned here is the idx from add_to_rbuf...so it will become zero peridiocally on normal successfull operation) > > return ret; > } > @@ -281,13 +293,10 @@ static void mailbox_mark_txdone(struct scmi_chan_info *cinfo, int ret, > { > struct scmi_mailbox *smbox = cinfo->transport_info; > > - /* > - * NOTE: we might prefer not to need the mailbox ticker to manage the > - * transfer queueing since the protocol layer queues things by itself. > - * Unfortunately, we have to kick the mailbox framework after we have > - * received our message. > - */ > mbox_client_txdone(smbox->chan, ret); > + > + /* Release channel */ > + mutex_unlock(&smbox->chan_lock); > } > > static void mailbox_fetch_response(struct scmi_chan_info *cinfo, > -- > 2.34.1 > I gave it a go on a couple of JUNO, without any issues. Other than the above, LGTM. Reviewed-by: Cristian Marussi Tested-by: Cristian Marussi Thanks, Cristian