From mboxrd@z Thu Jan 1 00:00:00 1970 Return-Path: X-Spam-Checker-Version: SpamAssassin 3.4.0 (2014-02-07) on aws-us-west-2-korg-lkml-1.web.codeaurora.org Received: from bombadil.infradead.org (bombadil.infradead.org [198.137.202.133]) (using TLSv1.2 with cipher ECDHE-RSA-AES256-GCM-SHA384 (256/256 bits)) (No client certificate requested) by smtp.lore.kernel.org (Postfix) with ESMTPS id DABAFCEFC4C for ; Tue, 8 Oct 2024 19:42:15 +0000 (UTC) DKIM-Signature: v=1; a=rsa-sha256; q=dns/txt; c=relaxed/relaxed; d=lists.infradead.org; s=bombadil.20210309; h=Sender:List-Subscribe:List-Help :List-Post:List-Archive:List-Unsubscribe:List-Id:Content-Transfer-Encoding: Content-Type:In-Reply-To:References:Cc:To:From:Subject:MIME-Version:Date: Message-ID:Reply-To:Content-ID:Content-Description:Resent-Date:Resent-From: Resent-Sender:Resent-To:Resent-Cc:Resent-Message-ID:List-Owner; bh=dXD5jbEF7TBtPzV3enOnwQwebN+HoFp78mk6IJcIZHw=; b=Hi/UfPwd+GX43aMdJfOpWQv1fB xK4tQ4jsJcOrxp+wAzSWI4wgovSNy+J5qUtawZ1tH6K9xkzUusveWBJkQT/KVYKUyxGRvwQbtLM+n B7RtPbc0UbcC+LLOZidbOUqQlp38F/lQFKJoIJgH5wp6fcStfxa5plzqB67TxENtUvUCPV28GOtA4 cQ/5zvNp6SYwzQ55YNZgFjB03DeeFoWbON2Qp8rj3CJH7yxFF10gAknQVena/aTwItAfgWJi742Cl gmdQyvzHDkGdd7vBQ7sKs3UR0GCFaXD3uZvowYzr+RgSCQqxpfFj3ncEqizvfBpTTDpDAQiig72Hm pO/gNIZQ==; Received: from localhost ([::1] helo=bombadil.infradead.org) by bombadil.infradead.org with esmtp (Exim 4.98 #2 (Red Hat Linux)) id 1syG5p-00000006y9S-1JxC; Tue, 08 Oct 2024 19:42:01 +0000 Received: from mail-pl1-x631.google.com ([2607:f8b0:4864:20::631]) by bombadil.infradead.org with esmtps (Exim 4.98 #2 (Red Hat Linux)) id 1syG4W-00000006y2w-0CZM for linux-arm-kernel@lists.infradead.org; Tue, 08 Oct 2024 19:40:41 +0000 Received: by mail-pl1-x631.google.com with SMTP id d9443c01a7336-20c544d34bcso10346495ad.1 for ; Tue, 08 Oct 2024 12:40:39 -0700 (PDT) DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=broadcom.com; s=google; t=1728416439; x=1729021239; darn=lists.infradead.org; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:date:message-id:from:to :cc:subject:date:message-id:reply-to; bh=dXD5jbEF7TBtPzV3enOnwQwebN+HoFp78mk6IJcIZHw=; b=ROwY86OzffEtxc8clapMzZ3goswSb/wK/i99lYmV8gY4I0cmI0hQPHG6kieUPCQj+D lzsrszRegHuxaAs42K91Sikfe/2+nDvDsqF8ELVTqYviCx/XAVakzsxW6XrB8p0P33C0 U/Fx/xGZxSEvh9nOuS+Jr6vSvTTpI2QZinDJQ= X-Google-DKIM-Signature: v=1; a=rsa-sha256; c=relaxed/relaxed; d=1e100.net; s=20230601; t=1728416439; x=1729021239; h=content-transfer-encoding:in-reply-to:content-language:references :cc:to:from:subject:user-agent:mime-version:date:message-id :x-gm-message-state:from:to:cc:subject:date:message-id:reply-to; bh=dXD5jbEF7TBtPzV3enOnwQwebN+HoFp78mk6IJcIZHw=; b=PP6W4n5Z8gRksDwr1DQ9ziRCzj7uO3qtCqkbdJiU4BHR4YniIwwH32dLm4ol6hZ2Ya P98EwI1CrMLR8a1UAt5vmZtBlufQlGyhr8lmm48H7OEmFX/4GDbatdsc7Oq7spr0TCy5 sa1DT/U9gCoDsdVojVX96X3yyg1crYdlE6CiC8kGUSdUwc+6iXBqxX/Xb4UT4whkDnkP 5KHNXEDLkbRlZ8agUMyND5T5LNY8dOXXn0pMmK+49E0/sLAFGcygk+DPTOWM0InDTj6u lmPcu0AlG7xMLJIDQ6PVj7Twmh/G0VUmAy68/1ZLiQfMKPwXmmLYlJegmenIV2cIKoA3 97Wg== X-Forwarded-Encrypted: i=1; AJvYcCWHMgw9QpC8GpDkn4p2qOcCSFW2G/qy+MYlAATNj4bW64GpwsJfWiKLwBvFM1VsUwGpk0RhBAn/NrysWF+lLuO2@lists.infradead.org X-Gm-Message-State: AOJu0YyG7eKBAFQL7VQGNhCgBRAAc08xArxGQHnfN5T6Qut2ezMVSH1Q rabKk8scOHpanGt/I0fwuUkm1uGpj/ODt+6jctfgF9xd7fLl02m0/hZFfW6UmQ== X-Google-Smtp-Source: AGHT+IHbgqBDXjQU1Boquz7ZlU7Y9QW1WgIfO5pEmEsB2Cw8p/omvogvQYCfksQhpWLK29AHzqgO7g== X-Received: by 2002:a17:902:d491:b0:20b:983c:f095 with SMTP id d9443c01a7336-20c6378fc02mr1471115ad.51.1728416438782; Tue, 08 Oct 2024 12:40:38 -0700 (PDT) Received: from [10.69.69.40] ([192.19.223.252]) by smtp.gmail.com with ESMTPSA id d9443c01a7336-20c590b44fasm13534055ad.225.2024.10.08.12.40.37 (version=TLS1_3 cipher=TLS_AES_128_GCM_SHA256 bits=128/128); Tue, 08 Oct 2024 12:40:38 -0700 (PDT) Message-ID: <8f671ba8-db6e-4370-aeb8-16ec733eeb09@broadcom.com> Date: Tue, 8 Oct 2024 12:40:37 -0700 MIME-Version: 1.0 User-Agent: Mozilla Thunderbird Subject: Re: [PATCH] firmware: arm_scmi: Queue in scmi layer for mailbox implementation From: Justin Chen To: Cristian Marussi , Sudeep Holla Cc: arm-scmi@vger.kernel.org, linux-arm-kernel@lists.infradead.org, peng.fan@nxp.com, bcm-kernel-feedback-list@broadcom.com, florian.fainelli@broadcom.com References: <20241004221257.2888603-1-justin.chen@broadcom.com> <1ad5c4e9-9f98-40ab-afa4-a7939781e8cc@broadcom.com> Content-Language: en-US In-Reply-To: Content-Type: text/plain; charset=UTF-8; format=flowed Content-Transfer-Encoding: 7bit X-CRM114-Version: 20100106-BlameMichelson ( TRE 0.8.0 (BSD) ) MR-646709E3 X-CRM114-CacheID: sfid-20241008_124040_130780_0031BF18 X-CRM114-Status: GOOD ( 20.17 ) X-BeenThere: linux-arm-kernel@lists.infradead.org X-Mailman-Version: 2.1.34 Precedence: list List-Id: List-Unsubscribe: , List-Archive: List-Post: List-Help: List-Subscribe: , Sender: "linux-arm-kernel" Errors-To: linux-arm-kernel-bounces+linux-arm-kernel=archiver.kernel.org@lists.infradead.org On 10/8/24 12:23 PM, Justin Chen wrote: > > > On 10/8/24 6:37 AM, Cristian Marussi wrote: >> On Tue, Oct 08, 2024 at 02:34:59PM +0100, Sudeep Holla wrote: >>> On Tue, Oct 08, 2024 at 02:23:00PM +0100, Sudeep Holla wrote: >>>> On Tue, Oct 08, 2024 at 01:10:39PM +0100, Cristian Marussi wrote: >>>>> On Mon, Oct 07, 2024 at 10:58:47AM -0700, Justin Chen wrote: >>>>>> Thanks for the response. I'll try to elaborate. >>>>>> >>>>>> When comparing SMC and mailbox transport, we noticed mailbox >>>>>> transport >>>>>> timesout much quicker when under load. Originally we thought this >>>>>> was the >>>>>> latency of the mailbox implementation, but after debugging we >>>>>> noticed a >>>>>> weird behavior. We saw SMCI transactions timing out before the >>>>>> mailbox even >>>>>> transmitted the message. >>>>>> >>>>>> This issue lies in the SCMI layer. drivers/firmware/arm_scmi/driver.c >>>>>> do_xfer() function. >>>>>> >>>>>> The fundamental issue is send_message() blocks for SMC transport, but >>>>>> doesn't block for mailbox transport. So if send_message() doesn't >>>>>> block we >>>>>> can have multiple messages waiting at >>>>>> scmi_wait_for_message_response(). >>>>>> >>>>> >>>>> oh...yes...now I can see it...tx_prepare is really never called given >>>>> how the mailbox subsystem de-queues messages once at time...so we >>>>> end up >>>>> waiting for a reply to some message that is still to be sent...so the >>>>> message inflight is really NOT corrupted because the next remain >>>>> pending >>>>> until the reply in the shmem is read back , BUT the timeout will >>>>> drift away >>>>> if you multiple inflights are pending to be sent... >>>>> >>>> >>>> Indeed. >>>> >>>>>> SMC looks like this >>>>>> CPU #0 SCMI message 0 -> calls send_message() then calls >>>>>> scmi_wait_for_message_response(), timesout after 30ms. >>>>>> CPU #1 SCMI message 1 -> blocks at send_message() waiting for SCMI >>>>>> message 0 >>>>>> to complete. >>>>>> >>>>>> Mailbox looks like this >>>>>> CPU #0 SCMI message 0 -> calls send_message(), mailbox layer >>>>>> queues up >>>>>> message, mailbox layer sees no message is outgoing and sends it. >>>>>> CPU waits >>>>>> at scmi_wait_for_message_response(), timesout after 30ms >>>>>> CPU #1 SCMI message 1 -> calls send_message(), mailbox layer >>>>>> queues up >>>>>> message, mailbox layer sees message pending, hold message in >>>>>> queue. CPU >>>>>> waits at scmi_wait_for_message_response(), timesout after 30ms. >>>>>> >>>>>> Lets say if transport takes 25ms. The first message would succeed, >>>>>> the >>>>>> second message would timeout after 5ms. >>>>>> >>>>>> Hopefully this makes sense. >>>>> >>>>> Yes, of course, thanks, for reporting this, and taking time to >>>>> explain... >>>>> >>>>> ...in general the patch LGTM...I think your patch is good also >>>>> because it >>>>> could be easily backported as a fix....can you add a Fixes tag in your >>>>> next version ? >>>>> >>>> >>>> Are you seeing this issue a lot ? IOW, do we need this to be >>>> backported ? >>>> > > I wouldn't say a lot. But we are seeing it with standard use of our > devices running over an extended amount of time. Yes we would like this > backported. > Any advice on what Fixes tag to add here? Since the driver was moved to the transport folder, I assume this is going to be a manual backport. Not sure if the Fixes tag should be when the SCMI MBOX transport was moved to a standalone driver in the transport folder or the very first SCMI commits since the MBOX transport was added then. Thanks, Justin