From: Sudeep Holla <sudeep.holla@arm.com>
To: jack21 <jackhuang021@gmail.com>
Cc: Cristian Marussi <cristian.marussi@arm.com>,
Dan Carpenter <dan.carpenter@linaro.org>,
<arm-scmi@vger.kernel.org>,
<linux-arm-kernel@lists.infradead.org>,
<linux-kernel@vger.kernel.org>,
Huangjie <huangjie1663@phytium.com.cn>
Subject: Re: [PATCH] dirvers: scmi: poll again when transfer reach timeout
Date: Tue, 11 Feb 2025 15:37:27 +0000 [thread overview]
Message-ID: <Z6tutxs2UE7vjjJ6@bogus> (raw)
In-Reply-To: <Z5KdJa0QOIYIIRv4@pluto>
On Thu, Jan 23, 2025 at 07:48:53PM +0000, Cristian Marussi wrote:
> On Thu, Jan 23, 2025 at 01:38:30PM +0300, Dan Carpenter wrote:
> > s/dirvers/drivers/
> >
> > On Thu, Jan 23, 2025 at 04:33:24PM +0800, jack21 wrote:
> > > From: Huangjie <huangjie1663@phytium.com.cn>
> > >
> > > spin_until_cond() not really hold a spin lock, possible timeout may occur
> > > in preemption kernel when preempted after spin_until_cond().
> > >
> > > We check status again when reach timeout is reached to prevent incorrect
> > > jugement of timeout.
> > >
>
> Hi,
>
> probably another not so short email of mine :P ...
>
> >
> > I suspect the real issue is that we exit the spin loop when
> > try_wait_for_completion(&xfer->done) is true. Probably we should add
> > that as a Fixes tag?:
> >
>
> The Kernel SCMI stack, acting as an SCMI agent have to cope with the
> possible (even though rare) scenario of receiving Out of Order messages
> when dealing with async commands...
>
> ...IOW, what to do if, after having issued an AsycnCmd, the Delayed response
> is received BEFORE the corresponding immediate reply to the initial request:
> such a wicked (and rare) situation could be the result of a misbehaving
> platform server OR simply due to parallellization of activies on the A2P
> and P2A on some transports, i.e. platform sent reply and delayed_response
> in the correct order but the transport delivered those OoO, being
> transmitted on 2 discinct physical channels...hard but not impossible.
>
> Kernel side we address this OoO scenario by assuming that, if a valid
> Delayed Response to a in-flight Async-cmd is received on teh P2A chan,
> before the immediate A2P reply, the transaction itself is good and we
> can progress by just logging and ignoring/swallowing the missing
> immediate-reply, that will probably arrive later, and just carry on
> processing the Delayed Response.
>
> In order to do that, we maintain, in fact, a per-message state-machine,
> and inside scmi_msg_response_validate(), when we detect the OoO condition,
> we cause the wait-for-immediate-reply to terminate by issuing forcibly a
> complete(xfer->done)....
>
> ...this works straight away for the non-polled IRQ transactions since
> causes the wait_for_completion() to terminate cleanly, BUT in order to
> cut-short also the busy-wait in the polling case we need that additional
> try_wait_for_completion()....so as not to spin forever for a message
> that we dont care anymore...
> [ note that any late arriving immediate-reply will be in teh future
> discarded at this point]
>
> For this reason, I think that this patch, while correctly checking the
> poll_done() condition when the spinloop terminates for the reason
> explained in the commit-message, it should also check if the loop was not
> instead forcibly terminated by the OoO scenario...if not, we will end up
> considering the polling to have timeout, while instead it was forcibly
> terminated by the OoO state machine complete() and just want to ignore
> such missing message...
>
> So what about instead of checking (untested):
>
> + if (!completion_done(&xfer->done) &&
> + !info->desc->ops->poll_done(cinfo, xfer))
Were you able to check this ? I am waiting to hear back from you on this.
--
Regards,
Sudeep
prev parent reply other threads:[~2025-02-11 15:39 UTC|newest]
Thread overview: 4+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-23 8:33 [PATCH] dirvers: scmi: poll again when transfer reach timeout jack21
2025-01-23 10:38 ` Dan Carpenter
2025-01-23 19:48 ` Cristian Marussi
2025-02-11 15:37 ` Sudeep Holla [this message]
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z6tutxs2UE7vjjJ6@bogus \
--to=sudeep.holla@arm.com \
--cc=arm-scmi@vger.kernel.org \
--cc=cristian.marussi@arm.com \
--cc=dan.carpenter@linaro.org \
--cc=huangjie1663@phytium.com.cn \
--cc=jackhuang021@gmail.com \
--cc=linux-arm-kernel@lists.infradead.org \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).