From: Arnaud POULIQUEN <arnaud.pouliquen@foss.st.com>
To: Mathieu Poirier <mathieu.poirier@linaro.org>
Cc: Tim Blechmann <tim.blechmann@gmail.com>,
<linux-remoteproc@vger.kernel.org>,
Tim Blechmann <tim@klingt.org>
Subject: Re: [PATCH 1/1] rpmsg: virtio_rpmsg_bus - prevent possible race condition
Date: Wed, 13 Sep 2023 12:10:39 +0200 [thread overview]
Message-ID: <9f6f19ad-1985-7e37-d89e-16ba239ad6a4@foss.st.com> (raw)
In-Reply-To: <ZPZBVS3R/oZuUmk5@p14s>
On 9/4/23 22:43, Mathieu Poirier wrote:
> On Mon, Sep 04, 2023 at 03:52:56PM +0200, Arnaud POULIQUEN wrote:
>> Hello Tim,
>>
>> On 9/4/23 10:36, Tim Blechmann wrote:
>>> when we cannot get a tx buffer (`get_a_tx_buf`) `rpmsg_upref_sleepers`
>>> enables tx-complete interrupt.
>>> however if the interrupt is executed after `get_a_tx_buf` and before
>>> `rpmsg_upref_sleepers` we may mis the tx-complete interrupt and sleep
>>> for the full 15 seconds.
>>
>>
>> Is there any reason why your co-processor is unable to release the TX RPMSG
>> buffers for 15 seconds? If not, you should first determine the reason why it is
>> stalled.
>
> Arnaud's concern is valid. If the remote processor can't consume a buffer
> within 15 seconds, something is probably wrong.
>
> That said, I believe your assesment of the situation is correct. *If* the TX
> callback is disabled and there is no buffer available, there is a window of
> opportunity between calls to get_a_tx_buf() and rpmsg_upref_sleepers() for an
> interrupt to arrive in function rpmsg_send_offchannel_raw().
>
> From here three things need to happen:
>
> 1) You send another version of this patch with a changelong that uses proper
> english, i.e capital letters when they are needed and no spelling mistake.
>
> 2) Arnaud confirms our suspicions.
Seems to me that this patch is useless
- wait_event_interruptible_timeout() function already seems
to test the condition (so call get_a_tx_buf()) before entering in sleep[1].
- ftraces show that vq interrupt is not called during the 15-second period.
So it is a normal behavior that the vrp->sendq is never waked-up.
Tim needs to analyze the reason why no mailbox interrupt occurs.
[1]https://elixir.bootlin.com/linux/latest/source/include/linux/wait.h#L534
>
> 3) This patch gets applied when rc1 comes out so that it has 6 or 7 weeks to
> soak. No error are locks are reported due to this patch during that time.
>
>>
>> Regards,
>> Arnaud
>>
>>>
>>> in this case, so we re-try once before we really start to sleep
>>>
>>> Signed-off-by: Tim Blechmann <tim@klingt.org>
>>> ---
>>> drivers/rpmsg/virtio_rpmsg_bus.c | 24 +++++++++++++++---------
>>> 1 file changed, 15 insertions(+), 9 deletions(-)
>>>
>>> diff --git a/drivers/rpmsg/virtio_rpmsg_bus.c b/drivers/rpmsg/virtio_rpmsg_bus.c
>>> index 905ac7910c98..2a9d42225e60 100644
>>> --- a/drivers/rpmsg/virtio_rpmsg_bus.c
>>> +++ b/drivers/rpmsg/virtio_rpmsg_bus.c
>>> @@ -587,21 +587,27 @@ static int rpmsg_send_offchannel_raw(struct rpmsg_device *rpdev,
>>>
>>> /* no free buffer ? wait for one (but bail after 15 seconds) */
>>> while (!msg) {
>>> /* enable "tx-complete" interrupts, if not already enabled */
>>> rpmsg_upref_sleepers(vrp);
>>>
>>> - /*
>>> - * sleep until a free buffer is available or 15 secs elapse.
>>> - * the timeout period is not configurable because there's
>>> - * little point in asking drivers to specify that.
>>> - * if later this happens to be required, it'd be easy to add.
>>> - */
>>> - err = wait_event_interruptible_timeout(vrp->sendq,
>>> - (msg = get_a_tx_buf(vrp)),
>>> - msecs_to_jiffies(15000));
>>> + /* make sure to retry to grab tx buffer before we start waiting */
>>> + msg = get_a_tx_buf(vrp);
>>> + if (msg) {
>>> + err = 0;
>>> + } else {
>>> + /*
>>> + * sleep until a free buffer is available or 15 secs elapse.
>>> + * the timeout period is not configurable because there's
>>> + * little point in asking drivers to specify that.
>>> + * if later this happens to be required, it'd be easy to add.
>>> + */
>>> + err = wait_event_interruptible_timeout(vrp->sendq,
>>> + (msg = get_a_tx_buf(vrp)),
>>> + msecs_to_jiffies(15000));
>>> + }
>>>
>>> /* disable "tx-complete" interrupts if we're the last sleeper */
>>> rpmsg_downref_sleepers(vrp);
>>>
>>> /* timeout ? */
>>> if (!err) {
next prev parent reply other threads:[~2023-09-13 10:10 UTC|newest]
Thread overview: 18+ messages / expand[flat|nested] mbox.gz Atom feed top
2023-09-04 8:36 [PATCH 1/1] rpmsg: virtio_rpmsg_bus - prevent possible race condition Tim Blechmann
2023-09-04 13:52 ` Arnaud POULIQUEN
2023-09-04 20:43 ` Mathieu Poirier
2023-09-05 1:33 ` Tim Blechmann
2023-09-05 16:02 ` Mathieu Poirier
2023-09-08 15:04 ` Arnaud POULIQUEN
2023-09-09 6:28 ` Tim Blechmann
2023-09-11 17:20 ` Arnaud POULIQUEN
2023-09-13 1:07 ` Tim Blechmann
2023-09-13 1:11 ` Tim Blechmann
2023-09-13 7:44 ` Arnaud POULIQUEN
2023-09-13 8:47 ` Tim Blechmann
2023-09-13 10:02 ` Arnaud POULIQUEN
[not found] ` <CAG2LOc42AG5H56=tzz8_2WrrBiy9d74qYmgPQaEVGrzWTNqodg@mail.gmail.com>
2023-09-14 17:25 ` Arnaud POULIQUEN
2023-09-16 1:38 ` Tim Blechmann
2023-09-13 10:10 ` Arnaud POULIQUEN [this message]
2023-09-13 14:46 ` Mathieu Poirier
-- strict thread matches above, loose matches on Subject: below --
2023-09-07 4:51 Tim Blechmann
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=9f6f19ad-1985-7e37-d89e-16ba239ad6a4@foss.st.com \
--to=arnaud.pouliquen@foss.st.com \
--cc=linux-remoteproc@vger.kernel.org \
--cc=mathieu.poirier@linaro.org \
--cc=tim.blechmann@gmail.com \
--cc=tim@klingt.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox