From: Bjorn Ardo <bjorn.ardo@axis.com>
To: Jassi Brar <jassisinghbrar@gmail.com>
Cc: kernel <kernel@axis.com>,
Linux Kernel Mailing List <linux-kernel@vger.kernel.org>
Subject: Re: [PATCH] mailbox: forward the hrtimer if not queued and under a lock
Date: Wed, 20 Apr 2022 10:28:00 +0200 [thread overview]
Message-ID: <487bbd00-d763-0a86-9984-1dfd957fbb38@axis.com> (raw)
In-Reply-To: <CABb+yY1BNsdMq7CNOBDk3sn7uvpL4=-fT7eOcbuL-+Yjz+iqHw@mail.gmail.com>
Hi,
On 4/19/22 16:10, Jassi Brar wrote:
> I don't have access to your client driver, but if it submits another
> message from rx_callback() that is the problem.
>
> Please have a look at how other platforms do, for example
> imx_dsp_rproc_rx_tx_callback()
>
> /**
> * mbox_chan_received_data - A way for controller driver to push data
> * received from remote to the upper layer.
> * @chan: Pointer to the mailbox channel on which RX happened.
> * @mssg: Client specific message typecasted as void *
> *
> * After startup and before shutdown any data received on the chan
> * is passed on to the API via atomic mbox_chan_received_data().
> * The controller should ACK the RX only after this call returns.
> */
> void mbox_chan_received_data(struct mbox_chan *chan, void *mssg)
>
> If not this, please share your client code as well.
>
> thanks.
In rx_callback() we call tasklet_hi_schedule() to schedule a tasklet and
this tasklet calls mbox_send_message(). The mailbox is setup with
.tx_block = false.
I do not quite understand how calling mbox_send_message() from another
contex (like a work thread as in imx_dsp_rproc_rx_tx_callback()) will
resolve the race, could you explain this? Or do you mean that it should
not be called from any interrupt context at all? Looking at the
documentation of mbox_send_message() it states:
* This function could be called from atomic context as it simply
* queues the data and returns a token against the request.
If I look at the code in msg_submit() the part that calls
hrtimer_active() and hrtimer_start() is completely without a lock. Also
the code in txdone_hrtimer() that calls hrtimer_forward_now() is without
a lock. So I cannot see anything preventing these two functions to be
called concurrently on two different processors (as they do in my
trace). And looking at the hr_timer code then hrtimer_active() will
return true if the hrtimer is currently executing txdone_hrtimer().
The way I see the race is if the timer has called txdone_hrtimer() and
it has passed the part with the for-loop looking at the individual
channels when the msg_submit() runs. Then txdone_hrtimer() has decided
to not restart the timer, but hrtimer_active() will still return true
for a while (until txdone_hrtimer() return completely and the hrtimer
code can change its status). In this time there is nothing that prevents
msg_submit() from running, adding a new message that the timer should
monitor. But if it manages to call hrtimer_active() in the time period
before the hrtimer code updates it then the current code will never
start the timer.
Also the poll_hrt is shared between all channels in the mailbox, so it
does not have to be the same mailbox channel that hrtimer is currently
waiting for that is calling msg_submit(). So not calling
mbox_send_message() from rx_callback() will only alter timing for that
channel. There could still be a race between two different mailbox channels.
This is my understanding of the current code, please tell me if I have
missed or misunderstood any part of it?
Our current solution are using 4 different mailbox channels
asynchronously. The code is part of a larger system, but I can put down
some time and try and extract the relevant parts if you still think this
is a client issue? But with my current understanding of the code, the
race between msg_submit() and txdone_hrtimer() is quite clear, and with
my proposed patch that removes this race we have be able to run for very
long time without any problems (that is several days). Without the fix
we get the race after 5-10 min.
Best Regards,
Björn
next prev parent reply other threads:[~2022-04-20 8:28 UTC|newest]
Thread overview: 7+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-03-31 7:01 [PATCH] mailbox: forward the hrtimer if not queued and under a lock Björn Ardö
2022-04-14 14:30 ` Jassi Brar
2022-04-19 12:15 ` Bjorn Ardo
2022-04-19 14:10 ` Jassi Brar
2022-04-20 8:28 ` Bjorn Ardo [this message]
2022-05-23 11:56 ` Bjorn Ardo
2022-05-23 19:35 ` Jassi Brar
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=487bbd00-d763-0a86-9984-1dfd957fbb38@axis.com \
--to=bjorn.ardo@axis.com \
--cc=jassisinghbrar@gmail.com \
--cc=kernel@axis.com \
--cc=linux-kernel@vger.kernel.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox