Re: [v3 PATCH] remoteproc: xlnx: Use high-prio workqueue instead of system wq

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Mathieu Poirier <mathieu.poirier@linaro.org>
To: Stefan Roese <stefan.roese@mailbox.org>
Cc: Tanmay Shah <tanmay.shah@amd.com>,
	Zhongqiu Han <zhongqiu.han@oss.qualcomm.com>,
	linux-remoteproc@vger.kernel.org
Subject: Re: [v3 PATCH] remoteproc: xlnx: Use high-prio workqueue instead of system wq
Date: Tue, 16 Dec 2025 14:47:24 -0700	[thread overview]
Message-ID: <aUHTbCVdG6i1hA8Q@p14s> (raw)
In-Reply-To: <ae1fd2a4-e35d-4907-a08c-a469adf6e96e@mailbox.org>

On Tue, Dec 16, 2025 at 03:34:18PM +0100, Stefan Roese wrote:
> Hi Mathieu,
> 
> On 12/15/25 02:14, Mathieu Poirier wrote:
> > On Wed, Dec 10, 2025 at 12:28:52PM -0600, Tanmay Shah wrote:
> > > Hello, please check my comments below:
> > > 
> > > On 12/10/25 2:29 AM, Stefan Roese wrote:
> > > > Hi Tanmay,
> > > > 
> > > > On 12/10/25 03:51, Zhongqiu Han wrote:
> > > > > On 12/5/2025 8:06 PM, Stefan Roese wrote:
> > > > > > Hi Tanmay,
> > > > > > 
> > > > > > On 12/4/25 17:45, Tanmay Shah wrote:
> > > > > > > Hello,
> > > > > > > 
> > > > > > > Thank You for your patch. Please find my comments below.
> > > > > > > 
> > > > > > > On 12/4/25 4:40 AM, Stefan Roese wrote:
> > > > > > > > Testing on our ZynqMP platform has shown, that some R5 messages might
> > > > > > > > get dropped under high CPU load. This patch creates a new high-prio
> > > > > > > 
> > > 
> > > This commit text should be fixed. Messages are not dropped by Linux, but R5
> > > can't send new messages as rx vq is not processed by Linux.
> > > 
> > 
> > I agree.
> > > > > > > Here, I would like to understand what it means by "R5
> > > > > > > messages might get dropped"
> > > > > > > 
> > > > > > > Even under high CPU load, the messages from R5 are stored in
> > > > > > > the virtqueues. If Linux doesn't read it, then it is not
> > > > > > > really lost/ dropped.
> > > > > > > 
> > > > > > > Could you please explain your use case in detail and how the
> > > > > > > testing is conducted?
> > > > > > 
> > > > > > Our use-case is, that we send ~4k messages per second from the R5 to
> > > > > > Linux - sometimes even a bit more. Normally these messages are received
> > > > > > okay and no messages are dropped. Sometimes, under "high CPU load"
> > > > > > scenarios it happens, that the R5 has to drop messages, as there is no
> > > > > > free space in the RPMsg buffer, which is 256 entries AFAIU. Resulting
> > > > > > from the Linux driver not emptying the RX queue.
> > > > > > 
> > > 
> > > Thanks for the details. Your understanding is correct.
> > > 
> > > > > > Could you please elaborate on these virtqueues a bit? Especially why no
> > > > > > messages drop should happen because of these virtqueues?
> > > > > 
> > > > > AFAIK, as a transport layer based on virtqueue, rpmsg is reliable once a
> > > > > message has been successfully enqueued. The observed "drop" here appears
> > > > > to be on the R5 side, where the application discards messages when no
> > > > > entry buffer is available.
> > > > 
> > > > Correct.
> > > > 
> > > > > In the long run, while improving the Linux side is recommended,
> > > > 
> > > > Yes, please.
> > > > 
> > > > > it could
> > > > > also be helpful for the R5 side to implement strategies such as an
> > > > > application-level buffer and retry mechanisms.
> > > > 
> > > > We already did this. We've added an additional buffer mechanism to the
> > > > R5, which improved this "message drop situation" a bit. Still it did not
> > > > fix it for all our high message rate situations - still resulting in
> > > > frame drops on the R5 side (the R5 is a bit resource restricted).
> > > > 
> > > > Improving the responsiveness on the Linux side seems to be the best way
> > > > for us to deal with this problem.
> > > > 
> > > 
> > > I agree to this. However, Just want to understand and cover full picture
> > > here.
> > > 
> > > On R5 side, I am assuming open-amp library is used for the RPMsg
> > > communication.
> > > 
> > > rpmsg_send() API will end up here: https://github.com/OpenAMP/open-amp/blob/be5770f30516505c1a4d35efcffff9fb547f7dcf/lib/rpmsg/rpmsg_virtio.c#L384
> > > 
> > > Here, if the new buffer is not available, then R5 is supposed to wait for
> > > 1ms before sending a new message. After 1ms, R5 will try to get buffer
> > > again, and this continues for 15 seconds. This is the default mechanism.
> > > 
> > > This mechanism is used in your case correctly ?
> > > 
> > > Alternatively you can register platform specific wait mechanism via this
> > > callback: https://github.com/OpenAMP/open-amp/blob/be5770f30516505c1a4d35efcffff9fb547f7dcf/lib/include/openamp/rpmsg_virtio.h#L42
> > > 
> > > Few questions for further understanding:
> > > 
> > > 1) As per your use case, 4k per second data transfer rate must be maintained
> > > all the time? And this is achieved with this patch?
> > > 
> > > Even after having the high priority queue, if someone wants to achieve 8k
> > > per seconds or 16k per seconds data transfer rate, at some point we will hit
> > > this issue again.
> > > 
> > 
> > Right, I also think this patch is not the right solution.
> 
> Hmmm. My understanding of Tanmays's comments is somewhat different. He
> is not "against" this patch in general AFAIU. Please see my reply with
> a more detailed description of our system setup and it's message flow
> and limitations that I just sent a few minutes ago.
>

Regardless of how we spin things around, this patch is about running out of
resource (CPU cycles and memory).  It is only a matter of time before this
solution becomes obsolete.

The main issue here is that we are adding a priority workqueue for everyone
using this driver, which may have unwanted side effects.  Please add a kernel
module parameter to control what kind of workqueue is to be used.

Thanks,
Mathieu  
 
> > > The reliable solution would be to keep the data transfer rate reasonable,
> > > and have solid re-try mechanism.
> > > 
> > > I am okay to take this patch in after addressing comments below but, please
> > > make sure all above things are r5 side is working as well.
> > 
> > Tanmay is correct on all front.
> 
> Agreed.
> 
> Thanks,
> Stefan
>

next prev parent reply	other threads:[~2025-12-16 21:47 UTC|newest]

Thread overview: 14+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-12-04 10:40 [v3 PATCH] remoteproc: xlnx: Use high-prio workqueue instead of system wq Stefan Roese
2025-12-04 16:45 ` Tanmay Shah
2025-12-05 12:06   ` Stefan Roese
2025-12-10  2:51     ` Zhongqiu Han
2025-12-10  8:29       ` Stefan Roese
2025-12-10 18:28         ` Tanmay Shah
2025-12-15  1:14           ` Mathieu Poirier
2025-12-16 14:34             ` Stefan Roese
2025-12-16 21:47               ` Mathieu Poirier [this message]
2025-12-17 10:27                 ` Stefan Roese
2025-12-17 21:34                   ` Mathieu Poirier
2025-12-18 14:58                     ` Stefan Roese
2025-12-16 14:26           ` Stefan Roese
2025-12-16 14:43           ` Stefan Roese

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aUHTbCVdG6i1hA8Q@p14s \
    --to=mathieu.poirier@linaro.org \
    --cc=linux-remoteproc@vger.kernel.org \
    --cc=stefan.roese@mailbox.org \
    --cc=tanmay.shah@amd.com \
    --cc=zhongqiu.han@oss.qualcomm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.