Re: [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
To: Richard Fitzgerald <rf@opensource.cirrus.com>,
	vkoul@kernel.org, yung-chuan.liao@linux.intel.com
Cc: linux-sound@vger.kernel.org, linux-kernel@vger.kernel.org,
	patches@opensource.cirrus.com
Subject: Re: [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency
Date: Tue, 23 Dec 2025 14:19:11 +0100	[thread overview]
Message-ID: <c81f581a-ee9f-4692-8ed8-736b5c5053af@linux.dev> (raw)
In-Reply-To: <028ae5fc-5ce1-4fbb-a09c-88700f2d77b1@opensource.cirrus.com>

On 12/23/25 11:47, Richard Fitzgerald wrote:
> On 23/12/25 10:29, Pierre-Louis Bossart wrote:
> <SNIP>
> 
>>>> I can see how preparing all ports in parallel would reduce the total time compared to a serial approach, even with a timeout, but if we end-up always timing out even in the case where there is a single device it'd be quite odd.
>>>
>>> Yes, but that's a separate problem that needs fixing. The current code
>>> in the kernel does the timeout for every amp. So if you have 4 amps with
>>> a timeout of 10ms it takes >40ms to do the port prepare.
>>
>> so is this fair to say that by not testing the return value of wait_for_completion_timeout(), we never detected the 'always-timeout' behavior since the initial contribution circa 2018, even with a single device per link?
>>
>> Wow, that'd be a big 7-year old conceptual miss...
> 
> It's just  a design flaw in the code. It wasn't found because it was
> tested against Realtek codecs, which use the simplified CP_SM so the
> code can skip the Prepare stage.

ok, thanks for the precision.

> The original code checked for timeout and then always failed, because it
> always timed out. So none of the Cirrus amps and codecs worked. I sent
> the "temporary" workaround of reading the status register after the
> wait_for_completion() and ignoring the timeout if the device is now
> reporting prepared, until I could fix it properly. I've only just got
> around to looking at this again.

ok, so this is a follow-up to commit 3d3e88e from 2021, where the timeout check was removed?
This should probably be mentioned in the commit description, with a clear explanation that the timeout happens in all cases.

I personally misunderstood that earlier commit, which states 

'The previous implementation would always fail if the
wait_for_completion() timed out, even if the port was reporting
successful prepare.
'

 
>>>> In other words does this patch reduce the start latency only in the case of multiple devices, but adds a 'tax' for all other cases?
>>>
>>> This only affects peripherals that use the full CP_SM (not simplified).
>>>
>>> If you have only 1 peripheral there is either no change or a possibility
>>> of skipping the wait. Before this change it would always wait the full
>>> timeout before noticing that it is ready.
>>>
>>> After this change, you might be able to skip the wait. If it prepared
>>> immediately, so that it is already reporting prepared when the first
>>> read tests for that, it will skip the wait.
>>> (BTW I see that happening with CS35L56. It prepares so fast that it
>>> is already done, without waiting for any of the timeouts.)
>>>
>>> If it isn't ready that early, you get the timeout as before.
>>>
>>> It isn't adding any penalty to single devices, unless you count the
>>> trivial time it takes to do the register read, which is negligible
>>> compared to getting deadlocked in the wait_for_completion_timeout().
>>
>> ok, now I get the point that the use of sdw_acquire_bus_lock() in sdw_prepare_stream() prevents the use of device-initiated interrupts.
>>
>> One possible change is a departure from the use of the PORT_READY alert. This was meant to be a worst-case scenario for cases where the prepare time is in hundreds of ms - IIRC we thought devices would need to adjust their internal clock to that of the bus, which might change during a prepare operation. But as you mentioned it in practice devices are prepared much faster. So we could have a typical loop with multiple read/sleep instead. See for example cdns_clear_bit() as an example of what I have in mind, classic kernel code really.
>>
>> That might speed-up the prepare time even for the single device case - and then that may be enough to avoid the deferred check on PORT_READY and make the code simpler for multiple devices as well.
>>
> 
> I've got an experimental patch to replace the
> wait_for_completion_timeout() with a polling loop. It's a brute-force
> approach but reduces the overhead without the scary rewrite of the bus
> locking.

sounds good, it's not that bad in terms of brute-forcing and certainly less bad than a broken design where an expected interrupt cannot be serviced by construction.

>> At any rate you're onto something, thanks for submitting this and bearing with my questions.
> 
> I need to send a new version of this patch anyway because it will
> need rebasing onto https://lore.kernel.org/linux-sound/20251219173830.407-1-niranjan.hy@ti.com/T/#u
> That one is fixing a problem so makes sense to take it first.
> 
> I'll improve the commenting in V2 to explain better what the code
> is doing.

I'd recommend the 'brute-force' approach, this wait_for_completion_timeout() makes no sense - and it'd remove the need for the extra read before the nonsensical wait.

Probably something for 2026 now ;-)

     prev parent reply	other threads:[~2025-12-23 13:19 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 16:56 [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency Richard Fitzgerald
2025-12-09 13:04 ` Pierre-Louis Bossart
2025-12-09 13:36   ` Richard Fitzgerald
2025-12-09 16:41     ` Pierre-Louis Bossart
2025-12-10  9:59       ` Richard Fitzgerald
2025-12-20 11:15         ` Pierre-Louis Bossart
2025-12-22 12:01           ` Richard Fitzgerald
2025-12-23 10:29             ` Pierre-Louis Bossart
2025-12-23 10:47               ` Richard Fitzgerald
2025-12-23 13:19                 ` Pierre-Louis Bossart [this message]

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=c81f581a-ee9f-4692-8ed8-736b5c5053af@linux.dev \
    --to=pierre-louis.bossart@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sound@vger.kernel.org \
    --cc=patches@opensource.cirrus.com \
    --cc=rf@opensource.cirrus.com \
    --cc=vkoul@kernel.org \
    --cc=yung-chuan.liao@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.