Re: [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency

public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed

From: Pierre-Louis Bossart <pierre-louis.bossart@linux.dev>
To: Richard Fitzgerald <rf@opensource.cirrus.com>,
	vkoul@kernel.org, yung-chuan.liao@linux.intel.com
Cc: linux-sound@vger.kernel.org, linux-kernel@vger.kernel.org,
	patches@opensource.cirrus.com
Subject: Re: [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency
Date: Tue, 23 Dec 2025 11:29:29 +0100	[thread overview]
Message-ID: <319caf4d-a96d-4c25-b912-e00c76ad8038@linux.dev> (raw)
In-Reply-To: <a1b2b617-448d-4457-891e-d2cc00858f50@opensource.cirrus.com>


>>>>>>> +    val = sdw_read_no_pm(s_rt->slave, SDW_DPN_PREPARESTATUS(p_rt->num));
>>>>>>> +    if (val < 0) {
>>>>>>> +        ret = val;
>>>>>>> +        goto err;
>>>>>>> +    }
>>>>>>> +
>>>>>>> +    if (val & p_rt->ch_mask) {
>>>>>>
>>>>>> Can you explain why we don't use the ch_mask in the already-prepared case? I am missing something.
>>>>>>
>>>>> I'm not sure what you mean here. The if() immediately above your comment
>>>>> uses ch_mask to check the already-prepared state.
>>>>
>>>> I was referring to the 1. above, you read the prepare status without checking for ch_mask first.
>>>>
>>>
>>> What would be the purpose of checking ch_mask before the read?
>>
>> I don't know - why do we need to read it in the second case and not the first is all I am asking.
>>
> 
> What the code is doing is:
> 
> 1. Read the current prepare status
> 2. Check if any channels we prepared are still reporting NOT_PREPARED
> 3. If there are any NOT_PREPARED channels we need to do the wait, else
>    we can skip it

ok, sorry for being thick. The code is just fine, I misread it with the test split in two.

>>>>>>> +        /* Wait for completion on port ready */
>>>>>>> +        port_ready = &s_rt->slave->port_ready[p_rt->num];
>>>>>>> +        wait_for_completion_timeout(port_ready, msecs_to_jiffies(ch_prep_timeout));
>>>>>>
>>>>>> I understand the code is the same as before but would there be any merit in checking the timeout before starting a read? If the device is already in the weeds, doing another read adds even more time before reporting an error.
>>>>>>
>>>>> Do you mean save the system time when the DPN_PREPARE was written to
>>>>> that peripheral and then check here whether the timeout period has
>>>>> already elapsed?
>>>>
>>>> I meant testing the return value of wait_for_completion_timeout(). If you already timed out at this point with a return value of zero, there's no point in checking the status any more, the system is in the weeds.
>>>>
>>>
>>> Wait completion will _always_ timeout because this code is holding the
>>> bus lock, which blocks the ALERT handler from running and signalling
>>> the completion. The wait_for_completion_timeout() is effectively
>>> msleep(msecs_to_jiffies(ch_prep_timeout));
>>> So we have to read the register afterwards to see whether the peripheral
>>> actually prepared.
>>>
>>> I've left the useless wait_for_completion_timeout() in the code so this
>>> commit is only changing what it says it is changing, and nothing else.
>>>
>>> What to do about the deadlocked wait_for_completion_timeout() is a
>>> separate problem.
>>
>> Humm, what happens if you have a single peripheral, does this result in an increase of the prepare time all the way to the timeout value?
>> I can see how preparing all ports in parallel would reduce the total time compared to a serial approach, even with a timeout, but if we end-up always timing out even in the case where there is a single device it'd be quite odd.
> 
> Yes, but that's a separate problem that needs fixing. The current code
> in the kernel does the timeout for every amp. So if you have 4 amps with
> a timeout of 10ms it takes >40ms to do the port prepare.

so is this fair to say that by not testing the return value of wait_for_completion_timeout(), we never detected the 'always-timeout' behavior since the initial contribution circa 2018, even with a single device per link? 

Wow, that'd be a big 7-year old conceptual miss...
 
>> In other words does this patch reduce the start latency only in the case of multiple devices, but adds a 'tax' for all other cases?
> 
> This only affects peripherals that use the full CP_SM (not simplified).
> 
> If you have only 1 peripheral there is either no change or a possibility
> of skipping the wait. Before this change it would always wait the full
> timeout before noticing that it is ready.
> 
> After this change, you might be able to skip the wait. If it prepared
> immediately, so that it is already reporting prepared when the first
> read tests for that, it will skip the wait.
> (BTW I see that happening with CS35L56. It prepares so fast that it
> is already done, without waiting for any of the timeouts.)
> 
> If it isn't ready that early, you get the timeout as before.
> 
> It isn't adding any penalty to single devices, unless you count the
> trivial time it takes to do the register read, which is negligible
> compared to getting deadlocked in the wait_for_completion_timeout().

ok, now I get the point that the use of sdw_acquire_bus_lock() in sdw_prepare_stream() prevents the use of device-initiated interrupts.

One possible change is a departure from the use of the PORT_READY alert. This was meant to be a worst-case scenario for cases where the prepare time is in hundreds of ms - IIRC we thought devices would need to adjust their internal clock to that of the bus, which might change during a prepare operation. But as you mentioned it in practice devices are prepared much faster. So we could have a typical loop with multiple read/sleep instead. See for example cdns_clear_bit() as an example of what I have in mind, classic kernel code really.

That might speed-up the prepare time even for the single device case - and then that may be enough to avoid the deferred check on PORT_READY and make the code simpler for multiple devices as well.

At any rate you're onto something, thanks for submitting this and bearing with my questions.

next prev parent reply	other threads:[~2025-12-23 10:29 UTC|newest]

Thread overview: 10+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-11-25 16:56 [PATCH v3] soundwire: stream: Prepare ports in parallel to reduce stream start latency Richard Fitzgerald
2025-12-09 13:04 ` Pierre-Louis Bossart
2025-12-09 13:36   ` Richard Fitzgerald
2025-12-09 16:41     ` Pierre-Louis Bossart
2025-12-10  9:59       ` Richard Fitzgerald
2025-12-20 11:15         ` Pierre-Louis Bossart
2025-12-22 12:01           ` Richard Fitzgerald
2025-12-23 10:29             ` Pierre-Louis Bossart [this message]
2025-12-23 10:47               ` Richard Fitzgerald
2025-12-23 13:19                 ` Pierre-Louis Bossart

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=319caf4d-a96d-4c25-b912-e00c76ad8038@linux.dev \
    --to=pierre-louis.bossart@linux.dev \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-sound@vger.kernel.org \
    --cc=patches@opensource.cirrus.com \
    --cc=rf@opensource.cirrus.com \
    --cc=vkoul@kernel.org \
    --cc=yung-chuan.liao@linux.intel.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox