linux-sound.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
@ 2025-04-24 18:13 Rafael J. Wysocki
  2025-04-25 17:12 ` Pierre-Louis Bossart
  0 siblings, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2025-04-24 18:13 UTC (permalink / raw)
  To: Vinod Koul
  Cc: LKML, Linux PM, Bard Liao, Pierre-Louis Bossart, Sanyog Kale,
	linux-sound

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The code in intel_suspend() and intel_resume() needs to be properly
synchronized with runtime PM which is not the case currently, so fix
it.

First of all, prevent runtime PM from triggering after intel_suspend()
has started because the changes made by it to the device might be
undone by a runtime resume of the device.  For this purpose, add a
pm_runtime_disable() call to intel_suspend().

Next, once runtime PM has been disabled, the runtime PM status of the
device cannot change, so pm_runtime_status_suspended() can be used
instead of pm_runtime_suspended() in intel_suspend().

Moreover, checking the runtime PM status of the device in intel_resume()
is pointless because the device is going to be resumed anyway and the
code running in the case when the pm_runtime_suspended() check in
intel_resume() returns 'true' is harmful.  Namely, calling
pm_runtime_resume() right after pm_runtime_set_active() has no effect
and calling pm_runtime_idle() on the device at that point is invalid
because the device is technically still suspended (it has not been
resumed yet).  Remove that code altogether along with the check leading
to it.

Finally, call pm_runtime_set_active() at the end intel_resume() to set
the runtime PM status of the device to "active" because it has just been
resumed and re-enable runtime PM for it after that.

Additionally, drop the setting of DPM_FLAG_SMART_SUSPEND from the
driver because it has never been necessary and should not have been
done.

This addresses a functional regression after commit bca84a7b93fd ("PM:
sleep: Use DPM_FLAG_SMART_SUSPEND conditionally") that effectively
caused the fatal pm_runtime_suspended() check in intel_resume() to
trigger.

Fixes: bca84a7b93fd ("PM: sleep: Use DPM_FLAG_SMART_SUSPEND conditionally")
Cc: 6.2+ <stable@vger.kernel.org> # 6.2+
Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Tested-by: Bard Liao <yung-chuan.liao@linux.intel.com>
Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This change fixes an issue predating commit bca84a7b93fd that simply
uncovered the problem and it needs to be backported to 6.2 and later
kernels.

Since it fixes a recent regression in 6.15-rc, I can route it through the
PM tree unless that would be a major concern.

---
 drivers/soundwire/intel_auxdevice.c |   36 +++++++++++++-----------------------
 1 file changed, 13 insertions(+), 23 deletions(-)

--- a/drivers/soundwire/intel_auxdevice.c
+++ b/drivers/soundwire/intel_auxdevice.c
@@ -353,9 +353,6 @@
 	/* use generic bandwidth allocation algorithm */
 	sdw->cdns.bus.compute_params = sdw_compute_params;
 
-	/* avoid resuming from pm_runtime suspend if it's not required */
-	dev_pm_set_driver_flags(dev, DPM_FLAG_SMART_SUSPEND);
-
 	ret = sdw_bus_master_add(bus, dev, dev->fwnode);
 	if (ret) {
 		dev_err(dev, "sdw_bus_master_add fail: %d\n", ret);
@@ -640,7 +637,10 @@
 		return 0;
 	}
 
-	if (pm_runtime_suspended(dev)) {
+	/* Prevent runtime PM from racing with the code below. */
+	pm_runtime_disable(dev);
+
+	if (pm_runtime_status_suspended(dev)) {
 		dev_dbg(dev, "pm_runtime status: suspended\n");
 
 		clock_stop_quirks = sdw->link_res->clock_stop_quirks;
@@ -648,7 +648,7 @@
 		if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET) ||
 		    !clock_stop_quirks) {
 
-			if (pm_runtime_suspended(dev->parent)) {
+			if (pm_runtime_status_suspended(dev->parent)) {
 				/*
 				 * paranoia check: this should not happen with the .prepare
 				 * resume to full power
@@ -715,7 +715,6 @@
 	struct sdw_cdns *cdns = dev_get_drvdata(dev);
 	struct sdw_intel *sdw = cdns_to_intel(cdns);
 	struct sdw_bus *bus = &cdns->bus;
-	int link_flags;
 	int ret;
 
 	if (bus->prop.hw_disabled || !sdw->startup_done) {
@@ -724,23 +723,6 @@
 		return 0;
 	}
 
-	if (pm_runtime_suspended(dev)) {
-		dev_dbg(dev, "pm_runtime status was suspended, forcing active\n");
-
-		/* follow required sequence from runtime_pm.rst */
-		pm_runtime_disable(dev);
-		pm_runtime_set_active(dev);
-		pm_runtime_mark_last_busy(dev);
-		pm_runtime_enable(dev);
-
-		pm_runtime_resume(bus->dev);
-
-		link_flags = md_flags >> (bus->link_id * 8);
-
-		if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE))
-			pm_runtime_idle(dev);
-	}
-
 	ret = sdw_intel_link_power_up(sdw);
 	if (ret) {
 		dev_err(dev, "%s failed: %d\n", __func__, ret);
@@ -761,6 +743,14 @@
 	}
 
 	/*
+	 * Runtime PM has been disabled in intel_suspend(), so set the status
+	 * to active because the device has just been resumed and re-enable
+	 * runtime PM.
+	 */
+	pm_runtime_set_active(dev);
+	pm_runtime_enable(dev);
+
+	/*
 	 * after system resume, the pm_runtime suspend() may kick in
 	 * during the enumeration, before any children device force the
 	 * master device to remain active.  Using pm_runtime_get()




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
  2025-04-24 18:13 [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling Rafael J. Wysocki
@ 2025-04-25 17:12 ` Pierre-Louis Bossart
  2025-04-25 18:10   ` Rafael J. Wysocki
  2025-05-07  8:51   ` Liao, Bard
  0 siblings, 2 replies; 7+ messages in thread
From: Pierre-Louis Bossart @ 2025-04-25 17:12 UTC (permalink / raw)
  To: Rafael J. Wysocki, Vinod Koul
  Cc: LKML, Linux PM, Bard Liao, Sanyog Kale, linux-sound

On 4/24/25 20:13, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The code in intel_suspend() and intel_resume() needs to be properly
> synchronized with runtime PM which is not the case currently, so fix
> it.
> 
> First of all, prevent runtime PM from triggering after intel_suspend()
> has started because the changes made by it to the device might be
> undone by a runtime resume of the device.  For this purpose, add a
> pm_runtime_disable() call to intel_suspend().

Allow me to push back on this, because we have to be very careful with a hidden state transition that needs to happen.

If a controller was suspended by pm_runtime, it will enter the clock stop mode.

If the system needs to suspend, the controller has to be forced to exit the clock stop mode and the bus has to restart before we can suspend it, and that's why we had those pm_runtime_resume().

Disabling pm_runtime when entering system suspend would be problematic for Intel hardware, it's a known can of worms.

It's quite possible that some of the code in intel_suspend() is no longer required because the .prepare will resume the bus properly, but I wanted to make sure this state transition out of clock-stop is known and taken into consideration.

Bard, is this a configuration you've tested?
 
> Next, once runtime PM has been disabled, the runtime PM status of the
> device cannot change, so pm_runtime_status_suspended() can be used
> instead of pm_runtime_suspended() in intel_suspend().
> 
> Moreover, checking the runtime PM status of the device in intel_resume()
> is pointless because the device is going to be resumed anyway and the
> code running in the case when the pm_runtime_suspended() check in
> intel_resume() returns 'true' is harmful.  Namely, calling
> pm_runtime_resume() right after pm_runtime_set_active() has no effect
> and calling pm_runtime_idle() on the device at that point is invalid
> because the device is technically still suspended (it has not been
> resumed yet).  Remove that code altogether along with the check leading
> to it.
> 
> Finally, call pm_runtime_set_active() at the end intel_resume() to set
> the runtime PM status of the device to "active" because it has just been
> resumed and re-enable runtime PM for it after that.



> Additionally, drop the setting of DPM_FLAG_SMART_SUSPEND from the
> driver because it has never been necessary and should not have been
> done.

IIRC it was your recommendation to add this flag many moons ago... No issue to remove it, it's just not something any of the 'SoundWire' folks are knowledgeable with.

> This addresses a functional regression after commit bca84a7b93fd ("PM:
> sleep: Use DPM_FLAG_SMART_SUSPEND conditionally") that effectively
> caused the fatal pm_runtime_suspended() check in intel_resume() to
> trigger.
> 
> Fixes: bca84a7b93fd ("PM: sleep: Use DPM_FLAG_SMART_SUSPEND conditionally")
> Cc: 6.2+ <stable@vger.kernel.org> # 6.2+
> Reported-by: Bard Liao <yung-chuan.liao@linux.intel.com>
> Tested-by: Bard Liao <yung-chuan.liao@linux.intel.com>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> This change fixes an issue predating commit bca84a7b93fd that simply
> uncovered the problem and it needs to be backported to 6.2 and later
> kernels.
> 
> Since it fixes a recent regression in 6.15-rc, I can route it through the
> PM tree unless that would be a major concern.
> 
> ---
>  drivers/soundwire/intel_auxdevice.c |   36 +++++++++++++-----------------------
>  1 file changed, 13 insertions(+), 23 deletions(-)
> 
> --- a/drivers/soundwire/intel_auxdevice.c
> +++ b/drivers/soundwire/intel_auxdevice.c
> @@ -353,9 +353,6 @@
>  	/* use generic bandwidth allocation algorithm */
>  	sdw->cdns.bus.compute_params = sdw_compute_params;
>  
> -	/* avoid resuming from pm_runtime suspend if it's not required */
> -	dev_pm_set_driver_flags(dev, DPM_FLAG_SMART_SUSPEND);
> -
>  	ret = sdw_bus_master_add(bus, dev, dev->fwnode);
>  	if (ret) {
>  		dev_err(dev, "sdw_bus_master_add fail: %d\n", ret);
> @@ -640,7 +637,10 @@
>  		return 0;
>  	}
>  
> -	if (pm_runtime_suspended(dev)) {
> +	/* Prevent runtime PM from racing with the code below. */
> +	pm_runtime_disable(dev);
> +
> +	if (pm_runtime_status_suspended(dev)) {
>  		dev_dbg(dev, "pm_runtime status: suspended\n");
>  
>  		clock_stop_quirks = sdw->link_res->clock_stop_quirks;
> @@ -648,7 +648,7 @@
>  		if ((clock_stop_quirks & SDW_INTEL_CLK_STOP_BUS_RESET) ||
>  		    !clock_stop_quirks) {
>  
> -			if (pm_runtime_suspended(dev->parent)) {
> +			if (pm_runtime_status_suspended(dev->parent)) {
>  				/*
>  				 * paranoia check: this should not happen with the .prepare
>  				 * resume to full power
> @@ -715,7 +715,6 @@
>  	struct sdw_cdns *cdns = dev_get_drvdata(dev);
>  	struct sdw_intel *sdw = cdns_to_intel(cdns);
>  	struct sdw_bus *bus = &cdns->bus;
> -	int link_flags;
>  	int ret;
>  
>  	if (bus->prop.hw_disabled || !sdw->startup_done) {
> @@ -724,23 +723,6 @@
>  		return 0;
>  	}
>  
> -	if (pm_runtime_suspended(dev)) {
> -		dev_dbg(dev, "pm_runtime status was suspended, forcing active\n");
> -
> -		/* follow required sequence from runtime_pm.rst */
> -		pm_runtime_disable(dev);
> -		pm_runtime_set_active(dev);
> -		pm_runtime_mark_last_busy(dev);
> -		pm_runtime_enable(dev);
> -
> -		pm_runtime_resume(bus->dev);
> -
> -		link_flags = md_flags >> (bus->link_id * 8);
> -
> -		if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE))
> -			pm_runtime_idle(dev);
> -	}
> -
>  	ret = sdw_intel_link_power_up(sdw);
>  	if (ret) {
>  		dev_err(dev, "%s failed: %d\n", __func__, ret);
> @@ -761,6 +743,14 @@
>  	}
>  
>  	/*
> +	 * Runtime PM has been disabled in intel_suspend(), so set the status
> +	 * to active because the device has just been resumed and re-enable
> +	 * runtime PM.
> +	 */
> +	pm_runtime_set_active(dev);
> +	pm_runtime_enable(dev);
> +
> +	/*
>  	 * after system resume, the pm_runtime suspend() may kick in
>  	 * during the enumeration, before any children device force the
>  	 * master device to remain active.  Using pm_runtime_get()
> 
> 
> 


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
  2025-04-25 17:12 ` Pierre-Louis Bossart
@ 2025-04-25 18:10   ` Rafael J. Wysocki
  2025-04-25 18:43     ` Rafael J. Wysocki
  2025-05-07  8:51   ` Liao, Bard
  1 sibling, 1 reply; 7+ messages in thread
From: Rafael J. Wysocki @ 2025-04-25 18:10 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Rafael J. Wysocki, Vinod Koul, LKML, Linux PM, Bard Liao,
	Sanyog Kale, linux-sound

On Fri, Apr 25, 2025 at 7:14 PM Pierre-Louis Bossart
<pierre-louis.bossart@linux.dev> wrote:
>
> On 4/24/25 20:13, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > The code in intel_suspend() and intel_resume() needs to be properly
> > synchronized with runtime PM which is not the case currently, so fix
> > it.
> >
> > First of all, prevent runtime PM from triggering after intel_suspend()
> > has started because the changes made by it to the device might be
> > undone by a runtime resume of the device.  For this purpose, add a
> > pm_runtime_disable() call to intel_suspend().
>
> Allow me to push back on this, because we have to be very careful with a hidden state transition that needs to happen.
>
> If a controller was suspended by pm_runtime, it will enter the clock stop mode.
>
> If the system needs to suspend, the controller has to be forced to exit the clock stop mode and the bus has to restart before we can suspend it, and that's why we had those pm_runtime_resume().
>
> Disabling pm_runtime when entering system suspend would be problematic for Intel hardware, it's a known can of worms.

No, it wouldn't AFAICS.

> It's quite possible that some of the code in intel_suspend() is no longer required because the .prepare will resume the bus properly, but I wanted to make sure this state transition out of clock-stop is known and taken into consideration.

This patch doesn't change the functionality in intel_suspend(), it
just prevents runtime resume running in parallel with it or after it
from messing up with the hardware.

I don't see why it would be unsafe to do and please feel free to prove me wrong.

What can happen after adding this pm_runtime_disable() that will make
things not work?

> Bard, is this a configuration you've tested?
>
> > Next, once runtime PM has been disabled, the runtime PM status of the
> > device cannot change, so pm_runtime_status_suspended() can be used
> > instead of pm_runtime_suspended() in intel_suspend().
> >
> > Moreover, checking the runtime PM status of the device in intel_resume()
> > is pointless because the device is going to be resumed anyway and the
> > code running in the case when the pm_runtime_suspended() check in
> > intel_resume() returns 'true' is harmful.  Namely, calling
> > pm_runtime_resume() right after pm_runtime_set_active() has no effect
> > and calling pm_runtime_idle() on the device at that point is invalid
> > because the device is technically still suspended (it has not been
> > resumed yet).  Remove that code altogether along with the check leading
> > to it.
> >
> > Finally, call pm_runtime_set_active() at the end intel_resume() to set
> > the runtime PM status of the device to "active" because it has just been
> > resumed and re-enable runtime PM for it after that.
>
>
>
> > Additionally, drop the setting of DPM_FLAG_SMART_SUSPEND from the
> > driver because it has never been necessary and should not have been
> > done.
>
> IIRC it was your recommendation to add this flag many moons ago... No issue to remove it, it's just not something any of the 'SoundWire' folks are knowledgeable with.

But not in this driver.  It doesn't have any early-late or noirq
callbacks anyway.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
  2025-04-25 18:10   ` Rafael J. Wysocki
@ 2025-04-25 18:43     ` Rafael J. Wysocki
  2025-04-26 13:30       ` Rafael J. Wysocki
  2025-05-05 10:16       ` Pierre-Louis Bossart
  0 siblings, 2 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2025-04-25 18:43 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Rafael J. Wysocki, Vinod Koul, LKML, Linux PM, Bard Liao,
	linux-sound

On Fri, Apr 25, 2025 at 8:10 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Apr 25, 2025 at 7:14 PM Pierre-Louis Bossart
> <pierre-louis.bossart@linux.dev> wrote:
> >
> > On 4/24/25 20:13, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > >
> > > The code in intel_suspend() and intel_resume() needs to be properly
> > > synchronized with runtime PM which is not the case currently, so fix
> > > it.
> > >
> > > First of all, prevent runtime PM from triggering after intel_suspend()
> > > has started because the changes made by it to the device might be
> > > undone by a runtime resume of the device.  For this purpose, add a
> > > pm_runtime_disable() call to intel_suspend().
> >
> > Allow me to push back on this, because we have to be very careful with a hidden state transition that needs to happen.
> >
> > If a controller was suspended by pm_runtime, it will enter the clock stop mode.
> >
> > If the system needs to suspend, the controller has to be forced to exit the clock stop mode and the bus has to restart before we can suspend it, and that's why we had those pm_runtime_resume().
> >
> > Disabling pm_runtime when entering system suspend would be problematic for Intel hardware, it's a known can of worms.
>
> No, it wouldn't AFAICS.
>
> > It's quite possible that some of the code in intel_suspend() is no longer required because the .prepare will resume the bus properly, but I wanted to make sure this state transition out of clock-stop is known and taken into consideration.
>
> This patch doesn't change the functionality in intel_suspend(), it
> just prevents runtime resume running in parallel with it or after it
> from messing up with the hardware.
>
> I don't see why it would be unsafe to do and please feel free to prove me wrong.

Or just tell me what I'm missing in the reasoning below.

This code:

-    if (pm_runtime_suspended(dev)) {
-        dev_dbg(dev, "pm_runtime status was suspended, forcing active\n");
-
-        /* follow required sequence from runtime_pm.rst */
-        pm_runtime_disable(dev);
-        pm_runtime_set_active(dev);
-        pm_runtime_mark_last_busy(dev);
-        pm_runtime_enable(dev);
-
-        pm_runtime_resume(bus->dev);
-
-        link_flags = md_flags >> (bus->link_id * 8);
-
-        if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE))
-            pm_runtime_idle(dev);
-    }

that is being removed by my patch (because it is invalid - more about
that later) had never run before commit bca84a7b93fd ("PM: sleep: Use
DPM_FLAG_SMART_SUSPEND conditionally") because setting
DPM_FLAG_SMART_SUSPEND had caused the core to call
pm_runtime_set_active() on the device in the noirq resume phase, so it
had never been seen as runtime-suspended in intel_resume().  After
commit bca84a7b93fd the core doesn't do that any more, so if the
device has been runtime-suspended before intel_suspend() runs,
intel_resume() will see that its status is RPM_SUSPENDED.  The code in
question will run and it will crash and burn if
SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE is set in the link flags.

The reason why that code is invalid is because the
pm_runtime_set_active() call in it causes the status to change to
RPM_ACTIVE, but it doesn't actually change the state of the device
(that is still physically suspended).  The subsequent
pm_runtime_resume() sees that the status is RPM_ACTIVE and it doesn't
do anything.  At this point, the device is still physically suspended,
but its runtime PM status is RPM_ACTIVE, so if pm_runtime_idle() runs,
it will trigger an attempt to suspend and that will break because the
device is already suspended.

So this code had never run before and it demonstrably doesn't work, so
I don't see why removing it could be incorrect.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
  2025-04-25 18:43     ` Rafael J. Wysocki
@ 2025-04-26 13:30       ` Rafael J. Wysocki
  2025-05-05 10:16       ` Pierre-Louis Bossart
  1 sibling, 0 replies; 7+ messages in thread
From: Rafael J. Wysocki @ 2025-04-26 13:30 UTC (permalink / raw)
  To: Pierre-Louis Bossart
  Cc: Rafael J. Wysocki, Vinod Koul, LKML, Linux PM, Bard Liao,
	linux-sound

On Fri, Apr 25, 2025 at 8:43 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Fri, Apr 25, 2025 at 8:10 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
> >
> > On Fri, Apr 25, 2025 at 7:14 PM Pierre-Louis Bossart
> > <pierre-louis.bossart@linux.dev> wrote:
> > >
> > > On 4/24/25 20:13, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > >
> > > > The code in intel_suspend() and intel_resume() needs to be properly
> > > > synchronized with runtime PM which is not the case currently, so fix
> > > > it.
> > > >
> > > > First of all, prevent runtime PM from triggering after intel_suspend()
> > > > has started because the changes made by it to the device might be
> > > > undone by a runtime resume of the device.  For this purpose, add a
> > > > pm_runtime_disable() call to intel_suspend().
> > >
> > > Allow me to push back on this, because we have to be very careful with a hidden state transition that needs to happen.
> > >
> > > If a controller was suspended by pm_runtime, it will enter the clock stop mode.
> > >
> > > If the system needs to suspend, the controller has to be forced to exit the clock stop mode and the bus has to restart before we can suspend it, and that's why we had those pm_runtime_resume().
> > >
> > > Disabling pm_runtime when entering system suspend would be problematic for Intel hardware, it's a known can of worms.
> >
> > No, it wouldn't AFAICS.
> >
> > > It's quite possible that some of the code in intel_suspend() is no longer required because the .prepare will resume the bus properly, but I wanted to make sure this state transition out of clock-stop is known and taken into consideration.
> >
> > This patch doesn't change the functionality in intel_suspend(), it
> > just prevents runtime resume running in parallel with it or after it
> > from messing up with the hardware.
> >
> > I don't see why it would be unsafe to do and please feel free to prove me wrong.
>
> Or just tell me what I'm missing in the reasoning below.
>
> This code:
>
> -    if (pm_runtime_suspended(dev)) {
> -        dev_dbg(dev, "pm_runtime status was suspended, forcing active\n");
> -
> -        /* follow required sequence from runtime_pm.rst */
> -        pm_runtime_disable(dev);
> -        pm_runtime_set_active(dev);
> -        pm_runtime_mark_last_busy(dev);
> -        pm_runtime_enable(dev);
> -
> -        pm_runtime_resume(bus->dev);
> -
> -        link_flags = md_flags >> (bus->link_id * 8);
> -
> -        if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE))
> -            pm_runtime_idle(dev);
> -    }

Well, I actually missed a couple of things.

First off, the core increases the runtime PM reference counter for
every device in the "prepare" phase of system suspend and it is still
nonzero when the code above runs, so the pm_runtime_idle() call at the
end of it has no effect (the bottom line is that devices don't
runtime-suspend during system-wide suspend and resume transitions,
they can only runtime-resume then).

Second, the argument of the pm_runtime_resume() call in it is
bus->dev, not dev.  I have to admit ignorance regarding the reason why
this code attempts to resume a different device, but this is what
breaks things AFAICS.

The pm_runtime_mark_last_busy() call is kind of fine, but it is
redundant because it is repeated at the end of intel_resume() and even
if the autosuspend timer triggers between these two calls, it will not
cause the device to suspend ("devices don't runtime-suspend during
system-wide suspend and resume transitions").

The pm_runtime_set_active() is questionable because it takes place
before the device is technically activated, so if anything in the
meantime actually depended on it being active, it would break.

So I need to rewrite the changelog, but I still cannot find anything
problematic in the patch itself.

First, it removes some code that had not been run earlier and it
started to break things after it had been allowed to run (and which is
questionable for that matter).  Next, it adds protection against races
that probably don't happen in practice, but if they happened, they
would be problematic.  It also replaces two checks with simpler
versions of them that can be used at this point, nothing wrong with
that.  Finally, it makes intel_resume() call pm_runtime_set_active()
when it is needed because the device actually becomes active at the
end of that function and it removes the setting of a driver flag that
has no effect on the given device any more.

I'm going to resend it with a new changelog as a v2.

Thanks!

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
  2025-04-25 18:43     ` Rafael J. Wysocki
  2025-04-26 13:30       ` Rafael J. Wysocki
@ 2025-05-05 10:16       ` Pierre-Louis Bossart
  1 sibling, 0 replies; 7+ messages in thread
From: Pierre-Louis Bossart @ 2025-05-05 10:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Vinod Koul, LKML, Linux PM, Bard Liao,
	linux-sound



On 4/25/25 13:43, Rafael J. Wysocki wrote:
> On Fri, Apr 25, 2025 at 8:10 PM Rafael J. Wysocki <rafael@kernel.org> wrote:
>>
>> On Fri, Apr 25, 2025 at 7:14 PM Pierre-Louis Bossart
>> <pierre-louis.bossart@linux.dev> wrote:
>>>
>>> On 4/24/25 20:13, Rafael J. Wysocki wrote:
>>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>>
>>>> The code in intel_suspend() and intel_resume() needs to be properly
>>>> synchronized with runtime PM which is not the case currently, so fix
>>>> it.
>>>>
>>>> First of all, prevent runtime PM from triggering after intel_suspend()
>>>> has started because the changes made by it to the device might be
>>>> undone by a runtime resume of the device.  For this purpose, add a
>>>> pm_runtime_disable() call to intel_suspend().
>>>
>>> Allow me to push back on this, because we have to be very careful with a hidden state transition that needs to happen.
>>>
>>> If a controller was suspended by pm_runtime, it will enter the clock stop mode.
>>>
>>> If the system needs to suspend, the controller has to be forced to exit the clock stop mode and the bus has to restart before we can suspend it, and that's why we had those pm_runtime_resume().
>>>
>>> Disabling pm_runtime when entering system suspend would be problematic for Intel hardware, it's a known can of worms.
>>
>> No, it wouldn't AFAICS.

I was referring to the SoundWire controller. The states are different 
between pm_runtime suspend (clock is stopped but external wakes are 
supported) and system suspend (external wakes are not supported).

If the system suspend is entered while the device is already in 
pm_runtime suspend, then we have to perform a full resume before the 
system suspend.

I am not going to argue on how to perform this resume, just that it's 
required. The direction transition from pm_runtime suspend to system 
suspend is not supported.

>>> It's quite possible that some of the code in intel_suspend() is no longer required because the .prepare will resume the bus properly, but I wanted to make sure this state transition out of clock-stop is known and taken into consideration.
>>
>> This patch doesn't change the functionality in intel_suspend(), it
>> just prevents runtime resume running in parallel with it or after it
>> from messing up with the hardware.
>>
>> I don't see why it would be unsafe to do and please feel free to prove me wrong.
> 
> Or just tell me what I'm missing in the reasoning below.
> 
> This code:
> 
> -    if (pm_runtime_suspended(dev)) {
> -        dev_dbg(dev, "pm_runtime status was suspended, forcing active\n");
> -
> -        /* follow required sequence from runtime_pm.rst */
> -        pm_runtime_disable(dev);
> -        pm_runtime_set_active(dev);
> -        pm_runtime_mark_last_busy(dev);
> -        pm_runtime_enable(dev);
> -
> -        pm_runtime_resume(bus->dev);
> -
> -        link_flags = md_flags >> (bus->link_id * 8);
> -
> -        if (!(link_flags & SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE))
> -            pm_runtime_idle(dev);
> -    }
> 
> that is being removed by my patch (because it is invalid - more about
> that later) had never run before commit bca84a7b93fd ("PM: sleep: Use
> DPM_FLAG_SMART_SUSPEND conditionally") because setting
> DPM_FLAG_SMART_SUSPEND had caused the core to call
> pm_runtime_set_active() on the device in the noirq resume phase, so it
> had never been seen as runtime-suspended in intel_resume().  After
> commit bca84a7b93fd the core doesn't do that any more, so if the
> device has been runtime-suspended before intel_suspend() runs,
> intel_resume() will see that its status is RPM_SUSPENDED.  The code in
> question will run and it will crash and burn if
> SDW_INTEL_MASTER_DISABLE_PM_RUNTIME_IDLE is set in the link flags.
> 
> The reason why that code is invalid is because the
> pm_runtime_set_active() call in it causes the status to change to
> RPM_ACTIVE, but it doesn't actually change the state of the device
> (that is still physically suspended).  The subsequent
> pm_runtime_resume() sees that the status is RPM_ACTIVE and it doesn't
> do anything.  At this point, the device is still physically suspended,
> but its runtime PM status is RPM_ACTIVE, so if pm_runtime_idle() runs,
> it will trigger an attempt to suspend and that will break because the
> device is already suspended.
> 
> So this code had never run before and it demonstrably doesn't work, so
> I don't see why removing it could be incorrect.

I don't have enough knowledge to counter your arguments :-). I think we 
misread the documentation in runtime_pm.pst, this sort of sequence is 
mentioned but on system resume, and we applied it for system suspend as 
well.




^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling
  2025-04-25 17:12 ` Pierre-Louis Bossart
  2025-04-25 18:10   ` Rafael J. Wysocki
@ 2025-05-07  8:51   ` Liao, Bard
  1 sibling, 0 replies; 7+ messages in thread
From: Liao, Bard @ 2025-05-07  8:51 UTC (permalink / raw)
  To: Pierre-Louis Bossart, Rafael J. Wysocki, Vinod Koul
  Cc: LKML, Linux PM, Sanyog Kale, linux-sound, bard.liao



On 4/26/2025 1:12 AM, Pierre-Louis Bossart wrote:
> On 4/24/25 20:13, Rafael J. Wysocki wrote:
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> The code in intel_suspend() and intel_resume() needs to be properly
>> synchronized with runtime PM which is not the case currently, so fix
>> it.
>>
>> First of all, prevent runtime PM from triggering after intel_suspend()
>> has started because the changes made by it to the device might be
>> undone by a runtime resume of the device.  For this purpose, add a
>> pm_runtime_disable() call to intel_suspend().
> 
> Allow me to push back on this, because we have to be very careful with a hidden state transition that needs to happen.
> 
> If a controller was suspended by pm_runtime, it will enter the clock stop mode.
> 
> If the system needs to suspend, the controller has to be forced to exit the clock stop mode and the bus has to restart before we can suspend it, and that's why we had those pm_runtime_resume().
> 
> Disabling pm_runtime when entering system suspend would be problematic for Intel hardware, it's a known can of worms.
> 
> It's quite possible that some of the code in intel_suspend() is no longer required because the .prepare will resume the bus properly, but I wanted to make sure this state transition out of clock-stop is known and taken into consideration.
> 
> Bard, is this a configuration you've tested?


Sorry for the late reply. Yes, I tested jack detection in runtime
suspended. Also, the CI test is passed.


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-05-07  8:51 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-04-24 18:13 [PATCH v1] soundwire: intel_auxdevice: Fix system suspend/resume handling Rafael J. Wysocki
2025-04-25 17:12 ` Pierre-Louis Bossart
2025-04-25 18:10   ` Rafael J. Wysocki
2025-04-25 18:43     ` Rafael J. Wysocki
2025-04-26 13:30       ` Rafael J. Wysocki
2025-05-05 10:16       ` Pierre-Louis Bossart
2025-05-07  8:51   ` Liao, Bard

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).