[PATCH 0/12] PM / sleep: Driver flags for system suspend/resume

linux-i2c.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
@ 2017-10-16  1:12 Rafael J. Wysocki
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
                   ` (14 more replies)
  0 siblings, 15 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:12 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

Hi All,

Well, this took more time than expected, as I tried to cover everything I had
in mind regarding PM flags for drivers.

This work was triggered by attempts to fix and optimize PM in the
i2c-designware-platdev driver that ended up with adding a couple of
flags to the driver's internal data structures for the tracking of
device state (https://marc.info/?l=linux-acpi&m=150629646805636&w=2).
That approach is sort of suboptimal, though, because other drivers will
probably want to do similar things and if all of them need to use internal
flags for that, quite a bit of code duplication may ensue at least.

That can be avoided in a couple of ways and one of them is to provide a means
for drivers to tell the core what to do and to make the core take care of it
if told to do so.  Hence, the idea to use driver flags for system-wide PM
that was briefly discussed during the LPC in LA last month.

One of the flags considered at that time was to possibly cause the core
to reuse the runtime PM callback path of a device for system suspend/resume.
Admittedly, that idea didn't look too bad to me until I had started to try to
implement it and I got to the PCI bus type's hibernation callbacks.  Then, I
moved the patch I was working on to /dev/null right away.  I mean it.

No, this is not going to happen.  No way.

Moreover, that experience made me realize that the whole *idea* of using the
runtime PM callback path for system-wide PM was actually totally bogus (sorry
Ulf).

The whole point of having different callbacks pointers for different types of
device transitions is because it may be necessary to do different things in
those callbacks in general.  Now, if you consider runtime PM and system
suspend/resume *only* and from a driver perspective, then yes, in some cases
the same pair of callback routines may be used for all suspend-like and
resume-like transitions of the device, but if you add hibernation to the mix,
then it is not so clear any more unless the callbacks don't actually do any
power management at all, but simply quiesce the device's activity and then
activate it again.  Namely, changing power states of devices during the
hibernation's "freeze" and "thaw" transitions rarely makes sense at all and
the "restore" transition needs to be able to cope with uninitialized devices
(in fact, it should be prepared to cope with devices in *any* state), so
runtime PM is hardly suitable for them.  Still, if a *driver* choses to not
do any real PM in its PM callbacks and leaves that to a middle layer (quite
a few drivers do that), then it possibly can use one pair of callbacks in all
cases and be happy, but middle layers pretty much have to use different
callback routines for different transitions.

If you are a middle layer, your role is basically to do PM for a certain
group of devices.  Thus you cannot really do the same in ->suspend or
->suspend_early and in ->runtime_suspend (because the former generally need to
take device_may_wakeup() into account and the latter doesn't) and you shouldn't
really do the same in ->suspend and ->freeze (becuase the latter shouldn't
change the device's power state) and so on.  To put it bluntly, trying
to use the ->runtime_suspend callback of a middle layer for anything other
than runtime suspend is complete and utter nonsense.  At the same time, the
->runtime_resume callback of a middle layer may be reused to some extent,
but even that doesn't cover the "thaw" transitions during hibernation.

What can work (and this is the only strategy that can work AFAICS) is to
point different callback pointers *in* *a* *driver* to the same routine
if the driver wants to reuse that code.  That actually will work for PCI
and USB drivers today, at least most of the time, but unfortunately there
are problems with it for, say, platform devices.

The first problem is the requirement to track the status of the device
(suspended vs not suspended) in the callbacks, because the system-wide PM
code in the PM core doesn't do that.  The runtime PM framework does it, so
this means adding some extra code which isn't necessary for runtime PM to
the callback routines and that is not particularly nice.

The second problem is that, if the driver wants to do anything in its
->suspend callback, it generally has to prevent runtime suspend of the
device from taking place in parallel with that, which is quite cumbersome.
Usually, that is taken care of by resuming the device from runtime suspend
upfront, but generally doing that is wasteful (there may be no real need to
resume the device except for the fact that the code is designed this way).

On top of the above, there are optimizations to be made, like leaving certain
devices in suspend after system resume to avoid wasting time on waiting for
them to resume before user space can run again and similar.

This patch series focuses on addressing those problems so as to make it
easier to reuse callback routines by pointing different callback pointers
to them in device drivers.  The flags introduced here are to instruct the
PM core and middle layers (whatever they are) on how the driver wants the
device to be handled and then the driver has to provide callbacks to match
these instructions and the rest should be taken care of by the code above it.

The flags are introduced one by one to avoid making too many changes in
one go and to allow things to be explained better (hopefully).  They mostly
are mutually independent with some clearly documented exceptions.

The first three patches in the series are about an issue with the
direct-complete optimization introduced some time ago in which some middle
layers decide on whether or not to do the optimization without asking the
drivers.  And, as it turns out, in some cases the drivers actually know
better, so the new flags introduced by these patches are here for these
drivers (and the DPM_FLAG_NEVER_SKIP one is really to avoid having to define
->prepare callbacks always returning zero).

The really interesting things start to happen in patches [4-9/12] which make it
possible to avoid resuming devices from runtime suspend upfront during system
suspend at least in some cases (and when direct-complete is not applied to the
devices in question), but please refer to the changelogs for details.

The i2d-designware-platdev driver is used as the primary example in the series
and the patches modifying it are based on some previous changes currently in
linux-next AFAICS (the same applies to the intel-lpss driver), but these
patches can wait until everything is properly merged.  They are included here
mostly as illustration.

Overall, the series is based on the linux-next branch of the linux-pm.git tree
with some extra patches on top of it and all of the names of new entities
introduced in it are negotiable.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-16  5:34   ` Lukas Wunner
                     ` (4 more replies)
  2017-10-16  1:29 ` [PATCH 02/12] PCI / PM: Use the NEVER_SKIP driver flag Rafael J. Wysocki
                   ` (13 subsequent siblings)
  14 siblings, 5 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The motivation for this change is to provide a way to work around
a problem with the direct-complete mechanism used for avoiding
system suspend/resume handling for devices in runtime suspend.

The problem is that some middle layer code (the PCI bus type and
the ACPI PM domain in particular) returns positive values from its
system suspend ->prepare callbacks regardless of whether the driver's
->prepare returns a positive value or 0, which effectively prevents
drivers from being able to control the direct-complete feature.
Some drivers need that control, however, and the PCI bus type has
grown its own flag to deal with this issue, but since it is not
limited to PCI, it is better to address it by adding driver flags at
the core level.

To that end, add a driver_flags field to struct dev_pm_info for flags
that can be set by device drivers at the probe time to inform the PM
core and/or bus types, PM domains and so on on the capabilities and/or
preferences of device drivers.  Also add two static inline helpers
for setting that field and testing it against a given set of flags
and make the driver core clear it automatically on driver remove
and probe failures.

Define and document two PM driver flags related to the direct-
complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
respectively, to indicate to the PM core that the direct-complete
mechanism should never be used for the device and to inform the
middle layer code (bus types, PM domains etc) that it can only
request the PM core to use the direct-complete mechanism for
the device (by returning a positive value from its ->prepare
callback) if it also has been requested by the driver.

While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/driver-api/pm/devices.rst |   14 ++++++++++++++
 Documentation/power/pci.txt             |   19 +++++++++++++++++++
 drivers/acpi/device_pm.c                |    3 +++
 drivers/base/dd.c                       |    2 ++
 drivers/base/power/main.c               |    4 +++-
 drivers/pci/pci-driver.c                |    5 ++++-
 include/linux/device.h                  |   10 ++++++++++
 include/linux/pm.h                      |   20 ++++++++++++++++++++
 8 files changed, 75 insertions(+), 2 deletions(-)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -1070,6 +1070,16 @@ static inline void dev_pm_syscore_device
 #endif
 }
 
+static inline void dev_pm_set_driver_flags(struct device *dev, unsigned int flags)
+{
+	dev->power.driver_flags = flags;
+}
+
+static inline bool dev_pm_test_driver_flags(struct device *dev, unsigned int flags)
+{
+	return !!(dev->power.driver_flags & flags);
+}
+
 static inline void device_lock(struct device *dev)
 {
 	mutex_lock(&dev->mutex);
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -550,6 +550,25 @@ struct pm_subsys_data {
 #endif
 };
 
+/*
+ * Driver flags to control system suspend/resume behavior.
+ *
+ * These flags can be set by device drivers at the probe time.  They need not be
+ * cleared by the drivers as the driver core will take care of that.
+ *
+ * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
+ * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
+ *
+ * Setting SMART_PREPARE instructs bus types and PM domains which may want
+ * system suspend/resume callbacks to be skipped for the device to return 0 from
+ * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in
+ * other words, the system suspend/resume callbacks can only be skipped for the
+ * device if its driver doesn't object against that).  This flag has no effect
+ * if NEVER_SKIP is set.
+ */
+#define DPM_FLAG_NEVER_SKIP	BIT(0)
+#define DPM_FLAG_SMART_PREPARE	BIT(1)
+
 struct dev_pm_info {
 	pm_message_t		power_state;
 	unsigned int		can_wakeup:1;
@@ -561,6 +580,7 @@ struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			early_init:1;	/* Owned by the PM core */
 	bool			direct_complete:1;	/* Owned by the PM core */
+	unsigned int		driver_flags;
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
Index: linux-pm/drivers/base/dd.c
===================================================================
--- linux-pm.orig/drivers/base/dd.c
+++ linux-pm/drivers/base/dd.c
@@ -464,6 +464,7 @@ pinctrl_bind_failed:
 	if (dev->pm_domain && dev->pm_domain->dismiss)
 		dev->pm_domain->dismiss(dev);
 	pm_runtime_reinit(dev);
+	dev_pm_set_driver_flags(dev, 0);
 
 	switch (ret) {
 	case -EPROBE_DEFER:
@@ -869,6 +870,7 @@ static void __device_release_driver(stru
 		if (dev->pm_domain && dev->pm_domain->dismiss)
 			dev->pm_domain->dismiss(dev);
 		pm_runtime_reinit(dev);
+		dev_pm_set_driver_flags(dev, 0);
 
 		klist_remove(&dev->p->knode_driver);
 		device_pm_check_callbacks(dev);
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -1700,7 +1700,9 @@ unlock:
 	 * applies to suspend transitions, however.
 	 */
 	spin_lock_irq(&dev->power.lock);
-	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND;
+	dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
+		pm_runtime_suspended(dev) && ret > 0 &&
+		!dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
 	spin_unlock_irq(&dev->power.lock);
 	return 0;
 }
Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -682,8 +682,11 @@ static int pci_pm_prepare(struct device
 
 	if (drv && drv->pm && drv->pm->prepare) {
 		int error = drv->pm->prepare(dev);
-		if (error)
+		if (error < 0)
 			return error;
+
+		if (!error && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
+			return 0;
 	}
 	return pci_dev_keep_suspended(to_pci_dev(dev));
 }
Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -965,6 +965,9 @@ int acpi_subsys_prepare(struct device *d
 	if (ret < 0)
 		return ret;
 
+	if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
+		return 0;
+
 	if (!adev || !pm_runtime_suspended(dev))
 		return 0;
 
Index: linux-pm/Documentation/driver-api/pm/devices.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/pm/devices.rst
+++ linux-pm/Documentation/driver-api/pm/devices.rst
@@ -354,6 +354,20 @@ the phases are: ``prepare``, ``suspend``
 	is because all such devices are initially set to runtime-suspended with
 	runtime PM disabled.
 
+	This feature also can be controlled by device drivers by using the
+	``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power
+	management flags.  [Typically, they are set at the time the driver is
+	probed against the device in question by passing them to the
+	:c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
+	tese flags is set, the PM core will not apply the direct-complete
+	proceudre described above to the given device and, consequenty, to any
+	of its ancestors.  The second flag, when set, informs the middle layer
+	code (bus types, device types, PM domains, classes) that it should take
+	the return value of the ``->prepare`` callback provided by the driver
+	into account and it may only return a positive value from its own
+	``->prepare`` callback if the driver's one also has returned a positive
+	value.
+
     2.	The ``->suspend`` methods should quiesce the device to stop it from
 	performing I/O.  They also may save the device registers and put it into
 	the appropriate low-power state, depending on the bus type the device is
Index: linux-pm/Documentation/power/pci.txt
===================================================================
--- linux-pm.orig/Documentation/power/pci.txt
+++ linux-pm/Documentation/power/pci.txt
@@ -961,6 +961,25 @@ dev_pm_ops to indicate that one suspend
 .suspend(), .freeze(), and .poweroff() members and one resume routine is to
 be pointed to by the .resume(), .thaw(), and .restore() members.
 
+3.1.19. Driver Flags for Power Management
+
+The PM core allows device drivers to set flags that influence the handling of
+power management for the devices by the core itself and by middle layer code
+including the PCI bus type.  The flags should be set once at the driver probe
+time with the help of the dev_pm_set_driver_flags() function and they should not
+be updated directly afterwards.
+
+The DPM_FLAG_NEVER_SKIP flag prevents the PM core from using the direct-complete
+mechanism allowing device suspend/resume callbacks to be skipped if the device
+is in runtime suspend when the system suspend starts.  That also affects all of
+the ancestors of the device, so this flag should only be used if absolutely
+necessary.
+
+The DPM_FLAG_SMART_PREPARE flag instructs the PCI bus type to only return a
+positive value from pci_pm_prepare() if the ->prepare callback provided by the
+driver of the device returns a positive value.  That allows the driver to opt
+out from using the direct-complete mechanism dynamically.
+
 3.2. Device Runtime Power Management
 ------------------------------------
 In addition to providing device power management callbacks PCI device drivers

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 02/12] PCI / PM: Use the NEVER_SKIP driver flag
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-23 16:40   ` Ulf Hansson
  2017-10-16  1:29 ` [PATCH 03/12] PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE Rafael J. Wysocki
                   ` (12 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Replace the PCI-specific flag PCI_DEV_FLAGS_NEEDS_RESUME with the
PM core's DPM_FLAG_NEVER_SKIP one everywhere and drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/gpu/drm/i915/i915_drv.c |    2 +-
 drivers/misc/mei/pci-me.c       |    2 +-
 drivers/misc/mei/pci-txe.c      |    2 +-
 drivers/pci/pci.c               |    3 +--
 include/linux/pci.h             |    7 +------
 5 files changed, 5 insertions(+), 11 deletions(-)

Index: linux-pm/include/linux/pci.h
===================================================================
--- linux-pm.orig/include/linux/pci.h
+++ linux-pm/include/linux/pci.h
@@ -205,13 +205,8 @@ enum pci_dev_flags {
 	PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT = (__force pci_dev_flags_t) (1 << 9),
 	/* Do not use FLR even if device advertises PCI_AF_CAP */
 	PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
-	/*
-	 * Resume before calling the driver's system suspend hooks, disabling
-	 * the direct_complete optimization.
-	 */
-	PCI_DEV_FLAGS_NEEDS_RESUME = (__force pci_dev_flags_t) (1 << 11),
 	/* Don't use Relaxed Ordering for TLPs directed at this device */
-	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 12),
+	PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
 };
 
 enum pci_irq_reroute_variant {
Index: linux-pm/drivers/pci/pci.c
===================================================================
--- linux-pm.orig/drivers/pci/pci.c
+++ linux-pm/drivers/pci/pci.c
@@ -2166,8 +2166,7 @@ bool pci_dev_keep_suspended(struct pci_d
 
 	if (!pm_runtime_suspended(dev)
 	    || pci_target_state(pci_dev, wakeup) != pci_dev->current_state
-	    || platform_pci_need_resume(pci_dev)
-	    || (pci_dev->dev_flags & PCI_DEV_FLAGS_NEEDS_RESUME))
+	    || platform_pci_need_resume(pci_dev))
 		return false;
 
 	/*
Index: linux-pm/drivers/gpu/drm/i915/i915_drv.c
===================================================================
--- linux-pm.orig/drivers/gpu/drm/i915/i915_drv.c
+++ linux-pm/drivers/gpu/drm/i915/i915_drv.c
@@ -1304,7 +1304,7 @@ int i915_driver_load(struct pci_dev *pde
 	 * becaue the HDA driver may require us to enable the audio power
 	 * domain during system suspend.
 	 */
-	pdev->dev_flags |= PCI_DEV_FLAGS_NEEDS_RESUME;
+	dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP);
 
 	ret = i915_driver_init_early(dev_priv, ent);
 	if (ret < 0)
Index: linux-pm/drivers/misc/mei/pci-txe.c
===================================================================
--- linux-pm.orig/drivers/misc/mei/pci-txe.c
+++ linux-pm/drivers/misc/mei/pci-txe.c
@@ -141,7 +141,7 @@ static int mei_txe_probe(struct pci_dev
 	 * MEI requires to resume from runtime suspend mode
 	 * in order to perform link reset flow upon system suspend.
 	 */
-	pdev->dev_flags |= PCI_DEV_FLAGS_NEEDS_RESUME;
+	dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP);
 
 	/*
 	* For not wake-able HW runtime pm framework
Index: linux-pm/drivers/misc/mei/pci-me.c
===================================================================
--- linux-pm.orig/drivers/misc/mei/pci-me.c
+++ linux-pm/drivers/misc/mei/pci-me.c
@@ -223,7 +223,7 @@ static int mei_me_probe(struct pci_dev *
 	 * MEI requires to resume from runtime suspend mode
 	 * in order to perform link reset flow upon system suspend.
 	 */
-	pdev->dev_flags |= PCI_DEV_FLAGS_NEEDS_RESUME;
+	dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP);
 
 	/*
 	* For not wake-able HW runtime pm framework

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 03/12] PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
  2017-10-16  1:29 ` [PATCH 02/12] PCI / PM: Use the NEVER_SKIP driver flag Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-23 16:57   ` Ulf Hansson
  2017-10-16  1:29 ` [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag Rafael J. Wysocki
                   ` (11 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Modify i2c-designware-platdrv to set DPM_FLAG_SMART_PREPARE for its
devices and return 0 from the system suspend ->prepare callback
if the device has an ACPI companion object in order to tell the PM
core and middle layers to avoid skipping system suspend/resume
callbacks for the device in that case (which may be problematic,
because the device may be accessed during suspend and resume of
other devices via I2C operation regions then).

Also the pm_runtime_suspended() check in dw_i2c_plat_prepare()
is not necessary any more, because the core does it when setting
power.direct_complete for the device, so drop it.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/i2c/busses/i2c-designware-platdrv.c |   10 +++++++++-
 1 file changed, 9 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/i2c/busses/i2c-designware-platdrv.c
===================================================================
--- linux-pm.orig/drivers/i2c/busses/i2c-designware-platdrv.c
+++ linux-pm/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -370,6 +370,8 @@ static int dw_i2c_plat_probe(struct plat
 	ACPI_COMPANION_SET(&adap->dev, ACPI_COMPANION(&pdev->dev));
 	adap->dev.of_node = pdev->dev.of_node;
 
+	dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_SMART_PREPARE);
+
 	/* The code below assumes runtime PM to be disabled. */
 	WARN_ON(pm_runtime_enabled(&pdev->dev));
 
@@ -433,7 +435,13 @@ MODULE_DEVICE_TABLE(of, dw_i2c_of_match)
 #ifdef CONFIG_PM_SLEEP
 static int dw_i2c_plat_prepare(struct device *dev)
 {
-	return pm_runtime_suspended(dev);
+	/*
+	 * If the ACPI companion device object is present for this device, it
+	 * may be accessed during suspend and resume of other devices via I2C
+	 * operation regions, so tell the PM core and middle layers to avoid
+	 * skipping system suspend/resume callbacks for it in that case.
+	 */
+	return !has_acpi_companion(dev);
 }
 
 static void dw_i2c_plat_complete(struct device *dev)



^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (2 preceding siblings ...)
  2017-10-16  1:29 ` [PATCH 03/12] PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-23 19:01   ` Ulf Hansson
  2017-10-24  5:22   ` Ulf Hansson
  2017-10-16  1:29 ` [PATCH 05/12] PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks Rafael J. Wysocki
                   ` (10 subsequent siblings)
  14 siblings, 2 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Define and document a SMART_SUSPEND flag to instruct bus types and PM
domains that the system suspend callbacks provided by the driver can
cope with runtime-suspended devices, so from the driver's perspective
it should be safe to leave devices in runtime suspend during system
suspend.

Setting that flag also causes the PM core to skip the "late" and
"noirq" phases of device suspend for devices that remain in runtime
suspend at the beginning of the "late" phase (when runtime PM has
been disabled for them) under the assumption that their state cannot
(and should not) change after that point until the system suspend
transition is complete.  Moreover, the PM core prevents runtime PM
from acting on devices with DPM_FLAG_SMART_SUSPEND during system
resume by setting their runtime PM status to "active" at the end of
the "early" phase (right prior to enabling runtime PM for them).
That allows system resume callbacks to do whatever is necessary to
resume the device without worrying about runtime PM possibly
running in parallel with them.

However, that doesn't apply to transitions involving ->thaw_noirq,
->thaw_early and ->thaw callbacks during hibernation, as they
generally are not expected to change the power states of devices.
Consequently, if a device is in runtime suspend at the beginning
of such a transition, it must stay in runtime suspend until the
"complete" phase of it (since the callbacks may not change its
power state).

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/driver-api/pm/devices.rst |   17 ++++++++
 drivers/base/power/main.c               |   63 ++++++++++++++++++++++++++++----
 include/linux/pm.h                      |    9 ++++
 3 files changed, 82 insertions(+), 7 deletions(-)

Index: linux-pm/Documentation/driver-api/pm/devices.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/pm/devices.rst
+++ linux-pm/Documentation/driver-api/pm/devices.rst
@@ -766,6 +766,23 @@ the state of devices (possibly except fo
 from their ``->prepare`` and ``->suspend`` callbacks (or equivalent) *before*
 invoking device drivers' ``->suspend`` callbacks (or equivalent).
 
+Some bus types and PM domains have a policy to resume all devices from runtime
+suspend upfront in their ``->suspend`` callbacks, but that may not be really
+necessary if the system suspend-resume callbacks provided by the device's
+driver can cope with runtime-suspended devices.  The driver can indicate that
+by setting ``DPM_FLAG_SMART_SUSPEND`` in :c:member:`power.driver_flags` at the
+probe time, by passing it to the :c:func:`dev_pm_set_driver_flags` helper.  That
+also causes the PM core to skip the ``suspend_late`` and ``suspend_noirq``
+phases of device suspend for the device if it remains in runtime suspend at the
+beginning of the ``suspend_late`` phase (when runtime PM has been disabled for
+it) under the assumption that its state cannot (and should not) change after
+that point until the system-wide transition is over.  Moreover, the PM core
+updates the runtime power management status of devices with
+``DPM_FLAG_SMART_SUSPEND`` set to "active" at the end of the ``resume_early``
+phase of device resume (right prior to enabling runtime PM for them) in order
+to prevent runtime PM from acting on them before the ``complete`` phase, which
+means that they should be put into the full-power state before that phase.
+
 During system-wide resume from a sleep state it's easiest to put devices into
 the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
 Refer to that document for more information regarding this particular issue as
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -558,6 +558,7 @@ struct pm_subsys_data {
  *
  * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
  * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
+ * SMART_SUSPEND: No need to resume the device from runtime suspend.
  *
  * Setting SMART_PREPARE instructs bus types and PM domains which may want
  * system suspend/resume callbacks to be skipped for the device to return 0 from
@@ -565,9 +566,17 @@ struct pm_subsys_data {
  * other words, the system suspend/resume callbacks can only be skipped for the
  * device if its driver doesn't object against that).  This flag has no effect
  * if NEVER_SKIP is set.
+ *
+ * Setting SMART_SUSPEND instructs bus types and PM domains which may want to
+ * runtime resume the device upfront during system suspend that doing so is not
+ * necessary from the driver's perspective.  It also causes the PM core to skip
+ * the "late" and "noirq" phases of device suspend for the device if it remains
+ * in runtime suspend at the beginning of the "late" phase (when runtime PM has
+ * been disabled for it).
  */
 #define DPM_FLAG_NEVER_SKIP	BIT(0)
 #define DPM_FLAG_SMART_PREPARE	BIT(1)
+#define DPM_FLAG_SMART_SUSPEND	BIT(2)
 
 struct dev_pm_info {
 	pm_message_t		power_state;
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -551,6 +551,18 @@ static int device_resume_noirq(struct de
 	if (!dev->power.is_noirq_suspended)
 		goto Out;
 
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+	    pm_runtime_status_suspended(dev) && (state.event == PM_EVENT_THAW ||
+	    state.event == PM_EVENT_RECOVER)) {
+		/*
+		 * The device has to stay in runtime suspend, because the
+		 * subsequent callbacks may not try to change its power state.
+		 */
+		dev->power.is_suspended = false;
+		dev->power.is_late_suspended = false;
+		goto Skip;
+	}
+
 	dpm_wait_for_superior(dev, async);
 
 	if (dev->pm_domain) {
@@ -573,9 +585,11 @@ static int device_resume_noirq(struct de
 	}
 
 	error = dpm_run_callback(callback, dev, state, info);
+
+Skip:
 	dev->power.is_noirq_suspended = false;
 
- Out:
+Out:
 	complete_all(&dev->power.completion);
 	TRACE_RESUME(error);
 	return error;
@@ -715,6 +729,14 @@ static int device_resume_early(struct de
 	error = dpm_run_callback(callback, dev, state, info);
 	dev->power.is_late_suspended = false;
 
+	/*
+	 * Devices with DPM_FLAG_SMART_SUSPEND may be left in runtime suspend
+	 * during system suspend, so update their runtime PM status to "active"
+	 * to prevent runtime PM from acting on them before device_complete().
+	 */
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+		pm_runtime_set_active(dev);
+
  Out:
 	TRACE_RESUME(error);
 
@@ -1107,6 +1129,15 @@ static int __device_suspend_noirq(struct
 	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
+	/*
+	 * The state of devices with DPM_FLAG_SMART_SUSPEND set that remain in
+	 * runtime suspend at this point cannot change going forward, so skip
+	 * the callback invocation for them.
+	 */
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+	    pm_runtime_status_suspended(dev))
+		goto Skip;
+
 	if (dev->pm_domain) {
 		info = "noirq power domain ";
 		callback = pm_noirq_op(&dev->pm_domain->ops, state);
@@ -1127,10 +1158,13 @@ static int __device_suspend_noirq(struct
 	}
 
 	error = dpm_run_callback(callback, dev, state, info);
-	if (!error)
-		dev->power.is_noirq_suspended = true;
-	else
+	if (error) {
 		async_error = error;
+		goto Complete;
+	}
+
+Skip:
+	dev->power.is_noirq_suspended = true;
 
 Complete:
 	complete_all(&dev->power.completion);
@@ -1268,6 +1302,15 @@ static int __device_suspend_late(struct
 	if (dev->power.syscore || dev->power.direct_complete)
 		goto Complete;
 
+	/*
+	 * The state of devices with DPM_FLAG_SMART_SUSPEND set that remain in
+	 * runtime suspend at this point cannot change going forward, so skip
+	 * the callback invocation for them.
+	 */
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+	    pm_runtime_status_suspended(dev))
+		goto Skip;
+
 	if (dev->pm_domain) {
 		info = "late power domain ";
 		callback = pm_late_early_op(&dev->pm_domain->ops, state);
@@ -1288,10 +1331,13 @@ static int __device_suspend_late(struct
 	}
 
 	error = dpm_run_callback(callback, dev, state, info);
-	if (!error)
-		dev->power.is_late_suspended = true;
-	else
+	if (error) {
 		async_error = error;
+		goto Complete;
+	}
+
+Skip:
+	dev->power.is_late_suspended = true;
 
 Complete:
 	TRACE_SUSPEND(error);
@@ -1652,6 +1698,9 @@ static int device_prepare(struct device
 	if (dev->power.syscore)
 		return 0;
 
+	WARN_ON(dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+		!pm_runtime_enabled(dev));
+
 	/*
 	 * If a device's parent goes into runtime suspend at the wrong time,
 	 * it won't be possible to resume the device.  To prevent this we

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 05/12] PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (3 preceding siblings ...)
  2017-10-16  1:29 ` [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-23 19:06   ` Ulf Hansson
  2017-10-16  1:29 ` [PATCH 06/12] PCI / PM: Take SMART_SUSPEND driver flag into account Rafael J. Wysocki
                   ` (9 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The only user of non-empty pcibios_pm_ops is s390 and it only uses
"noirq" callbacks, so drop the invocations of the other pcibios_pm_ops
callbacks from the PCI PM code.

That will allow subsequent changes to be somewhat simpler.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/pci/pci-driver.c |   18 ------------------
 1 file changed, 18 deletions(-)

Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -918,9 +918,6 @@ static int pci_pm_freeze(struct device *
 			return error;
 	}
 
-	if (pcibios_pm_ops.freeze)
-		return pcibios_pm_ops.freeze(dev);
-
 	return 0;
 }
 
@@ -982,12 +979,6 @@ static int pci_pm_thaw(struct device *de
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 	int error = 0;
 
-	if (pcibios_pm_ops.thaw) {
-		error = pcibios_pm_ops.thaw(dev);
-		if (error)
-			return error;
-	}
-
 	if (pci_has_legacy_pm_support(pci_dev))
 		return pci_legacy_resume(dev);
 
@@ -1032,9 +1023,6 @@ static int pci_pm_poweroff(struct device
  Fixup:
 	pci_fixup_device(pci_fixup_suspend, pci_dev);
 
-	if (pcibios_pm_ops.poweroff)
-		return pcibios_pm_ops.poweroff(dev);
-
 	return 0;
 }
 
@@ -1107,12 +1095,6 @@ static int pci_pm_restore(struct device
 	const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
 	int error = 0;
 
-	if (pcibios_pm_ops.restore) {
-		error = pcibios_pm_ops.restore(dev);
-		if (error)
-			return error;
-	}
-
 	/*
 	 * This is necessary for the hibernation error path in which restore is
 	 * called without restoring the standard config registers of the device.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 06/12] PCI / PM: Take SMART_SUSPEND driver flag into account
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (4 preceding siblings ...)
  2017-10-16  1:29 ` [PATCH 05/12] PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-16  1:29 ` [PATCH 07/12] ACPI / LPSS: Consolidate runtime PM and system sleep handling Rafael J. Wysocki
                   ` (8 subsequent siblings)
  14 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make the PCI bus type take DPM_FLAG_SMART_SUSPEND into account in its
system suspend callbacks and make sure that all code that should not
run in parallel with pci_pm_runtime_resume() is executed in the "late"
phases of system suspend, freeze and poweroff transitions.

[Note that the pm_runtime_suspended() check in pci_dev_keep_suspended()
is an optimization, because if is not passed, all of the subsequent
checks may be skipped and some of them are much more overhead in
general.]

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
---
 Documentation/power/pci.txt |    6 ++++
 drivers/pci/pci-driver.c    |   56 ++++++++++++++++++++++++++++++--------------
 2 files changed, 45 insertions(+), 17 deletions(-)

Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -727,18 +727,25 @@ static int pci_pm_suspend(struct device
 
 	if (!pm) {
 		pci_pm_default_suspend(pci_dev);
-		goto Fixup;
+		return 0;
 	}
 
 	/*
-	 * PCI devices suspended at run time need to be resumed at this point,
-	 * because in general it is necessary to reconfigure them for system
-	 * suspend.  Namely, if the device is supposed to wake up the system
-	 * from the sleep state, we may need to reconfigure it for this purpose.
-	 * In turn, if the device is not supposed to wake up the system from the
-	 * sleep state, we'll have to prevent it from signaling wake-up.
+	 * PCI devices suspended at run time may need to be resumed at this
+	 * point, because in general it may be necessary to reconfigure them for
+	 * system suspend.  Namely, if the device is expected to wake up the
+	 * system from the sleep state, it may have to be reconfigured for this
+	 * purpose, or if the device is not expected to wake up the system from
+	 * the sleep state, it should be prevented from signaling wakeup events
+	 * going forward.
+	 *
+	 * Also if the driver of the device does not indicate that its system
+	 * suspend callbacks can cope with runtime-suspended devices, it is
+	 * better to resume the device from runtime suspend here.
 	 */
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
+	    !pci_dev_keep_suspended(pci_dev))
+		pm_runtime_resume(dev);
 
 	pci_dev->state_saved = false;
 	if (pm->suspend) {
@@ -758,12 +765,16 @@ static int pci_pm_suspend(struct device
 		}
 	}
 
- Fixup:
-	pci_fixup_device(pci_fixup_suspend, pci_dev);
-
 	return 0;
 }
 
+static int pci_pm_suspend_late(struct device *dev)
+{
+	pci_fixup_device(pci_fixup_suspend, to_pci_dev(dev));
+
+	return pm_generic_suspend_late(dev);;
+}
+
 static int pci_pm_suspend_noirq(struct device *dev)
 {
 	struct pci_dev *pci_dev = to_pci_dev(dev);
@@ -872,6 +883,7 @@ static int pci_pm_resume(struct device *
 #else /* !CONFIG_SUSPEND */
 
 #define pci_pm_suspend		NULL
+#define pci_pm_suspend_late	NULL
 #define pci_pm_suspend_noirq	NULL
 #define pci_pm_resume		NULL
 #define pci_pm_resume_noirq	NULL
@@ -906,7 +918,8 @@ static int pci_pm_freeze(struct device *
 	 * devices should not be touched during freeze/thaw transitions,
 	 * however.
 	 */
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+		pm_runtime_resume(dev);
 
 	pci_dev->state_saved = false;
 	if (pm->freeze) {
@@ -1004,11 +1017,13 @@ static int pci_pm_poweroff(struct device
 
 	if (!pm) {
 		pci_pm_default_suspend(pci_dev);
-		goto Fixup;
+		return 0;
 	}
 
 	/* The reason to do that is the same as in pci_pm_suspend(). */
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
+	    !pci_dev_keep_suspended(pci_dev))
+		pm_runtime_resume(dev);
 
 	pci_dev->state_saved = false;
 	if (pm->poweroff) {
@@ -1020,12 +1035,16 @@ static int pci_pm_poweroff(struct device
 			return error;
 	}
 
- Fixup:
-	pci_fixup_device(pci_fixup_suspend, pci_dev);
-
 	return 0;
 }
 
+static int pci_pm_poweroff_late(struct device *dev)
+{
+	pci_fixup_device(pci_fixup_suspend, to_pci_dev(dev));
+
+	return pm_generic_poweroff_late(dev);
+}
+
 static int pci_pm_poweroff_noirq(struct device *dev)
 {
 	struct pci_dev *pci_dev = to_pci_dev(dev);
@@ -1124,6 +1143,7 @@ static int pci_pm_restore(struct device
 #define pci_pm_thaw		NULL
 #define pci_pm_thaw_noirq	NULL
 #define pci_pm_poweroff		NULL
+#define pci_pm_poweroff_late	NULL
 #define pci_pm_poweroff_noirq	NULL
 #define pci_pm_restore		NULL
 #define pci_pm_restore_noirq	NULL
@@ -1239,10 +1259,12 @@ static const struct dev_pm_ops pci_dev_p
 	.prepare = pci_pm_prepare,
 	.complete = pci_pm_complete,
 	.suspend = pci_pm_suspend,
+	.suspend_late = pci_pm_suspend_late,
 	.resume = pci_pm_resume,
 	.freeze = pci_pm_freeze,
 	.thaw = pci_pm_thaw,
 	.poweroff = pci_pm_poweroff,
+	.poweroff_late = pci_pm_poweroff_late,
 	.restore = pci_pm_restore,
 	.suspend_noirq = pci_pm_suspend_noirq,
 	.resume_noirq = pci_pm_resume_noirq,
Index: linux-pm/Documentation/power/pci.txt
===================================================================
--- linux-pm.orig/Documentation/power/pci.txt
+++ linux-pm/Documentation/power/pci.txt
@@ -980,6 +980,12 @@ positive value from pci_pm_prepare() if
 driver of the device returns a positive value.  That allows the driver to opt
 out from using the direct-complete mechanism dynamically.
 
+The DPM_FLAG_SMART_SUSPEND flag tells the PCI bus type that from the driver's
+perspective the device can be safely left in runtime suspend during system
+suspend.  That causes pci_pm_suspend(), pci_pm_freeze() and pci_pm_poweroff()
+to skip resuming the device from runtime suspend unless there are PCI-specific
+reasons for doing that.
+
 3.2. Device Runtime Power Management
 ------------------------------------
 In addition to providing device power management callbacks PCI device drivers

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 07/12] ACPI / LPSS: Consolidate runtime PM and system sleep handling
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (5 preceding siblings ...)
  2017-10-16  1:29 ` [PATCH 06/12] PCI / PM: Take SMART_SUSPEND driver flag into account Rafael J. Wysocki
@ 2017-10-16  1:29 ` Rafael J. Wysocki
  2017-10-23 19:09   ` Ulf Hansson
  2017-10-16  1:30 ` [PATCH 08/12] ACPI / PM: Take SMART_SUSPEND driver flag into account Rafael J. Wysocki
                   ` (7 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:29 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Move the LPSS-specific code from acpi_lpss_runtime_suspend()
and acpi_lpss_runtime_resume() into separate functions,
acpi_lpss_suspend() and acpi_lpss_resume(), respectively, and
make acpi_lpss_suspend_late() and acpi_lpss_resume_early() use
them too in order to unify the runtime PM and system sleep
handling in the LPSS driver.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

This is based on an RFC I posted some time ago
(https://patchwork.kernel.org/patch/9998147/), which didn't
receive any comments and it depends on a couple of ACPI device PM
patches posted recently (https://patchwork.kernel.org/patch/10006457/
in particular).

It's included in this series, because the next patch won't work without it.

---
 drivers/acpi/acpi_lpss.c |   75 ++++++++++++++++++++---------------------------
 1 file changed, 33 insertions(+), 42 deletions(-)

Index: linux-pm/drivers/acpi/acpi_lpss.c
===================================================================
--- linux-pm.orig/drivers/acpi/acpi_lpss.c
+++ linux-pm/drivers/acpi/acpi_lpss.c
@@ -716,40 +716,6 @@ static void acpi_lpss_dismiss(struct dev
 	acpi_dev_suspend(dev, false);
 }
 
-#ifdef CONFIG_PM_SLEEP
-static int acpi_lpss_suspend_late(struct device *dev)
-{
-	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
-	int ret;
-
-	ret = pm_generic_suspend_late(dev);
-	if (ret)
-		return ret;
-
-	if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
-		acpi_lpss_save_ctx(dev, pdata);
-
-	return acpi_dev_suspend(dev, device_may_wakeup(dev));
-}
-
-static int acpi_lpss_resume_early(struct device *dev)
-{
-	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
-	int ret;
-
-	ret = acpi_dev_resume(dev);
-	if (ret)
-		return ret;
-
-	acpi_lpss_d3_to_d0_delay(pdata);
-
-	if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
-		acpi_lpss_restore_ctx(dev, pdata);
-
-	return pm_generic_resume_early(dev);
-}
-#endif /* CONFIG_PM_SLEEP */
-
 /* IOSF SB for LPSS island */
 #define LPSS_IOSF_UNIT_LPIOEP		0xA0
 #define LPSS_IOSF_UNIT_LPIO1		0xAB
@@ -835,19 +801,15 @@ static void lpss_iosf_exit_d3_state(void
 	mutex_unlock(&lpss_iosf_mutex);
 }
 
-static int acpi_lpss_runtime_suspend(struct device *dev)
+static int acpi_lpss_suspend(struct device *dev, bool wakeup)
 {
 	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
 	int ret;
 
-	ret = pm_generic_runtime_suspend(dev);
-	if (ret)
-		return ret;
-
 	if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
 		acpi_lpss_save_ctx(dev, pdata);
 
-	ret = acpi_dev_suspend(dev, true);
+	 ret = acpi_dev_suspend(dev, wakeup);
 
 	/*
 	 * This call must be last in the sequence, otherwise PMC will return
@@ -860,7 +822,7 @@ static int acpi_lpss_runtime_suspend(str
 	return ret;
 }
 
-static int acpi_lpss_runtime_resume(struct device *dev)
+static int acpi_lpss_resume(struct device *dev)
 {
 	struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
 	int ret;
@@ -881,7 +843,36 @@ static int acpi_lpss_runtime_resume(stru
 	if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
 		acpi_lpss_restore_ctx(dev, pdata);
 
-	return pm_generic_runtime_resume(dev);
+	return 0;
+}
+#ifdef CONFIG_PM_SLEEP
+static int acpi_lpss_suspend_late(struct device *dev)
+{
+	int ret = pm_generic_suspend_late(dev);
+
+	return ret ? ret : acpi_lpss_suspend(dev, device_may_wakeup(dev));
+}
+
+static int acpi_lpss_resume_early(struct device *dev)
+{
+	int ret = acpi_lpss_resume(dev);
+
+	return ret ? ret : pm_generic_resume_early(dev);
+}
+#endif /* CONFIG_PM_SLEEP */
+
+static int acpi_lpss_runtime_suspend(struct device *dev)
+{
+	int ret = pm_generic_runtime_suspend(dev);
+
+	return ret ? ret : acpi_lpss_suspend(dev, true);
+}
+
+static int acpi_lpss_runtime_resume(struct device *dev)
+{
+	int ret = acpi_lpss_resume(dev);
+
+	return ret ? ret : pm_generic_runtime_resume(dev);
 }
 #endif /* CONFIG_PM */
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 08/12] ACPI / PM: Take SMART_SUSPEND driver flag into account
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (6 preceding siblings ...)
  2017-10-16  1:29 ` [PATCH 07/12] ACPI / LPSS: Consolidate runtime PM and system sleep handling Rafael J. Wysocki
@ 2017-10-16  1:30 ` Rafael J. Wysocki
  2017-10-16  1:30 ` [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND Rafael J. Wysocki
                   ` (6 subsequent siblings)
  14 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:30 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make the ACPI PM domain take DPM_FLAG_SMART_SUSPEND into account in
its system suspend callbacks.

[Note that the pm_runtime_suspended() check in acpi_dev_needs_resume()
is an optimization, because if is not passed, all of the subsequent
checks may be skipped and some of them are much more overhead in
general.]

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/acpi/device_pm.c |   21 +++++++++++++--------
 1 file changed, 13 insertions(+), 8 deletions(-)

Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -936,7 +936,8 @@ static bool acpi_dev_needs_resume(struct
 	u32 sys_target = acpi_target_system_state();
 	int ret, state;
 
-	if (device_may_wakeup(dev) != !!adev->wakeup.prepare_count)
+	if (!pm_runtime_suspended(dev) || !adev ||
+	    device_may_wakeup(dev) != !!adev->wakeup.prepare_count)
 		return true;
 
 	if (sys_target == ACPI_STATE_S0)
@@ -968,9 +969,6 @@ int acpi_subsys_prepare(struct device *d
 	if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
 		return 0;
 
-	if (!adev || !pm_runtime_suspended(dev))
-		return 0;
-
 	return !acpi_dev_needs_resume(dev, adev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_prepare);
@@ -996,12 +994,17 @@ EXPORT_SYMBOL_GPL(acpi_subsys_complete);
  * acpi_subsys_suspend - Run the device driver's suspend callback.
  * @dev: Device to handle.
  *
- * Follow PCI and resume devices suspended at run time before running their
- * system suspend callbacks.
+ * Follow PCI and resume devices from runtime suspend before running their
+ * system suspend callbacks, unless the driver can cope with runtime-suspended
+ * devices during system suspend and there are no ACPI-specific reasons for
+ * resuming them.
  */
 int acpi_subsys_suspend(struct device *dev)
 {
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
+	    acpi_dev_needs_resume(dev, ACPI_COMPANION(dev)))
+		pm_runtime_resume(dev);
+
 	return pm_generic_suspend(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_suspend);
@@ -1047,7 +1050,9 @@ int acpi_subsys_freeze(struct device *de
 	 * runtime-suspended devices should not be touched during freeze/thaw
 	 * transitions.
 	 */
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+		pm_runtime_resume(dev);
+
 	return pm_generic_freeze(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_freeze);

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (7 preceding siblings ...)
  2017-10-16  1:30 ` [PATCH 08/12] ACPI / PM: Take SMART_SUSPEND driver flag into account Rafael J. Wysocki
@ 2017-10-16  1:30 ` Rafael J. Wysocki
  2017-10-31 15:09   ` Lee Jones
  2017-10-16  1:30 ` [PATCH 10/12] PM / core: Add LEAVE_SUSPENDED driver flag Rafael J. Wysocki
                   ` (5 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:30 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Make the intel-lpss driver set DPM_FLAG_SMART_SUSPEND for its
devices which will allow them to stay in runtime suspend during
system suspend unless they need to be reconfigured for some reason.

Also make it avoid resuming its child devices if they have
DPM_FLAG_SMART_SUSPEND set to allow them to remain in runtime
suspend during system suspend.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/mfd/intel-lpss.c |    6 +++++-
 1 file changed, 5 insertions(+), 1 deletion(-)

Index: linux-pm/drivers/mfd/intel-lpss.c
===================================================================
--- linux-pm.orig/drivers/mfd/intel-lpss.c
+++ linux-pm/drivers/mfd/intel-lpss.c
@@ -450,6 +450,8 @@ int intel_lpss_probe(struct device *dev,
 	if (ret)
 		goto err_remove_ltr;
 
+	dev_pm_set_driver_flags(dev, DPM_FLAG_SMART_SUSPEND);
+
 	return 0;
 
 err_remove_ltr:
@@ -478,7 +480,9 @@ EXPORT_SYMBOL_GPL(intel_lpss_remove);
 
 static int resume_lpss_device(struct device *dev, void *data)
 {
-	pm_runtime_resume(dev);
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+		pm_runtime_resume(dev);
+
 	return 0;
 }
 

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 10/12] PM / core: Add LEAVE_SUSPENDED driver flag
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (8 preceding siblings ...)
  2017-10-16  1:30 ` [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND Rafael J. Wysocki
@ 2017-10-16  1:30 ` Rafael J. Wysocki
  2017-10-23 19:38   ` Ulf Hansson
  2017-10-16  1:31 ` [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management Rafael J. Wysocki
                   ` (4 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:30 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
instruct the PM core that it is desirable to leave the device in
runtime suspend after system resume (for example, the device may be
slow to resume and it may be better to avoid resuming it right away
for this reason).

Setting that flag causes the PM core to skip the ->resume_noirq,
->resume_early and ->resume callbacks for the device (like in the
direct-complete optimization case) if (1) the wakeup settings of it
are compatible with runtime PM (that is, either the device is
configured to wake up the system from sleep or it cannot generate
wakeup signals at all), and it will not be used for resuming any of
its children or consumers.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/driver-api/pm/devices.rst |   20 +++++++
 drivers/base/power/main.c               |   81 ++++++++++++++++++++++++++++++--
 include/linux/pm.h                      |   12 +++-
 3 files changed, 104 insertions(+), 9 deletions(-)

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -559,6 +559,7 @@ struct pm_subsys_data {
  * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
  * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
  * SMART_SUSPEND: No need to resume the device from runtime suspend.
+ * LEAVE_SUSPENDED: Avoid resuming the device during system resume if possible.
  *
  * Setting SMART_PREPARE instructs bus types and PM domains which may want
  * system suspend/resume callbacks to be skipped for the device to return 0 from
@@ -573,10 +574,14 @@ struct pm_subsys_data {
  * the "late" and "noirq" phases of device suspend for the device if it remains
  * in runtime suspend at the beginning of the "late" phase (when runtime PM has
  * been disabled for it).
+ *
+ * Setting LEAVE_SUSPENDED informs the PM core and middle layer code that the
+ * driver prefers the device to be left in runtime suspend after system resume.
  */
-#define DPM_FLAG_NEVER_SKIP	BIT(0)
-#define DPM_FLAG_SMART_PREPARE	BIT(1)
-#define DPM_FLAG_SMART_SUSPEND	BIT(2)
+#define DPM_FLAG_NEVER_SKIP		BIT(0)
+#define DPM_FLAG_SMART_PREPARE		BIT(1)
+#define DPM_FLAG_SMART_SUSPEND		BIT(2)
+#define DPM_FLAG_LEAVE_SUSPENDED	BIT(3)
 
 struct dev_pm_info {
 	pm_message_t		power_state;
@@ -598,6 +603,7 @@ struct dev_pm_info {
 	bool			wakeup_path:1;
 	bool			syscore:1;
 	bool			no_pm_callbacks:1;	/* Owned by the PM core */
+	unsigned int		must_resume:1;	/* Owned by the PM core */
 #else
 	unsigned int		should_wakeup:1;
 #endif
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -705,6 +705,12 @@ static int device_resume_early(struct de
 	if (!dev->power.is_late_suspended)
 		goto Out;
 
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED) &&
+	    !dev->power.must_resume) {
+		pm_runtime_set_suspended(dev);
+		goto Out;
+	}
+
 	dpm_wait_for_superior(dev, async);
 
 	if (dev->pm_domain) {
@@ -1098,6 +1104,32 @@ static pm_message_t resume_event(pm_mess
 	return PMSG_ON;
 }
 
+static void dpm_suppliers_set_must_resume(struct device *dev)
+{
+	struct device_link *link;
+	int idx;
+
+	idx = device_links_read_lock();
+
+	list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
+		link->supplier->power.must_resume = true;
+
+	device_links_read_unlock(idx);
+}
+
+static void dpm_leave_suspended(struct device *dev)
+{
+	pm_runtime_set_suspended(dev);
+	dev->power.is_suspended = false;
+	dev->power.is_late_suspended = false;
+	/*
+	 * This tells middle layer code to schedule runtime resume of the device
+	 * from its ->complete callback to update the device's power state in
+	 * case the platform firmware has been involved in resuming the system.
+	 */
+	dev->power.direct_complete = true;
+}
+
 /**
  * __device_suspend_noirq - Execute a "noirq suspend" callback for given device.
  * @dev: Device to handle.
@@ -1135,8 +1167,20 @@ static int __device_suspend_noirq(struct
 	 * the callback invocation for them.
 	 */
 	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
-	    pm_runtime_status_suspended(dev))
-		goto Skip;
+	    pm_runtime_status_suspended(dev)) {
+		/*
+		 * The device may be left suspended during system resume if
+		 * that is preferred by its driver and it will not be used for
+		 * resuming any of its children or consumers.
+		 */
+		if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED) &&
+		    !dev->power.must_resume) {
+			dpm_leave_suspended(dev);
+			goto Complete;
+		} else {
+			goto Skip;
+		}
+	}
 
 	if (dev->pm_domain) {
 		info = "noirq power domain ";
@@ -1163,6 +1207,28 @@ static int __device_suspend_noirq(struct
 		goto Complete;
 	}
 
+	/*
+	 * The device may be left suspended during system resume if that is
+	 * preferred by its driver and its wakeup configuration is compatible
+	 * with runtime PM, and it will not be used for resuming any of its
+	 * children or consumers.
+	 */
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED) &&
+	    (device_may_wakeup(dev) || !device_can_wakeup(dev)) &&
+	    !dev->power.must_resume) {
+		dpm_leave_suspended(dev);
+		goto Complete;
+	}
+
+	/*
+	 * The parent and suppliers will be necessary to resume the device
+	 * during system resume, so avoid leaving them in runtime suspend.
+	 */
+	if (dev->parent)
+		dev->parent->power.must_resume = true;
+
+	dpm_suppliers_set_must_resume(dev);
+
 Skip:
 	dev->power.is_noirq_suspended = true;
 
@@ -1698,8 +1764,9 @@ static int device_prepare(struct device
 	if (dev->power.syscore)
 		return 0;
 
-	WARN_ON(dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
-		!pm_runtime_enabled(dev));
+	WARN_ON(!pm_runtime_enabled(dev) &&
+		dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND |
+					      DPM_FLAG_LEAVE_SUSPENDED));
 
 	/*
 	 * If a device's parent goes into runtime suspend at the wrong time,
@@ -1712,6 +1779,12 @@ static int device_prepare(struct device
 	device_lock(dev);
 
 	dev->power.wakeup_path = device_may_wakeup(dev);
+	/*
+	 * Avoid leaving devices in suspend after transitions that don't really
+	 * suspend them in general.
+	 */
+	dev->power.must_resume = state.event == PM_EVENT_FREEZE ||
+				state.event == PM_EVENT_QUIESCE;
 
 	if (dev->power.no_pm_callbacks) {
 		ret = 1;	/* Let device go direct_complete */
Index: linux-pm/Documentation/driver-api/pm/devices.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/pm/devices.rst
+++ linux-pm/Documentation/driver-api/pm/devices.rst
@@ -785,6 +785,22 @@ means that they should be put into the f
 
 During system-wide resume from a sleep state it's easiest to put devices into
 the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
-Refer to that document for more information regarding this particular issue as
+[Refer to that document for more information regarding this particular issue as
 well as for information on the device runtime power management framework in
-general.
+general.]
+
+However, it may be desirable to leave some devices in runtime suspend after
+system resume and device drivers can use the ``DPM_FLAG_LEAVE_SUSPENDED`` flag
+to indicate to the PM core that this is the case.  If that flag is set for a
+device and the wakeup settings of it are compatible with runtime PM (that is,
+either the device is configured to wake up the system from sleep or it cannot
+generate wakeup signals at all), and it will not be used for resuming any of its
+children or consumers, the PM core will skip all of the system resume callbacks
+in the ``resume_noirq``, ``resume_early`` and ``resume`` phases for it and its
+runtime power management status will be set to "suspended".
+
+Still, if the platform firmware is involved in the handling of system resume, it
+may change the state of devices in unpredictable ways, so in that case the
+middle layer code (for example, a bus type or PM domain) the driver works with
+should update the device's power state and schedule runtime resume of it to
+align its power settings with the expectations of the runtime PM framework.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (9 preceding siblings ...)
  2017-10-16  1:30 ` [PATCH 10/12] PM / core: Add LEAVE_SUSPENDED driver flag Rafael J. Wysocki
@ 2017-10-16  1:31 ` Rafael J. Wysocki
  2017-10-26 20:41   ` Wolfram Sang
  2017-10-16  1:32 ` [PATCH 12/12] PM / core: Add AVOID_RPM driver flag Rafael J. Wysocki
                   ` (3 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:31 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Optimize the power management in i2c-designware-platdrv by making it
set the DPM_FLAG_SMART_SUSPEND and DPM_FLAG_LEAVE_SUSPENDED which
allows some code to be dropped from its PM callbacks.

First, setting DPM_FLAG_SMART_SUSPEND causes the intel-lpss driver
to avoid resuming i2c-designware-platdrv devices in its ->prepare
callback, so they can stay in runtime suspend after that point even
if the direct-complete feature is not used for them.

It also causes the PM core to avoid invoking "late" and "noirq"
suspend callbacks for these devices if they are in runtime suspend
at the beginning of the "late" phase of device suspend during
system suspend.  That guarantees dw_i2c_plat_suspend() to be
called for a device only if it is not in runtime suspend.
Moreover, it also causes the PM core to set the device's runtime
PM status to "active" after calling dw_i2c_plat_resume() for
it, so the driver doesn't need internal flags to avoid invoking
either dw_i2c_plat_suspend() or dw_i2c_plat_resume() twice in
a row.

Second, setting DPM_FLAG_LEAVE_SUSPENDED enables the optimization
allowing the device to stay suspended after system resume under
suitable conditions, so again the driver doesn't need to take
care of that by itself.

Accordingly, the internal "suspended" and "skip_resume" flags
used by the driver are not necessary any more, so drop them and
simplify the driver's PM callbacks.

Additionally, notice that dw_i2c_plat_complete() only needs
to schedule runtime PM for the device if platform firmware
has been involved in resuming the system, so make it call
pm_resume_via_firmware() to check that.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 drivers/i2c/busses/i2c-designware-core.h    |    2 --
 drivers/i2c/busses/i2c-designware-platdrv.c |   25 ++++++-------------------
 2 files changed, 6 insertions(+), 21 deletions(-)

Index: linux-pm/drivers/i2c/busses/i2c-designware-core.h
===================================================================
--- linux-pm.orig/drivers/i2c/busses/i2c-designware-core.h
+++ linux-pm/drivers/i2c/busses/i2c-designware-core.h
@@ -280,8 +280,6 @@ struct dw_i2c_dev {
 	int			(*acquire_lock)(struct dw_i2c_dev *dev);
 	void			(*release_lock)(struct dw_i2c_dev *dev);
 	bool			pm_disabled;
-	bool			suspended;
-	bool			skip_resume;
 	void			(*disable)(struct dw_i2c_dev *dev);
 	void			(*disable_int)(struct dw_i2c_dev *dev);
 	int			(*init)(struct dw_i2c_dev *dev);
Index: linux-pm/drivers/i2c/busses/i2c-designware-platdrv.c
===================================================================
--- linux-pm.orig/drivers/i2c/busses/i2c-designware-platdrv.c
+++ linux-pm/drivers/i2c/busses/i2c-designware-platdrv.c
@@ -42,6 +42,7 @@
 #include <linux/reset.h>
 #include <linux/sched.h>
 #include <linux/slab.h>
+#include <linux/suspend.h>
 
 #include "i2c-designware-core.h"
 
@@ -370,7 +371,10 @@ static int dw_i2c_plat_probe(struct plat
 	ACPI_COMPANION_SET(&adap->dev, ACPI_COMPANION(&pdev->dev));
 	adap->dev.of_node = pdev->dev.of_node;
 
-	dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_SMART_PREPARE);
+	dev_pm_set_driver_flags(&pdev->dev,
+				DPM_FLAG_SMART_PREPARE |
+				DPM_FLAG_SMART_SUSPEND |
+				DPM_FLAG_LEAVE_SUSPENDED);
 
 	/* The code below assumes runtime PM to be disabled. */
 	WARN_ON(pm_runtime_enabled(&pdev->dev));
@@ -446,7 +450,7 @@ static int dw_i2c_plat_prepare(struct de
 
 static void dw_i2c_plat_complete(struct device *dev)
 {
-	if (dev->power.direct_complete)
+	if (dev->power.direct_complete && pm_resume_via_firmware())
 		pm_request_resume(dev);
 }
 #else
@@ -459,16 +463,9 @@ static int dw_i2c_plat_suspend(struct de
 {
 	struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
 
-	if (i_dev->suspended) {
-		i_dev->skip_resume = true;
-		return 0;
-	}
-
 	i_dev->disable(i_dev);
 	i2c_dw_plat_prepare_clk(i_dev, false);
 
-	i_dev->suspended = true;
-
 	return 0;
 }
 
@@ -476,19 +473,9 @@ static int dw_i2c_plat_resume(struct dev
 {
 	struct dw_i2c_dev *i_dev = dev_get_drvdata(dev);
 
-	if (!i_dev->suspended)
-		return 0;
-
-	if (i_dev->skip_resume) {
-		i_dev->skip_resume = false;
-		return 0;
-	}
-
 	i2c_dw_plat_prepare_clk(i_dev, true);
 	i_dev->init(i_dev);
 
-	i_dev->suspended = false;
-
 	return 0;
 }
 


^ permalink raw reply	[flat|nested] 79+ messages in thread

* [PATCH 12/12] PM / core: Add AVOID_RPM driver flag
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (10 preceding siblings ...)
  2017-10-16  1:31 ` [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management Rafael J. Wysocki
@ 2017-10-16  1:32 ` Rafael J. Wysocki
  2017-10-17 15:33   ` Andy Shevchenko
  2017-10-16  7:08 ` [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Greg Kroah-Hartman
                   ` (2 subsequent siblings)
  14 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16  1:32 UTC (permalink / raw)
  To: Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Define and document a new driver flag, DPM_FLAG_AVOID_RPM, to inform
the PM core and middle layer code that the driver has something
significant to do in its ->suspend and/or ->resume callbacks and
runtime PM should be disabled for the device when these callbacks
run.

Setting DPM_FLAG_AVOID_RPM (in addition to DPM_FLAG_SMART_SUSPEND)
causes runtime PM to be disabled for the device before invoking the
driver's ->suspend callback for it and to be enabled again for it
only after the driver's ->resume callback has returned.  In addition
to that, if the device is in runtime suspend right after disabling
runtime PM for it (which means that there was no reason to resume it
from runtime suspend beforehand), the invocation of the ->suspend
callback will be skipped for it and it will be left in runtime
suspend until the "noirq" phase of the subsequent system resume.

If DPM_FLAG_SMART_SUSPEND is not set, DPM_FLAG_AVOID_RPM has no
effect.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---
 Documentation/driver-api/pm/devices.rst |   14 ++++++
 Documentation/power/pci.txt             |    9 +++-
 drivers/acpi/device_pm.c                |   24 ++++++++++-
 drivers/base/power/main.c               |   31 ++++++++++++++
 drivers/pci/pci-driver.c                |   69 ++++++++++++++++++++++----------
 include/linux/pm.h                      |   10 ++++
 6 files changed, 134 insertions(+), 23 deletions(-)

Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -560,6 +560,7 @@ struct pm_subsys_data {
  * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
  * SMART_SUSPEND: No need to resume the device from runtime suspend.
  * LEAVE_SUSPENDED: Avoid resuming the device during system resume if possible.
+ * AVOID_RPM: Disable runtime PM and check its status before ->suspend.
  *
  * Setting SMART_PREPARE instructs bus types and PM domains which may want
  * system suspend/resume callbacks to be skipped for the device to return 0 from
@@ -577,11 +578,17 @@ struct pm_subsys_data {
  *
  * Setting LEAVE_SUSPENDED informs the PM core and middle layer code that the
  * driver prefers the device to be left in runtime suspend after system resume.
+ *
+ * Setting AVOID_RPM informs the PM core and middle layer code that the driver
+ * has something significant to do in its ->suspend and/or ->resume callbacks
+ * and runtime PM should be disabled for the device when these callbacks run.
+ * If SMART_SUSPEND is not set, this flag has no effect.
  */
 #define DPM_FLAG_NEVER_SKIP		BIT(0)
 #define DPM_FLAG_SMART_PREPARE		BIT(1)
 #define DPM_FLAG_SMART_SUSPEND		BIT(2)
 #define DPM_FLAG_LEAVE_SUSPENDED	BIT(3)
+#define DPM_FLAG_AVOID_RPM		BIT(4)
 
 struct dev_pm_info {
 	pm_message_t		power_state;
@@ -604,6 +611,7 @@ struct dev_pm_info {
 	bool			syscore:1;
 	bool			no_pm_callbacks:1;	/* Owned by the PM core */
 	unsigned int		must_resume:1;	/* Owned by the PM core */
+	unsigned int		rpm_reenable:1;	/* Do not modify directly */
 #else
 	unsigned int		should_wakeup:1;
 #endif
@@ -741,6 +749,8 @@ extern int dpm_suspend_late(pm_message_t
 extern int dpm_suspend(pm_message_t state);
 extern int dpm_prepare(pm_message_t state);
 
+extern void dpm_disable_runtime_pm_early(struct device *dev);
+
 extern void __suspend_report_result(const char *function, void *fn, int ret);
 
 #define suspend_report_result(fn, ret)					\
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -906,6 +906,10 @@ static int device_resume(struct device *
  Unlock:
 	device_unlock(dev);
 	dpm_watchdog_clear(&wd);
+	if (dev->power.rpm_reenable) {
+		pm_runtime_enable(dev);
+		dev->power.rpm_reenable = false;
+	}
 
  Complete:
 	complete_all(&dev->power.completion);
@@ -1534,6 +1538,12 @@ static int legacy_suspend(struct device
 	return error;
 }
 
+void dpm_disable_runtime_pm_early(struct device *dev)
+{
+	pm_runtime_disable(dev);
+	dev->power.rpm_reenable = true;
+}
+
 static void dpm_clear_suppliers_direct_complete(struct device *dev)
 {
 	struct device_link *link;
@@ -1636,6 +1646,27 @@ static int __device_suspend(struct devic
 	if (!callback && dev->driver && dev->driver->pm) {
 		info = "driver ";
 		callback = pm_op(dev->driver->pm, state);
+		if (callback &&
+		    dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+		    dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
+			/*
+			 * Device wakeup is enabled for runtime PM, so if the
+			 * device is not expected to wake up the system from
+			 * sleep, resume it now so that it can be reconfigured.
+			 */
+			if (device_can_wakeup(dev) && !device_may_wakeup(dev))
+				pm_runtime_resume(dev);
+
+			dpm_disable_runtime_pm_early(dev);
+			/*
+			 * If the device is already suspended now, it won't be
+			 * resumed until the subsequent system resume starts and
+			 * there is no need to suspend it again, so simply skip
+			 * the callback for it.
+			 */
+			if (pm_runtime_status_suspended(dev))
+				goto End;
+		}
 	}
 
 	error = dpm_run_callback(callback, dev, state, info);
Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -708,6 +708,39 @@ static void pci_pm_complete(struct devic
 	}
 }
 
+static bool pci_pm_check_suspend(struct device *dev)
+{
+	/*
+	 * PCI devices suspended at run time may need to be resumed at this
+	 * point, because in general it may be necessary to reconfigure them for
+	 * system suspend.  Namely, if the device is expected to wake up the
+	 * system from the sleep state, it may have to be reconfigured for this
+	 * purpose, or if the device is not expected to wake up the system from
+	 * the sleep state, it should be prevented from signaling wakeup events
+	 * going forward.
+	 *
+	 * Also if the driver of the device does not indicate that its system
+	 * suspend callbacks can cope with runtime-suspended devices, it is
+	 * better to resume the device from runtime suspend here.
+	 */
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
+	    !pci_dev_keep_suspended(to_pci_dev(dev)))
+		pm_runtime_resume(dev);
+
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+	    dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
+		dpm_disable_runtime_pm_early(dev);
+		/*
+		 * If the device is in runtime suspend now, it won't be resumed
+		 * until the subsequent system resume starts and there is no
+		 * need to suspend it again, so let the callers know about that.
+		 */
+		if (pm_runtime_status_suspended(dev))
+			return true;
+	}
+	return false;
+}
+
 #else /* !CONFIG_PM_SLEEP */
 
 #define pci_pm_prepare	NULL
@@ -730,22 +763,8 @@ static int pci_pm_suspend(struct device
 		return 0;
 	}
 
-	/*
-	 * PCI devices suspended at run time may need to be resumed at this
-	 * point, because in general it may be necessary to reconfigure them for
-	 * system suspend.  Namely, if the device is expected to wake up the
-	 * system from the sleep state, it may have to be reconfigured for this
-	 * purpose, or if the device is not expected to wake up the system from
-	 * the sleep state, it should be prevented from signaling wakeup events
-	 * going forward.
-	 *
-	 * Also if the driver of the device does not indicate that its system
-	 * suspend callbacks can cope with runtime-suspended devices, it is
-	 * better to resume the device from runtime suspend here.
-	 */
-	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
-	    !pci_dev_keep_suspended(pci_dev))
-		pm_runtime_resume(dev);
+	if (pci_pm_check_suspend(dev))
+		return 0;
 
 	pci_dev->state_saved = false;
 	if (pm->suspend) {
@@ -918,8 +937,18 @@ static int pci_pm_freeze(struct device *
 	 * devices should not be touched during freeze/thaw transitions,
 	 * however.
 	 */
-	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND)) {
 		pm_runtime_resume(dev);
+	} else if (dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
+		dpm_disable_runtime_pm_early(dev);
+		/*
+		 * If the device is in runtime suspend now, it won't be resumed
+		 * until the subsequent system resume starts and there is no
+		 * need to suspend it again, so simply skip the callback for it.
+		 */
+		if (pm_runtime_status_suspended(dev))
+			return 0;
+	}
 
 	pci_dev->state_saved = false;
 	if (pm->freeze) {
@@ -1020,10 +1049,8 @@ static int pci_pm_poweroff(struct device
 		return 0;
 	}
 
-	/* The reason to do that is the same as in pci_pm_suspend(). */
-	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) ||
-	    !pci_dev_keep_suspended(pci_dev))
-		pm_runtime_resume(dev);
+	if (pci_pm_check_suspend(dev))
+		return 0;
 
 	pci_dev->state_saved = false;
 	if (pm->poweroff) {
Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -1005,6 +1005,18 @@ int acpi_subsys_suspend(struct device *d
 	    acpi_dev_needs_resume(dev, ACPI_COMPANION(dev)))
 		pm_runtime_resume(dev);
 
+	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
+	    dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
+		dpm_disable_runtime_pm_early(dev);
+		/*
+		 * If the device is in runtime suspend now, it won't be resumed
+		 * until the subsequent system resume starts and there is no
+		 * need to suspend it again, so let the callers know about that.
+		 */
+		if (pm_runtime_status_suspended(dev))
+			return 0;
+	}
+
 	return pm_generic_suspend(dev);
 }
 EXPORT_SYMBOL_GPL(acpi_subsys_suspend);
@@ -1050,8 +1062,18 @@ int acpi_subsys_freeze(struct device *de
 	 * runtime-suspended devices should not be touched during freeze/thaw
 	 * transitions.
 	 */
-	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
+	if (!dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND)) {
 		pm_runtime_resume(dev);
+	} else if (dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
+		dpm_disable_runtime_pm_early(dev);
+		/*
+		 * If the device is in runtime suspend now, it won't be resumed
+		 * until the subsequent system resume starts and there is no
+		 * need to suspend it again, so let the callers know about that.
+		 */
+		if (pm_runtime_status_suspended(dev))
+			return 0;
+	}
 
 	return pm_generic_freeze(dev);
 }
Index: linux-pm/Documentation/driver-api/pm/devices.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/pm/devices.rst
+++ linux-pm/Documentation/driver-api/pm/devices.rst
@@ -783,6 +783,20 @@ phase of device resume (right prior to e
 to prevent runtime PM from acting on them before the ``complete`` phase, which
 means that they should be put into the full-power state before that phase.
 
+The handling of ``DPM_FLAG_SMART_SUSPEND`` can be extended by setting another
+power management driver flag, ``DPM_FLAG_AVOID_RPM`` (it has no effect without
+``DPM_FLAG_SMART_SUSPEND`` set).  Setting it informs the PM core and middle
+layer code that the driver's ``->suspend`` and/or ``->resume`` callbacks are
+not trivial and need to be run with runtime PM disabled.  Consequently,
+runtime PM is disabled before running the ``->suspend`` callback for devices
+with both ``DPM_FLAG_SMART_SUSPEND`` and ``DPM_FLAG_AVOID_RPM`` set and it is
+enabled again only after the driver's ``->resume`` callback has returned.  In
+addition to that, if the device is in runtime suspend right after disabling
+runtime PM for it (which means that there was no reason to resume it from
+runtime suspend beforehand), the invocation of the ``->suspend`` callback will
+be skipped for it and it will be left in runtime suspend until the ongoing
+system-wide power transition is over.
+
 During system-wide resume from a sleep state it's easiest to put devices into
 the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
 [Refer to that document for more information regarding this particular issue as
Index: linux-pm/Documentation/power/pci.txt
===================================================================
--- linux-pm.orig/Documentation/power/pci.txt
+++ linux-pm/Documentation/power/pci.txt
@@ -984,7 +984,14 @@ The DPM_FLAG_SMART_SUSPEND flag tells th
 perspective the device can be safely left in runtime suspend during system
 suspend.  That causes pci_pm_suspend(), pci_pm_freeze() and pci_pm_poweroff()
 to skip resuming the device from runtime suspend unless there are PCI-specific
-reasons for doing that.
+reasons for doing that.  In addition to that, drivers can use the
+DPM_FLAG_AVOID_RPM flag to inform the PCI bus type that its .suspend() and
+.resume() callbacks need to be run with runtime PM disabled (this flag has no
+effect without DPM_FLAG_SMART_SUSPEND set).  Then, if the device is in runtime
+suspend afrer runtime PM has been disabled for it, which means that there was
+no reason to resume it from runtime suspend beforehand, it won't be resumed
+until the ongoing system transition is over, so the execution of system suspend
+callbacks for it during that transition will be skipped.
 
 3.2. Device Runtime Power Management
 ------------------------------------


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
@ 2017-10-16  5:34   ` Lukas Wunner
  2017-10-16 22:03     ` Rafael J. Wysocki
  2017-10-16  6:28   ` Greg Kroah-Hartman
                     ` (3 subsequent siblings)
  4 siblings, 1 reply; 79+ messages in thread
From: Lukas Wunner @ 2017-10-16  5:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> +	:c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
> +	tese flags is set, the PM core will not apply the direct-complete
        ^
	these

> +	proceudre described above to the given device and, consequenty, to any
        ^
        procedure

Lukas

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
  2017-10-16  5:34   ` Lukas Wunner
@ 2017-10-16  6:28   ` Greg Kroah-Hartman
  2017-10-16 22:05     ` Rafael J. Wysocki
  2017-10-16  6:31   ` Greg Kroah-Hartman
                     ` (2 subsequent siblings)
  4 siblings, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-16  6:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
>  struct dev_pm_info {
>  	pm_message_t		power_state;
>  	unsigned int		can_wakeup:1;
> @@ -561,6 +580,7 @@ struct dev_pm_info {
>  	bool			is_late_suspended:1;
>  	bool			early_init:1;	/* Owned by the PM core */
>  	bool			direct_complete:1;	/* Owned by the PM core */
> +	unsigned int		driver_flags;

Minor nit, u32 or u64?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
  2017-10-16  5:34   ` Lukas Wunner
  2017-10-16  6:28   ` Greg Kroah-Hartman
@ 2017-10-16  6:31   ` Greg Kroah-Hartman
  2017-10-16 22:07     ` Rafael J. Wysocki
  2017-10-16 20:16   ` Alan Stern
  2017-10-18 23:17   ` [Update][PATCH v2 " Rafael J. Wysocki
  4 siblings, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-16  6:31 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> +static inline void dev_pm_set_driver_flags(struct device *dev, unsigned int flags)
> +{
> +	dev->power.driver_flags = flags;
> +}

Should this function just set the specific bit?  Or is it going to be ok
to set the whole value, meaning you aren't going to care about turning
on and off specific flags over the lifetime of the driver/device, you
are just going to set them once and then just test them as needed?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (11 preceding siblings ...)
  2017-10-16  1:32 ` [PATCH 12/12] PM / core: Add AVOID_RPM driver flag Rafael J. Wysocki
@ 2017-10-16  7:08 ` Greg Kroah-Hartman
  2017-10-16 21:50   ` Rafael J. Wysocki
  2017-10-17  8:36 ` Ulf Hansson
  2017-10-20 20:46 ` Bjorn Helgaas
  14 siblings, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-16  7:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Mon, Oct 16, 2017 at 03:12:35AM +0200, Rafael J. Wysocki wrote:
> Hi All,
> 
> Well, this took more time than expected, as I tried to cover everything I had
> in mind regarding PM flags for drivers.
> 
> This work was triggered by attempts to fix and optimize PM in the
> i2c-designware-platdev driver that ended up with adding a couple of
> flags to the driver's internal data structures for the tracking of
> device state (https://marc.info/?l=linux-acpi&m=150629646805636&w=2).
> That approach is sort of suboptimal, though, because other drivers will
> probably want to do similar things and if all of them need to use internal
> flags for that, quite a bit of code duplication may ensue at least.
> 
> That can be avoided in a couple of ways and one of them is to provide a means
> for drivers to tell the core what to do and to make the core take care of it
> if told to do so.  Hence, the idea to use driver flags for system-wide PM
> that was briefly discussed during the LPC in LA last month.
> 
> One of the flags considered at that time was to possibly cause the core
> to reuse the runtime PM callback path of a device for system suspend/resume.
> Admittedly, that idea didn't look too bad to me until I had started to try to
> implement it and I got to the PCI bus type's hibernation callbacks.  Then, I
> moved the patch I was working on to /dev/null right away.  I mean it.
> 
> No, this is not going to happen.  No way.
> 
> Moreover, that experience made me realize that the whole *idea* of using the
> runtime PM callback path for system-wide PM was actually totally bogus (sorry
> Ulf).
> 
> The whole point of having different callbacks pointers for different types of
> device transitions is because it may be necessary to do different things in
> those callbacks in general.  Now, if you consider runtime PM and system
> suspend/resume *only* and from a driver perspective, then yes, in some cases
> the same pair of callback routines may be used for all suspend-like and
> resume-like transitions of the device, but if you add hibernation to the mix,
> then it is not so clear any more unless the callbacks don't actually do any
> power management at all, but simply quiesce the device's activity and then
> activate it again.  Namely, changing power states of devices during the
> hibernation's "freeze" and "thaw" transitions rarely makes sense at all and
> the "restore" transition needs to be able to cope with uninitialized devices
> (in fact, it should be prepared to cope with devices in *any* state), so
> runtime PM is hardly suitable for them.  Still, if a *driver* choses to not
> do any real PM in its PM callbacks and leaves that to a middle layer (quite
> a few drivers do that), then it possibly can use one pair of callbacks in all
> cases and be happy, but middle layers pretty much have to use different
> callback routines for different transitions.
> 
> If you are a middle layer, your role is basically to do PM for a certain
> group of devices.  Thus you cannot really do the same in ->suspend or
> ->suspend_early and in ->runtime_suspend (because the former generally need to
> take device_may_wakeup() into account and the latter doesn't) and you shouldn't
> really do the same in ->suspend and ->freeze (becuase the latter shouldn't
> change the device's power state) and so on.  To put it bluntly, trying
> to use the ->runtime_suspend callback of a middle layer for anything other
> than runtime suspend is complete and utter nonsense.  At the same time, the
> ->runtime_resume callback of a middle layer may be reused to some extent,
> but even that doesn't cover the "thaw" transitions during hibernation.
> 
> What can work (and this is the only strategy that can work AFAICS) is to
> point different callback pointers *in* *a* *driver* to the same routine
> if the driver wants to reuse that code.  That actually will work for PCI
> and USB drivers today, at least most of the time, but unfortunately there
> are problems with it for, say, platform devices.
> 
> The first problem is the requirement to track the status of the device
> (suspended vs not suspended) in the callbacks, because the system-wide PM
> code in the PM core doesn't do that.  The runtime PM framework does it, so
> this means adding some extra code which isn't necessary for runtime PM to
> the callback routines and that is not particularly nice.
> 
> The second problem is that, if the driver wants to do anything in its
> ->suspend callback, it generally has to prevent runtime suspend of the
> device from taking place in parallel with that, which is quite cumbersome.
> Usually, that is taken care of by resuming the device from runtime suspend
> upfront, but generally doing that is wasteful (there may be no real need to
> resume the device except for the fact that the code is designed this way).
> 
> On top of the above, there are optimizations to be made, like leaving certain
> devices in suspend after system resume to avoid wasting time on waiting for
> them to resume before user space can run again and similar.
> 
> This patch series focuses on addressing those problems so as to make it
> easier to reuse callback routines by pointing different callback pointers
> to them in device drivers.  The flags introduced here are to instruct the
> PM core and middle layers (whatever they are) on how the driver wants the
> device to be handled and then the driver has to provide callbacks to match
> these instructions and the rest should be taken care of by the code above it.
> 
> The flags are introduced one by one to avoid making too many changes in
> one go and to allow things to be explained better (hopefully).  They mostly
> are mutually independent with some clearly documented exceptions.
> 
> The first three patches in the series are about an issue with the
> direct-complete optimization introduced some time ago in which some middle
> layers decide on whether or not to do the optimization without asking the
> drivers.  And, as it turns out, in some cases the drivers actually know
> better, so the new flags introduced by these patches are here for these
> drivers (and the DPM_FLAG_NEVER_SKIP one is really to avoid having to define
> ->prepare callbacks always returning zero).
> 
> The really interesting things start to happen in patches [4-9/12] which make it
> possible to avoid resuming devices from runtime suspend upfront during system
> suspend at least in some cases (and when direct-complete is not applied to the
> devices in question), but please refer to the changelogs for details.
> 
> The i2d-designware-platdev driver is used as the primary example in the series
> and the patches modifying it are based on some previous changes currently in
> linux-next AFAICS (the same applies to the intel-lpss driver), but these
> patches can wait until everything is properly merged.  They are included here
> mostly as illustration.
> 
> Overall, the series is based on the linux-next branch of the linux-pm.git tree
> with some extra patches on top of it and all of the names of new entities
> introduced in it are negotiable.

Thanks for the great explaination, I was wondering how your proposal
discussed at Plumbers was going to work out in the end :)

The patch series looks good to me (minor questions already sent on the
patches), but what does this mean for drivers?  Do they now have to do a
lot of work to take advantage of this, like you did for the
i2d-designware-platdev driver?  Or will things continue to work as-is
and it's only an opt-in type thing where the bus/driver wants to take
advantage of it?

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
                     ` (2 preceding siblings ...)
  2017-10-16  6:31   ` Greg Kroah-Hartman
@ 2017-10-16 20:16   ` Alan Stern
  2017-10-16 22:11     ` Rafael J. Wysocki
  2017-10-18 23:17   ` [Update][PATCH v2 " Rafael J. Wysocki
  4 siblings, 1 reply; 79+ messages in thread
From: Alan Stern @ 2017-10-16 20:16 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The motivation for this change is to provide a way to work around
> a problem with the direct-complete mechanism used for avoiding
> system suspend/resume handling for devices in runtime suspend.
> 
> The problem is that some middle layer code (the PCI bus type and
> the ACPI PM domain in particular) returns positive values from its
> system suspend ->prepare callbacks regardless of whether the driver's
> ->prepare returns a positive value or 0, which effectively prevents
> drivers from being able to control the direct-complete feature.
> Some drivers need that control, however, and the PCI bus type has
> grown its own flag to deal with this issue, but since it is not
> limited to PCI, it is better to address it by adding driver flags at
> the core level.

I'm curious: Why does the PCI bus type (and others) do this?  Why 
doesn't it do what the driver says to do?

Alan Stern

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-16  7:08 ` [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Greg Kroah-Hartman
@ 2017-10-16 21:50   ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16 21:50 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Rafael J. Wysocki, Linux PM, Bjorn Helgaas, Alan Stern, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Mon, Oct 16, 2017 at 9:08 AM, Greg Kroah-Hartman
<gregkh@linuxfoundation.org> wrote:
> On Mon, Oct 16, 2017 at 03:12:35AM +0200, Rafael J. Wysocki wrote:
>> Hi All,
>>
>> Well, this took more time than expected, as I tried to cover everything I had
>> in mind regarding PM flags for drivers.
>>
>> This work was triggered by attempts to fix and optimize PM in the
>> i2c-designware-platdev driver that ended up with adding a couple of
>> flags to the driver's internal data structures for the tracking of
>> device state (https://marc.info/?l=linux-acpi&m=150629646805636&w=2).
>> That approach is sort of suboptimal, though, because other drivers will
>> probably want to do similar things and if all of them need to use internal
>> flags for that, quite a bit of code duplication may ensue at least.
>>
>> That can be avoided in a couple of ways and one of them is to provide a means
>> for drivers to tell the core what to do and to make the core take care of it
>> if told to do so.  Hence, the idea to use driver flags for system-wide PM
>> that was briefly discussed during the LPC in LA last month.
>>
>> One of the flags considered at that time was to possibly cause the core
>> to reuse the runtime PM callback path of a device for system suspend/resume.
>> Admittedly, that idea didn't look too bad to me until I had started to try to
>> implement it and I got to the PCI bus type's hibernation callbacks.  Then, I
>> moved the patch I was working on to /dev/null right away.  I mean it.
>>
>> No, this is not going to happen.  No way.
>>
>> Moreover, that experience made me realize that the whole *idea* of using the
>> runtime PM callback path for system-wide PM was actually totally bogus (sorry
>> Ulf).
>>
>> The whole point of having different callbacks pointers for different types of
>> device transitions is because it may be necessary to do different things in
>> those callbacks in general.  Now, if you consider runtime PM and system
>> suspend/resume *only* and from a driver perspective, then yes, in some cases
>> the same pair of callback routines may be used for all suspend-like and
>> resume-like transitions of the device, but if you add hibernation to the mix,
>> then it is not so clear any more unless the callbacks don't actually do any
>> power management at all, but simply quiesce the device's activity and then
>> activate it again.  Namely, changing power states of devices during the
>> hibernation's "freeze" and "thaw" transitions rarely makes sense at all and
>> the "restore" transition needs to be able to cope with uninitialized devices
>> (in fact, it should be prepared to cope with devices in *any* state), so
>> runtime PM is hardly suitable for them.  Still, if a *driver* choses to not
>> do any real PM in its PM callbacks and leaves that to a middle layer (quite
>> a few drivers do that), then it possibly can use one pair of callbacks in all
>> cases and be happy, but middle layers pretty much have to use different
>> callback routines for different transitions.
>>
>> If you are a middle layer, your role is basically to do PM for a certain
>> group of devices.  Thus you cannot really do the same in ->suspend or
>> ->suspend_early and in ->runtime_suspend (because the former generally need to
>> take device_may_wakeup() into account and the latter doesn't) and you shouldn't
>> really do the same in ->suspend and ->freeze (becuase the latter shouldn't
>> change the device's power state) and so on.  To put it bluntly, trying
>> to use the ->runtime_suspend callback of a middle layer for anything other
>> than runtime suspend is complete and utter nonsense.  At the same time, the
>> ->runtime_resume callback of a middle layer may be reused to some extent,
>> but even that doesn't cover the "thaw" transitions during hibernation.
>>
>> What can work (and this is the only strategy that can work AFAICS) is to
>> point different callback pointers *in* *a* *driver* to the same routine
>> if the driver wants to reuse that code.  That actually will work for PCI
>> and USB drivers today, at least most of the time, but unfortunately there
>> are problems with it for, say, platform devices.
>>
>> The first problem is the requirement to track the status of the device
>> (suspended vs not suspended) in the callbacks, because the system-wide PM
>> code in the PM core doesn't do that.  The runtime PM framework does it, so
>> this means adding some extra code which isn't necessary for runtime PM to
>> the callback routines and that is not particularly nice.
>>
>> The second problem is that, if the driver wants to do anything in its
>> ->suspend callback, it generally has to prevent runtime suspend of the
>> device from taking place in parallel with that, which is quite cumbersome.
>> Usually, that is taken care of by resuming the device from runtime suspend
>> upfront, but generally doing that is wasteful (there may be no real need to
>> resume the device except for the fact that the code is designed this way).
>>
>> On top of the above, there are optimizations to be made, like leaving certain
>> devices in suspend after system resume to avoid wasting time on waiting for
>> them to resume before user space can run again and similar.
>>
>> This patch series focuses on addressing those problems so as to make it
>> easier to reuse callback routines by pointing different callback pointers
>> to them in device drivers.  The flags introduced here are to instruct the
>> PM core and middle layers (whatever they are) on how the driver wants the
>> device to be handled and then the driver has to provide callbacks to match
>> these instructions and the rest should be taken care of by the code above it.
>>
>> The flags are introduced one by one to avoid making too many changes in
>> one go and to allow things to be explained better (hopefully).  They mostly
>> are mutually independent with some clearly documented exceptions.
>>
>> The first three patches in the series are about an issue with the
>> direct-complete optimization introduced some time ago in which some middle
>> layers decide on whether or not to do the optimization without asking the
>> drivers.  And, as it turns out, in some cases the drivers actually know
>> better, so the new flags introduced by these patches are here for these
>> drivers (and the DPM_FLAG_NEVER_SKIP one is really to avoid having to define
>> ->prepare callbacks always returning zero).
>>
>> The really interesting things start to happen in patches [4-9/12] which make it
>> possible to avoid resuming devices from runtime suspend upfront during system
>> suspend at least in some cases (and when direct-complete is not applied to the
>> devices in question), but please refer to the changelogs for details.
>>
>> The i2d-designware-platdev driver is used as the primary example in the series
>> and the patches modifying it are based on some previous changes currently in
>> linux-next AFAICS (the same applies to the intel-lpss driver), but these
>> patches can wait until everything is properly merged.  They are included here
>> mostly as illustration.
>>
>> Overall, the series is based on the linux-next branch of the linux-pm.git tree
>> with some extra patches on top of it and all of the names of new entities
>> introduced in it are negotiable.
>
> Thanks for the great explaination, I was wondering how your proposal
> discussed at Plumbers was going to work out in the end :)
>
> The patch series looks good to me (minor questions already sent on the
> patches),

Cool. :-)

> but what does this mean for drivers?  Do they now have to do a
> lot of work to take advantage of this, like you did for the
> i2d-designware-platdev driver?  Or will things continue to work as-is
> and it's only an opt-in type thing where the bus/driver wants to take
> advantage of it?

It's envisioned as an opt-in thing mostly, except for the flags
introduced by patch [01/12] that may be needed to address existing
issues.

It is not strictly necessary to set any of the other flags, but I
guess some use cases may benefit quite a bit from setting them. :-)

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  5:34   ` Lukas Wunner
@ 2017-10-16 22:03     ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16 22:03 UTC (permalink / raw)
  To: Lukas Wunner
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Monday, October 16, 2017 7:34:52 AM CEST Lukas Wunner wrote:
> On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> > +	:c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
> > +	tese flags is set, the PM core will not apply the direct-complete
>         ^
> 	these
> 
> > +	proceudre described above to the given device and, consequenty, to any
>         ^
>         procedure
> 

Thanks!

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  6:28   ` Greg Kroah-Hartman
@ 2017-10-16 22:05     ` Rafael J. Wysocki
  2017-10-17  7:15       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16 22:05 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Monday, October 16, 2017 8:28:52 AM CEST Greg Kroah-Hartman wrote:
> On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> >  struct dev_pm_info {
> >  	pm_message_t		power_state;
> >  	unsigned int		can_wakeup:1;
> > @@ -561,6 +580,7 @@ struct dev_pm_info {
> >  	bool			is_late_suspended:1;
> >  	bool			early_init:1;	/* Owned by the PM core */
> >  	bool			direct_complete:1;	/* Owned by the PM core */
> > +	unsigned int		driver_flags;
> 
> Minor nit, u32 or u64?

u32 I think, will update.

BTW, there's a mess in this struct overall and I'd like all of the bit fileds
to be the same type (and that shouldn't be bool IMO :-)).

Do you prefer u32 or unsinged int?

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  6:31   ` Greg Kroah-Hartman
@ 2017-10-16 22:07     ` Rafael J. Wysocki
  2017-10-17 13:26       ` Greg Kroah-Hartman
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16 22:07 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Monday, October 16, 2017 8:31:22 AM CEST Greg Kroah-Hartman wrote:
> On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> > +static inline void dev_pm_set_driver_flags(struct device *dev, unsigned int flags)
> > +{
> > +	dev->power.driver_flags = flags;
> > +}
> 
> Should this function just set the specific bit?  Or is it going to be ok
> to set the whole value, meaning you aren't going to care about turning
> on and off specific flags over the lifetime of the driver/device, you
> are just going to set them once and then just test them as needed?

The idea is to set them once and they should not be touched again until
the driver (or device) goes away, so that would be the whole value at once
(and one of the i2c-designware-platdrv patches actually sets multiple flags
in one go).

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16 20:16   ` Alan Stern
@ 2017-10-16 22:11     ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-16 22:11 UTC (permalink / raw)
  To: Alan Stern
  Cc: Linux PM, Bjorn Helgaas, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Monday, October 16, 2017 10:16:15 PM CEST Alan Stern wrote:
> On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:
> 
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > The motivation for this change is to provide a way to work around
> > a problem with the direct-complete mechanism used for avoiding
> > system suspend/resume handling for devices in runtime suspend.
> > 
> > The problem is that some middle layer code (the PCI bus type and
> > the ACPI PM domain in particular) returns positive values from its
> > system suspend ->prepare callbacks regardless of whether the driver's
> > ->prepare returns a positive value or 0, which effectively prevents
> > drivers from being able to control the direct-complete feature.
> > Some drivers need that control, however, and the PCI bus type has
> > grown its own flag to deal with this issue, but since it is not
> > limited to PCI, it is better to address it by adding driver flags at
> > the core level.
> 
> I'm curious: Why does the PCI bus type (and others) do this?  Why 
> doesn't it do what the driver says to do?

Well, the idea was that it might work for the existing drivers without the
need to modify them (and they would have had to be modified had the driver's
->prepare return value been required to be taken into account).

It actually does work for them in general, although with some notable
exceptions.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16 22:05     ` Rafael J. Wysocki
@ 2017-10-17  7:15       ` Greg Kroah-Hartman
  2017-10-17 15:26         ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-17  7:15 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Tue, Oct 17, 2017 at 12:05:11AM +0200, Rafael J. Wysocki wrote:
> On Monday, October 16, 2017 8:28:52 AM CEST Greg Kroah-Hartman wrote:
> > On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> > >  struct dev_pm_info {
> > >  	pm_message_t		power_state;
> > >  	unsigned int		can_wakeup:1;
> > > @@ -561,6 +580,7 @@ struct dev_pm_info {
> > >  	bool			is_late_suspended:1;
> > >  	bool			early_init:1;	/* Owned by the PM core */
> > >  	bool			direct_complete:1;	/* Owned by the PM core */
> > > +	unsigned int		driver_flags;
> > 
> > Minor nit, u32 or u64?
> 
> u32 I think, will update.
> 
> BTW, there's a mess in this struct overall and I'd like all of the bit fileds
> to be the same type (and that shouldn't be bool IMO :-)).
> 
> Do you prefer u32 or unsinged int?

I always prefer an explicit size for variables, unless it's a "generic
loop" type thing.  So I'll always say "u32" for this.

And cleaning up the structure would be great, it's grown over time in
odd ways as you point out.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (12 preceding siblings ...)
  2017-10-16  7:08 ` [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Greg Kroah-Hartman
@ 2017-10-17  8:36 ` Ulf Hansson
  2017-10-17 15:25   ` Rafael J. Wysocki
  2017-10-20 20:46 ` Bjorn Helgaas
  14 siblings, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-17  8:36 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:12, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> Hi All,
>
> Well, this took more time than expected, as I tried to cover everything I had
> in mind regarding PM flags for drivers.
>
> This work was triggered by attempts to fix and optimize PM in the
> i2c-designware-platdev driver that ended up with adding a couple of
> flags to the driver's internal data structures for the tracking of
> device state (https://marc.info/?l=linux-acpi&m=150629646805636&w=2).
> That approach is sort of suboptimal, though, because other drivers will
> probably want to do similar things and if all of them need to use internal
> flags for that, quite a bit of code duplication may ensue at least.
>
> That can be avoided in a couple of ways and one of them is to provide a means
> for drivers to tell the core what to do and to make the core take care of it
> if told to do so.  Hence, the idea to use driver flags for system-wide PM
> that was briefly discussed during the LPC in LA last month.
>
> One of the flags considered at that time was to possibly cause the core
> to reuse the runtime PM callback path of a device for system suspend/resume.
> Admittedly, that idea didn't look too bad to me until I had started to try to
> implement it and I got to the PCI bus type's hibernation callbacks.  Then, I
> moved the patch I was working on to /dev/null right away.  I mean it.
>
> No, this is not going to happen.  No way.
>
> Moreover, that experience made me realize that the whole *idea* of using the
> runtime PM callback path for system-wide PM was actually totally bogus (sorry
> Ulf).
>
> The whole point of having different callbacks pointers for different types of
> device transitions is because it may be necessary to do different things in
> those callbacks in general.  Now, if you consider runtime PM and system
> suspend/resume *only* and from a driver perspective, then yes, in some cases
> the same pair of callback routines may be used for all suspend-like and
> resume-like transitions of the device, but if you add hibernation to the mix,
> then it is not so clear any more unless the callbacks don't actually do any
> power management at all, but simply quiesce the device's activity and then
> activate it again.  Namely, changing power states of devices during the
> hibernation's "freeze" and "thaw" transitions rarely makes sense at all and
> the "restore" transition needs to be able to cope with uninitialized devices
> (in fact, it should be prepared to cope with devices in *any* state), so
> runtime PM is hardly suitable for them.  Still, if a *driver* choses to not
> do any real PM in its PM callbacks and leaves that to a middle layer (quite
> a few drivers do that), then it possibly can use one pair of callbacks in all
> cases and be happy, but middle layers pretty much have to use different
> callback routines for different transitions.
>
> If you are a middle layer, your role is basically to do PM for a certain
> group of devices.  Thus you cannot really do the same in ->suspend or
> ->suspend_early and in ->runtime_suspend (because the former generally need to
> take device_may_wakeup() into account and the latter doesn't) and you shouldn't
> really do the same in ->suspend and ->freeze (becuase the latter shouldn't
> change the device's power state) and so on.  To put it bluntly, trying
> to use the ->runtime_suspend callback of a middle layer for anything other
> than runtime suspend is complete and utter nonsense.  At the same time, the
> ->runtime_resume callback of a middle layer may be reused to some extent,
> but even that doesn't cover the "thaw" transitions during hibernation.
>
> What can work (and this is the only strategy that can work AFAICS) is to
> point different callback pointers *in* *a* *driver* to the same routine
> if the driver wants to reuse that code.  That actually will work for PCI
> and USB drivers today, at least most of the time, but unfortunately there
> are problems with it for, say, platform devices.
>
> The first problem is the requirement to track the status of the device
> (suspended vs not suspended) in the callbacks, because the system-wide PM
> code in the PM core doesn't do that.  The runtime PM framework does it, so
> this means adding some extra code which isn't necessary for runtime PM to
> the callback routines and that is not particularly nice.
>
> The second problem is that, if the driver wants to do anything in its
> ->suspend callback, it generally has to prevent runtime suspend of the
> device from taking place in parallel with that, which is quite cumbersome.
> Usually, that is taken care of by resuming the device from runtime suspend
> upfront, but generally doing that is wasteful (there may be no real need to
> resume the device except for the fact that the code is designed this way).
>
> On top of the above, there are optimizations to be made, like leaving certain
> devices in suspend after system resume to avoid wasting time on waiting for
> them to resume before user space can run again and similar.
>
> This patch series focuses on addressing those problems so as to make it
> easier to reuse callback routines by pointing different callback pointers
> to them in device drivers.  The flags introduced here are to instruct the
> PM core and middle layers (whatever they are) on how the driver wants the
> device to be handled and then the driver has to provide callbacks to match
> these instructions and the rest should be taken care of by the code above it.
>
> The flags are introduced one by one to avoid making too many changes in
> one go and to allow things to be explained better (hopefully).  They mostly
> are mutually independent with some clearly documented exceptions.
>
> The first three patches in the series are about an issue with the
> direct-complete optimization introduced some time ago in which some middle
> layers decide on whether or not to do the optimization without asking the
> drivers.  And, as it turns out, in some cases the drivers actually know
> better, so the new flags introduced by these patches are here for these
> drivers (and the DPM_FLAG_NEVER_SKIP one is really to avoid having to define
> ->prepare callbacks always returning zero).
>
> The really interesting things start to happen in patches [4-9/12] which make it
> possible to avoid resuming devices from runtime suspend upfront during system
> suspend at least in some cases (and when direct-complete is not applied to the
> devices in question), but please refer to the changelogs for details.
>
> The i2d-designware-platdev driver is used as the primary example in the series
> and the patches modifying it are based on some previous changes currently in
> linux-next AFAICS (the same applies to the intel-lpss driver), but these
> patches can wait until everything is properly merged.  They are included here
> mostly as illustration.
>
> Overall, the series is based on the linux-next branch of the linux-pm.git tree
> with some extra patches on top of it and all of the names of new entities
> introduced in it are negotiable.
>
> Thanks,
> Rafael
>

I am not sure I fully understand the goal you have with this series.
Can we please try to get that clear before I continue the review.

Now, re-using runtime PM callbacks for system sleep, is already
happening. We have > 60 users (git grep "pm_runtime_force_suspend")
deploying this and from a middle layer point of view, all the trivial
cases supports this. Like the spi bus, i2c bus, amba bus, platform
bus, genpd, etc. There are no changes needed to continue to support
this option, if you see what I mean.

So, when you say that re-using runtime PM callbacks for system-wide PM
isn't going to happen, can you please elaborate what you mean?

I assume you mean that the PM core won't be involved to support this,
but is that it?

Do you also mean that *all* users of pm_runtime_force_suspend|resume()
must convert to this new thing, using "driver PM flags", so in the end
you want to remove pm_runtime_force_suspend|resume()?
 - Then if so, you must of course consider all cases for how
pm_runtime_force_suspend|resume() are being deployed currently, else
existing users can't convert to the "driver PM flags" thing. Have you
done that in this series?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16 22:07     ` Rafael J. Wysocki
@ 2017-10-17 13:26       ` Greg Kroah-Hartman
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-17 13:26 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Tue, Oct 17, 2017 at 12:07:37AM +0200, Rafael J. Wysocki wrote:
> On Monday, October 16, 2017 8:31:22 AM CEST Greg Kroah-Hartman wrote:
> > On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> > > +static inline void dev_pm_set_driver_flags(struct device *dev, unsigned int flags)
> > > +{
> > > +	dev->power.driver_flags = flags;
> > > +}
> > 
> > Should this function just set the specific bit?  Or is it going to be ok
> > to set the whole value, meaning you aren't going to care about turning
> > on and off specific flags over the lifetime of the driver/device, you
> > are just going to set them once and then just test them as needed?
> 
> The idea is to set them once and they should not be touched again until
> the driver (or device) goes away, so that would be the whole value at once
> (and one of the i2c-designware-platdrv patches actually sets multiple flags
> in one go).

Ok, thanks.

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-17  8:36 ` Ulf Hansson
@ 2017-10-17 15:25   ` Rafael J. Wysocki
  2017-10-17 19:41     ` Ulf Hansson
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-17 15:25 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Tuesday, October 17, 2017 10:36:39 AM CEST Ulf Hansson wrote:
> On 16 October 2017 at 03:12, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > Hi All,
> >
> > Well, this took more time than expected, as I tried to cover everything I had
> > in mind regarding PM flags for drivers.
> >
> > This work was triggered by attempts to fix and optimize PM in the
> > i2c-designware-platdev driver that ended up with adding a couple of
> > flags to the driver's internal data structures for the tracking of
> > device state (https://marc.info/?l=linux-acpi&m=150629646805636&w=2).
> > That approach is sort of suboptimal, though, because other drivers will
> > probably want to do similar things and if all of them need to use internal
> > flags for that, quite a bit of code duplication may ensue at least.
> >
> > That can be avoided in a couple of ways and one of them is to provide a means
> > for drivers to tell the core what to do and to make the core take care of it
> > if told to do so.  Hence, the idea to use driver flags for system-wide PM
> > that was briefly discussed during the LPC in LA last month.
> >
> > One of the flags considered at that time was to possibly cause the core
> > to reuse the runtime PM callback path of a device for system suspend/resume.
> > Admittedly, that idea didn't look too bad to me until I had started to try to
> > implement it and I got to the PCI bus type's hibernation callbacks.  Then, I
> > moved the patch I was working on to /dev/null right away.  I mean it.
> >
> > No, this is not going to happen.  No way.
> >
> > Moreover, that experience made me realize that the whole *idea* of using the
> > runtime PM callback path for system-wide PM was actually totally bogus (sorry
> > Ulf).
> >
> > The whole point of having different callbacks pointers for different types of
> > device transitions is because it may be necessary to do different things in
> > those callbacks in general.  Now, if you consider runtime PM and system
> > suspend/resume *only* and from a driver perspective, then yes, in some cases
> > the same pair of callback routines may be used for all suspend-like and
> > resume-like transitions of the device, but if you add hibernation to the mix,
> > then it is not so clear any more unless the callbacks don't actually do any
> > power management at all, but simply quiesce the device's activity and then
> > activate it again.  Namely, changing power states of devices during the
> > hibernation's "freeze" and "thaw" transitions rarely makes sense at all and
> > the "restore" transition needs to be able to cope with uninitialized devices
> > (in fact, it should be prepared to cope with devices in *any* state), so
> > runtime PM is hardly suitable for them.  Still, if a *driver* choses to not
> > do any real PM in its PM callbacks and leaves that to a middle layer (quite
> > a few drivers do that), then it possibly can use one pair of callbacks in all
> > cases and be happy, but middle layers pretty much have to use different
> > callback routines for different transitions.
> >
> > If you are a middle layer, your role is basically to do PM for a certain
> > group of devices.  Thus you cannot really do the same in ->suspend or
> > ->suspend_early and in ->runtime_suspend (because the former generally need to
> > take device_may_wakeup() into account and the latter doesn't) and you shouldn't
> > really do the same in ->suspend and ->freeze (becuase the latter shouldn't
> > change the device's power state) and so on.  To put it bluntly, trying
> > to use the ->runtime_suspend callback of a middle layer for anything other
> > than runtime suspend is complete and utter nonsense.  At the same time, the
> > ->runtime_resume callback of a middle layer may be reused to some extent,
> > but even that doesn't cover the "thaw" transitions during hibernation.
> >
> > What can work (and this is the only strategy that can work AFAICS) is to
> > point different callback pointers *in* *a* *driver* to the same routine
> > if the driver wants to reuse that code.  That actually will work for PCI
> > and USB drivers today, at least most of the time, but unfortunately there
> > are problems with it for, say, platform devices.
> >
> > The first problem is the requirement to track the status of the device
> > (suspended vs not suspended) in the callbacks, because the system-wide PM
> > code in the PM core doesn't do that.  The runtime PM framework does it, so
> > this means adding some extra code which isn't necessary for runtime PM to
> > the callback routines and that is not particularly nice.
> >
> > The second problem is that, if the driver wants to do anything in its
> > ->suspend callback, it generally has to prevent runtime suspend of the
> > device from taking place in parallel with that, which is quite cumbersome.
> > Usually, that is taken care of by resuming the device from runtime suspend
> > upfront, but generally doing that is wasteful (there may be no real need to
> > resume the device except for the fact that the code is designed this way).
> >
> > On top of the above, there are optimizations to be made, like leaving certain
> > devices in suspend after system resume to avoid wasting time on waiting for
> > them to resume before user space can run again and similar.
> >
> > This patch series focuses on addressing those problems so as to make it
> > easier to reuse callback routines by pointing different callback pointers
> > to them in device drivers.  The flags introduced here are to instruct the
> > PM core and middle layers (whatever they are) on how the driver wants the
> > device to be handled and then the driver has to provide callbacks to match
> > these instructions and the rest should be taken care of by the code above it.
> >
> > The flags are introduced one by one to avoid making too many changes in
> > one go and to allow things to be explained better (hopefully).  They mostly
> > are mutually independent with some clearly documented exceptions.
> >
> > The first three patches in the series are about an issue with the
> > direct-complete optimization introduced some time ago in which some middle
> > layers decide on whether or not to do the optimization without asking the
> > drivers.  And, as it turns out, in some cases the drivers actually know
> > better, so the new flags introduced by these patches are here for these
> > drivers (and the DPM_FLAG_NEVER_SKIP one is really to avoid having to define
> > ->prepare callbacks always returning zero).
> >
> > The really interesting things start to happen in patches [4-9/12] which make it
> > possible to avoid resuming devices from runtime suspend upfront during system
> > suspend at least in some cases (and when direct-complete is not applied to the
> > devices in question), but please refer to the changelogs for details.
> >
> > The i2d-designware-platdev driver is used as the primary example in the series
> > and the patches modifying it are based on some previous changes currently in
> > linux-next AFAICS (the same applies to the intel-lpss driver), but these
> > patches can wait until everything is properly merged.  They are included here
> > mostly as illustration.
> >
> > Overall, the series is based on the linux-next branch of the linux-pm.git tree
> > with some extra patches on top of it and all of the names of new entities
> > introduced in it are negotiable.
> >
> > Thanks,
> > Rafael
> >
> 
> I am not sure I fully understand the goal you have with this series.
> Can we please try to get that clear before I continue the review.

Quoting from the above:

"This patch series focuses on addressing those problems so as to make it
easier to reuse callback routines by pointing different callback pointers
to them in device drivers.  The flags introduced here are to instruct the
PM core and middle layers (whatever they are) on how the driver wants the
device to be handled and then the driver has to provide callbacks to match
these instructions and the rest should be taken care of by the code above it."

I'm not sure what I can explain beyond that. :-)

And the i2c-designware-platdrv and intel-lpss patches show the direction
I would like to take with that going forward: use the flags to reduce code
duplication in drivers and between drivers.

> Now, re-using runtime PM callbacks for system sleep, is already
> happening. We have > 60 users (git grep "pm_runtime_force_suspend")

60 is a small number relative to the total number of device drivers in
the tree.  In particular, that scheme is totally unsuitable for PCI drivers
and how many of them there are?  Surely more than 60.

> deploying this and from a middle layer point of view, all the trivial
> cases supports this.

These functions are wrong, however, because they attempt to reuse the
whole callback *path* instead of just reusing driver callbacks.  The
*only* reason why it all "works" is because there are no middle layer
callbacks involved in that now.

If you changed them to reuse driver callbacks only today, nothing would break
AFAICS.

> Like the spi bus, i2c bus, amba bus, platform
> bus, genpd, etc. There are no changes needed to continue to support
> this option, if you see what I mean.

For the time being, nothing changes in that respect, but eventually I'd
prefer the pm_runtime_force_* things to go away, frankly.

> So, when you say that re-using runtime PM callbacks for system-wide PM
> isn't going to happen, can you please elaborate what you mean?

I didn't mean "reusing runtime PM callbacks for system-wide PM" overall, but
reusing *middle-layer* runtime PM callbacks for system-wide PM.  That is the
bogus part.

Quoting again:

"If you are a middle layer, your role is basically to do PM for a certain
group of devices.  Thus you cannot really do the same in ->suspend or
->suspend_early and in ->runtime_suspend (because the former generally need to
take device_may_wakeup() into account and the latter doesn't) and you shouldn't
really do the same in ->suspend and ->freeze (becuase the latter shouldn't
change the device's power state) and so on."

I have said for multiple times that re-using *driver* callbacks actually makes
sense and the series is for doing that easier in general among other things.

> I assume you mean that the PM core won't be involved to support this,
> but is that it?
> 
> Do you also mean that *all* users of pm_runtime_force_suspend|resume()
> must convert to this new thing, using "driver PM flags", so in the end
> you want to remove pm_runtime_force_suspend|resume()?
>  - Then if so, you must of course consider all cases for how
> pm_runtime_force_suspend|resume() are being deployed currently, else
> existing users can't convert to the "driver PM flags" thing. Have you
> done that in this series?

Let me turn this around.

The majority of cases in which pm_runtime_force_* are used *should* be
addressable using the flags introduced here.  Some case in which
pm_runtime_force_* cannot be used should be addressable by these flags
as well.

There may be some cases in which pm_runtime_force_* are used that may
require something more, but I'm not going to worry about that right now.

I'll take care of that when I'll be removing pm_runtime_force_*, which I'm
not doing here.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-17  7:15       ` Greg Kroah-Hartman
@ 2017-10-17 15:26         ` Rafael J. Wysocki
  2017-10-18  6:56           ` Greg Kroah-Hartman
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-17 15:26 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Tuesday, October 17, 2017 9:15:43 AM CEST Greg Kroah-Hartman wrote:
> On Tue, Oct 17, 2017 at 12:05:11AM +0200, Rafael J. Wysocki wrote:
> > On Monday, October 16, 2017 8:28:52 AM CEST Greg Kroah-Hartman wrote:
> > > On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> > > >  struct dev_pm_info {
> > > >  	pm_message_t		power_state;
> > > >  	unsigned int		can_wakeup:1;
> > > > @@ -561,6 +580,7 @@ struct dev_pm_info {
> > > >  	bool			is_late_suspended:1;
> > > >  	bool			early_init:1;	/* Owned by the PM core */
> > > >  	bool			direct_complete:1;	/* Owned by the PM core */
> > > > +	unsigned int		driver_flags;
> > > 
> > > Minor nit, u32 or u64?
> > 
> > u32 I think, will update.
> > 
> > BTW, there's a mess in this struct overall and I'd like all of the bit fileds
> > to be the same type (and that shouldn't be bool IMO :-)).
> > 
> > Do you prefer u32 or unsinged int?
> 
> I always prefer an explicit size for variables, unless it's a "generic
> loop" type thing.  So I'll always say "u32" for this.
> 
> And cleaning up the structure would be great, it's grown over time in
> odd ways as you point out.

OK, but that will be separate from this work.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 12/12] PM / core: Add AVOID_RPM driver flag
  2017-10-16  1:32 ` [PATCH 12/12] PM / core: Add AVOID_RPM driver flag Rafael J. Wysocki
@ 2017-10-17 15:33   ` Andy Shevchenko
  2017-10-17 15:59     ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Andy Shevchenko @ 2017-10-17 15:33 UTC (permalink / raw)
  To: Rafael J. Wysocki, Linux PM
  Cc: Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI,
	Linux PCI, Linux Documentation, Mika Westerberg, Ulf Hansson,
	Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Mon, 2017-10-16 at 03:32 +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Define and document a new driver flag, DPM_FLAG_AVOID_RPM, to inform
> the PM core and middle layer code that the driver has something
> significant to do in its ->suspend and/or ->resume callbacks and
> runtime PM should be disabled for the device when these callbacks
> run.
> 
> Setting DPM_FLAG_AVOID_RPM (in addition to DPM_FLAG_SMART_SUSPEND)
> causes runtime PM to be disabled for the device before invoking the
> driver's ->suspend callback for it and to be enabled again for it
> only after the driver's ->resume callback has returned.  In addition
> to that, if the device is in runtime suspend right after disabling
> runtime PM for it (which means that there was no reason to resume it
> from runtime suspend beforehand), the invocation of the ->suspend
> callback will be skipped for it and it will be left in runtime
> suspend until the "noirq" phase of the subsequent system resume.
> 
> If DPM_FLAG_SMART_SUSPEND is not set, DPM_FLAG_AVOID_RPM has no
> effect.
> 

> +	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> +	    dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {

Wasn't interface designed to allow something like:
	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND | DPM_FLAG_AVOID_RPM)) {
instead?

Does it make sense to have a separate definition for
DPM_FLAG_SMART_SUSPEND | DPM_FLAG_AVOID_RPM ?

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 12/12] PM / core: Add AVOID_RPM driver flag
  2017-10-17 15:33   ` Andy Shevchenko
@ 2017-10-17 15:59     ` Rafael J. Wysocki
  2017-10-17 16:25       ` Andy Shevchenko
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-17 15:59 UTC (permalink / raw)
  To: Andy Shevchenko
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Tuesday, October 17, 2017 5:33:17 PM CEST Andy Shevchenko wrote:
> On Mon, 2017-10-16 at 03:32 +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Define and document a new driver flag, DPM_FLAG_AVOID_RPM, to inform
> > the PM core and middle layer code that the driver has something
> > significant to do in its ->suspend and/or ->resume callbacks and
> > runtime PM should be disabled for the device when these callbacks
> > run.
> > 
> > Setting DPM_FLAG_AVOID_RPM (in addition to DPM_FLAG_SMART_SUSPEND)
> > causes runtime PM to be disabled for the device before invoking the
> > driver's ->suspend callback for it and to be enabled again for it
> > only after the driver's ->resume callback has returned.  In addition
> > to that, if the device is in runtime suspend right after disabling
> > runtime PM for it (which means that there was no reason to resume it
> > from runtime suspend beforehand), the invocation of the ->suspend
> > callback will be skipped for it and it will be left in runtime
> > suspend until the "noirq" phase of the subsequent system resume.
> > 
> > If DPM_FLAG_SMART_SUSPEND is not set, DPM_FLAG_AVOID_RPM has no
> > effect.
> > 
> 
> > +	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> > +	    dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
> 
> Wasn't interface designed to allow something like:
> 	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND | DPM_FLAG_AVOID_RPM)) {
> instead?

That would return true if any of them was set and both are needed here.

> Does it make sense to have a separate definition for
> DPM_FLAG_SMART_SUSPEND | DPM_FLAG_AVOID_RPM ?

Yes, it does IMO, because if you don't provide ->suspend and ->resume
callbacks, it is sufficient if runtime PM is disabled for the device
in __device_suspend_late() which happens anyway.

DPM_FLAG_AVOID_RPM is about disabling it earlier.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 12/12] PM / core: Add AVOID_RPM driver flag
  2017-10-17 15:59     ` Rafael J. Wysocki
@ 2017-10-17 16:25       ` Andy Shevchenko
  0 siblings, 0 replies; 79+ messages in thread
From: Andy Shevchenko @ 2017-10-17 16:25 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Tue, 2017-10-17 at 17:59 +0200, Rafael J. Wysocki wrote:
> On Tuesday, October 17, 2017 5:33:17 PM CEST Andy Shevchenko wrote:
> > On Mon, 2017-10-16 at 03:32 +0200, Rafael J. Wysocki wrote:

> > > If DPM_FLAG_SMART_SUSPEND is not set, DPM_FLAG_AVOID_RPM has no
> > > effect.
> > > 
> > > +	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND)
> > > &&
> > > +	    dev_pm_test_driver_flags(dev, DPM_FLAG_AVOID_RPM)) {
> > 
> > Wasn't interface designed to allow something like:
> > 	if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND |
> > DPM_FLAG_AVOID_RPM)) {
> > instead?
> 
> That would return true if any of them was set and both are needed
> here.

Ah, indeed. It would not be equivalent. 

-- 
Andy Shevchenko <andriy.shevchenko@linux.intel.com>
Intel Finland Oy

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-17 15:25   ` Rafael J. Wysocki
@ 2017-10-17 19:41     ` Ulf Hansson
  2017-10-17 20:12       ` Alan Stern
  2017-10-18  0:39       ` Rafael J. Wysocki
  0 siblings, 2 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-17 19:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

[...]

>>
>> I am not sure I fully understand the goal you have with this series.
>> Can we please try to get that clear before I continue the review.
>
> Quoting from the above:
>
> "This patch series focuses on addressing those problems so as to make it
> easier to reuse callback routines by pointing different callback pointers
> to them in device drivers.  The flags introduced here are to instruct the
> PM core and middle layers (whatever they are) on how the driver wants the
> device to be handled and then the driver has to provide callbacks to match
> these instructions and the rest should be taken care of by the code above it."
>
> I'm not sure what I can explain beyond that. :-)
>
> And the i2c-designware-platdrv and intel-lpss patches show the direction
> I would like to take with that going forward: use the flags to reduce code
> duplication in drivers and between drivers.
>
>> Now, re-using runtime PM callbacks for system sleep, is already
>> happening. We have > 60 users (git grep "pm_runtime_force_suspend")
>
> 60 is a small number relative to the total number of device drivers in
> the tree.  In particular, that scheme is totally unsuitable for PCI drivers
> and how many of them there are?  Surely more than 60.

Sure, those 60 can be converted after some work. I just wanted to
understand your plan for these moving forward.

>
>> deploying this and from a middle layer point of view, all the trivial
>> cases supports this.
>
> These functions are wrong, however, because they attempt to reuse the
> whole callback *path* instead of just reusing driver callbacks.  The
> *only* reason why it all "works" is because there are no middle layer
> callbacks involved in that now.
>
> If you changed them to reuse driver callbacks only today, nothing would break
> AFAICS.

Yes, it would.

First, for example, the amba bus is responsible for the amba bus
clock, but relies on drivers to gate/ungate it during system sleep. In
case the amba drivers don't use the pm_runtime_force_suspend|resume(),
it will explicitly have to start manage the clock during system sleep
themselves. Leading to open coding.

Second, it will introduce a regression in behavior for all users of
pm_runtime_force_suspend|resume(), especially during system resume as
the driver may then end up resuming the device even in case it isn't
needed. I believe I have explained why, also several times by now -
and that's also how far you could take the i2c designware driver at
this point.

That said, I assume the second part may be addressed in this series,
if these drivers convert to use the "driver PM flags", right?

However, what about the first case? Is some open coding needed or your
think the amba driver can instruct the amba bus via the "driver PM
flags"?

>
>> Like the spi bus, i2c bus, amba bus, platform
>> bus, genpd, etc. There are no changes needed to continue to support
>> this option, if you see what I mean.
>
> For the time being, nothing changes in that respect, but eventually I'd
> prefer the pm_runtime_force_* things to go away, frankly.

Okay, thanks for that clear statement!

>
>> So, when you say that re-using runtime PM callbacks for system-wide PM
>> isn't going to happen, can you please elaborate what you mean?
>
> I didn't mean "reusing runtime PM callbacks for system-wide PM" overall, but
> reusing *middle-layer* runtime PM callbacks for system-wide PM.  That is the
> bogus part.

I think we have discussed this several times, but the arguments you
have put forward, explaining *why* haven't yet convinced me.

In principle what you have been saying is that it's a "layering
violation" to use pm_runtime_force_suspend|resume() from driver's
system sleep callbacks, but on the other hand you think using
pm_runtime_get*  and friends is okay!?

That makes little sense to me, because it's the same "layering
violation" that is done for both cases.

Moreover, you have been explaining that re-using runtime PM callbacks
for PCI doesn't work. Then my question is, why should a limitation of
the PCI subsystem put constraints on the behavior for all other
subsystems/middle-layers?

>
> Quoting again:
>
> "If you are a middle layer, your role is basically to do PM for a certain
> group of devices.  Thus you cannot really do the same in ->suspend or
> ->suspend_early and in ->runtime_suspend (because the former generally need to
> take device_may_wakeup() into account and the latter doesn't) and you shouldn't
> really do the same in ->suspend and ->freeze (becuase the latter shouldn't
> change the device's power state) and so on."
>
> I have said for multiple times that re-using *driver* callbacks actually makes
> sense and the series is for doing that easier in general among other things.
>
>> I assume you mean that the PM core won't be involved to support this,
>> but is that it?
>>
>> Do you also mean that *all* users of pm_runtime_force_suspend|resume()
>> must convert to this new thing, using "driver PM flags", so in the end
>> you want to remove pm_runtime_force_suspend|resume()?
>>  - Then if so, you must of course consider all cases for how
>> pm_runtime_force_suspend|resume() are being deployed currently, else
>> existing users can't convert to the "driver PM flags" thing. Have you
>> done that in this series?
>
> Let me turn this around.
>
> The majority of cases in which pm_runtime_force_* are used *should* be
> addressable using the flags introduced here.  Some case in which
> pm_runtime_force_* cannot be used should be addressable by these flags
> as well.

That's sounds really great!

>
> There may be some cases in which pm_runtime_force_* are used that may
> require something more, but I'm not going to worry about that right now.

This approach concerns me, because if we in the end realizes that
pm_runtime_force_suspend|resume() will be too hard to get rid of, then
this series just add yet another generic way of trying to optimize the
system sleep path for runtime PM enabled devices.

So then we would end up having to support the "direct_complete" path,
the "driver PM flags" and cases where
pm_runtime_force_suspend|resume() is used. No, that just isn't good
enough to me. That will just lead to similar scenarios as we had in
the i2c designware driver.

If we decide to go with these new "driver PM flags", I want to make
sure, as long as possible, that we can remove both the
"direct_complete" path support from the PM core as well as removing
the pm_runtime_force_suspend|resume() helpers.

>
> I'll take care of that when I'll be removing pm_runtime_force_*, which I'm
> not doing here.

Of course I am fine with that we postpone doing the actual converting
of drivers etc from this series, although as stated above, let's sure
we *can* do it by using the "driver PM flags".

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-17 19:41     ` Ulf Hansson
@ 2017-10-17 20:12       ` Alan Stern
  2017-10-17 23:07         ` Rafael J. Wysocki
  2017-10-18  0:39       ` Rafael J. Wysocki
  1 sibling, 1 reply; 79+ messages in thread
From: Alan Stern @ 2017-10-17 20:12 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J. Wysocki, Linux PM, Bjorn Helgaas, Greg Kroah-Hartman,
	LKML, Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Tue, 17 Oct 2017, Ulf Hansson wrote:

> > These functions are wrong, however, because they attempt to reuse the
> > whole callback *path* instead of just reusing driver callbacks.  The
> > *only* reason why it all "works" is because there are no middle layer
> > callbacks involved in that now.
> >
> > If you changed them to reuse driver callbacks only today, nothing would break
> > AFAICS.
> 
> Yes, it would.
> 
> First, for example, the amba bus is responsible for the amba bus
> clock, but relies on drivers to gate/ungate it during system sleep. In
> case the amba drivers don't use the pm_runtime_force_suspend|resume(),
> it will explicitly have to start manage the clock during system sleep
> themselves. Leading to open coding.

I think what Rafael has in mind is that the PM core will call the amba
bus's ->suspend callback, and that routine will then be able to call
the amba driver's runtime_suspend routine directly, if it wants to --
as opposed to going through pm_runtime_force_suspend.

However, it's not clear whether this fully answers your concerns.

> Second, it will introduce a regression in behavior for all users of
> pm_runtime_force_suspend|resume(), especially during system resume as
> the driver may then end up resuming the device even in case it isn't
> needed. I believe I have explained why, also several times by now -
> and that's also how far you could take the i2c designware driver at
> this point.
> 
> That said, I assume the second part may be addressed in this series,
> if these drivers convert to use the "driver PM flags", right?

Presumably.

The problem is how to handle things which need to be treated
differently for runtime PM vs. system suspend vs. hibernation.  If
everything filters through a runtime_suspend routine, that doesn't
leave any scope for handling the different kinds of PM transitions
differently.  Instead, we can make the middle layer (i.e., the bus-type
callbacks) take care of the varying tasks, and they can directly invoke
a driver's runtime-PM callbacks to handle all the common activities.  
If that's how the middle layer wants to do it.

> However, what about the first case? Is some open coding needed or your
> think the amba driver can instruct the amba bus via the "driver PM
> flags"?

PM flags won't directly be able to cover things like disabling clocks.  
But they could be useful for indicating explicitly whether the code to
take care of those things needs to reside at the driver layer or at the
bus layer.

Alan Stern

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-17 20:12       ` Alan Stern
@ 2017-10-17 23:07         ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-17 23:07 UTC (permalink / raw)
  To: Alan Stern
  Cc: Ulf Hansson, Linux PM, Bjorn Helgaas, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Tuesday, October 17, 2017 10:12:19 PM CEST Alan Stern wrote:
> On Tue, 17 Oct 2017, Ulf Hansson wrote:
> 
> > > These functions are wrong, however, because they attempt to reuse the
> > > whole callback *path* instead of just reusing driver callbacks.  The
> > > *only* reason why it all "works" is because there are no middle layer
> > > callbacks involved in that now.
> > >
> > > If you changed them to reuse driver callbacks only today, nothing would break
> > > AFAICS.
> > 
> > Yes, it would.
> > 
> > First, for example, the amba bus is responsible for the amba bus
> > clock, but relies on drivers to gate/ungate it during system sleep. In
> > case the amba drivers don't use the pm_runtime_force_suspend|resume(),
> > it will explicitly have to start manage the clock during system sleep
> > themselves. Leading to open coding.
> 
> I think what Rafael has in mind is that the PM core will call the amba
> bus's ->suspend callback, and that routine will then be able to call
> the amba driver's runtime_suspend routine directly, if it wants to --
> as opposed to going through pm_runtime_force_suspend.

Right in general.

> However, it's not clear whether this fully answers your concerns.

Well, in the particular AMBA case fixing this should be quite straightforward.

> > Second, it will introduce a regression in behavior for all users of
> > pm_runtime_force_suspend|resume(), especially during system resume as
> > the driver may then end up resuming the device even in case it isn't
> > needed. I believe I have explained why, also several times by now -
> > and that's also how far you could take the i2c designware driver at
> > this point.
> > 
> > That said, I assume the second part may be addressed in this series,
> > if these drivers convert to use the "driver PM flags", right?
> 
> Presumably.
> 
> The problem is how to handle things which need to be treated
> differently for runtime PM vs. system suspend vs. hibernation.  If
> everything filters through a runtime_suspend routine, that doesn't
> leave any scope for handling the different kinds of PM transitions
> differently.  Instead, we can make the middle layer (i.e., the bus-type
> callbacks) take care of the varying tasks, and they can directly invoke
> a driver's runtime-PM callbacks to handle all the common activities.  
> If that's how the middle layer wants to do it.

Well, that's what happens today, except that driver runtime PM callbacks
are not directly invoked.  Actually, I tried to implement that, but it was
so ugly and fragile that I gave up.

It really is better if drivers point the different callback pointers to the
same rountine if they want to reuse it.

> > However, what about the first case? Is some open coding needed or your
> > think the amba driver can instruct the amba bus via the "driver PM
> > flags"?
> 
> PM flags won't directly be able to cover things like disabling clocks.  
> But they could be useful for indicating explicitly whether the code to
> take care of those things needs to reside at the driver layer or at the
> bus layer.

Right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-17 19:41     ` Ulf Hansson
  2017-10-17 20:12       ` Alan Stern
@ 2017-10-18  0:39       ` Rafael J. Wysocki
  2017-10-18 10:24         ` Rafael J. Wysocki
  2017-10-18 11:57         ` Ulf Hansson
  1 sibling, 2 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18  0:39 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Tuesday, October 17, 2017 9:41:16 PM CEST Ulf Hansson wrote:

[cut]

> >
> >> deploying this and from a middle layer point of view, all the trivial
> >> cases supports this.
> >
> > These functions are wrong, however, because they attempt to reuse the
> > whole callback *path* instead of just reusing driver callbacks.  The
> > *only* reason why it all "works" is because there are no middle layer
> > callbacks involved in that now.
> >
> > If you changed them to reuse driver callbacks only today, nothing would break
> > AFAICS.
> 
> Yes, it would.
> 
> First, for example, the amba bus is responsible for the amba bus
> clock, but relies on drivers to gate/ungate it during system sleep. In
> case the amba drivers don't use the pm_runtime_force_suspend|resume(),
> it will explicitly have to start manage the clock during system sleep
> themselves. Leading to open coding.

Well, I suspected that something like this would surface. ;-)

Are there any major reasons why the appended patch (obviously untested) won't
work, then?

> Second, it will introduce a regression in behavior for all users of
> pm_runtime_force_suspend|resume(), especially during system resume as
> the driver may then end up resuming the device even in case it isn't
> needed.

How so?

I'm talking about a change like in the appended patch, where
pm_runtime_force_* simply invoke driver callbacks directly.  What is
skipped there is middle-layer stuff which is empty anyway in all cases
except for AMBA (if that's all what is lurking below the surface), so
I don't quite see how the failure will happen.

> I believe I have explained why, also several times by now -
> and that's also how far you could take the i2c designware driver at
> this point.
> 
> That said, I assume the second part may be addressed in this series,
> if these drivers convert to use the "driver PM flags", right?
> 
> However, what about the first case? Is some open coding needed or your
> think the amba driver can instruct the amba bus via the "driver PM
> flags"?

With the appended patch applied things should work for AMBA like for
any other bus type implementing PM, so I don't see why not.

> >
> >> Like the spi bus, i2c bus, amba bus, platform
> >> bus, genpd, etc. There are no changes needed to continue to support
> >> this option, if you see what I mean.
> >
> > For the time being, nothing changes in that respect, but eventually I'd
> > prefer the pm_runtime_force_* things to go away, frankly.
> 
> Okay, thanks for that clear statement!
> 
> >
> >> So, when you say that re-using runtime PM callbacks for system-wide PM
> >> isn't going to happen, can you please elaborate what you mean?
> >
> > I didn't mean "reusing runtime PM callbacks for system-wide PM" overall, but
> > reusing *middle-layer* runtime PM callbacks for system-wide PM.  That is the
> > bogus part.
> 
> I think we have discussed this several times, but the arguments you
> have put forward, explaining *why* haven't yet convinced me.

Well, sorry about that.  I would like to be able to explain my point to you so
that you understand my perspective, but if that's not working, that's not a
sufficient reason for me to give up.

I'm just refusing to maintain code that I don't agree with in the long run.

> In principle what you have been saying is that it's a "layering
> violation" to use pm_runtime_force_suspend|resume() from driver's
> system sleep callbacks, but on the other hand you think using
> pm_runtime_get*  and friends is okay!?

Not unconditionally, which would be fair to mention.

Only if it is called in ->prepare or as the first thing in a ->suspend
callback.  Later than that is broken too in principle.

> That makes little sense to me, because it's the same "layering
> violation" that is done for both cases.

The "layering violation" is all about things possibly occurring in a
wrong order.  For example, say a middle-layer ->runtime_suspend is
called via pm_runtime_force_suspend() which in turn is called from
middle-layer ->suspend_late as a driver callback.  If the ->runtime_suspend
does anything significat to the device, then executing the remaining part of
->suspend_late will almost cetainly break things, more or less.

That is not a concern with a middle-layer ->runtime_resume running
*before* a middle-layer ->suspend (or any subsequent callbacks) does
anything significant to the device.

Is there anything in the above which is not clear enough?

> Moreover, you have been explaining that re-using runtime PM callbacks
> for PCI doesn't work. Then my question is, why should a limitation of
> the PCI subsystem put constraints on the behavior for all other
> subsystems/middle-layers?

Because they aren't just PCI subsystem limitations only.  The need to handle
wakeup setup differently for runtime PM and system sleep is not PCI-specific.
The need to handle suspend and hibernation differently isn't too.

Those things may be more obvious in PCI, but they are generic rather than
special.

Also, quite so often other middle layers interact with PCI directly or
indirectly (eg. a platform device may be a child or a consumer of a PCI
device) and some optimizations need to take that into account (eg. parents
generally need to be accessible when their childres are resumed and so on).

Moreover, the majority of the "other subsystems/middle-layers" you've talked
about so far don't provide any PM callbacks to be invoked by pm_runtime_force_*,
so question is how representative they really are.

> >
> > Quoting again:
> >
> > "If you are a middle layer, your role is basically to do PM for a certain
> > group of devices.  Thus you cannot really do the same in ->suspend or
> > ->suspend_early and in ->runtime_suspend (because the former generally need to
> > take device_may_wakeup() into account and the latter doesn't) and you shouldn't
> > really do the same in ->suspend and ->freeze (becuase the latter shouldn't
> > change the device's power state) and so on."
> >
> > I have said for multiple times that re-using *driver* callbacks actually makes
> > sense and the series is for doing that easier in general among other things.
> >
> >> I assume you mean that the PM core won't be involved to support this,
> >> but is that it?
> >>
> >> Do you also mean that *all* users of pm_runtime_force_suspend|resume()
> >> must convert to this new thing, using "driver PM flags", so in the end
> >> you want to remove pm_runtime_force_suspend|resume()?
> >>  - Then if so, you must of course consider all cases for how
> >> pm_runtime_force_suspend|resume() are being deployed currently, else
> >> existing users can't convert to the "driver PM flags" thing. Have you
> >> done that in this series?
> >
> > Let me turn this around.
> >
> > The majority of cases in which pm_runtime_force_* are used *should* be
> > addressable using the flags introduced here.  Some case in which
> > pm_runtime_force_* cannot be used should be addressable by these flags
> > as well.
> 
> That's sounds really great!
> 
> >
> > There may be some cases in which pm_runtime_force_* are used that may
> > require something more, but I'm not going to worry about that right now.
> 
> This approach concerns me, because if we in the end realizes that
> pm_runtime_force_suspend|resume() will be too hard to get rid of, then
> this series just add yet another generic way of trying to optimize the
> system sleep path for runtime PM enabled devices.

Which also works for PCI and the ACPI PM domain and that's sort of valuable
anyway, isn't it?

For the record, I don't think it will be too hard to get rid of
pm_runtime_force_suspend|resume(), although that may take quite some time.

> So then we would end up having to support the "direct_complete" path,
> the "driver PM flags" and cases where
> pm_runtime_force_suspend|resume() is used. No, that just isn't good
> enough to me. That will just lead to similar scenarios as we had in
> the i2c designware driver.

Frankly, this sounds like staging for indefinite blocking of changes in
this area on non-technical grounds.  I hope that it isn't the case ...

> If we decide to go with these new "driver PM flags", I want to make
> sure, as long as possible, that we can remove both the
> "direct_complete" path support from the PM core as well as removing
> the pm_runtime_force_suspend|resume() helpers.

We'll see.

> >
> > I'll take care of that when I'll be removing pm_runtime_force_*, which I'm
> > not doing here.
> 
> Of course I am fine with that we postpone doing the actual converting
> of drivers etc from this series, although as stated above, let's sure
> we *can* do it by using the "driver PM flags".

There clearly are use cases that benefit from this series and I don't see
any alternatives covering them, including both direct-complete and the
pm_runtime_force* approach, so I'm not buying this "let's make sure
it can cover all possible use cases that exist" argumentation.

Thanks,
Rafael


---
 drivers/amba/bus.c           |   79 ++++++++++++++++++++++++++++---------------
 drivers/base/power/runtime.c |   10 +++--
 2 files changed, 58 insertions(+), 31 deletions(-)

Index: linux-pm/drivers/amba/bus.c
===================================================================
--- linux-pm.orig/drivers/amba/bus.c
+++ linux-pm/drivers/amba/bus.c
@@ -132,52 +132,77 @@ static struct attribute *amba_dev_attrs[
 ATTRIBUTE_GROUPS(amba_dev);
 
 #ifdef CONFIG_PM
+static void amba_pm_suspend(struct device *dev)
+{
+	struct amba_device *pcdev = to_amba_device(dev);
+
+	if (!dev->driver)
+		return;
+
+	if (pm_runtime_is_irq_safe(dev))
+		clk_disable(pcdev->pclk);
+	else
+		clk_disable_unprepare(pcdev->pclk);
+}
+
+static int amba_pm_resume(struct device *dev)
+{
+	struct amba_device *pcdev = to_amba_device(dev);
+
+	if (!dev->driver)
+		return 0;
+
+	/* Failure is probably fatal to the system, but... */
+	if (pm_runtime_is_irq_safe(dev))
+		return clk_enable(pcdev->pclk);
+
+	return clk_prepare_enable(pcdev->pclk);
+}
+
 /*
  * Hooks to provide runtime PM of the pclk (bus clock).  It is safe to
  * enable/disable the bus clock at runtime PM suspend/resume as this
  * does not result in loss of context.
  */
+static int amba_pm_suspend_early(struct device *dev)
+{
+	int ret = pm_generic_suspend_early(dev);
+
+	if (ret)
+		return ret;
+
+	amba_pm_suspend(dev);
+	return 0;
+}
+
+static int amba_pm_resume_late(struct device *dev)
+{
+	int ret = amba_pm_resume(dev);
+
+	return ret ? ret : pm_generic_resume_late(dev);
+}
+
 static int amba_pm_runtime_suspend(struct device *dev)
 {
-	struct amba_device *pcdev = to_amba_device(dev);
 	int ret = pm_generic_runtime_suspend(dev);
 
-	if (ret == 0 && dev->driver) {
-		if (pm_runtime_is_irq_safe(dev))
-			clk_disable(pcdev->pclk);
-		else
-			clk_disable_unprepare(pcdev->pclk);
-	}
+	if (ret)
+		return ret;
 
-	return ret;
+	amba_pm_suspend(dev);
+	return 0;
 }
 
 static int amba_pm_runtime_resume(struct device *dev)
 {
-	struct amba_device *pcdev = to_amba_device(dev);
-	int ret;
-
-	if (dev->driver) {
-		if (pm_runtime_is_irq_safe(dev))
-			ret = clk_enable(pcdev->pclk);
-		else
-			ret = clk_prepare_enable(pcdev->pclk);
-		/* Failure is probably fatal to the system, but... */
-		if (ret)
-			return ret;
-	}
+	int ret = amba_pm_resume(dev);
 
-	return pm_generic_runtime_resume(dev);
+	return ret ? ret : pm_generic_runtime_resume(dev);
 }
 #endif /* CONFIG_PM */
 
 static const struct dev_pm_ops amba_pm = {
-	.suspend	= pm_generic_suspend,
-	.resume		= pm_generic_resume,
-	.freeze		= pm_generic_freeze,
-	.thaw		= pm_generic_thaw,
-	.poweroff	= pm_generic_poweroff,
-	.restore	= pm_generic_restore,
+	SET_LATE_SYSTEM_SLEEP_PM_OPS(amba_pm_suspend_late, amba_pm_resume_early)
 	SET_RUNTIME_PM_OPS(
 		amba_pm_runtime_suspend,
 		amba_pm_runtime_resume,
Index: linux-pm/drivers/base/power/runtime.c
===================================================================
--- linux-pm.orig/drivers/base/power/runtime.c
+++ linux-pm/drivers/base/power/runtime.c
@@ -1636,14 +1636,15 @@ void pm_runtime_drop_link(struct device
  */
 int pm_runtime_force_suspend(struct device *dev)
 {
-	int (*callback)(struct device *);
+	int (*callback)(struct device *) = NULL;
 	int ret = 0;
 
 	pm_runtime_disable(dev);
 	if (pm_runtime_status_suspended(dev))
 		return 0;
 
-	callback = RPM_GET_CALLBACK(dev, runtime_suspend);
+	if (dev->driver && dev->driver->pm)
+		callback = dev->driver->pm->runtime_suspend;
 
 	if (!callback) {
 		ret = -ENOSYS;
@@ -1690,10 +1691,11 @@ EXPORT_SYMBOL_GPL(pm_runtime_force_suspe
  */
 int pm_runtime_force_resume(struct device *dev)
 {
-	int (*callback)(struct device *);
+	int (*callback)(struct device *) = NULL;
 	int ret = 0;
 
-	callback = RPM_GET_CALLBACK(dev, runtime_resume);
+	if (dev->driver && dev->driver->pm)
+		callback = dev->driver->pm->runtime_resume;
 
 	if (!callback) {
 		ret = -ENOSYS;

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-17 15:26         ` Rafael J. Wysocki
@ 2017-10-18  6:56           ` Greg Kroah-Hartman
  0 siblings, 0 replies; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-18  6:56 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

On Tue, Oct 17, 2017 at 05:26:20PM +0200, Rafael J. Wysocki wrote:
> On Tuesday, October 17, 2017 9:15:43 AM CEST Greg Kroah-Hartman wrote:
> > On Tue, Oct 17, 2017 at 12:05:11AM +0200, Rafael J. Wysocki wrote:
> > > On Monday, October 16, 2017 8:28:52 AM CEST Greg Kroah-Hartman wrote:
> > > > On Mon, Oct 16, 2017 at 03:29:02AM +0200, Rafael J. Wysocki wrote:
> > > > >  struct dev_pm_info {
> > > > >  	pm_message_t		power_state;
> > > > >  	unsigned int		can_wakeup:1;
> > > > > @@ -561,6 +580,7 @@ struct dev_pm_info {
> > > > >  	bool			is_late_suspended:1;
> > > > >  	bool			early_init:1;	/* Owned by the PM core */
> > > > >  	bool			direct_complete:1;	/* Owned by the PM core */
> > > > > +	unsigned int		driver_flags;
> > > > 
> > > > Minor nit, u32 or u64?
> > > 
> > > u32 I think, will update.
> > > 
> > > BTW, there's a mess in this struct overall and I'd like all of the bit fileds
> > > to be the same type (and that shouldn't be bool IMO :-)).
> > > 
> > > Do you prefer u32 or unsinged int?
> > 
> > I always prefer an explicit size for variables, unless it's a "generic
> > loop" type thing.  So I'll always say "u32" for this.
> > 
> > And cleaning up the structure would be great, it's grown over time in
> > odd ways as you point out.
> 
> OK, but that will be separate from this work.

Of course :)

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18  0:39       ` Rafael J. Wysocki
@ 2017-10-18 10:24         ` Rafael J. Wysocki
  2017-10-18 12:34           ` Ulf Hansson
  2017-10-18 11:57         ` Ulf Hansson
  1 sibling, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18 10:24 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Wednesday, October 18, 2017 2:39:24 AM CEST Rafael J. Wysocki wrote:
> On Tuesday, October 17, 2017 9:41:16 PM CEST Ulf Hansson wrote:
> 
> [cut]
> 
> > >
> > >> deploying this and from a middle layer point of view, all the trivial
> > >> cases supports this.
> > >
> > > These functions are wrong, however, because they attempt to reuse the
> > > whole callback *path* instead of just reusing driver callbacks.  The
> > > *only* reason why it all "works" is because there are no middle layer
> > > callbacks involved in that now.
> > >
> > > If you changed them to reuse driver callbacks only today, nothing would break
> > > AFAICS.
> > 
> > Yes, it would.
> > 
> > First, for example, the amba bus is responsible for the amba bus
> > clock, but relies on drivers to gate/ungate it during system sleep. In
> > case the amba drivers don't use the pm_runtime_force_suspend|resume(),
> > it will explicitly have to start manage the clock during system sleep
> > themselves. Leading to open coding.
> 
> Well, I suspected that something like this would surface. ;-)
> 
> Are there any major reasons why the appended patch (obviously untested) won't
> work, then?

OK, there is a reason, which is the optimizations bundled into
pm_runtime_force_*, because (a) the device may be left in runtime suspend
by them (in which case amba_pm_suspend_early() in my patch should not run)
and (b) pm_runtime_force_resume() may decide to leave it suspended (in which
case amba_pm_suspend_late() in my patch should not run).

[BTW, the "leave the device suspended" optimization in pm_runtime_force_*
is potentially problematic too, because it requires the children to do
the right thing, which effectively means that their drivers need to use
pm_runtime_force_* too, but what if they don't want to reuse their
runtime PM callbacks for system-wide PM?]

Honestly, I don't like the way this is designed.  IMO, it would be better
to do the optimizations and all in the bus type middle-layer code instead
of expecting drivers to use pm_runtime_force_* as their system-wide PM
callbacks (and that expectation should at least be documented, which I'm
not sure is the case now).  But whatever.

It all should work the way it does now without pm_runtime_force_* if (a) the
bus type's PM callbacks are changed like in the last patch and the drivers
(b) point their system suspend callbacks to the runtime PM callback routines
and (c) set DPM_FLAG_SMART_SUSPEND and DPM_FLAG_LEAVE_SUSPENDED for the
devices (if they need to do the PM in ->suspend and ->resume, they may set
DPM_FLAG_AVOID_RPM too).

And if you see a reason why that won't work, please let me know.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18  0:39       ` Rafael J. Wysocki
  2017-10-18 10:24         ` Rafael J. Wysocki
@ 2017-10-18 11:57         ` Ulf Hansson
  2017-10-18 13:00           ` Rafael J. Wysocki
  1 sibling, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-18 11:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 18 October 2017 at 02:39, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Tuesday, October 17, 2017 9:41:16 PM CEST Ulf Hansson wrote:
>
> [cut]
>
>> >
>> >> deploying this and from a middle layer point of view, all the trivial
>> >> cases supports this.
>> >
>> > These functions are wrong, however, because they attempt to reuse the
>> > whole callback *path* instead of just reusing driver callbacks.  The
>> > *only* reason why it all "works" is because there are no middle layer
>> > callbacks involved in that now.
>> >
>> > If you changed them to reuse driver callbacks only today, nothing would break
>> > AFAICS.
>>
>> Yes, it would.
>>
>> First, for example, the amba bus is responsible for the amba bus
>> clock, but relies on drivers to gate/ungate it during system sleep. In
>> case the amba drivers don't use the pm_runtime_force_suspend|resume(),
>> it will explicitly have to start manage the clock during system sleep
>> themselves. Leading to open coding.
>
> Well, I suspected that something like this would surface. ;-)
>
> Are there any major reasons why the appended patch (obviously untested) won't
> work, then?

Let me comment on the code, instead of here...

...just realized your second reply, so let me reply to that instead
regarding the patch.

>
>> Second, it will introduce a regression in behavior for all users of
>> pm_runtime_force_suspend|resume(), especially during system resume as
>> the driver may then end up resuming the device even in case it isn't
>> needed.
>
> How so?
>
> I'm talking about a change like in the appended patch, where
> pm_runtime_force_* simply invoke driver callbacks directly.  What is
> skipped there is middle-layer stuff which is empty anyway in all cases
> except for AMBA (if that's all what is lurking below the surface), so
> I don't quite see how the failure will happen.

I am afraid changing pm_runtime_force* to only call driver callbacks
may become fragile. Let me elaborate.

The reason why pm_runtime_force_* needs to respects the hierarchy of
the RPM callbacks, is because otherwise it can't safely update the
runtime PM status of the device. And updating the runtime PM status of
the device is required to manage the optimized behavior during system
resume (avoiding to unnecessary resume devices).

Besides the AMBA case, I also realized that we are dealing with PM
clocks in the genpd case. For this, genpd relies on the that runtime
PM status of the device properly reflects the state of the HW, during
system-wide PM.

In other words, if the driver would change the runtime PM status of
the device, without respecting the hierarchy of the runtime PM
callbacks, it would lead to that genpd starts taking wrong decisions
while managing the PM clocks during system-wide PM. So in case you
intend to change pm_runtime_force_* this needs to be addressed too.

>
>> I believe I have explained why, also several times by now -
>> and that's also how far you could take the i2c designware driver at
>> this point.
>>
>> That said, I assume the second part may be addressed in this series,
>> if these drivers convert to use the "driver PM flags", right?
>>
>> However, what about the first case? Is some open coding needed or your
>> think the amba driver can instruct the amba bus via the "driver PM
>> flags"?
>
> With the appended patch applied things should work for AMBA like for
> any other bus type implementing PM, so I don't see why not.
>
>> >
>> >> Like the spi bus, i2c bus, amba bus, platform
>> >> bus, genpd, etc. There are no changes needed to continue to support
>> >> this option, if you see what I mean.
>> >
>> > For the time being, nothing changes in that respect, but eventually I'd
>> > prefer the pm_runtime_force_* things to go away, frankly.
>>
>> Okay, thanks for that clear statement!
>>
>> >
>> >> So, when you say that re-using runtime PM callbacks for system-wide PM
>> >> isn't going to happen, can you please elaborate what you mean?
>> >
>> > I didn't mean "reusing runtime PM callbacks for system-wide PM" overall, but
>> > reusing *middle-layer* runtime PM callbacks for system-wide PM.  That is the
>> > bogus part.
>>
>> I think we have discussed this several times, but the arguments you
>> have put forward, explaining *why* haven't yet convinced me.
>
> Well, sorry about that.  I would like to be able to explain my point to you so
> that you understand my perspective, but if that's not working, that's not a
> sufficient reason for me to give up.
>
> I'm just refusing to maintain code that I don't agree with in the long run.
>
>> In principle what you have been saying is that it's a "layering
>> violation" to use pm_runtime_force_suspend|resume() from driver's
>> system sleep callbacks, but on the other hand you think using
>> pm_runtime_get*  and friends is okay!?
>
> Not unconditionally, which would be fair to mention.
>
> Only if it is called in ->prepare or as the first thing in a ->suspend
> callback.  Later than that is broken too in principle.
>
>> That makes little sense to me, because it's the same "layering
>> violation" that is done for both cases.
>
> The "layering violation" is all about things possibly occurring in a
> wrong order.  For example, say a middle-layer ->runtime_suspend is
> called via pm_runtime_force_suspend() which in turn is called from
> middle-layer ->suspend_late as a driver callback.  If the ->runtime_suspend
> does anything significat to the device, then executing the remaining part of
> ->suspend_late will almost cetainly break things, more or less.
>
> That is not a concern with a middle-layer ->runtime_resume running
> *before* a middle-layer ->suspend (or any subsequent callbacks) does
> anything significant to the device.
>
> Is there anything in the above which is not clear enough?
>
>> Moreover, you have been explaining that re-using runtime PM callbacks
>> for PCI doesn't work. Then my question is, why should a limitation of
>> the PCI subsystem put constraints on the behavior for all other
>> subsystems/middle-layers?
>
> Because they aren't just PCI subsystem limitations only.  The need to handle
> wakeup setup differently for runtime PM and system sleep is not PCI-specific.
> The need to handle suspend and hibernation differently isn't too.
>
> Those things may be more obvious in PCI, but they are generic rather than
> special.

Absolutely agree about the different wake-up settings. However, these
issues can be addressed also when using pm_runtime_force_*, at least
in general, but then not for PCI.

Regarding hibernation, honestly that's not really my area of
expertise. Although, I assume the middle-layer and driver can treat
that as a separate case, so if it's not suitable to use
pm_runtime_force* for that case, then they shouldn't do it.

>
> Also, quite so often other middle layers interact with PCI directly or
> indirectly (eg. a platform device may be a child or a consumer of a PCI
> device) and some optimizations need to take that into account (eg. parents
> generally need to be accessible when their childres are resumed and so on).

A device's parent becomes informed when changing the runtime PM status
of the device via pm_runtime_force_suspend|resume(), as those calls
pm_runtime_set_suspended|active(). In case that isn't that sufficient,
what else is needed? Perhaps you can point me to an example so I can
understand better?

For a PCI consumer device those will of course have to play by the rules of PCI.

>
> Moreover, the majority of the "other subsystems/middle-layers" you've talked
> about so far don't provide any PM callbacks to be invoked by pm_runtime_force_*,
> so question is how representative they really are.

That's the point. We know pm_runtime_force_* works nicely for the
trivial middle-layer cases. For the more complex cases, we need
something additional/different.

>
>> >
>> > Quoting again:
>> >
>> > "If you are a middle layer, your role is basically to do PM for a certain
>> > group of devices.  Thus you cannot really do the same in ->suspend or
>> > ->suspend_early and in ->runtime_suspend (because the former generally need to
>> > take device_may_wakeup() into account and the latter doesn't) and you shouldn't
>> > really do the same in ->suspend and ->freeze (becuase the latter shouldn't
>> > change the device's power state) and so on."
>> >
>> > I have said for multiple times that re-using *driver* callbacks actually makes
>> > sense and the series is for doing that easier in general among other things.
>> >
>> >> I assume you mean that the PM core won't be involved to support this,
>> >> but is that it?
>> >>
>> >> Do you also mean that *all* users of pm_runtime_force_suspend|resume()
>> >> must convert to this new thing, using "driver PM flags", so in the end
>> >> you want to remove pm_runtime_force_suspend|resume()?
>> >>  - Then if so, you must of course consider all cases for how
>> >> pm_runtime_force_suspend|resume() are being deployed currently, else
>> >> existing users can't convert to the "driver PM flags" thing. Have you
>> >> done that in this series?
>> >
>> > Let me turn this around.
>> >
>> > The majority of cases in which pm_runtime_force_* are used *should* be
>> > addressable using the flags introduced here.  Some case in which
>> > pm_runtime_force_* cannot be used should be addressable by these flags
>> > as well.
>>
>> That's sounds really great!
>>
>> >
>> > There may be some cases in which pm_runtime_force_* are used that may
>> > require something more, but I'm not going to worry about that right now.
>>
>> This approach concerns me, because if we in the end realizes that
>> pm_runtime_force_suspend|resume() will be too hard to get rid of, then
>> this series just add yet another generic way of trying to optimize the
>> system sleep path for runtime PM enabled devices.
>
> Which also works for PCI and the ACPI PM domain and that's sort of valuable
> anyway, isn't it?

Indeed it is! I am definitely open to improve the situation for ACPI and PCI.

Seems like I may have given the wrong impression about that.

>
> For the record, I don't think it will be too hard to get rid of
> pm_runtime_force_suspend|resume(), although that may take quite some time.
>
>> So then we would end up having to support the "direct_complete" path,
>> the "driver PM flags" and cases where
>> pm_runtime_force_suspend|resume() is used. No, that just isn't good
>> enough to me. That will just lead to similar scenarios as we had in
>> the i2c designware driver.
>
> Frankly, this sounds like staging for indefinite blocking of changes in
> this area on non-technical grounds.  I hope that it isn't the case ...
>
>> If we decide to go with these new "driver PM flags", I want to make
>> sure, as long as possible, that we can remove both the
>> "direct_complete" path support from the PM core as well as removing
>> the pm_runtime_force_suspend|resume() helpers.
>
> We'll see.
>
>> >
>> > I'll take care of that when I'll be removing pm_runtime_force_*, which I'm
>> > not doing here.
>>
>> Of course I am fine with that we postpone doing the actual converting
>> of drivers etc from this series, although as stated above, let's sure
>> we *can* do it by using the "driver PM flags".
>
> There clearly are use cases that benefit from this series and I don't see
> any alternatives covering them, including both direct-complete and the
> pm_runtime_force* approach, so I'm not buying this "let's make sure
> it can cover all possible use cases that exist" argumentation.

Alright, let me re-phrase my take on this.

Because you stated that you plan to remove pm_runtime_force_*
eventually, then I think you need to put up some valid reasons of why
(I consider that done), but more importantly, you need to offer an
alternative solution that can replace it. Else such that statement can
easily become wrong interpreted. My point is, the "driver PM flags" do
*not* offers a full alternative solution, it may do in the future or
it may not.

So, to conclude from my side, I don't have any major objections to
going forward with the "driver PM flags", especially with the goal of
improving the situation for PCI and ACPI. Down the road, we can then
*try* to make it replace pm_runtime_force_* and the "direct_complete
path".

Hopefully that makes it more clear.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 10:24         ` Rafael J. Wysocki
@ 2017-10-18 12:34           ` Ulf Hansson
  2017-10-18 21:54             ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-18 12:34 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

[...]

>> Are there any major reasons why the appended patch (obviously untested) won't
>> work, then?
>
> OK, there is a reason, which is the optimizations bundled into
> pm_runtime_force_*, because (a) the device may be left in runtime suspend
> by them (in which case amba_pm_suspend_early() in my patch should not run)
> and (b) pm_runtime_force_resume() may decide to leave it suspended (in which
> case amba_pm_suspend_late() in my patch should not run).

Exactly.

>
> [BTW, the "leave the device suspended" optimization in pm_runtime_force_*
> is potentially problematic too, because it requires the children to do
> the right thing, which effectively means that their drivers need to use
> pm_runtime_force_* too, but what if they don't want to reuse their
> runtime PM callbacks for system-wide PM?]

Deployment of pm_runtime_force_suspend() should generally be done for
children devices first.

If some reason that isn't the case, it's expected that the call to
pm_runtime_set_suspended() invoked from pm_runtime_force_suspend(),
for the parent, should fail and thus abort system suspend.

>
> Honestly, I don't like the way this is designed.  IMO, it would be better
> to do the optimizations and all in the bus type middle-layer code instead
> of expecting drivers to use pm_runtime_force_* as their system-wide PM
> callbacks (and that expectation should at least be documented, which I'm
> not sure is the case now).  But whatever.
>
> It all should work the way it does now without pm_runtime_force_* if (a) the
> bus type's PM callbacks are changed like in the last patch and the drivers
> (b) point their system suspend callbacks to the runtime PM callback routines
> and (c) set DPM_FLAG_SMART_SUSPEND and DPM_FLAG_LEAVE_SUSPENDED for the
> devices (if they need to do the PM in ->suspend and ->resume, they may set
> DPM_FLAG_AVOID_RPM too).
>
> And if you see a reason why that won't work, please let me know.

I will have look and try out the series by using my local "runtime PM
test driver".

I get back to you with an update on this.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 11:57         ` Ulf Hansson
@ 2017-10-18 13:00           ` Rafael J. Wysocki
  2017-10-18 14:11             ` Ulf Hansson
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18 13:00 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Wednesday, October 18, 2017 1:57:52 PM CEST Ulf Hansson wrote:
> On 18 October 2017 at 02:39, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Tuesday, October 17, 2017 9:41:16 PM CEST Ulf Hansson wrote:
> >
> > [cut]
> >
> >> >
> >> >> deploying this and from a middle layer point of view, all the trivial
> >> >> cases supports this.
> >> >
> >> > These functions are wrong, however, because they attempt to reuse the
> >> > whole callback *path* instead of just reusing driver callbacks.  The
> >> > *only* reason why it all "works" is because there are no middle layer
> >> > callbacks involved in that now.
> >> >
> >> > If you changed them to reuse driver callbacks only today, nothing would break
> >> > AFAICS.
> >>
> >> Yes, it would.
> >>
> >> First, for example, the amba bus is responsible for the amba bus
> >> clock, but relies on drivers to gate/ungate it during system sleep. In
> >> case the amba drivers don't use the pm_runtime_force_suspend|resume(),
> >> it will explicitly have to start manage the clock during system sleep
> >> themselves. Leading to open coding.
> >
> > Well, I suspected that something like this would surface. ;-)
> >
> > Are there any major reasons why the appended patch (obviously untested) won't
> > work, then?
> 
> Let me comment on the code, instead of here...
> 
> ...just realized your second reply, so let me reply to that instead
> regarding the patch.
> 
> >
> >> Second, it will introduce a regression in behavior for all users of
> >> pm_runtime_force_suspend|resume(), especially during system resume as
> >> the driver may then end up resuming the device even in case it isn't
> >> needed.
> >
> > How so?
> >
> > I'm talking about a change like in the appended patch, where
> > pm_runtime_force_* simply invoke driver callbacks directly.  What is
> > skipped there is middle-layer stuff which is empty anyway in all cases
> > except for AMBA (if that's all what is lurking below the surface), so
> > I don't quite see how the failure will happen.
> 
> I am afraid changing pm_runtime_force* to only call driver callbacks
> may become fragile. Let me elaborate.
> 
> The reason why pm_runtime_force_* needs to respects the hierarchy of
> the RPM callbacks, is because otherwise it can't safely update the
> runtime PM status of the device.

I'm not sure I follow this requirement.  Why is that so?

> And updating the runtime PM status of
> the device is required to manage the optimized behavior during system
> resume (avoiding to unnecessary resume devices).

Well, OK.  The runtime PM status of the device after system resume should
better reflect its physical state.

[The physical state of the device may not be under the control of the
kernel in some cases, like in S3 resume on some systems that reset
devices in the firmware and so on, but let's set that aside.]

However, for the runtime PM status of the device may still reflect its state
if, say, a ->resume_early of the middle layer is called during resume along
with a driver's ->runtime_resume.  That still can produce the right state
of the device and all depends on the middle layer.

On the other hand, as I said before, using a middle-layer ->runtime_suspend
during a system sleep transition may be outright incorrect, say if device
wakeup settings need to be adjusted by the middle layer (which is the
case for some of them).

Of course, if the middle layer expects the driver to point its
system-wide PM callbacks to pm_runtime_force_*, then that's how it goes,
but the drivers working with this particular middle layer generally
won't work with other middle layers and may interact incorrectly
with parents and/or children using the other middle layers.

I guess the problem boils down to having a common set of expectations
on the driver side and on the middle layer side allowing different
combinations of these to work together.

> Besides the AMBA case, I also realized that we are dealing with PM
> clocks in the genpd case. For this, genpd relies on the that runtime
> PM status of the device properly reflects the state of the HW, during
> system-wide PM.
> 
> In other words, if the driver would change the runtime PM status of
> the device, without respecting the hierarchy of the runtime PM
> callbacks, it would lead to that genpd starts taking wrong decisions
> while managing the PM clocks during system-wide PM. So in case you
> intend to change pm_runtime_force_* this needs to be addressed too.

I've just looked at the genpd code and quite frankly I'm not sure how this
works, but I'll figure this out. :-)

> >
> >> I believe I have explained why, also several times by now -
> >> and that's also how far you could take the i2c designware driver at
> >> this point.
> >>
> >> That said, I assume the second part may be addressed in this series,
> >> if these drivers convert to use the "driver PM flags", right?
> >>
> >> However, what about the first case? Is some open coding needed or your
> >> think the amba driver can instruct the amba bus via the "driver PM
> >> flags"?
> >
> > With the appended patch applied things should work for AMBA like for
> > any other bus type implementing PM, so I don't see why not.
> >
> >> >
> >> >> Like the spi bus, i2c bus, amba bus, platform
> >> >> bus, genpd, etc. There are no changes needed to continue to support
> >> >> this option, if you see what I mean.
> >> >
> >> > For the time being, nothing changes in that respect, but eventually I'd
> >> > prefer the pm_runtime_force_* things to go away, frankly.
> >>
> >> Okay, thanks for that clear statement!
> >>
> >> >
> >> >> So, when you say that re-using runtime PM callbacks for system-wide PM
> >> >> isn't going to happen, can you please elaborate what you mean?
> >> >
> >> > I didn't mean "reusing runtime PM callbacks for system-wide PM" overall, but
> >> > reusing *middle-layer* runtime PM callbacks for system-wide PM.  That is the
> >> > bogus part.
> >>
> >> I think we have discussed this several times, but the arguments you
> >> have put forward, explaining *why* haven't yet convinced me.
> >
> > Well, sorry about that.  I would like to be able to explain my point to you so
> > that you understand my perspective, but if that's not working, that's not a
> > sufficient reason for me to give up.
> >
> > I'm just refusing to maintain code that I don't agree with in the long run.
> >
> >> In principle what you have been saying is that it's a "layering
> >> violation" to use pm_runtime_force_suspend|resume() from driver's
> >> system sleep callbacks, but on the other hand you think using
> >> pm_runtime_get*  and friends is okay!?
> >
> > Not unconditionally, which would be fair to mention.
> >
> > Only if it is called in ->prepare or as the first thing in a ->suspend
> > callback.  Later than that is broken too in principle.
> >
> >> That makes little sense to me, because it's the same "layering
> >> violation" that is done for both cases.
> >
> > The "layering violation" is all about things possibly occurring in a
> > wrong order.  For example, say a middle-layer ->runtime_suspend is
> > called via pm_runtime_force_suspend() which in turn is called from
> > middle-layer ->suspend_late as a driver callback.  If the ->runtime_suspend
> > does anything significat to the device, then executing the remaining part of
> > ->suspend_late will almost cetainly break things, more or less.
> >
> > That is not a concern with a middle-layer ->runtime_resume running
> > *before* a middle-layer ->suspend (or any subsequent callbacks) does
> > anything significant to the device.
> >
> > Is there anything in the above which is not clear enough?
> >
> >> Moreover, you have been explaining that re-using runtime PM callbacks
> >> for PCI doesn't work. Then my question is, why should a limitation of
> >> the PCI subsystem put constraints on the behavior for all other
> >> subsystems/middle-layers?
> >
> > Because they aren't just PCI subsystem limitations only.  The need to handle
> > wakeup setup differently for runtime PM and system sleep is not PCI-specific.
> > The need to handle suspend and hibernation differently isn't too.
> >
> > Those things may be more obvious in PCI, but they are generic rather than
> > special.
> 
> Absolutely agree about the different wake-up settings. However, these
> issues can be addressed also when using pm_runtime_force_*, at least
> in general, but then not for PCI.

Well, not for the ACPI PM domain too.

In general, not if the wakeup settings are adjusted by the middle layer.

> Regarding hibernation, honestly that's not really my area of
> expertise. Although, I assume the middle-layer and driver can treat
> that as a separate case, so if it's not suitable to use
> pm_runtime_force* for that case, then they shouldn't do it.

Well, agreed.

In some simple cases, though, driver callbacks can be reused for hibernation
too, so it would be good to have a common way to do that too, IMO.

> >
> > Also, quite so often other middle layers interact with PCI directly or
> > indirectly (eg. a platform device may be a child or a consumer of a PCI
> > device) and some optimizations need to take that into account (eg. parents
> > generally need to be accessible when their childres are resumed and so on).
> 
> A device's parent becomes informed when changing the runtime PM status
> of the device via pm_runtime_force_suspend|resume(), as those calls
> pm_runtime_set_suspended|active().

This requires the parent driver or middle layer to look at the reference
counter and understand it the same way as pm_runtime_force_*.

> In case that isn't that sufficient, what else is needed? Perhaps you can
> point me to an example so I can understand better?

Say you want to leave the parent suspended after system resume, but the
child drivers use pm_runtime_force_suspend|resume().  The parent would then
need to use pm_runtime_force_suspend|resume() too, no?
 
> For a PCI consumer device those will of course have to play by the rules of PCI.
> 
> >
> > Moreover, the majority of the "other subsystems/middle-layers" you've talked
> > about so far don't provide any PM callbacks to be invoked by pm_runtime_force_*,
> > so question is how representative they really are.
> 
> That's the point. We know pm_runtime_force_* works nicely for the
> trivial middle-layer cases.

In which cases the middle-layer callbacks don't exist, so it's just like
reusing driver callbacks directly. :-)

> For the more complex cases, we need something additional/different.

Something different.

But overall, as I said, this is about common expectations.

Today, some middle layers expect drivers to point their callback pointers
to the same routine in order to resue it (PCI, ACPI bus type), some of them
expect pm_runtime_force_suspend|resume() to be used (AMBA, maybe genpd),
and some of them have no expectations at all.

There needs to be a common ground in that area for drivers to be able to
work with different middle layers.

> >
> >> >
> >> > Quoting again:
> >> >
> >> > "If you are a middle layer, your role is basically to do PM for a certain
> >> > group of devices.  Thus you cannot really do the same in ->suspend or
> >> > ->suspend_early and in ->runtime_suspend (because the former generally need to
> >> > take device_may_wakeup() into account and the latter doesn't) and you shouldn't
> >> > really do the same in ->suspend and ->freeze (becuase the latter shouldn't
> >> > change the device's power state) and so on."
> >> >
> >> > I have said for multiple times that re-using *driver* callbacks actually makes
> >> > sense and the series is for doing that easier in general among other things.
> >> >
> >> >> I assume you mean that the PM core won't be involved to support this,
> >> >> but is that it?
> >> >>
> >> >> Do you also mean that *all* users of pm_runtime_force_suspend|resume()
> >> >> must convert to this new thing, using "driver PM flags", so in the end
> >> >> you want to remove pm_runtime_force_suspend|resume()?
> >> >>  - Then if so, you must of course consider all cases for how
> >> >> pm_runtime_force_suspend|resume() are being deployed currently, else
> >> >> existing users can't convert to the "driver PM flags" thing. Have you
> >> >> done that in this series?
> >> >
> >> > Let me turn this around.
> >> >
> >> > The majority of cases in which pm_runtime_force_* are used *should* be
> >> > addressable using the flags introduced here.  Some case in which
> >> > pm_runtime_force_* cannot be used should be addressable by these flags
> >> > as well.
> >>
> >> That's sounds really great!
> >>
> >> >
> >> > There may be some cases in which pm_runtime_force_* are used that may
> >> > require something more, but I'm not going to worry about that right now.
> >>
> >> This approach concerns me, because if we in the end realizes that
> >> pm_runtime_force_suspend|resume() will be too hard to get rid of, then
> >> this series just add yet another generic way of trying to optimize the
> >> system sleep path for runtime PM enabled devices.
> >
> > Which also works for PCI and the ACPI PM domain and that's sort of valuable
> > anyway, isn't it?
> 
> Indeed it is! I am definitely open to improve the situation for ACPI and PCI.
> 
> Seems like I may have given the wrong impression about that.
> 
> >
> > For the record, I don't think it will be too hard to get rid of
> > pm_runtime_force_suspend|resume(), although that may take quite some time.
> >
> >> So then we would end up having to support the "direct_complete" path,
> >> the "driver PM flags" and cases where
> >> pm_runtime_force_suspend|resume() is used. No, that just isn't good
> >> enough to me. That will just lead to similar scenarios as we had in
> >> the i2c designware driver.
> >
> > Frankly, this sounds like staging for indefinite blocking of changes in
> > this area on non-technical grounds.  I hope that it isn't the case ...
> >
> >> If we decide to go with these new "driver PM flags", I want to make
> >> sure, as long as possible, that we can remove both the
> >> "direct_complete" path support from the PM core as well as removing
> >> the pm_runtime_force_suspend|resume() helpers.
> >
> > We'll see.
> >
> >> >
> >> > I'll take care of that when I'll be removing pm_runtime_force_*, which I'm
> >> > not doing here.
> >>
> >> Of course I am fine with that we postpone doing the actual converting
> >> of drivers etc from this series, although as stated above, let's sure
> >> we *can* do it by using the "driver PM flags".
> >
> > There clearly are use cases that benefit from this series and I don't see
> > any alternatives covering them, including both direct-complete and the
> > pm_runtime_force* approach, so I'm not buying this "let's make sure
> > it can cover all possible use cases that exist" argumentation.
> 
> Alright, let me re-phrase my take on this.
> 
> Because you stated that you plan to remove pm_runtime_force_*
> eventually, then I think you need to put up some valid reasons of why
> (I consider that done), but more importantly, you need to offer an
> alternative solution that can replace it. Else such that statement can
> easily become wrong interpreted. My point is, the "driver PM flags" do
> *not* offers a full alternative solution, it may do in the future or
> it may not.
> 
> So, to conclude from my side, I don't have any major objections to
> going forward with the "driver PM flags", especially with the goal of
> improving the situation for PCI and ACPI. Down the road, we can then
> *try* to make it replace pm_runtime_force_* and the "direct_complete
> path".
> 
> Hopefully that makes it more clear.

Yes, it does, thank you!

Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 13:00           ` Rafael J. Wysocki
@ 2017-10-18 14:11             ` Ulf Hansson
  2017-10-18 19:45               ` Grygorii Strashko
  2017-10-18 22:12               ` Rafael J. Wysocki
  0 siblings, 2 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-18 14:11 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

[...]

>>
>> The reason why pm_runtime_force_* needs to respects the hierarchy of
>> the RPM callbacks, is because otherwise it can't safely update the
>> runtime PM status of the device.
>
> I'm not sure I follow this requirement.  Why is that so?

If the PM domain controls some resources for the device in its RPM
callbacks and the driver controls some other resources in its RPM
callbacks - then these resources needs to be managed together.

This follows the behavior of when a regular call to
pm_runtime_get|put(), triggers the RPM callbacks to be invoked.

>
>> And updating the runtime PM status of
>> the device is required to manage the optimized behavior during system
>> resume (avoiding to unnecessary resume devices).
>
> Well, OK.  The runtime PM status of the device after system resume should
> better reflect its physical state.
>
> [The physical state of the device may not be under the control of the
> kernel in some cases, like in S3 resume on some systems that reset
> devices in the firmware and so on, but let's set that aside.]
>
> However, for the runtime PM status of the device may still reflect its state
> if, say, a ->resume_early of the middle layer is called during resume along
> with a driver's ->runtime_resume.  That still can produce the right state
> of the device and all depends on the middle layer.
>
> On the other hand, as I said before, using a middle-layer ->runtime_suspend
> during a system sleep transition may be outright incorrect, say if device
> wakeup settings need to be adjusted by the middle layer (which is the
> case for some of them).
>
> Of course, if the middle layer expects the driver to point its
> system-wide PM callbacks to pm_runtime_force_*, then that's how it goes,
> but the drivers working with this particular middle layer generally
> won't work with other middle layers and may interact incorrectly
> with parents and/or children using the other middle layers.
>
> I guess the problem boils down to having a common set of expectations
> on the driver side and on the middle layer side allowing different
> combinations of these to work together.

Yes!

>
>> Besides the AMBA case, I also realized that we are dealing with PM
>> clocks in the genpd case. For this, genpd relies on the that runtime
>> PM status of the device properly reflects the state of the HW, during
>> system-wide PM.
>>
>> In other words, if the driver would change the runtime PM status of
>> the device, without respecting the hierarchy of the runtime PM
>> callbacks, it would lead to that genpd starts taking wrong decisions
>> while managing the PM clocks during system-wide PM. So in case you
>> intend to change pm_runtime_force_* this needs to be addressed too.
>
> I've just looked at the genpd code and quite frankly I'm not sure how this
> works, but I'll figure this out. :-)

You may think of it as genpd's RPM callback controls some device
clocks, while the driver control some other device resources (pinctrl
for example) from its RPM callback.

These resources needs to managed together, similar to as I described above.

[...]

>> Absolutely agree about the different wake-up settings. However, these
>> issues can be addressed also when using pm_runtime_force_*, at least
>> in general, but then not for PCI.
>
> Well, not for the ACPI PM domain too.
>
> In general, not if the wakeup settings are adjusted by the middle layer.

Correct!

To use pm_runtime_force* for these cases, one would need some
additional information exchange between the driver and the
middle-layer.

>
>> Regarding hibernation, honestly that's not really my area of
>> expertise. Although, I assume the middle-layer and driver can treat
>> that as a separate case, so if it's not suitable to use
>> pm_runtime_force* for that case, then they shouldn't do it.
>
> Well, agreed.
>
> In some simple cases, though, driver callbacks can be reused for hibernation
> too, so it would be good to have a common way to do that too, IMO.

Okay, that makes sense!

>
>> >
>> > Also, quite so often other middle layers interact with PCI directly or
>> > indirectly (eg. a platform device may be a child or a consumer of a PCI
>> > device) and some optimizations need to take that into account (eg. parents
>> > generally need to be accessible when their childres are resumed and so on).
>>
>> A device's parent becomes informed when changing the runtime PM status
>> of the device via pm_runtime_force_suspend|resume(), as those calls
>> pm_runtime_set_suspended|active().
>
> This requires the parent driver or middle layer to look at the reference
> counter and understand it the same way as pm_runtime_force_*.
>
>> In case that isn't that sufficient, what else is needed? Perhaps you can
>> point me to an example so I can understand better?
>
> Say you want to leave the parent suspended after system resume, but the
> child drivers use pm_runtime_force_suspend|resume().  The parent would then
> need to use pm_runtime_force_suspend|resume() too, no?

Actually no.

Currently the other options of "deferring resume" (not using
pm_runtime_force_*), is either using the "direct_complete" path or
similar to the approach you took for the i2c designware driver.

Both cases should play nicely in combination of a child being managed
by pm_runtime_force_*. That's because only when the parent device is
kept runtime suspended during system suspend, resuming can be
deferred.

That means, if the resume of the parent is deferred, so will the also
the resume of the child.

>
>> For a PCI consumer device those will of course have to play by the rules of PCI.
>>
>> >
>> > Moreover, the majority of the "other subsystems/middle-layers" you've talked
>> > about so far don't provide any PM callbacks to be invoked by pm_runtime_force_*,
>> > so question is how representative they really are.
>>
>> That's the point. We know pm_runtime_force_* works nicely for the
>> trivial middle-layer cases.
>
> In which cases the middle-layer callbacks don't exist, so it's just like
> reusing driver callbacks directly. :-)
>
>> For the more complex cases, we need something additional/different.
>
> Something different.
>
> But overall, as I said, this is about common expectations.
>
> Today, some middle layers expect drivers to point their callback pointers
> to the same routine in order to resue it (PCI, ACPI bus type), some of them
> expect pm_runtime_force_suspend|resume() to be used (AMBA, maybe genpd),
> and some of them have no expectations at all.
>
> There needs to be a common ground in that area for drivers to be able to
> work with different middle layers.

Yes, reaching that point would be great, we should definitively aim for that!

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 14:11             ` Ulf Hansson
@ 2017-10-18 19:45               ` Grygorii Strashko
  2017-10-18 21:48                 ` Rafael J. Wysocki
  2017-10-18 22:12               ` Rafael J. Wysocki
  1 sibling, 1 reply; 79+ messages in thread
From: Grygorii Strashko @ 2017-10-18 19:45 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones



On 10/18/2017 09:11 AM, Ulf Hansson wrote:
> [...]
> 
>>>
>>> The reason why pm_runtime_force_* needs to respects the hierarchy of
>>> the RPM callbacks, is because otherwise it can't safely update the
>>> runtime PM status of the device.
>>
>> I'm not sure I follow this requirement.  Why is that so?
> 
> If the PM domain controls some resources for the device in its RPM
> callbacks and the driver controls some other resources in its RPM
> callbacks - then these resources needs to be managed together.
> 
> This follows the behavior of when a regular call to
> pm_runtime_get|put(), triggers the RPM callbacks to be invoked.
> 
>>
>>> And updating the runtime PM status of
>>> the device is required to manage the optimized behavior during system
>>> resume (avoiding to unnecessary resume devices).
>>
>> Well, OK.  The runtime PM status of the device after system resume should
>> better reflect its physical state.
>>
>> [The physical state of the device may not be under the control of the
>> kernel in some cases, like in S3 resume on some systems that reset
>> devices in the firmware and so on, but let's set that aside.]
>>
>> However, for the runtime PM status of the device may still reflect its state
>> if, say, a ->resume_early of the middle layer is called during resume along
>> with a driver's ->runtime_resume.  That still can produce the right state
>> of the device and all depends on the middle layer.
>>
>> On the other hand, as I said before, using a middle-layer ->runtime_suspend
>> during a system sleep transition may be outright incorrect, say if device
>> wakeup settings need to be adjusted by the middle layer (which is the
>> case for some of them).
>>
>> Of course, if the middle layer expects the driver to point its
>> system-wide PM callbacks to pm_runtime_force_*, then that's how it goes,
>> but the drivers working with this particular middle layer generally
>> won't work with other middle layers and may interact incorrectly
>> with parents and/or children using the other middle layers.
>>
>> I guess the problem boils down to having a common set of expectations
>> on the driver side and on the middle layer side allowing different
>> combinations of these to work together.
> 
> Yes!
> 
>>
>>> Besides the AMBA case, I also realized that we are dealing with PM
>>> clocks in the genpd case. For this, genpd relies on the that runtime
>>> PM status of the device properly reflects the state of the HW, during
>>> system-wide PM.
>>>
>>> In other words, if the driver would change the runtime PM status of
>>> the device, without respecting the hierarchy of the runtime PM
>>> callbacks, it would lead to that genpd starts taking wrong decisions
>>> while managing the PM clocks during system-wide PM. So in case you
>>> intend to change pm_runtime_force_* this needs to be addressed too.
>>
>> I've just looked at the genpd code and quite frankly I'm not sure how this
>> works, but I'll figure this out. :-)
> 
> You may think of it as genpd's RPM callback controls some device
> clocks, while the driver control some other device resources (pinctrl
> for example) from its RPM callback.
> 
> These resources needs to managed together, similar to as I described above.
> 
> [...]
> 
>>> Absolutely agree about the different wake-up settings. However, these
>>> issues can be addressed also when using pm_runtime_force_*, at least
>>> in general, but then not for PCI.
>>
>> Well, not for the ACPI PM domain too.
>>
>> In general, not if the wakeup settings are adjusted by the middle layer.
> 
> Correct!
> 
> To use pm_runtime_force* for these cases, one would need some
> additional information exchange between the driver and the
> middle-layer.
> 
>>
>>> Regarding hibernation, honestly that's not really my area of
>>> expertise. Although, I assume the middle-layer and driver can treat
>>> that as a separate case, so if it's not suitable to use
>>> pm_runtime_force* for that case, then they shouldn't do it.
>>
>> Well, agreed.
>>
>> In some simple cases, though, driver callbacks can be reused for hibernation
>> too, so it would be good to have a common way to do that too, IMO.
> 
> Okay, that makes sense!
> 
>>
>>>>
>>>> Also, quite so often other middle layers interact with PCI directly or
>>>> indirectly (eg. a platform device may be a child or a consumer of a PCI
>>>> device) and some optimizations need to take that into account (eg. parents
>>>> generally need to be accessible when their childres are resumed and so on).
>>>
>>> A device's parent becomes informed when changing the runtime PM status
>>> of the device via pm_runtime_force_suspend|resume(), as those calls
>>> pm_runtime_set_suspended|active().
>>
>> This requires the parent driver or middle layer to look at the reference
>> counter and understand it the same way as pm_runtime_force_*.
>>
>>> In case that isn't that sufficient, what else is needed? Perhaps you can
>>> point me to an example so I can understand better?
>>
>> Say you want to leave the parent suspended after system resume, but the
>> child drivers use pm_runtime_force_suspend|resume().  The parent would then
>> need to use pm_runtime_force_suspend|resume() too, no?
> 
> Actually no.
> 
> Currently the other options of "deferring resume" (not using
> pm_runtime_force_*), is either using the "direct_complete" path or
> similar to the approach you took for the i2c designware driver.
> 
> Both cases should play nicely in combination of a child being managed
> by pm_runtime_force_*. That's because only when the parent device is
> kept runtime suspended during system suspend, resuming can be
> deferred.
> 
> That means, if the resume of the parent is deferred, so will the also
> the resume of the child.
> 
>>
>>> For a PCI consumer device those will of course have to play by the rules of PCI.
>>>
>>>>
>>>> Moreover, the majority of the "other subsystems/middle-layers" you've talked
>>>> about so far don't provide any PM callbacks to be invoked by pm_runtime_force_*,
>>>> so question is how representative they really are.
>>>
>>> That's the point. We know pm_runtime_force_* works nicely for the
>>> trivial middle-layer cases.
>>
>> In which cases the middle-layer callbacks don't exist, so it's just like
>> reusing driver callbacks directly. :-)

I'd like to ask you clarify one point here and provide some info which I hope can be useful - 
what's exactly means  "trivial middle-layer cases"?

Is it when systems use "drivers/base/power/clock_ops.c - Generic clock manipulation PM callbacks"
as dev_pm_domain (arm davinci/keystone), or OMAP device framework struct dev_pm_domain omap_device_pm_domain
 (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops tegra_aconnect_pm_ops?

if yes all above have PM runtime callbacks.


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 19:45               ` Grygorii Strashko
@ 2017-10-18 21:48                 ` Rafael J. Wysocki
  2017-10-19  8:33                   ` Ulf Hansson
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18 21:48 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Ulf Hansson, Linux PM, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones

On Wednesday, October 18, 2017 9:45:11 PM CEST Grygorii Strashko wrote:
> 
> On 10/18/2017 09:11 AM, Ulf Hansson wrote:

[...]

> >>> That's the point. We know pm_runtime_force_* works nicely for the
> >>> trivial middle-layer cases.
> >>
> >> In which cases the middle-layer callbacks don't exist, so it's just like
> >> reusing driver callbacks directly. :-)
> 
> I'd like to ask you clarify one point here and provide some info which I hope can be useful - 
> what's exactly means  "trivial middle-layer cases"?
> 
> Is it when systems use "drivers/base/power/clock_ops.c - Generic clock
> manipulation PM callbacks" as dev_pm_domain (arm davinci/keystone), or OMAP
> device framework struct dev_pm_domain omap_device_pm_domain
> (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops
> tegra_aconnect_pm_ops?
> 
> if yes all above have PM runtime callbacks.

Trivial ones don't actually do anything meaningful in their PM callbacks.

Things like the platform bus type, spi bus type, i2c bus type and similar.

If the middle-layer callbacks manipulate devices in a significant way, then
they aren't trivial.

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 12:34           ` Ulf Hansson
@ 2017-10-18 21:54             ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18 21:54 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Wednesday, October 18, 2017 2:34:10 PM CEST Ulf Hansson wrote:
> [...]
> 
> >> Are there any major reasons why the appended patch (obviously untested) won't
> >> work, then?
> >
> > OK, there is a reason, which is the optimizations bundled into
> > pm_runtime_force_*, because (a) the device may be left in runtime suspend
> > by them (in which case amba_pm_suspend_early() in my patch should not run)
> > and (b) pm_runtime_force_resume() may decide to leave it suspended (in which
> > case amba_pm_suspend_late() in my patch should not run).
> 
> Exactly.
> 
> >
> > [BTW, the "leave the device suspended" optimization in pm_runtime_force_*
> > is potentially problematic too, because it requires the children to do
> > the right thing, which effectively means that their drivers need to use
> > pm_runtime_force_* too, but what if they don't want to reuse their
> > runtime PM callbacks for system-wide PM?]
> 
> Deployment of pm_runtime_force_suspend() should generally be done for
> children devices first.
> 
> If some reason that isn't the case, it's expected that the call to
> pm_runtime_set_suspended() invoked from pm_runtime_force_suspend(),
> for the parent, should fail and thus abort system suspend.

Well, generally what about drivers that need to do something significantly
different for system suspend and runtime PM?  The whole picture seems to be
falling apart if one of these is involved.

> >
> > Honestly, I don't like the way this is designed.  IMO, it would be better
> > to do the optimizations and all in the bus type middle-layer code instead
> > of expecting drivers to use pm_runtime_force_* as their system-wide PM
> > callbacks (and that expectation should at least be documented, which I'm
> > not sure is the case now).  But whatever.
> >
> > It all should work the way it does now without pm_runtime_force_* if (a) the
> > bus type's PM callbacks are changed like in the last patch and the drivers
> > (b) point their system suspend callbacks to the runtime PM callback routines
> > and (c) set DPM_FLAG_SMART_SUSPEND and DPM_FLAG_LEAVE_SUSPENDED for the
> > devices (if they need to do the PM in ->suspend and ->resume, they may set
> > DPM_FLAG_AVOID_RPM too).
> >
> > And if you see a reason why that won't work, please let me know.
> 
> I will have look and try out the series by using my local "runtime PM
> test driver".
> 
> I get back to you with an update on this.

OK, thanks!

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 14:11             ` Ulf Hansson
  2017-10-18 19:45               ` Grygorii Strashko
@ 2017-10-18 22:12               ` Rafael J. Wysocki
  2017-10-19 12:21                 ` Ulf Hansson
  1 sibling, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18 22:12 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Wednesday, October 18, 2017 4:11:33 PM CEST Ulf Hansson wrote:
> [...]
> 
> >>
> >> The reason why pm_runtime_force_* needs to respects the hierarchy of
> >> the RPM callbacks, is because otherwise it can't safely update the
> >> runtime PM status of the device.
> >
> > I'm not sure I follow this requirement.  Why is that so?
> 
> If the PM domain controls some resources for the device in its RPM
> callbacks and the driver controls some other resources in its RPM
> callbacks - then these resources needs to be managed together.

Right, but that doesn't automatically make it necessary to use runtime PM
callbacks in the middle layer.  Its system-wide PM callbacks may be
suitable for that just fine.

That is, at least in some cases, you can combine ->runtime_suspend from a
driver and ->suspend_late from a middle layer with no problems, for example.

That's why some middle layers allow drivers to point ->suspend_late and
->runtime_suspend to the same routine if they want to reuse that code.

> This follows the behavior of when a regular call to
> pm_runtime_get|put(), triggers the RPM callbacks to be invoked.

But (a) it doesn't have to follow it and (b) in some cases it should not
follow it.
 
> >
> >> And updating the runtime PM status of
> >> the device is required to manage the optimized behavior during system
> >> resume (avoiding to unnecessary resume devices).
> >
> > Well, OK.  The runtime PM status of the device after system resume should
> > better reflect its physical state.
> >
> > [The physical state of the device may not be under the control of the
> > kernel in some cases, like in S3 resume on some systems that reset
> > devices in the firmware and so on, but let's set that aside.]
> >
> > However, for the runtime PM status of the device may still reflect its state
> > if, say, a ->resume_early of the middle layer is called during resume along
> > with a driver's ->runtime_resume.  That still can produce the right state
> > of the device and all depends on the middle layer.
> >
> > On the other hand, as I said before, using a middle-layer ->runtime_suspend
> > during a system sleep transition may be outright incorrect, say if device
> > wakeup settings need to be adjusted by the middle layer (which is the
> > case for some of them).
> >
> > Of course, if the middle layer expects the driver to point its
> > system-wide PM callbacks to pm_runtime_force_*, then that's how it goes,
> > but the drivers working with this particular middle layer generally
> > won't work with other middle layers and may interact incorrectly
> > with parents and/or children using the other middle layers.
> >
> > I guess the problem boils down to having a common set of expectations
> > on the driver side and on the middle layer side allowing different
> > combinations of these to work together.
> 
> Yes!
> 
> >
> >> Besides the AMBA case, I also realized that we are dealing with PM
> >> clocks in the genpd case. For this, genpd relies on the that runtime
> >> PM status of the device properly reflects the state of the HW, during
> >> system-wide PM.
> >>
> >> In other words, if the driver would change the runtime PM status of
> >> the device, without respecting the hierarchy of the runtime PM
> >> callbacks, it would lead to that genpd starts taking wrong decisions
> >> while managing the PM clocks during system-wide PM. So in case you
> >> intend to change pm_runtime_force_* this needs to be addressed too.
> >
> > I've just looked at the genpd code and quite frankly I'm not sure how this
> > works, but I'll figure this out. :-)
> 
> You may think of it as genpd's RPM callback controls some device
> clocks, while the driver control some other device resources (pinctrl
> for example) from its RPM callback.
> 
> These resources needs to managed together, similar to as I described above.

Which, again, doesn't mean that runtime PM callbacks from the middle layer
have to be used for that.

> [...]
> 
> >> Absolutely agree about the different wake-up settings. However, these
> >> issues can be addressed also when using pm_runtime_force_*, at least
> >> in general, but then not for PCI.
> >
> > Well, not for the ACPI PM domain too.
> >
> > In general, not if the wakeup settings are adjusted by the middle layer.
> 
> Correct!
> 
> To use pm_runtime_force* for these cases, one would need some
> additional information exchange between the driver and the
> middle-layer.

Which pretty much defeats the purpose of the wrappers, doesn't it?

> >
> >> Regarding hibernation, honestly that's not really my area of
> >> expertise. Although, I assume the middle-layer and driver can treat
> >> that as a separate case, so if it's not suitable to use
> >> pm_runtime_force* for that case, then they shouldn't do it.
> >
> > Well, agreed.
> >
> > In some simple cases, though, driver callbacks can be reused for hibernation
> > too, so it would be good to have a common way to do that too, IMO.
> 
> Okay, that makes sense!
> 
> >
> >> >
> >> > Also, quite so often other middle layers interact with PCI directly or
> >> > indirectly (eg. a platform device may be a child or a consumer of a PCI
> >> > device) and some optimizations need to take that into account (eg. parents
> >> > generally need to be accessible when their childres are resumed and so on).
> >>
> >> A device's parent becomes informed when changing the runtime PM status
> >> of the device via pm_runtime_force_suspend|resume(), as those calls
> >> pm_runtime_set_suspended|active().
> >
> > This requires the parent driver or middle layer to look at the reference
> > counter and understand it the same way as pm_runtime_force_*.
> >
> >> In case that isn't that sufficient, what else is needed? Perhaps you can
> >> point me to an example so I can understand better?
> >
> > Say you want to leave the parent suspended after system resume, but the
> > child drivers use pm_runtime_force_suspend|resume().  The parent would then
> > need to use pm_runtime_force_suspend|resume() too, no?
> 
> Actually no.
> 
> Currently the other options of "deferring resume" (not using
> pm_runtime_force_*), is either using the "direct_complete" path or
> similar to the approach you took for the i2c designware driver.
>
> Both cases should play nicely in combination of a child being managed
> by pm_runtime_force_*. That's because only when the parent device is
> kept runtime suspended during system suspend, resuming can be
> deferred.

And because the parent remains in runtime suspend late enough in the
system suspend path, its children also are guaranteed to be suspended.

But then all of them need to be left in runtime suspend during system
resume too, which is somewhat restrictive, because some drivers may
want their devices to be resumed then.

[BTW, our current documentation recommends resuming devices during
system resume, actually, and gives a list of reasons why. :-)]

> That means, if the resume of the parent is deferred, so will the also
> the resume of the child.
> 
> >
> >> For a PCI consumer device those will of course have to play by the rules of PCI.
> >>
> >> >
> >> > Moreover, the majority of the "other subsystems/middle-layers" you've talked
> >> > about so far don't provide any PM callbacks to be invoked by pm_runtime_force_*,
> >> > so question is how representative they really are.
> >>
> >> That's the point. We know pm_runtime_force_* works nicely for the
> >> trivial middle-layer cases.
> >
> > In which cases the middle-layer callbacks don't exist, so it's just like
> > reusing driver callbacks directly. :-)
> >
> >> For the more complex cases, we need something additional/different.
> >
> > Something different.
> >
> > But overall, as I said, this is about common expectations.
> >
> > Today, some middle layers expect drivers to point their callback pointers
> > to the same routine in order to resue it (PCI, ACPI bus type), some of them
> > expect pm_runtime_force_suspend|resume() to be used (AMBA, maybe genpd),
> > and some of them have no expectations at all.
> >
> > There needs to be a common ground in that area for drivers to be able to
> > work with different middle layers.
> 
> Yes, reaching that point would be great, we should definitively aim for that!

Indeed.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
                     ` (3 preceding siblings ...)
  2017-10-16 20:16   ` Alan Stern
@ 2017-10-18 23:17   ` Rafael J. Wysocki
  2017-10-19  7:33     ` Greg Kroah-Hartman
  2017-10-23 16:37     ` Ulf Hansson
  4 siblings, 2 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-18 23:17 UTC (permalink / raw)
  To: Linux PM, Greg Kroah-Hartman, Lukas Wunner
  Cc: Bjorn Helgaas, Alan Stern, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c, Lee Jones

From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

The motivation for this change is to provide a way to work around
a problem with the direct-complete mechanism used for avoiding
system suspend/resume handling for devices in runtime suspend.

The problem is that some middle layer code (the PCI bus type and
the ACPI PM domain in particular) returns positive values from its
system suspend ->prepare callbacks regardless of whether the driver's
->prepare returns a positive value or 0, which effectively prevents
drivers from being able to control the direct-complete feature.
Some drivers need that control, however, and the PCI bus type has
grown its own flag to deal with this issue, but since it is not
limited to PCI, it is better to address it by adding driver flags at
the core level.

To that end, add a driver_flags field to struct dev_pm_info for flags
that can be set by device drivers at the probe time to inform the PM
core and/or bus types, PM domains and so on on the capabilities and/or
preferences of device drivers.  Also add two static inline helpers
for setting that field and testing it against a given set of flags
and make the driver core clear it automatically on driver remove
and probe failures.

Define and document two PM driver flags related to the direct-
complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
respectively, to indicate to the PM core that the direct-complete
mechanism should never be used for the device and to inform the
middle layer code (bus types, PM domains etc) that it can only
request the PM core to use the direct-complete mechanism for
the device (by returning a positive value from its ->prepare
callback) if it also has been requested by the driver.

While at it, make the core check pm_runtime_suspended() when
setting power.direct_complete so that it doesn't need to be
checked by ->prepare callbacks.

Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
---

-> v2: Change the data type for driver_flags to u32 as suggested by Greg
       and fix a couple of documentation typos pointed out by Lukas.

---
 Documentation/driver-api/pm/devices.rst |   14 ++++++++++++++
 Documentation/power/pci.txt             |   19 +++++++++++++++++++
 drivers/acpi/device_pm.c                |    3 +++
 drivers/base/dd.c                       |    2 ++
 drivers/base/power/main.c               |    4 +++-
 drivers/pci/pci-driver.c                |    5 ++++-
 include/linux/device.h                  |   10 ++++++++++
 include/linux/pm.h                      |   20 ++++++++++++++++++++
 8 files changed, 75 insertions(+), 2 deletions(-)

Index: linux-pm/include/linux/device.h
===================================================================
--- linux-pm.orig/include/linux/device.h
+++ linux-pm/include/linux/device.h
@@ -1070,6 +1070,16 @@ static inline void dev_pm_syscore_device
 #endif
 }
 
+static inline void dev_pm_set_driver_flags(struct device *dev, u32 flags)
+{
+	dev->power.driver_flags = flags;
+}
+
+static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
+{
+	return !!(dev->power.driver_flags & flags);
+}
+
 static inline void device_lock(struct device *dev)
 {
 	mutex_lock(&dev->mutex);
Index: linux-pm/include/linux/pm.h
===================================================================
--- linux-pm.orig/include/linux/pm.h
+++ linux-pm/include/linux/pm.h
@@ -550,6 +550,25 @@ struct pm_subsys_data {
 #endif
 };
 
+/*
+ * Driver flags to control system suspend/resume behavior.
+ *
+ * These flags can be set by device drivers at the probe time.  They need not be
+ * cleared by the drivers as the driver core will take care of that.
+ *
+ * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
+ * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
+ *
+ * Setting SMART_PREPARE instructs bus types and PM domains which may want
+ * system suspend/resume callbacks to be skipped for the device to return 0 from
+ * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in
+ * other words, the system suspend/resume callbacks can only be skipped for the
+ * device if its driver doesn't object against that).  This flag has no effect
+ * if NEVER_SKIP is set.
+ */
+#define DPM_FLAG_NEVER_SKIP	BIT(0)
+#define DPM_FLAG_SMART_PREPARE	BIT(1)
+
 struct dev_pm_info {
 	pm_message_t		power_state;
 	unsigned int		can_wakeup:1;
@@ -561,6 +580,7 @@ struct dev_pm_info {
 	bool			is_late_suspended:1;
 	bool			early_init:1;	/* Owned by the PM core */
 	bool			direct_complete:1;	/* Owned by the PM core */
+	u32			driver_flags;
 	spinlock_t		lock;
 #ifdef CONFIG_PM_SLEEP
 	struct list_head	entry;
Index: linux-pm/drivers/base/dd.c
===================================================================
--- linux-pm.orig/drivers/base/dd.c
+++ linux-pm/drivers/base/dd.c
@@ -464,6 +464,7 @@ pinctrl_bind_failed:
 	if (dev->pm_domain && dev->pm_domain->dismiss)
 		dev->pm_domain->dismiss(dev);
 	pm_runtime_reinit(dev);
+	dev_pm_set_driver_flags(dev, 0);
 
 	switch (ret) {
 	case -EPROBE_DEFER:
@@ -869,6 +870,7 @@ static void __device_release_driver(stru
 		if (dev->pm_domain && dev->pm_domain->dismiss)
 			dev->pm_domain->dismiss(dev);
 		pm_runtime_reinit(dev);
+		dev_pm_set_driver_flags(dev, 0);
 
 		klist_remove(&dev->p->knode_driver);
 		device_pm_check_callbacks(dev);
Index: linux-pm/drivers/base/power/main.c
===================================================================
--- linux-pm.orig/drivers/base/power/main.c
+++ linux-pm/drivers/base/power/main.c
@@ -1700,7 +1700,9 @@ unlock:
 	 * applies to suspend transitions, however.
 	 */
 	spin_lock_irq(&dev->power.lock);
-	dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND;
+	dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
+		pm_runtime_suspended(dev) && ret > 0 &&
+		!dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
 	spin_unlock_irq(&dev->power.lock);
 	return 0;
 }
Index: linux-pm/drivers/pci/pci-driver.c
===================================================================
--- linux-pm.orig/drivers/pci/pci-driver.c
+++ linux-pm/drivers/pci/pci-driver.c
@@ -682,8 +682,11 @@ static int pci_pm_prepare(struct device
 
 	if (drv && drv->pm && drv->pm->prepare) {
 		int error = drv->pm->prepare(dev);
-		if (error)
+		if (error < 0)
 			return error;
+
+		if (!error && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
+			return 0;
 	}
 	return pci_dev_keep_suspended(to_pci_dev(dev));
 }
Index: linux-pm/drivers/acpi/device_pm.c
===================================================================
--- linux-pm.orig/drivers/acpi/device_pm.c
+++ linux-pm/drivers/acpi/device_pm.c
@@ -965,6 +965,9 @@ int acpi_subsys_prepare(struct device *d
 	if (ret < 0)
 		return ret;
 
+	if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
+		return 0;
+
 	if (!adev || !pm_runtime_suspended(dev))
 		return 0;
 
Index: linux-pm/Documentation/driver-api/pm/devices.rst
===================================================================
--- linux-pm.orig/Documentation/driver-api/pm/devices.rst
+++ linux-pm/Documentation/driver-api/pm/devices.rst
@@ -354,6 +354,20 @@ the phases are: ``prepare``, ``suspend``
 	is because all such devices are initially set to runtime-suspended with
 	runtime PM disabled.
 
+	This feature also can be controlled by device drivers by using the
+	``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power
+	management flags.  [Typically, they are set at the time the driver is
+	probed against the device in question by passing them to the
+	:c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
+	these flags is set, the PM core will not apply the direct-complete
+	procedure described above to the given device and, consequenty, to any
+	of its ancestors.  The second flag, when set, informs the middle layer
+	code (bus types, device types, PM domains, classes) that it should take
+	the return value of the ``->prepare`` callback provided by the driver
+	into account and it may only return a positive value from its own
+	``->prepare`` callback if the driver's one also has returned a positive
+	value.
+
     2.	The ``->suspend`` methods should quiesce the device to stop it from
 	performing I/O.  They also may save the device registers and put it into
 	the appropriate low-power state, depending on the bus type the device is
Index: linux-pm/Documentation/power/pci.txt
===================================================================
--- linux-pm.orig/Documentation/power/pci.txt
+++ linux-pm/Documentation/power/pci.txt
@@ -961,6 +961,25 @@ dev_pm_ops to indicate that one suspend
 .suspend(), .freeze(), and .poweroff() members and one resume routine is to
 be pointed to by the .resume(), .thaw(), and .restore() members.
 
+3.1.19. Driver Flags for Power Management
+
+The PM core allows device drivers to set flags that influence the handling of
+power management for the devices by the core itself and by middle layer code
+including the PCI bus type.  The flags should be set once at the driver probe
+time with the help of the dev_pm_set_driver_flags() function and they should not
+be updated directly afterwards.
+
+The DPM_FLAG_NEVER_SKIP flag prevents the PM core from using the direct-complete
+mechanism allowing device suspend/resume callbacks to be skipped if the device
+is in runtime suspend when the system suspend starts.  That also affects all of
+the ancestors of the device, so this flag should only be used if absolutely
+necessary.
+
+The DPM_FLAG_SMART_PREPARE flag instructs the PCI bus type to only return a
+positive value from pci_pm_prepare() if the ->prepare callback provided by the
+driver of the device returns a positive value.  That allows the driver to opt
+out from using the direct-complete mechanism dynamically.
+
 3.2. Device Runtime Power Management
 ------------------------------------
 In addition to providing device power management callbacks PCI device drivers


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-18 23:17   ` [Update][PATCH v2 " Rafael J. Wysocki
@ 2017-10-19  7:33     ` Greg Kroah-Hartman
  2017-10-20 11:11       ` Rafael J. Wysocki
  2017-10-23 16:37     ` Ulf Hansson
  1 sibling, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-19  7:33 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Lukas Wunner, Bjorn Helgaas, Alan Stern, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> The motivation for this change is to provide a way to work around
> a problem with the direct-complete mechanism used for avoiding
> system suspend/resume handling for devices in runtime suspend.
> 
> The problem is that some middle layer code (the PCI bus type and
> the ACPI PM domain in particular) returns positive values from its
> system suspend ->prepare callbacks regardless of whether the driver's
> ->prepare returns a positive value or 0, which effectively prevents
> drivers from being able to control the direct-complete feature.
> Some drivers need that control, however, and the PCI bus type has
> grown its own flag to deal with this issue, but since it is not
> limited to PCI, it is better to address it by adding driver flags at
> the core level.
> 
> To that end, add a driver_flags field to struct dev_pm_info for flags
> that can be set by device drivers at the probe time to inform the PM
> core and/or bus types, PM domains and so on on the capabilities and/or
> preferences of device drivers.  Also add two static inline helpers
> for setting that field and testing it against a given set of flags
> and make the driver core clear it automatically on driver remove
> and probe failures.
> 
> Define and document two PM driver flags related to the direct-
> complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> respectively, to indicate to the PM core that the direct-complete
> mechanism should never be used for the device and to inform the
> middle layer code (bus types, PM domains etc) that it can only
> request the PM core to use the direct-complete mechanism for
> the device (by returning a positive value from its ->prepare
> callback) if it also has been requested by the driver.
> 
> While at it, make the core check pm_runtime_suspended() when
> setting power.direct_complete so that it doesn't need to be
> checked by ->prepare callbacks.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 21:48                 ` Rafael J. Wysocki
@ 2017-10-19  8:33                   ` Ulf Hansson
  2017-10-19 17:21                     ` Grygorii Strashko
  0 siblings, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-19  8:33 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Linux PM, Rafael J. Wysocki, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones

On 18 October 2017 at 23:48, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, October 18, 2017 9:45:11 PM CEST Grygorii Strashko wrote:
>>
>> On 10/18/2017 09:11 AM, Ulf Hansson wrote:
>
> [...]
>
>> >>> That's the point. We know pm_runtime_force_* works nicely for the
>> >>> trivial middle-layer cases.
>> >>
>> >> In which cases the middle-layer callbacks don't exist, so it's just like
>> >> reusing driver callbacks directly. :-)
>>
>> I'd like to ask you clarify one point here and provide some info which I hope can be useful -
>> what's exactly means  "trivial middle-layer cases"?
>>
>> Is it when systems use "drivers/base/power/clock_ops.c - Generic clock
>> manipulation PM callbacks" as dev_pm_domain (arm davinci/keystone), or OMAP
>> device framework struct dev_pm_domain omap_device_pm_domain
>> (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops
>> tegra_aconnect_pm_ops?
>>
>> if yes all above have PM runtime callbacks.
>
> Trivial ones don't actually do anything meaningful in their PM callbacks.
>
> Things like the platform bus type, spi bus type, i2c bus type and similar.
>
> If the middle-layer callbacks manipulate devices in a significant way, then
> they aren't trivial.

I fully agree with Rafael's description above, but let me also clarify
one more thing.

We have also been discussing PM domains as being trivial and
non-trivial. In some statements I even think the PM domain has been a
part the middle-layer terminology, which may have been a bit
confusing.

In this regards as we consider genpd being a trivial PM domain, those
examples your bring up above is too me also examples of trivial PM
domains. Especially because they don't deal with wakeups, as that is
taken care of by the drivers, right!?

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-18 22:12               ` Rafael J. Wysocki
@ 2017-10-19 12:21                 ` Ulf Hansson
  2017-10-19 18:01                   ` Ulf Hansson
  2017-10-20  1:19                   ` Rafael J. Wysocki
  0 siblings, 2 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-19 12:21 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 19 October 2017 at 00:12, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Wednesday, October 18, 2017 4:11:33 PM CEST Ulf Hansson wrote:
>> [...]
>>
>> >>
>> >> The reason why pm_runtime_force_* needs to respects the hierarchy of
>> >> the RPM callbacks, is because otherwise it can't safely update the
>> >> runtime PM status of the device.
>> >
>> > I'm not sure I follow this requirement.  Why is that so?
>>
>> If the PM domain controls some resources for the device in its RPM
>> callbacks and the driver controls some other resources in its RPM
>> callbacks - then these resources needs to be managed together.
>
> Right, but that doesn't automatically make it necessary to use runtime PM
> callbacks in the middle layer.  Its system-wide PM callbacks may be
> suitable for that just fine.
>
> That is, at least in some cases, you can combine ->runtime_suspend from a
> driver and ->suspend_late from a middle layer with no problems, for example.
>
> That's why some middle layers allow drivers to point ->suspend_late and
> ->runtime_suspend to the same routine if they want to reuse that code.
>
>> This follows the behavior of when a regular call to
>> pm_runtime_get|put(), triggers the RPM callbacks to be invoked.
>
> But (a) it doesn't have to follow it and (b) in some cases it should not
> follow it.

Of course you don't explicitly *have to* respect the hierarchy of the
RPM callbacks in pm_runtime_force_*.

However, changing that would require some additional information
exchange between the driver and the middle-layer/PM domain, as to
instruct the middle-layer/PM domain of what to do during system-wide
PM. Especially in cases when the driver deals with wakeup, as in those
cases the instructions may change dynamically.

[...]

>> > In general, not if the wakeup settings are adjusted by the middle layer.
>>
>> Correct!
>>
>> To use pm_runtime_force* for these cases, one would need some
>> additional information exchange between the driver and the
>> middle-layer.
>
> Which pretty much defeats the purpose of the wrappers, doesn't it?

Well, no matter if the wrappers are used or not, we need some kind of
information exchange between the driver and the middle-layers/PM
domains.

Anyway, me personally think it's too early to conclude that using the
wrappers may not be useful going forward. At this point, they clearly
helps trivial cases to remain being trivial.

>
>> >
>> >> Regarding hibernation, honestly that's not really my area of
>> >> expertise. Although, I assume the middle-layer and driver can treat
>> >> that as a separate case, so if it's not suitable to use
>> >> pm_runtime_force* for that case, then they shouldn't do it.
>> >
>> > Well, agreed.
>> >
>> > In some simple cases, though, driver callbacks can be reused for hibernation
>> > too, so it would be good to have a common way to do that too, IMO.
>>
>> Okay, that makes sense!
>>
>> >
>> >> >
>> >> > Also, quite so often other middle layers interact with PCI directly or
>> >> > indirectly (eg. a platform device may be a child or a consumer of a PCI
>> >> > device) and some optimizations need to take that into account (eg. parents
>> >> > generally need to be accessible when their childres are resumed and so on).
>> >>
>> >> A device's parent becomes informed when changing the runtime PM status
>> >> of the device via pm_runtime_force_suspend|resume(), as those calls
>> >> pm_runtime_set_suspended|active().
>> >
>> > This requires the parent driver or middle layer to look at the reference
>> > counter and understand it the same way as pm_runtime_force_*.
>> >
>> >> In case that isn't that sufficient, what else is needed? Perhaps you can
>> >> point me to an example so I can understand better?
>> >
>> > Say you want to leave the parent suspended after system resume, but the
>> > child drivers use pm_runtime_force_suspend|resume().  The parent would then
>> > need to use pm_runtime_force_suspend|resume() too, no?
>>
>> Actually no.
>>
>> Currently the other options of "deferring resume" (not using
>> pm_runtime_force_*), is either using the "direct_complete" path or
>> similar to the approach you took for the i2c designware driver.
>>
>> Both cases should play nicely in combination of a child being managed
>> by pm_runtime_force_*. That's because only when the parent device is
>> kept runtime suspended during system suspend, resuming can be
>> deferred.
>
> And because the parent remains in runtime suspend late enough in the
> system suspend path, its children also are guaranteed to be suspended.

Yes.

>
> But then all of them need to be left in runtime suspend during system
> resume too, which is somewhat restrictive, because some drivers may
> want their devices to be resumed then.

Actually, this scenario is also addressed when using the pm_runtime_force_*.

The driver for the child would only need to bump the runtime PM usage
count (pm_runtime_get_noresume()) before calling
pm_runtime_force_suspend() at system suspend. That then also
propagates to the parent, leading to that both the parent and the
child will be resumed when pm_runtime_force_resume() is called for
them.

Of course, if the driver of the parent isn't using pm_runtime_force_,
we would have to assume that it's always being resumed at system
resume.

As at matter of fact, doesn't this scenario actually indicates that we
do need to involve the runtime PM core (updating RPM status according
to the HW state even during system-wide PM) to really get this right.
It's not enough to only use "driver PM flags"!?

Seems like we need to create a list of all requirements, pitfalls,
good things vs bad things etc. :-)

>
> [BTW, our current documentation recommends resuming devices during
> system resume, actually, and gives a list of reasons why. :-)]

Yes, but that too easy and to me not good enough. :-)

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19  8:33                   ` Ulf Hansson
@ 2017-10-19 17:21                     ` Grygorii Strashko
  2017-10-19 18:04                       ` Ulf Hansson
  0 siblings, 1 reply; 79+ messages in thread
From: Grygorii Strashko @ 2017-10-19 17:21 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Rafael J. Wysocki, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones



On 10/19/2017 03:33 AM, Ulf Hansson wrote:
> On 18 October 2017 at 23:48, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> On Wednesday, October 18, 2017 9:45:11 PM CEST Grygorii Strashko wrote:
>>>
>>> On 10/18/2017 09:11 AM, Ulf Hansson wrote:
>>
>> [...]
>>
>>>>>> That's the point. We know pm_runtime_force_* works nicely for the
>>>>>> trivial middle-layer cases.
>>>>>
>>>>> In which cases the middle-layer callbacks don't exist, so it's just like
>>>>> reusing driver callbacks directly. :-)
>>>
>>> I'd like to ask you clarify one point here and provide some info which I hope can be useful -
>>> what's exactly means  "trivial middle-layer cases"?
>>>
>>> Is it when systems use "drivers/base/power/clock_ops.c - Generic clock
>>> manipulation PM callbacks" as dev_pm_domain (arm davinci/keystone), or OMAP
>>> device framework struct dev_pm_domain omap_device_pm_domain
>>> (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops
>>> tegra_aconnect_pm_ops?
>>>
>>> if yes all above have PM runtime callbacks.
>>
>> Trivial ones don't actually do anything meaningful in their PM callbacks.
>>
>> Things like the platform bus type, spi bus type, i2c bus type and similar.
>>
>> If the middle-layer callbacks manipulate devices in a significant way, then
>> they aren't trivial.
> 
> I fully agree with Rafael's description above, but let me also clarify
> one more thing.
> 
> We have also been discussing PM domains as being trivial and
> non-trivial. In some statements I even think the PM domain has been a
> part the middle-layer terminology, which may have been a bit
> confusing.
> 
> In this regards as we consider genpd being a trivial PM domain, those
> examples your bring up above is too me also examples of trivial PM
> domains. Especially because they don't deal with wakeups, as that is
> taken care of by the drivers, right!?

Not directly, for example, omap device framework has noirq callback implemented
which forcibly disable all devices which are not PM runtime suspended.
while doing this it calls drivers PM .runtime_suspend() which may return
non 0 value and in this case device will be left enabled (powered) at suspend for
wake up purposes (see _od_suspend_noirq()).


-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19 12:21                 ` Ulf Hansson
@ 2017-10-19 18:01                   ` Ulf Hansson
  2017-10-20  1:19                   ` Rafael J. Wysocki
  1 sibling, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-19 18:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

[...]

>>> > Say you want to leave the parent suspended after system resume, but the
>>> > child drivers use pm_runtime_force_suspend|resume().  The parent would then
>>> > need to use pm_runtime_force_suspend|resume() too, no?
>>>
>>> Actually no.
>>>
>>> Currently the other options of "deferring resume" (not using
>>> pm_runtime_force_*), is either using the "direct_complete" path or
>>> similar to the approach you took for the i2c designware driver.
>>>
>>> Both cases should play nicely in combination of a child being managed
>>> by pm_runtime_force_*. That's because only when the parent device is
>>> kept runtime suspended during system suspend, resuming can be
>>> deferred.
>>
>> And because the parent remains in runtime suspend late enough in the
>> system suspend path, its children also are guaranteed to be suspended.
>
> Yes.
>
>>
>> But then all of them need to be left in runtime suspend during system
>> resume too, which is somewhat restrictive, because some drivers may
>> want their devices to be resumed then.
>
> Actually, this scenario is also addressed when using the pm_runtime_force_*.
>
> The driver for the child would only need to bump the runtime PM usage
> count (pm_runtime_get_noresume()) before calling
> pm_runtime_force_suspend() at system suspend. That then also
> propagates to the parent, leading to that both the parent and the
> child will be resumed when pm_runtime_force_resume() is called for
> them.

I need to correct myself here. The above currently only works if the
child is runtime resumed while pm_runtime_force_suspend() is called.

The logic in pm_runtime_force_* needs to be improved to take care of
such scenarios. However I think that should be rather easy to fix, if
we want that.

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19 17:21                     ` Grygorii Strashko
@ 2017-10-19 18:04                       ` Ulf Hansson
  2017-10-19 18:11                         ` Ulf Hansson
  0 siblings, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-19 18:04 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Linux PM, Rafael J. Wysocki, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones

On 19 October 2017 at 19:21, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>
>
> On 10/19/2017 03:33 AM, Ulf Hansson wrote:
>> On 18 October 2017 at 23:48, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>> On Wednesday, October 18, 2017 9:45:11 PM CEST Grygorii Strashko wrote:
>>>>
>>>> On 10/18/2017 09:11 AM, Ulf Hansson wrote:
>>>
>>> [...]
>>>
>>>>>>> That's the point. We know pm_runtime_force_* works nicely for the
>>>>>>> trivial middle-layer cases.
>>>>>>
>>>>>> In which cases the middle-layer callbacks don't exist, so it's just like
>>>>>> reusing driver callbacks directly. :-)
>>>>
>>>> I'd like to ask you clarify one point here and provide some info which I hope can be useful -
>>>> what's exactly means  "trivial middle-layer cases"?
>>>>
>>>> Is it when systems use "drivers/base/power/clock_ops.c - Generic clock
>>>> manipulation PM callbacks" as dev_pm_domain (arm davinci/keystone), or OMAP
>>>> device framework struct dev_pm_domain omap_device_pm_domain
>>>> (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops
>>>> tegra_aconnect_pm_ops?
>>>>
>>>> if yes all above have PM runtime callbacks.
>>>
>>> Trivial ones don't actually do anything meaningful in their PM callbacks.
>>>
>>> Things like the platform bus type, spi bus type, i2c bus type and similar.
>>>
>>> If the middle-layer callbacks manipulate devices in a significant way, then
>>> they aren't trivial.
>>
>> I fully agree with Rafael's description above, but let me also clarify
>> one more thing.
>>
>> We have also been discussing PM domains as being trivial and
>> non-trivial. In some statements I even think the PM domain has been a
>> part the middle-layer terminology, which may have been a bit
>> confusing.
>>
>> In this regards as we consider genpd being a trivial PM domain, those
>> examples your bring up above is too me also examples of trivial PM
>> domains. Especially because they don't deal with wakeups, as that is
>> taken care of by the drivers, right!?
>
> Not directly, for example, omap device framework has noirq callback implemented
> which forcibly disable all devices which are not PM runtime suspended.
> while doing this it calls drivers PM .runtime_suspend() which may return
> non 0 value and in this case device will be left enabled (powered) at suspend for
> wake up purposes (see _od_suspend_noirq()).
>

Yeah, I had that feeling that omap has some trickyness going on. :-)

I sure that can be fixed in the omap PM domain, although

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19 18:04                       ` Ulf Hansson
@ 2017-10-19 18:11                         ` Ulf Hansson
  2017-10-19 21:31                           ` Grygorii Strashko
  0 siblings, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-19 18:11 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Linux PM, Rafael J. Wysocki, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones

On 19 October 2017 at 20:04, Ulf Hansson <ulf.hansson@linaro.org> wrote:
> On 19 October 2017 at 19:21, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>>
>>
>> On 10/19/2017 03:33 AM, Ulf Hansson wrote:
>>> On 18 October 2017 at 23:48, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>> On Wednesday, October 18, 2017 9:45:11 PM CEST Grygorii Strashko wrote:
>>>>>
>>>>> On 10/18/2017 09:11 AM, Ulf Hansson wrote:
>>>>
>>>> [...]
>>>>
>>>>>>>> That's the point. We know pm_runtime_force_* works nicely for the
>>>>>>>> trivial middle-layer cases.
>>>>>>>
>>>>>>> In which cases the middle-layer callbacks don't exist, so it's just like
>>>>>>> reusing driver callbacks directly. :-)
>>>>>
>>>>> I'd like to ask you clarify one point here and provide some info which I hope can be useful -
>>>>> what's exactly means  "trivial middle-layer cases"?
>>>>>
>>>>> Is it when systems use "drivers/base/power/clock_ops.c - Generic clock
>>>>> manipulation PM callbacks" as dev_pm_domain (arm davinci/keystone), or OMAP
>>>>> device framework struct dev_pm_domain omap_device_pm_domain
>>>>> (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops
>>>>> tegra_aconnect_pm_ops?
>>>>>
>>>>> if yes all above have PM runtime callbacks.
>>>>
>>>> Trivial ones don't actually do anything meaningful in their PM callbacks.
>>>>
>>>> Things like the platform bus type, spi bus type, i2c bus type and similar.
>>>>
>>>> If the middle-layer callbacks manipulate devices in a significant way, then
>>>> they aren't trivial.
>>>
>>> I fully agree with Rafael's description above, but let me also clarify
>>> one more thing.
>>>
>>> We have also been discussing PM domains as being trivial and
>>> non-trivial. In some statements I even think the PM domain has been a
>>> part the middle-layer terminology, which may have been a bit
>>> confusing.
>>>
>>> In this regards as we consider genpd being a trivial PM domain, those
>>> examples your bring up above is too me also examples of trivial PM
>>> domains. Especially because they don't deal with wakeups, as that is
>>> taken care of by the drivers, right!?
>>
>> Not directly, for example, omap device framework has noirq callback implemented
>> which forcibly disable all devices which are not PM runtime suspended.
>> while doing this it calls drivers PM .runtime_suspend() which may return
>> non 0 value and in this case device will be left enabled (powered) at suspend for
>> wake up purposes (see _od_suspend_noirq()).
>>
>
> Yeah, I had that feeling that omap has some trickyness going on. :-)
>
> I sure that can be fixed in the omap PM domain, although

...slipped with my fingers.. here is the rest of the reply...

..of course that require us to use another way for drivers to signal
to the omap PM domain that it needs to stay powered as to deal with
wakeup.

I can have a look at that more closely, to see if it makes sense to change.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19 18:11                         ` Ulf Hansson
@ 2017-10-19 21:31                           ` Grygorii Strashko
  2017-10-20  6:05                             ` Ulf Hansson
  0 siblings, 1 reply; 79+ messages in thread
From: Grygorii Strashko @ 2017-10-19 21:31 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Rafael J. Wysocki, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones



On 10/19/2017 01:11 PM, Ulf Hansson wrote:
> On 19 October 2017 at 20:04, Ulf Hansson <ulf.hansson@linaro.org> wrote:
>> On 19 October 2017 at 19:21, Grygorii Strashko <grygorii.strashko@ti.com> wrote:
>>>
>>>
>>> On 10/19/2017 03:33 AM, Ulf Hansson wrote:
>>>> On 18 October 2017 at 23:48, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>>>>> On Wednesday, October 18, 2017 9:45:11 PM CEST Grygorii Strashko wrote:
>>>>>>
>>>>>> On 10/18/2017 09:11 AM, Ulf Hansson wrote:
>>>>>
>>>>> [...]
>>>>>
>>>>>>>>> That's the point. We know pm_runtime_force_* works nicely for the
>>>>>>>>> trivial middle-layer cases.
>>>>>>>>
>>>>>>>> In which cases the middle-layer callbacks don't exist, so it's just like
>>>>>>>> reusing driver callbacks directly. :-)
>>>>>>
>>>>>> I'd like to ask you clarify one point here and provide some info which I hope can be useful -
>>>>>> what's exactly means  "trivial middle-layer cases"?
>>>>>>
>>>>>> Is it when systems use "drivers/base/power/clock_ops.c - Generic clock
>>>>>> manipulation PM callbacks" as dev_pm_domain (arm davinci/keystone), or OMAP
>>>>>> device framework struct dev_pm_domain omap_device_pm_domain
>>>>>> (arm/mach-omap2/omap_device.c) or static const struct dev_pm_ops
>>>>>> tegra_aconnect_pm_ops?
>>>>>>
>>>>>> if yes all above have PM runtime callbacks.
>>>>>
>>>>> Trivial ones don't actually do anything meaningful in their PM callbacks.
>>>>>
>>>>> Things like the platform bus type, spi bus type, i2c bus type and similar.
>>>>>
>>>>> If the middle-layer callbacks manipulate devices in a significant way, then
>>>>> they aren't trivial.
>>>>
>>>> I fully agree with Rafael's description above, but let me also clarify
>>>> one more thing.
>>>>
>>>> We have also been discussing PM domains as being trivial and
>>>> non-trivial. In some statements I even think the PM domain has been a
>>>> part the middle-layer terminology, which may have been a bit
>>>> confusing.
>>>>
>>>> In this regards as we consider genpd being a trivial PM domain, those
>>>> examples your bring up above is too me also examples of trivial PM
>>>> domains. Especially because they don't deal with wakeups, as that is
>>>> taken care of by the drivers, right!?
>>>
>>> Not directly, for example, omap device framework has noirq callback implemented
>>> which forcibly disable all devices which are not PM runtime suspended.
>>> while doing this it calls drivers PM .runtime_suspend() which may return
>>> non 0 value and in this case device will be left enabled (powered) at suspend for
>>> wake up purposes (see _od_suspend_noirq()).
>>>
>>
>> Yeah, I had that feeling that omap has some trickyness going on. :-)
>>
>> I sure that can be fixed in the omap PM domain, although
> 
> ...slipped with my fingers.. here is the rest of the reply...
> 
> ..of course that require us to use another way for drivers to signal
> to the omap PM domain that it needs to stay powered as to deal with
> wakeup.
> 
> I can have a look at that more closely, to see if it makes sense to change.
> 

Also, additional note here. some IPs are reused between OMAP/Davinci/Keystone,
OMAP PM domain have some code running at noirq time to dial with devices left
in PM runtime enabled state (OMAP PM runtime centric), while Davinci/Keystone haven't (clock_ops.c),
so pm_runtime_force_* API is actually possibility now to make the same driver work 
 on all these platforms. 

-- 
regards,
-grygorii

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19 12:21                 ` Ulf Hansson
  2017-10-19 18:01                   ` Ulf Hansson
@ 2017-10-20  1:19                   ` Rafael J. Wysocki
  2017-10-20  5:57                     ` Ulf Hansson
  1 sibling, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-20  1:19 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Thursday, October 19, 2017 2:21:07 PM CEST Ulf Hansson wrote:
> On 19 October 2017 at 00:12, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > On Wednesday, October 18, 2017 4:11:33 PM CEST Ulf Hansson wrote:
> >> [...]
> >>
> >> >>
> >> >> The reason why pm_runtime_force_* needs to respects the hierarchy of
> >> >> the RPM callbacks, is because otherwise it can't safely update the
> >> >> runtime PM status of the device.
> >> >
> >> > I'm not sure I follow this requirement.  Why is that so?
> >>
> >> If the PM domain controls some resources for the device in its RPM
> >> callbacks and the driver controls some other resources in its RPM
> >> callbacks - then these resources needs to be managed together.
> >
> > Right, but that doesn't automatically make it necessary to use runtime PM
> > callbacks in the middle layer.  Its system-wide PM callbacks may be
> > suitable for that just fine.
> >
> > That is, at least in some cases, you can combine ->runtime_suspend from a
> > driver and ->suspend_late from a middle layer with no problems, for example.
> >
> > That's why some middle layers allow drivers to point ->suspend_late and
> > ->runtime_suspend to the same routine if they want to reuse that code.
> >
> >> This follows the behavior of when a regular call to
> >> pm_runtime_get|put(), triggers the RPM callbacks to be invoked.
> >
> > But (a) it doesn't have to follow it and (b) in some cases it should not
> > follow it.
> 
> Of course you don't explicitly *have to* respect the hierarchy of the
> RPM callbacks in pm_runtime_force_*.
> 
> However, changing that would require some additional information
> exchange between the driver and the middle-layer/PM domain, as to
> instruct the middle-layer/PM domain of what to do during system-wide
> PM. Especially in cases when the driver deals with wakeup, as in those
> cases the instructions may change dynamically.

Well, if wakeup matters, drivers can't simply point their PM callbacks
to pm_runtime_force_* anyway.

If the driver itself deals with wakeups, it clearly needs different callback
routines for system-wide PM and for runtime PM, so it can't reuse its runtime
PM callbacks at all then.

If the middle layer deals with wakeups, different callbacks are needed at
that level and so pm_runtime_force_* are unsuitable too.

Really, invoking runtime PM callbacks from the middle layer in
pm_runtime_force_* is a not a idea at all and there's no general requirement
for it whatever.

> [...]
> 
> >> > In general, not if the wakeup settings are adjusted by the middle layer.
> >>
> >> Correct!
> >>
> >> To use pm_runtime_force* for these cases, one would need some
> >> additional information exchange between the driver and the
> >> middle-layer.
> >
> > Which pretty much defeats the purpose of the wrappers, doesn't it?
> 
> Well, no matter if the wrappers are used or not, we need some kind of
> information exchange between the driver and the middle-layers/PM
> domains.

Right.

But if that information is exchanged, then why use wrappers *in* *addition*
to that?

> Anyway, me personally think it's too early to conclude that using the
> wrappers may not be useful going forward. At this point, they clearly
> helps trivial cases to remain being trivial.

I'm not sure about that really.  So far I've seen more complexity resulting
from using them than being avoided by using them, but I guess the beauty is
in the eye of the beholder. :-)

> >
> >> >
> >> >> Regarding hibernation, honestly that's not really my area of
> >> >> expertise. Although, I assume the middle-layer and driver can treat
> >> >> that as a separate case, so if it's not suitable to use
> >> >> pm_runtime_force* for that case, then they shouldn't do it.
> >> >
> >> > Well, agreed.
> >> >
> >> > In some simple cases, though, driver callbacks can be reused for hibernation
> >> > too, so it would be good to have a common way to do that too, IMO.
> >>
> >> Okay, that makes sense!
> >>
> >> >
> >> >> >
> >> >> > Also, quite so often other middle layers interact with PCI directly or
> >> >> > indirectly (eg. a platform device may be a child or a consumer of a PCI
> >> >> > device) and some optimizations need to take that into account (eg. parents
> >> >> > generally need to be accessible when their childres are resumed and so on).
> >> >>
> >> >> A device's parent becomes informed when changing the runtime PM status
> >> >> of the device via pm_runtime_force_suspend|resume(), as those calls
> >> >> pm_runtime_set_suspended|active().
> >> >
> >> > This requires the parent driver or middle layer to look at the reference
> >> > counter and understand it the same way as pm_runtime_force_*.
> >> >
> >> >> In case that isn't that sufficient, what else is needed? Perhaps you can
> >> >> point me to an example so I can understand better?
> >> >
> >> > Say you want to leave the parent suspended after system resume, but the
> >> > child drivers use pm_runtime_force_suspend|resume().  The parent would then
> >> > need to use pm_runtime_force_suspend|resume() too, no?
> >>
> >> Actually no.
> >>
> >> Currently the other options of "deferring resume" (not using
> >> pm_runtime_force_*), is either using the "direct_complete" path or
> >> similar to the approach you took for the i2c designware driver.
> >>
> >> Both cases should play nicely in combination of a child being managed
> >> by pm_runtime_force_*. That's because only when the parent device is
> >> kept runtime suspended during system suspend, resuming can be
> >> deferred.
> >
> > And because the parent remains in runtime suspend late enough in the
> > system suspend path, its children also are guaranteed to be suspended.
> 
> Yes.
> 
> >
> > But then all of them need to be left in runtime suspend during system
> > resume too, which is somewhat restrictive, because some drivers may
> > want their devices to be resumed then.
> 
> Actually, this scenario is also addressed when using the pm_runtime_force_*.
> 
> The driver for the child would only need to bump the runtime PM usage
> count (pm_runtime_get_noresume()) before calling
> pm_runtime_force_suspend() at system suspend. That then also
> propagates to the parent, leading to that both the parent and the
> child will be resumed when pm_runtime_force_resume() is called for
> them.
> 
> Of course, if the driver of the parent isn't using pm_runtime_force_,
> we would have to assume that it's always being resumed at system
> resume.

There may be other ways to avoid that, though.

BTW, I don't quite like using the RPM usage counter this way either, if
that hasn't been clear so far.

> As at matter of fact, doesn't this scenario actually indicates that we
> do need to involve the runtime PM core (updating RPM status according
> to the HW state even during system-wide PM) to really get this right.
> It's not enough to only use "driver PM flags"!?

I'm not sure what you are talking about.

For all devices with enabled runtime PM any state produced by system
suspend/resume has to be labeled either as RPM_SUSPENDED or as RPM_ACTIVE.
That has always been the case and hasn't involved any magic.

However, while runtime PM is disabled, the state of the device doesn't
need to be reflected by its RPM status and there's no need to track it then.
Moreover, in some cases it cannot be tracked even, because of the firmare
involvement (and we cannot track the firmware).

Besides, please really look at what happens in the patches I posted and
then we can talk.

> Seems like we need to create a list of all requirements, pitfalls,
> good things vs bad things etc. :-)

We surely need to know what general cases need to be addressed.

> >
> > [BTW, our current documentation recommends resuming devices during
> > system resume, actually, and gives a list of reasons why. :-)]
> 
> Yes, but that too easy and to me not good enough. :-)

But the list of reasons why is kind of valid still.  There may be better
reasons for not doing that, but it really is a tradeoff and drivers
should be able to decide which way they want to go.

IOW, the "leave the device in runtime suspend throughout system
suspend" optimization doesn't have to be bundled with the "leave the
device in suspend throughout and after system resume" one.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-20  1:19                   ` Rafael J. Wysocki
@ 2017-10-20  5:57                     ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-20  5:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 20 October 2017 at 03:19, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> On Thursday, October 19, 2017 2:21:07 PM CEST Ulf Hansson wrote:
>> On 19 October 2017 at 00:12, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
>> > On Wednesday, October 18, 2017 4:11:33 PM CEST Ulf Hansson wrote:
>> >> [...]
>> >>
>> >> >>
>> >> >> The reason why pm_runtime_force_* needs to respects the hierarchy of
>> >> >> the RPM callbacks, is because otherwise it can't safely update the
>> >> >> runtime PM status of the device.
>> >> >
>> >> > I'm not sure I follow this requirement.  Why is that so?
>> >>
>> >> If the PM domain controls some resources for the device in its RPM
>> >> callbacks and the driver controls some other resources in its RPM
>> >> callbacks - then these resources needs to be managed together.
>> >
>> > Right, but that doesn't automatically make it necessary to use runtime PM
>> > callbacks in the middle layer.  Its system-wide PM callbacks may be
>> > suitable for that just fine.
>> >
>> > That is, at least in some cases, you can combine ->runtime_suspend from a
>> > driver and ->suspend_late from a middle layer with no problems, for example.
>> >
>> > That's why some middle layers allow drivers to point ->suspend_late and
>> > ->runtime_suspend to the same routine if they want to reuse that code.
>> >
>> >> This follows the behavior of when a regular call to
>> >> pm_runtime_get|put(), triggers the RPM callbacks to be invoked.
>> >
>> > But (a) it doesn't have to follow it and (b) in some cases it should not
>> > follow it.
>>
>> Of course you don't explicitly *have to* respect the hierarchy of the
>> RPM callbacks in pm_runtime_force_*.
>>
>> However, changing that would require some additional information
>> exchange between the driver and the middle-layer/PM domain, as to
>> instruct the middle-layer/PM domain of what to do during system-wide
>> PM. Especially in cases when the driver deals with wakeup, as in those
>> cases the instructions may change dynamically.
>
> Well, if wakeup matters, drivers can't simply point their PM callbacks
> to pm_runtime_force_* anyway.
>
> If the driver itself deals with wakeups, it clearly needs different callback
> routines for system-wide PM and for runtime PM, so it can't reuse its runtime
> PM callbacks at all then.

It can still re-use its runtime PM callbacks, simply by calling
pm_runtime_force_ from its system sleep callbacks.

Drivers already do that today, not only to deal with wakeups, but
generally when they need to deal with some additional operations.

>
> If the middle layer deals with wakeups, different callbacks are needed at
> that level and so pm_runtime_force_* are unsuitable too.
>
> Really, invoking runtime PM callbacks from the middle layer in
> pm_runtime_force_* is a not a idea at all and there's no general requirement
> for it whatever.
>
>> [...]
>>
>> >> > In general, not if the wakeup settings are adjusted by the middle layer.
>> >>
>> >> Correct!
>> >>
>> >> To use pm_runtime_force* for these cases, one would need some
>> >> additional information exchange between the driver and the
>> >> middle-layer.
>> >
>> > Which pretty much defeats the purpose of the wrappers, doesn't it?
>>
>> Well, no matter if the wrappers are used or not, we need some kind of
>> information exchange between the driver and the middle-layers/PM
>> domains.
>
> Right.
>
> But if that information is exchanged, then why use wrappers *in* *addition*
> to that?

If we can find a different method that both avoids both open coding
and offers the optimize system-wide PM path at resume, I am open to
that.

>
>> Anyway, me personally think it's too early to conclude that using the
>> wrappers may not be useful going forward. At this point, they clearly
>> helps trivial cases to remain being trivial.
>
> I'm not sure about that really.  So far I've seen more complexity resulting
> from using them than being avoided by using them, but I guess the beauty is
> in the eye of the beholder. :-)

Hehe, yeah you may be right. :-)

>
>> >
>> >> >
>> >> >> Regarding hibernation, honestly that's not really my area of
>> >> >> expertise. Although, I assume the middle-layer and driver can treat
>> >> >> that as a separate case, so if it's not suitable to use
>> >> >> pm_runtime_force* for that case, then they shouldn't do it.
>> >> >
>> >> > Well, agreed.
>> >> >
>> >> > In some simple cases, though, driver callbacks can be reused for hibernation
>> >> > too, so it would be good to have a common way to do that too, IMO.
>> >>
>> >> Okay, that makes sense!
>> >>
>> >> >
>> >> >> >
>> >> >> > Also, quite so often other middle layers interact with PCI directly or
>> >> >> > indirectly (eg. a platform device may be a child or a consumer of a PCI
>> >> >> > device) and some optimizations need to take that into account (eg. parents
>> >> >> > generally need to be accessible when their childres are resumed and so on).
>> >> >>
>> >> >> A device's parent becomes informed when changing the runtime PM status
>> >> >> of the device via pm_runtime_force_suspend|resume(), as those calls
>> >> >> pm_runtime_set_suspended|active().
>> >> >
>> >> > This requires the parent driver or middle layer to look at the reference
>> >> > counter and understand it the same way as pm_runtime_force_*.
>> >> >
>> >> >> In case that isn't that sufficient, what else is needed? Perhaps you can
>> >> >> point me to an example so I can understand better?
>> >> >
>> >> > Say you want to leave the parent suspended after system resume, but the
>> >> > child drivers use pm_runtime_force_suspend|resume().  The parent would then
>> >> > need to use pm_runtime_force_suspend|resume() too, no?
>> >>
>> >> Actually no.
>> >>
>> >> Currently the other options of "deferring resume" (not using
>> >> pm_runtime_force_*), is either using the "direct_complete" path or
>> >> similar to the approach you took for the i2c designware driver.
>> >>
>> >> Both cases should play nicely in combination of a child being managed
>> >> by pm_runtime_force_*. That's because only when the parent device is
>> >> kept runtime suspended during system suspend, resuming can be
>> >> deferred.
>> >
>> > And because the parent remains in runtime suspend late enough in the
>> > system suspend path, its children also are guaranteed to be suspended.
>>
>> Yes.
>>
>> >
>> > But then all of them need to be left in runtime suspend during system
>> > resume too, which is somewhat restrictive, because some drivers may
>> > want their devices to be resumed then.
>>
>> Actually, this scenario is also addressed when using the pm_runtime_force_*.
>>
>> The driver for the child would only need to bump the runtime PM usage
>> count (pm_runtime_get_noresume()) before calling
>> pm_runtime_force_suspend() at system suspend. That then also
>> propagates to the parent, leading to that both the parent and the
>> child will be resumed when pm_runtime_force_resume() is called for
>> them.
>>
>> Of course, if the driver of the parent isn't using pm_runtime_force_,
>> we would have to assume that it's always being resumed at system
>> resume.
>
> There may be other ways to avoid that, though.
>
> BTW, I don't quite like using the RPM usage counter this way either, if
> that hasn't been clear so far.
>
>> As at matter of fact, doesn't this scenario actually indicates that we
>> do need to involve the runtime PM core (updating RPM status according
>> to the HW state even during system-wide PM) to really get this right.
>> It's not enough to only use "driver PM flags"!?
>
> I'm not sure what you are talking about.
>
> For all devices with enabled runtime PM any state produced by system
> suspend/resume has to be labeled either as RPM_SUSPENDED or as RPM_ACTIVE.
> That has always been the case and hasn't involved any magic.
>
> However, while runtime PM is disabled, the state of the device doesn't
> need to be reflected by its RPM status and there's no need to track it then.
> Moreover, in some cases it cannot be tracked even, because of the firmare
> involvement (and we cannot track the firmware).
>
> Besides, please really look at what happens in the patches I posted and
> then we can talk.

Yes, I will have look.

>
>> Seems like we need to create a list of all requirements, pitfalls,
>> good things vs bad things etc. :-)
>
> We surely need to know what general cases need to be addressed.
>
>> >
>> > [BTW, our current documentation recommends resuming devices during
>> > system resume, actually, and gives a list of reasons why. :-)]
>>
>> Yes, but that too easy and to me not good enough. :-)
>
> But the list of reasons why is kind of valid still.  There may be better
> reasons for not doing that, but it really is a tradeoff and drivers
> should be able to decide which way they want to go.

Agree.

>
> IOW, the "leave the device in runtime suspend throughout system
> suspend" optimization doesn't have to be bundled with the "leave the
> device in suspend throughout and after system resume" one.

Agree.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-19 21:31                           ` Grygorii Strashko
@ 2017-10-20  6:05                             ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-20  6:05 UTC (permalink / raw)
  To: Grygorii Strashko
  Cc: Linux PM, Rafael J. Wysocki, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Andy Shevchenko,
	Kevin Hilman, Wolfram Sang, linux-i2c@vger.kernel.org, Lee Jones

[...]

>>>>> In this regards as we consider genpd being a trivial PM domain, those
>>>>> examples your bring up above is too me also examples of trivial PM
>>>>> domains. Especially because they don't deal with wakeups, as that is
>>>>> taken care of by the drivers, right!?
>>>>
>>>> Not directly, for example, omap device framework has noirq callback implemented
>>>> which forcibly disable all devices which are not PM runtime suspended.
>>>> while doing this it calls drivers PM .runtime_suspend() which may return
>>>> non 0 value and in this case device will be left enabled (powered) at suspend for
>>>> wake up purposes (see _od_suspend_noirq()).
>>>>
>>>
>>> Yeah, I had that feeling that omap has some trickyness going on. :-)
>>>
>>> I sure that can be fixed in the omap PM domain, although
>>
>> ...slipped with my fingers.. here is the rest of the reply...
>>
>> ..of course that require us to use another way for drivers to signal
>> to the omap PM domain that it needs to stay powered as to deal with
>> wakeup.
>>
>> I can have a look at that more closely, to see if it makes sense to change.
>>
>
> Also, additional note here. some IPs are reused between OMAP/Davinci/Keystone,
> OMAP PM domain have some code running at noirq time to dial with devices left
> in PM runtime enabled state (OMAP PM runtime centric), while Davinci/Keystone haven't (clock_ops.c),
> so pm_runtime_force_* API is actually possibility now to make the same driver work
>  on all these platforms.

That sounds great!

Also, in the end it would be nice to also convert the OMAP PM domain
to genpd. I think most of the needed infrastructure is already there
to do that.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-19  7:33     ` Greg Kroah-Hartman
@ 2017-10-20 11:11       ` Rafael J. Wysocki
  2017-10-20 11:35         ` Greg Kroah-Hartman
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-20 11:11 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux PM, Lukas Wunner, Bjorn Helgaas, Alan Stern, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Thursday, October 19, 2017 9:33:15 AM CEST Greg Kroah-Hartman wrote:
> On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > The motivation for this change is to provide a way to work around
> > a problem with the direct-complete mechanism used for avoiding
> > system suspend/resume handling for devices in runtime suspend.
> > 
> > The problem is that some middle layer code (the PCI bus type and
> > the ACPI PM domain in particular) returns positive values from its
> > system suspend ->prepare callbacks regardless of whether the driver's
> > ->prepare returns a positive value or 0, which effectively prevents
> > drivers from being able to control the direct-complete feature.
> > Some drivers need that control, however, and the PCI bus type has
> > grown its own flag to deal with this issue, but since it is not
> > limited to PCI, it is better to address it by adding driver flags at
> > the core level.
> > 
> > To that end, add a driver_flags field to struct dev_pm_info for flags
> > that can be set by device drivers at the probe time to inform the PM
> > core and/or bus types, PM domains and so on on the capabilities and/or
> > preferences of device drivers.  Also add two static inline helpers
> > for setting that field and testing it against a given set of flags
> > and make the driver core clear it automatically on driver remove
> > and probe failures.
> > 
> > Define and document two PM driver flags related to the direct-
> > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > respectively, to indicate to the PM core that the direct-complete
> > mechanism should never be used for the device and to inform the
> > middle layer code (bus types, PM domains etc) that it can only
> > request the PM core to use the direct-complete mechanism for
> > the device (by returning a positive value from its ->prepare
> > callback) if it also has been requested by the driver.
> > 
> > While at it, make the core check pm_runtime_suspended() when
> > setting power.direct_complete so that it doesn't need to be
> > checked by ->prepare callbacks.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

Thanks!

Does it also apply to the other patches in the series?

I'd like to queue up the core patches for 4.15 as they are specifically
designed to only affect the drivers that actually set the flags, so there
shouldn't be any regression resulting from them, and I'd like to be
able to start using the flags in drivers going forward.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-20 11:35         ` Greg Kroah-Hartman
@ 2017-10-20 11:28           ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-20 11:28 UTC (permalink / raw)
  To: Greg Kroah-Hartman
  Cc: Linux PM, Lukas Wunner, Bjorn Helgaas, Alan Stern, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Friday, October 20, 2017 1:35:27 PM CEST Greg Kroah-Hartman wrote:
> On Fri, Oct 20, 2017 at 01:11:22PM +0200, Rafael J. Wysocki wrote:
> > On Thursday, October 19, 2017 9:33:15 AM CEST Greg Kroah-Hartman wrote:
> > > On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> > > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > > 
> > > > The motivation for this change is to provide a way to work around
> > > > a problem with the direct-complete mechanism used for avoiding
> > > > system suspend/resume handling for devices in runtime suspend.
> > > > 
> > > > The problem is that some middle layer code (the PCI bus type and
> > > > the ACPI PM domain in particular) returns positive values from its
> > > > system suspend ->prepare callbacks regardless of whether the driver's
> > > > ->prepare returns a positive value or 0, which effectively prevents
> > > > drivers from being able to control the direct-complete feature.
> > > > Some drivers need that control, however, and the PCI bus type has
> > > > grown its own flag to deal with this issue, but since it is not
> > > > limited to PCI, it is better to address it by adding driver flags at
> > > > the core level.
> > > > 
> > > > To that end, add a driver_flags field to struct dev_pm_info for flags
> > > > that can be set by device drivers at the probe time to inform the PM
> > > > core and/or bus types, PM domains and so on on the capabilities and/or
> > > > preferences of device drivers.  Also add two static inline helpers
> > > > for setting that field and testing it against a given set of flags
> > > > and make the driver core clear it automatically on driver remove
> > > > and probe failures.
> > > > 
> > > > Define and document two PM driver flags related to the direct-
> > > > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > > > respectively, to indicate to the PM core that the direct-complete
> > > > mechanism should never be used for the device and to inform the
> > > > middle layer code (bus types, PM domains etc) that it can only
> > > > request the PM core to use the direct-complete mechanism for
> > > > the device (by returning a positive value from its ->prepare
> > > > callback) if it also has been requested by the driver.
> > > > 
> > > > While at it, make the core check pm_runtime_suspended() when
> > > > setting power.direct_complete so that it doesn't need to be
> > > > checked by ->prepare callbacks.
> > > > 
> > > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> > 
> > Thanks!
> > 
> > Does it also apply to the other patches in the series?
> > 
> > I'd like to queue up the core patches for 4.15 as they are specifically
> > designed to only affect the drivers that actually set the flags, so there
> > shouldn't be any regression resulting from them, and I'd like to be
> > able to start using the flags in drivers going forward.
> 
> Yes, sorry, I thought I acked them, but you are right, I didn't:
> 
> Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> for all of them please.

Thanks!


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-20 11:11       ` Rafael J. Wysocki
@ 2017-10-20 11:35         ` Greg Kroah-Hartman
  2017-10-20 11:28           ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Greg Kroah-Hartman @ 2017-10-20 11:35 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Lukas Wunner, Bjorn Helgaas, Alan Stern, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Fri, Oct 20, 2017 at 01:11:22PM +0200, Rafael J. Wysocki wrote:
> On Thursday, October 19, 2017 9:33:15 AM CEST Greg Kroah-Hartman wrote:
> > On Thu, Oct 19, 2017 at 01:17:31AM +0200, Rafael J. Wysocki wrote:
> > > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > > 
> > > The motivation for this change is to provide a way to work around
> > > a problem with the direct-complete mechanism used for avoiding
> > > system suspend/resume handling for devices in runtime suspend.
> > > 
> > > The problem is that some middle layer code (the PCI bus type and
> > > the ACPI PM domain in particular) returns positive values from its
> > > system suspend ->prepare callbacks regardless of whether the driver's
> > > ->prepare returns a positive value or 0, which effectively prevents
> > > drivers from being able to control the direct-complete feature.
> > > Some drivers need that control, however, and the PCI bus type has
> > > grown its own flag to deal with this issue, but since it is not
> > > limited to PCI, it is better to address it by adding driver flags at
> > > the core level.
> > > 
> > > To that end, add a driver_flags field to struct dev_pm_info for flags
> > > that can be set by device drivers at the probe time to inform the PM
> > > core and/or bus types, PM domains and so on on the capabilities and/or
> > > preferences of device drivers.  Also add two static inline helpers
> > > for setting that field and testing it against a given set of flags
> > > and make the driver core clear it automatically on driver remove
> > > and probe failures.
> > > 
> > > Define and document two PM driver flags related to the direct-
> > > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > > respectively, to indicate to the PM core that the direct-complete
> > > mechanism should never be used for the device and to inform the
> > > middle layer code (bus types, PM domains etc) that it can only
> > > request the PM core to use the direct-complete mechanism for
> > > the device (by returning a positive value from its ->prepare
> > > callback) if it also has been requested by the driver.
> > > 
> > > While at it, make the core check pm_runtime_suspended() when
> > > setting power.direct_complete so that it doesn't need to be
> > > checked by ->prepare callbacks.
> > > 
> > > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>
> 
> Thanks!
> 
> Does it also apply to the other patches in the series?
> 
> I'd like to queue up the core patches for 4.15 as they are specifically
> designed to only affect the drivers that actually set the flags, so there
> shouldn't be any regression resulting from them, and I'd like to be
> able to start using the flags in drivers going forward.

Yes, sorry, I thought I acked them, but you are right, I didn't:

Acked-by: Greg Kroah-Hartman <gregkh@linuxfoundation.org>

for all of them please.

thanks,

greg k-h

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
                   ` (13 preceding siblings ...)
  2017-10-17  8:36 ` Ulf Hansson
@ 2017-10-20 20:46 ` Bjorn Helgaas
  2017-10-21  1:04   ` Rafael J. Wysocki
  14 siblings, 1 reply; 79+ messages in thread
From: Bjorn Helgaas @ 2017-10-20 20:46 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Mon, Oct 16, 2017 at 03:12:35AM +0200, Rafael J. Wysocki wrote:
> Hi All,
> 
> Well, this took more time than expected, as I tried to cover everything I had
> in mind regarding PM flags for drivers.

For the parts that touch PCI,

Acked-by: Bjorn Helgaas <bhelgaas@google.com>

I doubt there'll be conflicts with changes in my tree, but let me know if
you trip over any so I can watch for them when merging.

Bjorn

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume
  2017-10-20 20:46 ` Bjorn Helgaas
@ 2017-10-21  1:04   ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-21  1:04 UTC (permalink / raw)
  To: Bjorn Helgaas
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c, Lee Jones

On Friday, October 20, 2017 10:46:07 PM CEST Bjorn Helgaas wrote:
> On Mon, Oct 16, 2017 at 03:12:35AM +0200, Rafael J. Wysocki wrote:
> > Hi All,
> > 
> > Well, this took more time than expected, as I tried to cover everything I had
> > in mind regarding PM flags for drivers.
> 
> For the parts that touch PCI,
> 
> Acked-by: Bjorn Helgaas <bhelgaas@google.com>

Thank you!

> I doubt there'll be conflicts with changes in my tree, but let me know if
> you trip over any so I can watch for them when merging.

Well, if there are any conflicts, we'll see them in linux-next I guess. :-)

Thanks,
Rafael


^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-18 23:17   ` [Update][PATCH v2 " Rafael J. Wysocki
  2017-10-19  7:33     ` Greg Kroah-Hartman
@ 2017-10-23 16:37     ` Ulf Hansson
  2017-10-23 20:41       ` Rafael J. Wysocki
  1 sibling, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 16:37 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Greg Kroah-Hartman, Lukas Wunner, Bjorn Helgaas,
	Alan Stern, LKML, Linux ACPI, Linux PCI, Linux Documentation,
	Mika Westerberg, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 19 October 2017 at 01:17, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> The motivation for this change is to provide a way to work around
> a problem with the direct-complete mechanism used for avoiding
> system suspend/resume handling for devices in runtime suspend.
>
> The problem is that some middle layer code (the PCI bus type and
> the ACPI PM domain in particular) returns positive values from its
> system suspend ->prepare callbacks regardless of whether the driver's
> ->prepare returns a positive value or 0, which effectively prevents
> drivers from being able to control the direct-complete feature.
> Some drivers need that control, however, and the PCI bus type has
> grown its own flag to deal with this issue, but since it is not
> limited to PCI, it is better to address it by adding driver flags at
> the core level.
>
> To that end, add a driver_flags field to struct dev_pm_info for flags
> that can be set by device drivers at the probe time to inform the PM
> core and/or bus types, PM domains and so on on the capabilities and/or
> preferences of device drivers.  Also add two static inline helpers
> for setting that field and testing it against a given set of flags
> and make the driver core clear it automatically on driver remove
> and probe failures.
>
> Define and document two PM driver flags related to the direct-
> complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> respectively, to indicate to the PM core that the direct-complete
> mechanism should never be used for the device and to inform the
> middle layer code (bus types, PM domains etc) that it can only
> request the PM core to use the direct-complete mechanism for
> the device (by returning a positive value from its ->prepare
> callback) if it also has been requested by the driver.
>
> While at it, make the core check pm_runtime_suspended() when
> setting power.direct_complete so that it doesn't need to be
> checked by ->prepare callbacks.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>
> -> v2: Change the data type for driver_flags to u32 as suggested by Greg
>        and fix a couple of documentation typos pointed out by Lukas.
>
> ---
>  Documentation/driver-api/pm/devices.rst |   14 ++++++++++++++
>  Documentation/power/pci.txt             |   19 +++++++++++++++++++
>  drivers/acpi/device_pm.c                |    3 +++
>  drivers/base/dd.c                       |    2 ++
>  drivers/base/power/main.c               |    4 +++-
>  drivers/pci/pci-driver.c                |    5 ++++-
>  include/linux/device.h                  |   10 ++++++++++
>  include/linux/pm.h                      |   20 ++++++++++++++++++++
>  8 files changed, 75 insertions(+), 2 deletions(-)
>
> Index: linux-pm/include/linux/device.h
> ===================================================================
> --- linux-pm.orig/include/linux/device.h
> +++ linux-pm/include/linux/device.h
> @@ -1070,6 +1070,16 @@ static inline void dev_pm_syscore_device
>  #endif
>  }
>
> +static inline void dev_pm_set_driver_flags(struct device *dev, u32 flags)
> +{
> +       dev->power.driver_flags = flags;
> +}
> +
> +static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
> +{
> +       return !!(dev->power.driver_flags & flags);
> +}
> +
>  static inline void device_lock(struct device *dev)
>  {
>         mutex_lock(&dev->mutex);
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -550,6 +550,25 @@ struct pm_subsys_data {
>  #endif
>  };
>
> +/*
> + * Driver flags to control system suspend/resume behavior.
> + *
> + * These flags can be set by device drivers at the probe time.  They need not be
> + * cleared by the drivers as the driver core will take care of that.
> + *
> + * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
> + * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
> + *
> + * Setting SMART_PREPARE instructs bus types and PM domains which may want
> + * system suspend/resume callbacks to be skipped for the device to return 0 from
> + * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in
> + * other words, the system suspend/resume callbacks can only be skipped for the
> + * device if its driver doesn't object against that).  This flag has no effect
> + * if NEVER_SKIP is set.

In principle ACPI/PCI middle-layer/PM domain could have started out by
respecting the return values from driver's ->prepare() callbacks in
case those existed, but they didn't, and that is the reason to why the
SMART_PREPARE is needed. Right?

My point is, I don't think we should encourage other middle-layer to
support the SMART_PREPARE flag, simply because they should be able to
cope without it. To make this more obvious we could try to find a
different name of the flag indicating that, or at least make it clear
that we don't want it to be used by others than ACPI/PCI via
documenting that.

> + */
> +#define DPM_FLAG_NEVER_SKIP    BIT(0)
> +#define DPM_FLAG_SMART_PREPARE BIT(1)
> +
>  struct dev_pm_info {
>         pm_message_t            power_state;
>         unsigned int            can_wakeup:1;
> @@ -561,6 +580,7 @@ struct dev_pm_info {
>         bool                    is_late_suspended:1;
>         bool                    early_init:1;   /* Owned by the PM core */
>         bool                    direct_complete:1;      /* Owned by the PM core */
> +       u32                     driver_flags;
>         spinlock_t              lock;
>  #ifdef CONFIG_PM_SLEEP
>         struct list_head        entry;
> Index: linux-pm/drivers/base/dd.c
> ===================================================================
> --- linux-pm.orig/drivers/base/dd.c
> +++ linux-pm/drivers/base/dd.c
> @@ -464,6 +464,7 @@ pinctrl_bind_failed:
>         if (dev->pm_domain && dev->pm_domain->dismiss)
>                 dev->pm_domain->dismiss(dev);
>         pm_runtime_reinit(dev);
> +       dev_pm_set_driver_flags(dev, 0);
>
>         switch (ret) {
>         case -EPROBE_DEFER:
> @@ -869,6 +870,7 @@ static void __device_release_driver(stru
>                 if (dev->pm_domain && dev->pm_domain->dismiss)
>                         dev->pm_domain->dismiss(dev);
>                 pm_runtime_reinit(dev);
> +               dev_pm_set_driver_flags(dev, 0);
>
>                 klist_remove(&dev->p->knode_driver);
>                 device_pm_check_callbacks(dev);
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -1700,7 +1700,9 @@ unlock:
>          * applies to suspend transitions, however.
>          */
>         spin_lock_irq(&dev->power.lock);
> -       dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND;
> +       dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
> +               pm_runtime_suspended(dev) && ret > 0 &&
> +               !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
>         spin_unlock_irq(&dev->power.lock);
>         return 0;
>  }
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -682,8 +682,11 @@ static int pci_pm_prepare(struct device
>
>         if (drv && drv->pm && drv->pm->prepare) {
>                 int error = drv->pm->prepare(dev);
> -               if (error)
> +               if (error < 0)
>                         return error;
> +
> +               if (!error && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
> +                       return 0;
>         }
>         return pci_dev_keep_suspended(to_pci_dev(dev));
>  }
> Index: linux-pm/drivers/acpi/device_pm.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/device_pm.c
> +++ linux-pm/drivers/acpi/device_pm.c
> @@ -965,6 +965,9 @@ int acpi_subsys_prepare(struct device *d
>         if (ret < 0)
>                 return ret;
>
> +       if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
> +               return 0;

So if the driver don't implement the ->prepare() callback, you still
want to treat this flag as it has one assigned and that it returns 0?

It seems not entirely according to what you have documented about the flag.

> +
>         if (!adev || !pm_runtime_suspended(dev))
>                 return 0;
>
> Index: linux-pm/Documentation/driver-api/pm/devices.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> +++ linux-pm/Documentation/driver-api/pm/devices.rst
> @@ -354,6 +354,20 @@ the phases are: ``prepare``, ``suspend``
>         is because all such devices are initially set to runtime-suspended with
>         runtime PM disabled.
>
> +       This feature also can be controlled by device drivers by using the
> +       ``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power
> +       management flags.  [Typically, they are set at the time the driver is
> +       probed against the device in question by passing them to the
> +       :c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
> +       these flags is set, the PM core will not apply the direct-complete
> +       procedure described above to the given device and, consequenty, to any
> +       of its ancestors.  The second flag, when set, informs the middle layer
> +       code (bus types, device types, PM domains, classes) that it should take
> +       the return value of the ``->prepare`` callback provided by the driver
> +       into account and it may only return a positive value from its own
> +       ``->prepare`` callback if the driver's one also has returned a positive
> +       value.
> +
>      2. The ``->suspend`` methods should quiesce the device to stop it from
>         performing I/O.  They also may save the device registers and put it into
>         the appropriate low-power state, depending on the bus type the device is
> Index: linux-pm/Documentation/power/pci.txt
> ===================================================================
> --- linux-pm.orig/Documentation/power/pci.txt
> +++ linux-pm/Documentation/power/pci.txt
> @@ -961,6 +961,25 @@ dev_pm_ops to indicate that one suspend
>  .suspend(), .freeze(), and .poweroff() members and one resume routine is to
>  be pointed to by the .resume(), .thaw(), and .restore() members.
>
> +3.1.19. Driver Flags for Power Management
> +
> +The PM core allows device drivers to set flags that influence the handling of
> +power management for the devices by the core itself and by middle layer code
> +including the PCI bus type.  The flags should be set once at the driver probe
> +time with the help of the dev_pm_set_driver_flags() function and they should not
> +be updated directly afterwards.

I am wondering if we really need to make a statement generic to all
"driver PM flags" that these flags must be set at ->probe(). Maybe
that is better documented per flag, rather than for all. The reason
why I bring it up, is that I would not be surprised if a new flag
comes a long and which may be used a bit differently, not requiring
that.

Of course we can also update that later on, if needed.

> +
> +The DPM_FLAG_NEVER_SKIP flag prevents the PM core from using the direct-complete
> +mechanism allowing device suspend/resume callbacks to be skipped if the device
> +is in runtime suspend when the system suspend starts.  That also affects all of
> +the ancestors of the device, so this flag should only be used if absolutely
> +necessary.
> +
> +The DPM_FLAG_SMART_PREPARE flag instructs the PCI bus type to only return a
> +positive value from pci_pm_prepare() if the ->prepare callback provided by the
> +driver of the device returns a positive value.  That allows the driver to opt
> +out from using the direct-complete mechanism dynamically.
> +
>  3.2. Device Runtime Power Management
>  ------------------------------------
>  In addition to providing device power management callbacks PCI device drivers
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 02/12] PCI / PM: Use the NEVER_SKIP driver flag
  2017-10-16  1:29 ` [PATCH 02/12] PCI / PM: Use the NEVER_SKIP driver flag Rafael J. Wysocki
@ 2017-10-23 16:40   ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 16:40 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Replace the PCI-specific flag PCI_DEV_FLAGS_NEEDS_RESUME with the
> PM core's DPM_FLAG_NEVER_SKIP one everywhere and drop it.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

> ---
>  drivers/gpu/drm/i915/i915_drv.c |    2 +-
>  drivers/misc/mei/pci-me.c       |    2 +-
>  drivers/misc/mei/pci-txe.c      |    2 +-
>  drivers/pci/pci.c               |    3 +--
>  include/linux/pci.h             |    7 +------
>  5 files changed, 5 insertions(+), 11 deletions(-)
>
> Index: linux-pm/include/linux/pci.h
> ===================================================================
> --- linux-pm.orig/include/linux/pci.h
> +++ linux-pm/include/linux/pci.h
> @@ -205,13 +205,8 @@ enum pci_dev_flags {
>         PCI_DEV_FLAGS_BRIDGE_XLATE_ROOT = (__force pci_dev_flags_t) (1 << 9),
>         /* Do not use FLR even if device advertises PCI_AF_CAP */
>         PCI_DEV_FLAGS_NO_FLR_RESET = (__force pci_dev_flags_t) (1 << 10),
> -       /*
> -        * Resume before calling the driver's system suspend hooks, disabling
> -        * the direct_complete optimization.
> -        */
> -       PCI_DEV_FLAGS_NEEDS_RESUME = (__force pci_dev_flags_t) (1 << 11),
>         /* Don't use Relaxed Ordering for TLPs directed at this device */
> -       PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 12),
> +       PCI_DEV_FLAGS_NO_RELAXED_ORDERING = (__force pci_dev_flags_t) (1 << 11),
>  };
>
>  enum pci_irq_reroute_variant {
> Index: linux-pm/drivers/pci/pci.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci.c
> +++ linux-pm/drivers/pci/pci.c
> @@ -2166,8 +2166,7 @@ bool pci_dev_keep_suspended(struct pci_d
>
>         if (!pm_runtime_suspended(dev)
>             || pci_target_state(pci_dev, wakeup) != pci_dev->current_state
> -           || platform_pci_need_resume(pci_dev)
> -           || (pci_dev->dev_flags & PCI_DEV_FLAGS_NEEDS_RESUME))
> +           || platform_pci_need_resume(pci_dev))
>                 return false;
>
>         /*
> Index: linux-pm/drivers/gpu/drm/i915/i915_drv.c
> ===================================================================
> --- linux-pm.orig/drivers/gpu/drm/i915/i915_drv.c
> +++ linux-pm/drivers/gpu/drm/i915/i915_drv.c
> @@ -1304,7 +1304,7 @@ int i915_driver_load(struct pci_dev *pde
>          * becaue the HDA driver may require us to enable the audio power
>          * domain during system suspend.
>          */
> -       pdev->dev_flags |= PCI_DEV_FLAGS_NEEDS_RESUME;
> +       dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP);
>
>         ret = i915_driver_init_early(dev_priv, ent);
>         if (ret < 0)
> Index: linux-pm/drivers/misc/mei/pci-txe.c
> ===================================================================
> --- linux-pm.orig/drivers/misc/mei/pci-txe.c
> +++ linux-pm/drivers/misc/mei/pci-txe.c
> @@ -141,7 +141,7 @@ static int mei_txe_probe(struct pci_dev
>          * MEI requires to resume from runtime suspend mode
>          * in order to perform link reset flow upon system suspend.
>          */
> -       pdev->dev_flags |= PCI_DEV_FLAGS_NEEDS_RESUME;
> +       dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP);
>
>         /*
>         * For not wake-able HW runtime pm framework
> Index: linux-pm/drivers/misc/mei/pci-me.c
> ===================================================================
> --- linux-pm.orig/drivers/misc/mei/pci-me.c
> +++ linux-pm/drivers/misc/mei/pci-me.c
> @@ -223,7 +223,7 @@ static int mei_me_probe(struct pci_dev *
>          * MEI requires to resume from runtime suspend mode
>          * in order to perform link reset flow upon system suspend.
>          */
> -       pdev->dev_flags |= PCI_DEV_FLAGS_NEEDS_RESUME;
> +       dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_NEVER_SKIP);
>
>         /*
>         * For not wake-able HW runtime pm framework
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 03/12] PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE
  2017-10-16  1:29 ` [PATCH 03/12] PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE Rafael J. Wysocki
@ 2017-10-23 16:57   ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 16:57 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Modify i2c-designware-platdrv to set DPM_FLAG_SMART_PREPARE for its
> devices and return 0 from the system suspend ->prepare callback
> if the device has an ACPI companion object in order to tell the PM
> core and middle layers to avoid skipping system suspend/resume
> callbacks for the device in that case (which may be problematic,
> because the device may be accessed during suspend and resume of
> other devices via I2C operation regions then).
>
> Also the pm_runtime_suspended() check in dw_i2c_plat_prepare()
> is not necessary any more, because the core does it when setting
> power.direct_complete for the device, so drop it.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/i2c/busses/i2c-designware-platdrv.c |   10 +++++++++-
>  1 file changed, 9 insertions(+), 1 deletion(-)
>
> Index: linux-pm/drivers/i2c/busses/i2c-designware-platdrv.c
> ===================================================================
> --- linux-pm.orig/drivers/i2c/busses/i2c-designware-platdrv.c
> +++ linux-pm/drivers/i2c/busses/i2c-designware-platdrv.c
> @@ -370,6 +370,8 @@ static int dw_i2c_plat_probe(struct plat
>         ACPI_COMPANION_SET(&adap->dev, ACPI_COMPANION(&pdev->dev));
>         adap->dev.of_node = pdev->dev.of_node;
>
> +       dev_pm_set_driver_flags(&pdev->dev, DPM_FLAG_SMART_PREPARE);
> +
>         /* The code below assumes runtime PM to be disabled. */
>         WARN_ON(pm_runtime_enabled(&pdev->dev));
>
> @@ -433,7 +435,13 @@ MODULE_DEVICE_TABLE(of, dw_i2c_of_match)
>  #ifdef CONFIG_PM_SLEEP
>  static int dw_i2c_plat_prepare(struct device *dev)
>  {
> -       return pm_runtime_suspended(dev);
> +       /*
> +        * If the ACPI companion device object is present for this device, it
> +        * may be accessed during suspend and resume of other devices via I2C
> +        * operation regions, so tell the PM core and middle layers to avoid
> +        * skipping system suspend/resume callbacks for it in that case.
> +        */

The above scenario can also happens for non-acpi companion devices.
That makes this comment a bit confusing to me.

> +       return !has_acpi_companion(dev);

I understand it still works by always returning 1 for the non-acpi
case, because the PM core deals with it for the direct_complete path.
However it looks rather odd, especially due to the comment above.

Perhaps returning pm_runtime_suspended() in the other case make this
more clear? Or perhaps clarifying the comment somehow? :-)

>  }
>
>  static void dw_i2c_plat_complete(struct device *dev)
>
>

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag
  2017-10-16  1:29 ` [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag Rafael J. Wysocki
@ 2017-10-23 19:01   ` Ulf Hansson
  2017-10-24  5:22   ` Ulf Hansson
  1 sibling, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 19:01 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Define and document a SMART_SUSPEND flag to instruct bus types and PM
> domains that the system suspend callbacks provided by the driver can
> cope with runtime-suspended devices, so from the driver's perspective
> it should be safe to leave devices in runtime suspend during system
> suspend.



>
> Setting that flag also causes the PM core to skip the "late" and
> "noirq" phases of device suspend for devices that remain in runtime
> suspend at the beginning of the "late" phase (when runtime PM has
> been disabled for them) under the assumption that their state cannot
> (and should not) change after that point until the system suspend
> transition is complete.  Moreover, the PM core prevents runtime PM
> from acting on devices with DPM_FLAG_SMART_SUSPEND during system
> resume by setting their runtime PM status to "active" at the end of
> the "early" phase (right prior to enabling runtime PM for them).
> That allows system resume callbacks to do whatever is necessary to
> resume the device without worrying about runtime PM possibly
> running in parallel with them.

Could you explain in some detail of why the second part makes sense?

To me it seems more clever to leave the decision to the driver,
whether it wants to resume the device during system resume or if
rather wants to defer that to later, via runtime PM.

>
> However, that doesn't apply to transitions involving ->thaw_noirq,
> ->thaw_early and ->thaw callbacks during hibernation, as they
> generally are not expected to change the power states of devices.
> Consequently, if a device is in runtime suspend at the beginning
> of such a transition, it must stay in runtime suspend until the
> "complete" phase of it (since the callbacks may not change its
> power state).

The above seems reasonable, but on the other hand it makes it more
difficult to understand how the DPM_FLAG_SMART_SUSPEND is going to be
used.

Perhaps we should simply have a separate flag for the resume path?

>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  Documentation/driver-api/pm/devices.rst |   17 ++++++++
>  drivers/base/power/main.c               |   63 ++++++++++++++++++++++++++++----
>  include/linux/pm.h                      |    9 ++++
>  3 files changed, 82 insertions(+), 7 deletions(-)
>
> Index: linux-pm/Documentation/driver-api/pm/devices.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> +++ linux-pm/Documentation/driver-api/pm/devices.rst
> @@ -766,6 +766,23 @@ the state of devices (possibly except fo
>  from their ``->prepare`` and ``->suspend`` callbacks (or equivalent) *before*
>  invoking device drivers' ``->suspend`` callbacks (or equivalent).
>
> +Some bus types and PM domains have a policy to resume all devices from runtime
> +suspend upfront in their ``->suspend`` callbacks, but that may not be really
> +necessary if the system suspend-resume callbacks provided by the device's
> +driver can cope with runtime-suspended devices.  The driver can indicate that
> +by setting ``DPM_FLAG_SMART_SUSPEND`` in :c:member:`power.driver_flags` at the
> +probe time, by passing it to the :c:func:`dev_pm_set_driver_flags` helper.  That
> +also causes the PM core to skip the ``suspend_late`` and ``suspend_noirq``
> +phases of device suspend for the device if it remains in runtime suspend at the
> +beginning of the ``suspend_late`` phase (when runtime PM has been disabled for
> +it) under the assumption that its state cannot (and should not) change after
> +that point until the system-wide transition is over.  Moreover, the PM core
> +updates the runtime power management status of devices with
> +``DPM_FLAG_SMART_SUSPEND`` set to "active" at the end of the ``resume_early``
> +phase of device resume (right prior to enabling runtime PM for them) in order
> +to prevent runtime PM from acting on them before the ``complete`` phase, which
> +means that they should be put into the full-power state before that phase.
> +
>  During system-wide resume from a sleep state it's easiest to put devices into
>  the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
>  Refer to that document for more information regarding this particular issue as
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -558,6 +558,7 @@ struct pm_subsys_data {
>   *
>   * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
>   * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
> + * SMART_SUSPEND: No need to resume the device from runtime suspend.
>   *
>   * Setting SMART_PREPARE instructs bus types and PM domains which may want
>   * system suspend/resume callbacks to be skipped for the device to return 0 from
> @@ -565,9 +566,17 @@ struct pm_subsys_data {
>   * other words, the system suspend/resume callbacks can only be skipped for the
>   * device if its driver doesn't object against that).  This flag has no effect
>   * if NEVER_SKIP is set.
> + *
> + * Setting SMART_SUSPEND instructs bus types and PM domains which may want to
> + * runtime resume the device upfront during system suspend that doing so is not
> + * necessary from the driver's perspective.  It also causes the PM core to skip
> + * the "late" and "noirq" phases of device suspend for the device if it remains
> + * in runtime suspend at the beginning of the "late" phase (when runtime PM has
> + * been disabled for it).
>   */
>  #define DPM_FLAG_NEVER_SKIP    BIT(0)
>  #define DPM_FLAG_SMART_PREPARE BIT(1)
> +#define DPM_FLAG_SMART_SUSPEND BIT(2)
>
>  struct dev_pm_info {
>         pm_message_t            power_state;
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -551,6 +551,18 @@ static int device_resume_noirq(struct de
>         if (!dev->power.is_noirq_suspended)
>                 goto Out;
>
> +       if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> +           pm_runtime_status_suspended(dev) && (state.event == PM_EVENT_THAW ||
> +           state.event == PM_EVENT_RECOVER)) {
> +               /*
> +                * The device has to stay in runtime suspend, because the
> +                * subsequent callbacks may not try to change its power state.
> +                */
> +               dev->power.is_suspended = false;
> +               dev->power.is_late_suspended = false;
> +               goto Skip;
> +       }
> +
>         dpm_wait_for_superior(dev, async);
>
>         if (dev->pm_domain) {
> @@ -573,9 +585,11 @@ static int device_resume_noirq(struct de
>         }
>
>         error = dpm_run_callback(callback, dev, state, info);
> +
> +Skip:
>         dev->power.is_noirq_suspended = false;
>
> - Out:
> +Out:
>         complete_all(&dev->power.completion);
>         TRACE_RESUME(error);
>         return error;
> @@ -715,6 +729,14 @@ static int device_resume_early(struct de
>         error = dpm_run_callback(callback, dev, state, info);
>         dev->power.is_late_suspended = false;
>
> +       /*
> +        * Devices with DPM_FLAG_SMART_SUSPEND may be left in runtime suspend
> +        * during system suspend, so update their runtime PM status to "active"
> +        * to prevent runtime PM from acting on them before device_complete().
> +        */
> +       if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND))
> +               pm_runtime_set_active(dev);

Please check the return value from pm_runtime_set_active(), else we
might not know if something went wrong. For example, the parent may
not be active.

Moreover, as stated above, perhaps this should be controlled by a separate flag?

> +
>   Out:
>         TRACE_RESUME(error);
>
> @@ -1107,6 +1129,15 @@ static int __device_suspend_noirq(struct
>         if (dev->power.syscore || dev->power.direct_complete)
>                 goto Complete;
>
> +       /*
> +        * The state of devices with DPM_FLAG_SMART_SUSPEND set that remain in
> +        * runtime suspend at this point cannot change going forward, so skip
> +        * the callback invocation for them.
> +        */
> +       if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> +           pm_runtime_status_suspended(dev))
> +               goto Skip;
> +
>         if (dev->pm_domain) {
>                 info = "noirq power domain ";
>                 callback = pm_noirq_op(&dev->pm_domain->ops, state);
> @@ -1127,10 +1158,13 @@ static int __device_suspend_noirq(struct
>         }
>
>         error = dpm_run_callback(callback, dev, state, info);
> -       if (!error)
> -               dev->power.is_noirq_suspended = true;
> -       else
> +       if (error) {
>                 async_error = error;
> +               goto Complete;
> +       }
> +
> +Skip:
> +       dev->power.is_noirq_suspended = true;
>
>  Complete:
>         complete_all(&dev->power.completion);
> @@ -1268,6 +1302,15 @@ static int __device_suspend_late(struct
>         if (dev->power.syscore || dev->power.direct_complete)
>                 goto Complete;
>
> +       /*
> +        * The state of devices with DPM_FLAG_SMART_SUSPEND set that remain in
> +        * runtime suspend at this point cannot change going forward, so skip
> +        * the callback invocation for them.
> +        */
> +       if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> +           pm_runtime_status_suspended(dev))
> +               goto Skip;
> +
>         if (dev->pm_domain) {
>                 info = "late power domain ";
>                 callback = pm_late_early_op(&dev->pm_domain->ops, state);
> @@ -1288,10 +1331,13 @@ static int __device_suspend_late(struct
>         }
>
>         error = dpm_run_callback(callback, dev, state, info);
> -       if (!error)
> -               dev->power.is_late_suspended = true;
> -       else
> +       if (error) {
>                 async_error = error;
> +               goto Complete;
> +       }
> +
> +Skip:
> +       dev->power.is_late_suspended = true;
>
>  Complete:
>         TRACE_SUSPEND(error);
> @@ -1652,6 +1698,9 @@ static int device_prepare(struct device
>         if (dev->power.syscore)
>                 return 0;
>
> +       WARN_ON(dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> +               !pm_runtime_enabled(dev));
> +
>         /*
>          * If a device's parent goes into runtime suspend at the wrong time,
>          * it won't be possible to resume the device.  To prevent this we
>
>

My overall comment/concern with this flag is that I would like a more
straightforward approach, else people want understand how to use of
it.

Moreover doesn't this flag actually overlap quite closely with what
the direct_complete path is already doing? Except for the resume path
- of course.

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 05/12] PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks
  2017-10-16  1:29 ` [PATCH 05/12] PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks Rafael J. Wysocki
@ 2017-10-23 19:06   ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 19:06 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> The only user of non-empty pcibios_pm_ops is s390 and it only uses
> "noirq" callbacks, so drop the invocations of the other pcibios_pm_ops
> callbacks from the PCI PM code.
>
> That will allow subsequent changes to be somewhat simpler.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

> ---
>  drivers/pci/pci-driver.c |   18 ------------------
>  1 file changed, 18 deletions(-)
>
> Index: linux-pm/drivers/pci/pci-driver.c
> ===================================================================
> --- linux-pm.orig/drivers/pci/pci-driver.c
> +++ linux-pm/drivers/pci/pci-driver.c
> @@ -918,9 +918,6 @@ static int pci_pm_freeze(struct device *
>                         return error;
>         }
>
> -       if (pcibios_pm_ops.freeze)
> -               return pcibios_pm_ops.freeze(dev);
> -
>         return 0;
>  }
>
> @@ -982,12 +979,6 @@ static int pci_pm_thaw(struct device *de
>         const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>         int error = 0;
>
> -       if (pcibios_pm_ops.thaw) {
> -               error = pcibios_pm_ops.thaw(dev);
> -               if (error)
> -                       return error;
> -       }
> -
>         if (pci_has_legacy_pm_support(pci_dev))
>                 return pci_legacy_resume(dev);
>
> @@ -1032,9 +1023,6 @@ static int pci_pm_poweroff(struct device
>   Fixup:
>         pci_fixup_device(pci_fixup_suspend, pci_dev);
>
> -       if (pcibios_pm_ops.poweroff)
> -               return pcibios_pm_ops.poweroff(dev);
> -
>         return 0;
>  }
>
> @@ -1107,12 +1095,6 @@ static int pci_pm_restore(struct device
>         const struct dev_pm_ops *pm = dev->driver ? dev->driver->pm : NULL;
>         int error = 0;
>
> -       if (pcibios_pm_ops.restore) {
> -               error = pcibios_pm_ops.restore(dev);
> -               if (error)
> -                       return error;
> -       }
> -
>         /*
>          * This is necessary for the hibernation error path in which restore is
>          * called without restoring the standard config registers of the device.
>
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 07/12] ACPI / LPSS: Consolidate runtime PM and system sleep handling
  2017-10-16  1:29 ` [PATCH 07/12] ACPI / LPSS: Consolidate runtime PM and system sleep handling Rafael J. Wysocki
@ 2017-10-23 19:09   ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 19:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Move the LPSS-specific code from acpi_lpss_runtime_suspend()
> and acpi_lpss_runtime_resume() into separate functions,
> acpi_lpss_suspend() and acpi_lpss_resume(), respectively, and
> make acpi_lpss_suspend_late() and acpi_lpss_resume_early() use
> them too in order to unify the runtime PM and system sleep
> handling in the LPSS driver.
>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

Reviewed-by: Ulf Hansson <ulf.hansson@linaro.org>

> ---
>
> This is based on an RFC I posted some time ago
> (https://patchwork.kernel.org/patch/9998147/), which didn't
> receive any comments and it depends on a couple of ACPI device PM
> patches posted recently (https://patchwork.kernel.org/patch/10006457/
> in particular).
>
> It's included in this series, because the next patch won't work without it.
>
> ---
>  drivers/acpi/acpi_lpss.c |   75 ++++++++++++++++++++---------------------------
>  1 file changed, 33 insertions(+), 42 deletions(-)
>
> Index: linux-pm/drivers/acpi/acpi_lpss.c
> ===================================================================
> --- linux-pm.orig/drivers/acpi/acpi_lpss.c
> +++ linux-pm/drivers/acpi/acpi_lpss.c
> @@ -716,40 +716,6 @@ static void acpi_lpss_dismiss(struct dev
>         acpi_dev_suspend(dev, false);
>  }
>
> -#ifdef CONFIG_PM_SLEEP
> -static int acpi_lpss_suspend_late(struct device *dev)
> -{
> -       struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
> -       int ret;
> -
> -       ret = pm_generic_suspend_late(dev);
> -       if (ret)
> -               return ret;
> -
> -       if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
> -               acpi_lpss_save_ctx(dev, pdata);
> -
> -       return acpi_dev_suspend(dev, device_may_wakeup(dev));
> -}
> -
> -static int acpi_lpss_resume_early(struct device *dev)
> -{
> -       struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
> -       int ret;
> -
> -       ret = acpi_dev_resume(dev);
> -       if (ret)
> -               return ret;
> -
> -       acpi_lpss_d3_to_d0_delay(pdata);
> -
> -       if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
> -               acpi_lpss_restore_ctx(dev, pdata);
> -
> -       return pm_generic_resume_early(dev);
> -}
> -#endif /* CONFIG_PM_SLEEP */
> -
>  /* IOSF SB for LPSS island */
>  #define LPSS_IOSF_UNIT_LPIOEP          0xA0
>  #define LPSS_IOSF_UNIT_LPIO1           0xAB
> @@ -835,19 +801,15 @@ static void lpss_iosf_exit_d3_state(void
>         mutex_unlock(&lpss_iosf_mutex);
>  }
>
> -static int acpi_lpss_runtime_suspend(struct device *dev)
> +static int acpi_lpss_suspend(struct device *dev, bool wakeup)
>  {
>         struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
>         int ret;
>
> -       ret = pm_generic_runtime_suspend(dev);
> -       if (ret)
> -               return ret;
> -
>         if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
>                 acpi_lpss_save_ctx(dev, pdata);
>
> -       ret = acpi_dev_suspend(dev, true);
> +        ret = acpi_dev_suspend(dev, wakeup);
>
>         /*
>          * This call must be last in the sequence, otherwise PMC will return
> @@ -860,7 +822,7 @@ static int acpi_lpss_runtime_suspend(str
>         return ret;
>  }
>
> -static int acpi_lpss_runtime_resume(struct device *dev)
> +static int acpi_lpss_resume(struct device *dev)
>  {
>         struct lpss_private_data *pdata = acpi_driver_data(ACPI_COMPANION(dev));
>         int ret;
> @@ -881,7 +843,36 @@ static int acpi_lpss_runtime_resume(stru
>         if (pdata->dev_desc->flags & LPSS_SAVE_CTX)
>                 acpi_lpss_restore_ctx(dev, pdata);
>
> -       return pm_generic_runtime_resume(dev);
> +       return 0;
> +}
> +#ifdef CONFIG_PM_SLEEP
> +static int acpi_lpss_suspend_late(struct device *dev)
> +{
> +       int ret = pm_generic_suspend_late(dev);
> +
> +       return ret ? ret : acpi_lpss_suspend(dev, device_may_wakeup(dev));
> +}
> +
> +static int acpi_lpss_resume_early(struct device *dev)
> +{
> +       int ret = acpi_lpss_resume(dev);
> +
> +       return ret ? ret : pm_generic_resume_early(dev);
> +}
> +#endif /* CONFIG_PM_SLEEP */
> +
> +static int acpi_lpss_runtime_suspend(struct device *dev)
> +{
> +       int ret = pm_generic_runtime_suspend(dev);
> +
> +       return ret ? ret : acpi_lpss_suspend(dev, true);
> +}
> +
> +static int acpi_lpss_runtime_resume(struct device *dev)
> +{
> +       int ret = acpi_lpss_resume(dev);
> +
> +       return ret ? ret : pm_generic_runtime_resume(dev);
>  }
>  #endif /* CONFIG_PM */
>
>
>

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 10/12] PM / core: Add LEAVE_SUSPENDED driver flag
  2017-10-16  1:30 ` [PATCH 10/12] PM / core: Add LEAVE_SUSPENDED driver flag Rafael J. Wysocki
@ 2017-10-23 19:38   ` Ulf Hansson
  0 siblings, 0 replies; 79+ messages in thread
From: Ulf Hansson @ 2017-10-23 19:38 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:30, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Define and document a new driver flag, DPM_FLAG_LEAVE_SUSPENDED, to
> instruct the PM core that it is desirable to leave the device in
> runtime suspend after system resume (for example, the device may be
> slow to resume and it may be better to avoid resuming it right away
> for this reason).
>
> Setting that flag causes the PM core to skip the ->resume_noirq,
> ->resume_early and ->resume callbacks for the device (like in the
> direct-complete optimization case) if (1) the wakeup settings of it
> are compatible with runtime PM (that is, either the device is
> configured to wake up the system from sleep or it cannot generate
> wakeup signals at all), and it will not be used for resuming any of
> its children or consumers.

As you state above, this looks like the direct_complete path, if being
used in conjunction with the DPM_SMART_SUSPEND flag.

Taking both these flags into account, what it seems to boils done is
that you need one flag, instructing the PM core to sometimes resume
the devices when it runs the direct_complete path, as isn't the case
today.

Wouldn't that be an alternative solution, which may be a bit simpler?

>
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  Documentation/driver-api/pm/devices.rst |   20 +++++++
>  drivers/base/power/main.c               |   81 ++++++++++++++++++++++++++++++--
>  include/linux/pm.h                      |   12 +++-
>  3 files changed, 104 insertions(+), 9 deletions(-)
>
> Index: linux-pm/include/linux/pm.h
> ===================================================================
> --- linux-pm.orig/include/linux/pm.h
> +++ linux-pm/include/linux/pm.h
> @@ -559,6 +559,7 @@ struct pm_subsys_data {
>   * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
>   * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
>   * SMART_SUSPEND: No need to resume the device from runtime suspend.
> + * LEAVE_SUSPENDED: Avoid resuming the device during system resume if possible.
>   *
>   * Setting SMART_PREPARE instructs bus types and PM domains which may want
>   * system suspend/resume callbacks to be skipped for the device to return 0 from
> @@ -573,10 +574,14 @@ struct pm_subsys_data {
>   * the "late" and "noirq" phases of device suspend for the device if it remains
>   * in runtime suspend at the beginning of the "late" phase (when runtime PM has
>   * been disabled for it).
> + *
> + * Setting LEAVE_SUSPENDED informs the PM core and middle layer code that the
> + * driver prefers the device to be left in runtime suspend after system resume.
>   */
> -#define DPM_FLAG_NEVER_SKIP    BIT(0)
> -#define DPM_FLAG_SMART_PREPARE BIT(1)
> -#define DPM_FLAG_SMART_SUSPEND BIT(2)
> +#define DPM_FLAG_NEVER_SKIP            BIT(0)
> +#define DPM_FLAG_SMART_PREPARE         BIT(1)
> +#define DPM_FLAG_SMART_SUSPEND         BIT(2)
> +#define DPM_FLAG_LEAVE_SUSPENDED       BIT(3)

I would appreciate if you could reformat the patches such that you
only have to add one line here.

It makes it easier when I later runs a "git blame" to understand what
commit that introduced each flag. :-)

>
>  struct dev_pm_info {
>         pm_message_t            power_state;
> @@ -598,6 +603,7 @@ struct dev_pm_info {
>         bool                    wakeup_path:1;
>         bool                    syscore:1;
>         bool                    no_pm_callbacks:1;      /* Owned by the PM core */
> +       unsigned int            must_resume:1;  /* Owned by the PM core */
>  #else
>         unsigned int            should_wakeup:1;
>  #endif
> Index: linux-pm/drivers/base/power/main.c
> ===================================================================
> --- linux-pm.orig/drivers/base/power/main.c
> +++ linux-pm/drivers/base/power/main.c
> @@ -705,6 +705,12 @@ static int device_resume_early(struct de
>         if (!dev->power.is_late_suspended)
>                 goto Out;
>
> +       if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED) &&
> +           !dev->power.must_resume) {
> +               pm_runtime_set_suspended(dev);
> +               goto Out;
> +       }
> +
>         dpm_wait_for_superior(dev, async);
>
>         if (dev->pm_domain) {
> @@ -1098,6 +1104,32 @@ static pm_message_t resume_event(pm_mess
>         return PMSG_ON;
>  }
>
> +static void dpm_suppliers_set_must_resume(struct device *dev)
> +{
> +       struct device_link *link;
> +       int idx;
> +
> +       idx = device_links_read_lock();
> +
> +       list_for_each_entry_rcu(link, &dev->links.suppliers, c_node)
> +               link->supplier->power.must_resume = true;
> +
> +       device_links_read_unlock(idx);
> +}
> +
> +static void dpm_leave_suspended(struct device *dev)
> +{
> +       pm_runtime_set_suspended(dev);
> +       dev->power.is_suspended = false;
> +       dev->power.is_late_suspended = false;
> +       /*
> +        * This tells middle layer code to schedule runtime resume of the device
> +        * from its ->complete callback to update the device's power state in
> +        * case the platform firmware has been involved in resuming the system.
> +        */
> +       dev->power.direct_complete = true;
> +}
> +
>  /**
>   * __device_suspend_noirq - Execute a "noirq suspend" callback for given device.
>   * @dev: Device to handle.
> @@ -1135,8 +1167,20 @@ static int __device_suspend_noirq(struct
>          * the callback invocation for them.
>          */
>         if (dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> -           pm_runtime_status_suspended(dev))
> -               goto Skip;
> +           pm_runtime_status_suspended(dev)) {
> +               /*
> +                * The device may be left suspended during system resume if
> +                * that is preferred by its driver and it will not be used for
> +                * resuming any of its children or consumers.
> +                */
> +               if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED) &&
> +                   !dev->power.must_resume) {
> +                       dpm_leave_suspended(dev);
> +                       goto Complete;
> +               } else {
> +                       goto Skip;
> +               }
> +       }
>
>         if (dev->pm_domain) {
>                 info = "noirq power domain ";
> @@ -1163,6 +1207,28 @@ static int __device_suspend_noirq(struct
>                 goto Complete;
>         }
>
> +       /*
> +        * The device may be left suspended during system resume if that is
> +        * preferred by its driver and its wakeup configuration is compatible
> +        * with runtime PM, and it will not be used for resuming any of its
> +        * children or consumers.
> +        */
> +       if (dev_pm_test_driver_flags(dev, DPM_FLAG_LEAVE_SUSPENDED) &&
> +           (device_may_wakeup(dev) || !device_can_wakeup(dev)) &&
> +           !dev->power.must_resume) {
> +               dpm_leave_suspended(dev);
> +               goto Complete;
> +       }
> +
> +       /*
> +        * The parent and suppliers will be necessary to resume the device
> +        * during system resume, so avoid leaving them in runtime suspend.
> +        */
> +       if (dev->parent)
> +               dev->parent->power.must_resume = true;
> +
> +       dpm_suppliers_set_must_resume(dev);
> +
>  Skip:
>         dev->power.is_noirq_suspended = true;
>
> @@ -1698,8 +1764,9 @@ static int device_prepare(struct device
>         if (dev->power.syscore)
>                 return 0;
>
> -       WARN_ON(dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND) &&
> -               !pm_runtime_enabled(dev));
> +       WARN_ON(!pm_runtime_enabled(dev) &&
> +               dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_SUSPEND |
> +                                             DPM_FLAG_LEAVE_SUSPENDED));
>
>         /*
>          * If a device's parent goes into runtime suspend at the wrong time,
> @@ -1712,6 +1779,12 @@ static int device_prepare(struct device
>         device_lock(dev);
>
>         dev->power.wakeup_path = device_may_wakeup(dev);
> +       /*
> +        * Avoid leaving devices in suspend after transitions that don't really
> +        * suspend them in general.
> +        */
> +       dev->power.must_resume = state.event == PM_EVENT_FREEZE ||
> +                               state.event == PM_EVENT_QUIESCE;
>
>         if (dev->power.no_pm_callbacks) {
>                 ret = 1;        /* Let device go direct_complete */
> Index: linux-pm/Documentation/driver-api/pm/devices.rst
> ===================================================================
> --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> +++ linux-pm/Documentation/driver-api/pm/devices.rst
> @@ -785,6 +785,22 @@ means that they should be put into the f
>
>  During system-wide resume from a sleep state it's easiest to put devices into
>  the full-power state, as explained in :file:`Documentation/power/runtime_pm.txt`.
> -Refer to that document for more information regarding this particular issue as
> +[Refer to that document for more information regarding this particular issue as
>  well as for information on the device runtime power management framework in
> -general.
> +general.]
> +
> +However, it may be desirable to leave some devices in runtime suspend after
> +system resume and device drivers can use the ``DPM_FLAG_LEAVE_SUSPENDED`` flag
> +to indicate to the PM core that this is the case.  If that flag is set for a
> +device and the wakeup settings of it are compatible with runtime PM (that is,
> +either the device is configured to wake up the system from sleep or it cannot
> +generate wakeup signals at all), and it will not be used for resuming any of its
> +children or consumers, the PM core will skip all of the system resume callbacks
> +in the ``resume_noirq``, ``resume_early`` and ``resume`` phases for it and its
> +runtime power management status will be set to "suspended".
> +
> +Still, if the platform firmware is involved in the handling of system resume, it
> +may change the state of devices in unpredictable ways, so in that case the
> +middle layer code (for example, a bus type or PM domain) the driver works with
> +should update the device's power state and schedule runtime resume of it to
> +align its power settings with the expectations of the runtime PM framework.
>
>

Regarding the DPM_NEVER_SKIP flag, is that flag only to prevent
direct_complete, and thus it ought not to be used with these other
driver PM flags that you add in this series?

Have you considered that DPM_NEVER_SKIP also propagates to parents
etc? Just to make sure one you won't skip invoking some system sleep
callbacks, even if they should because a child requires it?

Or maybe I am just tired and should continue to review with a more
fresh mind. :-)

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [Update][PATCH v2 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags
  2017-10-23 16:37     ` Ulf Hansson
@ 2017-10-23 20:41       ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-23 20:41 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Greg Kroah-Hartman, Lukas Wunner, Bjorn Helgaas,
	Alan Stern, LKML, Linux ACPI, Linux PCI, Linux Documentation,
	Mika Westerberg, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Monday, October 23, 2017 6:37:41 PM CEST Ulf Hansson wrote:
> On 19 October 2017 at 01:17, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > The motivation for this change is to provide a way to work around
> > a problem with the direct-complete mechanism used for avoiding
> > system suspend/resume handling for devices in runtime suspend.
> >
> > The problem is that some middle layer code (the PCI bus type and
> > the ACPI PM domain in particular) returns positive values from its
> > system suspend ->prepare callbacks regardless of whether the driver's
> > ->prepare returns a positive value or 0, which effectively prevents
> > drivers from being able to control the direct-complete feature.
> > Some drivers need that control, however, and the PCI bus type has
> > grown its own flag to deal with this issue, but since it is not
> > limited to PCI, it is better to address it by adding driver flags at
> > the core level.
> >
> > To that end, add a driver_flags field to struct dev_pm_info for flags
> > that can be set by device drivers at the probe time to inform the PM
> > core and/or bus types, PM domains and so on on the capabilities and/or
> > preferences of device drivers.  Also add two static inline helpers
> > for setting that field and testing it against a given set of flags
> > and make the driver core clear it automatically on driver remove
> > and probe failures.
> >
> > Define and document two PM driver flags related to the direct-
> > complete feature: NEVER_SKIP and SMART_PREPARE that can be used,
> > respectively, to indicate to the PM core that the direct-complete
> > mechanism should never be used for the device and to inform the
> > middle layer code (bus types, PM domains etc) that it can only
> > request the PM core to use the direct-complete mechanism for
> > the device (by returning a positive value from its ->prepare
> > callback) if it also has been requested by the driver.
> >
> > While at it, make the core check pm_runtime_suspended() when
> > setting power.direct_complete so that it doesn't need to be
> > checked by ->prepare callbacks.
> >
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > ---
> >
> > -> v2: Change the data type for driver_flags to u32 as suggested by Greg
> >        and fix a couple of documentation typos pointed out by Lukas.
> >
> > ---
> >  Documentation/driver-api/pm/devices.rst |   14 ++++++++++++++
> >  Documentation/power/pci.txt             |   19 +++++++++++++++++++
> >  drivers/acpi/device_pm.c                |    3 +++
> >  drivers/base/dd.c                       |    2 ++
> >  drivers/base/power/main.c               |    4 +++-
> >  drivers/pci/pci-driver.c                |    5 ++++-
> >  include/linux/device.h                  |   10 ++++++++++
> >  include/linux/pm.h                      |   20 ++++++++++++++++++++
> >  8 files changed, 75 insertions(+), 2 deletions(-)
> >
> > Index: linux-pm/include/linux/device.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/device.h
> > +++ linux-pm/include/linux/device.h
> > @@ -1070,6 +1070,16 @@ static inline void dev_pm_syscore_device
> >  #endif
> >  }
> >
> > +static inline void dev_pm_set_driver_flags(struct device *dev, u32 flags)
> > +{
> > +       dev->power.driver_flags = flags;
> > +}
> > +
> > +static inline bool dev_pm_test_driver_flags(struct device *dev, u32 flags)
> > +{
> > +       return !!(dev->power.driver_flags & flags);
> > +}
> > +
> >  static inline void device_lock(struct device *dev)
> >  {
> >         mutex_lock(&dev->mutex);
> > Index: linux-pm/include/linux/pm.h
> > ===================================================================
> > --- linux-pm.orig/include/linux/pm.h
> > +++ linux-pm/include/linux/pm.h
> > @@ -550,6 +550,25 @@ struct pm_subsys_data {
> >  #endif
> >  };
> >
> > +/*
> > + * Driver flags to control system suspend/resume behavior.
> > + *
> > + * These flags can be set by device drivers at the probe time.  They need not be
> > + * cleared by the drivers as the driver core will take care of that.
> > + *
> > + * NEVER_SKIP: Do not skip system suspend/resume callbacks for the device.
> > + * SMART_PREPARE: Check the return value of the driver's ->prepare callback.
> > + *
> > + * Setting SMART_PREPARE instructs bus types and PM domains which may want
> > + * system suspend/resume callbacks to be skipped for the device to return 0 from
> > + * their ->prepare callbacks if the driver's ->prepare callback returns 0 (in
> > + * other words, the system suspend/resume callbacks can only be skipped for the
> > + * device if its driver doesn't object against that).  This flag has no effect
> > + * if NEVER_SKIP is set.
> 
> In principle ACPI/PCI middle-layer/PM domain could have started out by
> respecting the return values from driver's ->prepare() callbacks in
> case those existed, but they didn't, and that is the reason to why the
> SMART_PREPARE is needed. Right?
> 
> My point is, I don't think we should encourage other middle-layer to
> support the SMART_PREPARE flag, simply because they should be able to
> cope without it. To make this more obvious we could try to find a
> different name of the flag indicating that, or at least make it clear
> that we don't want it to be used by others than ACPI/PCI via
> documenting that.

I want it to be generic, though, so setting it should not be treated as a
mistake in any case (for example, because some drivers interact with the
ACPI PM domain and with some other middle layers).

If SMART_PREPARE simply overlaps with your defaul behavior, there's no need
to check the flag, but then it can be set really safely. :-)

> > + */
> > +#define DPM_FLAG_NEVER_SKIP    BIT(0)
> > +#define DPM_FLAG_SMART_PREPARE BIT(1)
> > +
> >  struct dev_pm_info {
> >         pm_message_t            power_state;
> >         unsigned int            can_wakeup:1;
> > @@ -561,6 +580,7 @@ struct dev_pm_info {
> >         bool                    is_late_suspended:1;
> >         bool                    early_init:1;   /* Owned by the PM core */
> >         bool                    direct_complete:1;      /* Owned by the PM core */
> > +       u32                     driver_flags;
> >         spinlock_t              lock;
> >  #ifdef CONFIG_PM_SLEEP
> >         struct list_head        entry;
> > Index: linux-pm/drivers/base/dd.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/dd.c
> > +++ linux-pm/drivers/base/dd.c
> > @@ -464,6 +464,7 @@ pinctrl_bind_failed:
> >         if (dev->pm_domain && dev->pm_domain->dismiss)
> >                 dev->pm_domain->dismiss(dev);
> >         pm_runtime_reinit(dev);
> > +       dev_pm_set_driver_flags(dev, 0);
> >
> >         switch (ret) {
> >         case -EPROBE_DEFER:
> > @@ -869,6 +870,7 @@ static void __device_release_driver(stru
> >                 if (dev->pm_domain && dev->pm_domain->dismiss)
> >                         dev->pm_domain->dismiss(dev);
> >                 pm_runtime_reinit(dev);
> > +               dev_pm_set_driver_flags(dev, 0);
> >
> >                 klist_remove(&dev->p->knode_driver);
> >                 device_pm_check_callbacks(dev);
> > Index: linux-pm/drivers/base/power/main.c
> > ===================================================================
> > --- linux-pm.orig/drivers/base/power/main.c
> > +++ linux-pm/drivers/base/power/main.c
> > @@ -1700,7 +1700,9 @@ unlock:
> >          * applies to suspend transitions, however.
> >          */
> >         spin_lock_irq(&dev->power.lock);
> > -       dev->power.direct_complete = ret > 0 && state.event == PM_EVENT_SUSPEND;
> > +       dev->power.direct_complete = state.event == PM_EVENT_SUSPEND &&
> > +               pm_runtime_suspended(dev) && ret > 0 &&
> > +               !dev_pm_test_driver_flags(dev, DPM_FLAG_NEVER_SKIP);
> >         spin_unlock_irq(&dev->power.lock);
> >         return 0;
> >  }
> > Index: linux-pm/drivers/pci/pci-driver.c
> > ===================================================================
> > --- linux-pm.orig/drivers/pci/pci-driver.c
> > +++ linux-pm/drivers/pci/pci-driver.c
> > @@ -682,8 +682,11 @@ static int pci_pm_prepare(struct device
> >
> >         if (drv && drv->pm && drv->pm->prepare) {
> >                 int error = drv->pm->prepare(dev);
> > -               if (error)
> > +               if (error < 0)
> >                         return error;
> > +
> > +               if (!error && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
> > +                       return 0;
> >         }
> >         return pci_dev_keep_suspended(to_pci_dev(dev));
> >  }
> > Index: linux-pm/drivers/acpi/device_pm.c
> > ===================================================================
> > --- linux-pm.orig/drivers/acpi/device_pm.c
> > +++ linux-pm/drivers/acpi/device_pm.c
> > @@ -965,6 +965,9 @@ int acpi_subsys_prepare(struct device *d
> >         if (ret < 0)
> >                 return ret;
> >
> > +       if (!ret && dev_pm_test_driver_flags(dev, DPM_FLAG_SMART_PREPARE))
> > +               return 0;
> 
> So if the driver don't implement the ->prepare() callback, you still
> want to treat this flag as it has one assigned and that it returns 0?
> 
> It seems not entirely according to what you have documented about the flag.

You are right, I just should not use the _generic_prepare() thing there.

I'll send a fix patch for that.

> > +
> >         if (!adev || !pm_runtime_suspended(dev))
> >                 return 0;
> >
> > Index: linux-pm/Documentation/driver-api/pm/devices.rst
> > ===================================================================
> > --- linux-pm.orig/Documentation/driver-api/pm/devices.rst
> > +++ linux-pm/Documentation/driver-api/pm/devices.rst
> > @@ -354,6 +354,20 @@ the phases are: ``prepare``, ``suspend``
> >         is because all such devices are initially set to runtime-suspended with
> >         runtime PM disabled.
> >
> > +       This feature also can be controlled by device drivers by using the
> > +       ``DPM_FLAG_NEVER_SKIP`` and ``DPM_FLAG_SMART_PREPARE`` driver power
> > +       management flags.  [Typically, they are set at the time the driver is
> > +       probed against the device in question by passing them to the
> > +       :c:func:`dev_pm_set_driver_flags` helper function.]  If the first of
> > +       these flags is set, the PM core will not apply the direct-complete
> > +       procedure described above to the given device and, consequenty, to any
> > +       of its ancestors.  The second flag, when set, informs the middle layer
> > +       code (bus types, device types, PM domains, classes) that it should take
> > +       the return value of the ``->prepare`` callback provided by the driver
> > +       into account and it may only return a positive value from its own
> > +       ``->prepare`` callback if the driver's one also has returned a positive
> > +       value.
> > +
> >      2. The ``->suspend`` methods should quiesce the device to stop it from
> >         performing I/O.  They also may save the device registers and put it into
> >         the appropriate low-power state, depending on the bus type the device is
> > Index: linux-pm/Documentation/power/pci.txt
> > ===================================================================
> > --- linux-pm.orig/Documentation/power/pci.txt
> > +++ linux-pm/Documentation/power/pci.txt
> > @@ -961,6 +961,25 @@ dev_pm_ops to indicate that one suspend
> >  .suspend(), .freeze(), and .poweroff() members and one resume routine is to
> >  be pointed to by the .resume(), .thaw(), and .restore() members.
> >
> > +3.1.19. Driver Flags for Power Management
> > +
> > +The PM core allows device drivers to set flags that influence the handling of
> > +power management for the devices by the core itself and by middle layer code
> > +including the PCI bus type.  The flags should be set once at the driver probe
> > +time with the help of the dev_pm_set_driver_flags() function and they should not
> > +be updated directly afterwards.
> 
> I am wondering if we really need to make a statement generic to all
> "driver PM flags" that these flags must be set at ->probe(). Maybe
> that is better documented per flag, rather than for all. The reason
> why I bring it up, is that I would not be surprised if a new flag
> comes a long and which may be used a bit differently, not requiring
> that.
> 
> Of course we can also update that later on, if needed.

Right.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag
  2017-10-16  1:29 ` [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag Rafael J. Wysocki
  2017-10-23 19:01   ` Ulf Hansson
@ 2017-10-24  5:22   ` Ulf Hansson
  2017-10-24  8:55     ` Rafael J. Wysocki
  1 sibling, 1 reply; 79+ messages in thread
From: Ulf Hansson @ 2017-10-24  5:22 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>
> Define and document a SMART_SUSPEND flag to instruct bus types and PM
> domains that the system suspend callbacks provided by the driver can
> cope with runtime-suspended devices, so from the driver's perspective
> it should be safe to leave devices in runtime suspend during system
> suspend.
>
> Setting that flag also causes the PM core to skip the "late" and
> "noirq" phases of device suspend for devices that remain in runtime
> suspend at the beginning of the "late" phase (when runtime PM has
> been disabled for them) under the assumption that their state cannot
> (and should not) change after that point until the system suspend
> transition is complete.  Moreover, the PM core prevents runtime PM
> from acting on devices with DPM_FLAG_SMART_SUSPEND during system
> resume by setting their runtime PM status to "active" at the end of
> the "early" phase (right prior to enabling runtime PM for them).
> That allows system resume callbacks to do whatever is necessary to
> resume the device without worrying about runtime PM possibly
> running in parallel with them.

After some sleep, I woke up and realized that the hole thing of making
the PM core skip invoking system sleep callbacks, is not compatible
with devices being attached to the genpd. Sorry.

The reason is because genpd may not power off the PM domain, even if
all devices attached to it are runtime suspended. For example, due to
a subdomain holding it powered or because a PM QoS constraints
prevents to power off it in runtime. Then to understand whether it
shall power off/on the PM domain, during system-wide PM it requires
the system sleep callbacks to be invoked.

So, even if the driver can cope with the behavior from
DPM_FLAG_SMART_SUSPEND, then what happens when the PM domain (genpd)
can not?

Taking this into account, this feels like solution entirely specific
to ACPI and PCI. That is fine by me, however then we still have those
cross SoC drivers, the i2c-designware driver, which may have its
device attached to an ACPI PM domain or perhaps a genpd.

>
> However, that doesn't apply to transitions involving ->thaw_noirq,
> ->thaw_early and ->thaw callbacks during hibernation, as they
> generally are not expected to change the power states of devices.
> Consequently, if a device is in runtime suspend at the beginning
> of such a transition, it must stay in runtime suspend until the
> "complete" phase of it (since the callbacks may not change its
> power state).
>

[...]

Kind regards
Uffe

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag
  2017-10-24  5:22   ` Ulf Hansson
@ 2017-10-24  8:55     ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-24  8:55 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c@vger.kernel.org, Lee Jones

On Tuesday, October 24, 2017 7:22:25 AM CEST Ulf Hansson wrote:
> On 16 October 2017 at 03:29, Rafael J. Wysocki <rjw@rjwysocki.net> wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >
> > Define and document a SMART_SUSPEND flag to instruct bus types and PM
> > domains that the system suspend callbacks provided by the driver can
> > cope with runtime-suspended devices, so from the driver's perspective
> > it should be safe to leave devices in runtime suspend during system
> > suspend.
> >
> > Setting that flag also causes the PM core to skip the "late" and
> > "noirq" phases of device suspend for devices that remain in runtime
> > suspend at the beginning of the "late" phase (when runtime PM has
> > been disabled for them) under the assumption that their state cannot
> > (and should not) change after that point until the system suspend
> > transition is complete.  Moreover, the PM core prevents runtime PM
> > from acting on devices with DPM_FLAG_SMART_SUSPEND during system
> > resume by setting their runtime PM status to "active" at the end of
> > the "early" phase (right prior to enabling runtime PM for them).
> > That allows system resume callbacks to do whatever is necessary to
> > resume the device without worrying about runtime PM possibly
> > running in parallel with them.
> 
> After some sleep, I woke up and realized that the hole thing of making
> the PM core skip invoking system sleep callbacks, is not compatible
> with devices being attached to the genpd. Sorry.

That's OK. :-)

It just means I need to move that logic up to the concerned middle layers.

I was going to do that to start with, but then I thought I would do it in
the core to avoid duplicated checks.  I overlooked the genpd case, however.

> The reason is because genpd may not power off the PM domain, even if
> all devices attached to it are runtime suspended. For example, due to
> a subdomain holding it powered or because a PM QoS constraints
> prevents to power off it in runtime. Then to understand whether it
> shall power off/on the PM domain, during system-wide PM it requires
> the system sleep callbacks to be invoked.
> 
> So, even if the driver can cope with the behavior from
> DPM_FLAG_SMART_SUSPEND, then what happens when the PM domain (genpd)
> can not?
> 
> Taking this into account, this feels like solution entirely specific
> to ACPI and PCI. That is fine by me, however then we still have those
> cross SoC drivers, the i2c-designware driver, which may have its
> device attached to an ACPI PM domain or perhaps a genpd.

Yes, that should be fine if the logic above goes to the PCI bus type
and ACPI PM domain.  Then, setting the flag will have no effect on
genpd at all, but that's OK.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management
  2017-10-16  1:31 ` [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management Rafael J. Wysocki
@ 2017-10-26 20:41   ` Wolfram Sang
  2017-10-26 21:14     ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Wolfram Sang @ 2017-10-26 20:41 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, linux-i2c, Lee Jones

[-- Attachment #1: Type: text/plain, Size: 1970 bytes --]

On Mon, Oct 16, 2017 at 03:31:17AM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Optimize the power management in i2c-designware-platdrv by making it
> set the DPM_FLAG_SMART_SUSPEND and DPM_FLAG_LEAVE_SUSPENDED which
> allows some code to be dropped from its PM callbacks.
> 
> First, setting DPM_FLAG_SMART_SUSPEND causes the intel-lpss driver
> to avoid resuming i2c-designware-platdrv devices in its ->prepare
> callback, so they can stay in runtime suspend after that point even
> if the direct-complete feature is not used for them.
> 
> It also causes the PM core to avoid invoking "late" and "noirq"
> suspend callbacks for these devices if they are in runtime suspend
> at the beginning of the "late" phase of device suspend during
> system suspend.  That guarantees dw_i2c_plat_suspend() to be
> called for a device only if it is not in runtime suspend.
> Moreover, it also causes the PM core to set the device's runtime
> PM status to "active" after calling dw_i2c_plat_resume() for
> it, so the driver doesn't need internal flags to avoid invoking
> either dw_i2c_plat_suspend() or dw_i2c_plat_resume() twice in
> a row.
> 
> Second, setting DPM_FLAG_LEAVE_SUSPENDED enables the optimization
> allowing the device to stay suspended after system resume under
> suitable conditions, so again the driver doesn't need to take
> care of that by itself.
> 
> Accordingly, the internal "suspended" and "skip_resume" flags
> used by the driver are not necessary any more, so drop them and
> simplify the driver's PM callbacks.
> 
> Additionally, notice that dw_i2c_plat_complete() only needs
> to schedule runtime PM for the device if platform firmware
> has been involved in resuming the system, so make it call
> pm_resume_via_firmware() to check that.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>

So, if the designware maintainers ack it, I will, too.


[-- Attachment #2: signature.asc --]
[-- Type: application/pgp-signature, Size: 833 bytes --]

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management
  2017-10-26 20:41   ` Wolfram Sang
@ 2017-10-26 21:14     ` Rafael J. Wysocki
  0 siblings, 0 replies; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-26 21:14 UTC (permalink / raw)
  To: Wolfram Sang
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, linux-i2c, Lee Jones

On Thursday, October 26, 2017 10:41:40 PM CEST Wolfram Sang wrote:
> On Mon, Oct 16, 2017 at 03:31:17AM +0200, Rafael J. Wysocki wrote:
> > From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> > 
> > Optimize the power management in i2c-designware-platdrv by making it
> > set the DPM_FLAG_SMART_SUSPEND and DPM_FLAG_LEAVE_SUSPENDED which
> > allows some code to be dropped from its PM callbacks.
> > 
> > First, setting DPM_FLAG_SMART_SUSPEND causes the intel-lpss driver
> > to avoid resuming i2c-designware-platdrv devices in its ->prepare
> > callback, so they can stay in runtime suspend after that point even
> > if the direct-complete feature is not used for them.
> > 
> > It also causes the PM core to avoid invoking "late" and "noirq"
> > suspend callbacks for these devices if they are in runtime suspend
> > at the beginning of the "late" phase of device suspend during
> > system suspend.  That guarantees dw_i2c_plat_suspend() to be
> > called for a device only if it is not in runtime suspend.
> > Moreover, it also causes the PM core to set the device's runtime
> > PM status to "active" after calling dw_i2c_plat_resume() for
> > it, so the driver doesn't need internal flags to avoid invoking
> > either dw_i2c_plat_suspend() or dw_i2c_plat_resume() twice in
> > a row.
> > 
> > Second, setting DPM_FLAG_LEAVE_SUSPENDED enables the optimization
> > allowing the device to stay suspended after system resume under
> > suitable conditions, so again the driver doesn't need to take
> > care of that by itself.
> > 
> > Accordingly, the internal "suspended" and "skip_resume" flags
> > used by the driver are not necessary any more, so drop them and
> > simplify the driver's PM callbacks.
> > 
> > Additionally, notice that dw_i2c_plat_complete() only needs
> > to schedule runtime PM for the device if platform firmware
> > has been involved in resuming the system, so make it call
> > pm_resume_via_firmware() to check that.
> > 
> > Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> So, if the designware maintainers ack it, I will, too.

Thanks!

I need to post a new revision of the core patches, so I'll send this one
again later.  Likely during the next cycle.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  2017-10-16  1:30 ` [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND Rafael J. Wysocki
@ 2017-10-31 15:09   ` Lee Jones
  2017-10-31 16:28     ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Lee Jones @ 2017-10-31 15:09 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Linux PM, Bjorn Helgaas, Alan Stern, Greg Kroah-Hartman, LKML,
	Linux ACPI, Linux PCI, Linux Documentation, Mika Westerberg,
	Ulf Hansson, Andy Shevchenko, Kevin Hilman, Wolfram Sang,
	linux-i2c

On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:

> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Make the intel-lpss driver set DPM_FLAG_SMART_SUSPEND for its
> devices which will allow them to stay in runtime suspend during
> system suspend unless they need to be reconfigured for some reason.
> 
> Also make it avoid resuming its child devices if they have
> DPM_FLAG_SMART_SUSPEND set to allow them to remain in runtime
> suspend during system suspend.
> 
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
>  drivers/mfd/intel-lpss.c |    6 +++++-
>  1 file changed, 5 insertions(+), 1 deletion(-)

Is this patch independent?

For my own reference:
  Acked-for-MFD-by: Lee Jones <lee.jones@linaro.org>

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  2017-10-31 15:09   ` Lee Jones
@ 2017-10-31 16:28     ` Rafael J. Wysocki
  2017-11-01  9:28       ` Lee Jones
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-10-31 16:28 UTC (permalink / raw)
  To: Lee Jones
  Cc: Rafael J. Wysocki, Linux PM, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c

On Tue, Oct 31, 2017 at 4:09 PM, Lee Jones <lee.jones@linaro.org> wrote:
> On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:
>
>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>
>> Make the intel-lpss driver set DPM_FLAG_SMART_SUSPEND for its
>> devices which will allow them to stay in runtime suspend during
>> system suspend unless they need to be reconfigured for some reason.
>>
>> Also make it avoid resuming its child devices if they have
>> DPM_FLAG_SMART_SUSPEND set to allow them to remain in runtime
>> suspend during system suspend.
>>
>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> ---
>>  drivers/mfd/intel-lpss.c |    6 +++++-
>>  1 file changed, 5 insertions(+), 1 deletion(-)
>
> Is this patch independent?

It depends on the flag definition at least, but functionally it also
depends on the PCI support for the flag.

> For my own reference:
>   Acked-for-MFD-by: Lee Jones <lee.jones@linaro.org>

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  2017-10-31 16:28     ` Rafael J. Wysocki
@ 2017-11-01  9:28       ` Lee Jones
  2017-11-01 20:26         ` Rafael J. Wysocki
  0 siblings, 1 reply; 79+ messages in thread
From: Lee Jones @ 2017-11-01  9:28 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c

On Tue, 31 Oct 2017, Rafael J. Wysocki wrote:

> On Tue, Oct 31, 2017 at 4:09 PM, Lee Jones <lee.jones@linaro.org> wrote:
> > On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:
> >
> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >>
> >> Make the intel-lpss driver set DPM_FLAG_SMART_SUSPEND for its
> >> devices which will allow them to stay in runtime suspend during
> >> system suspend unless they need to be reconfigured for some reason.
> >>
> >> Also make it avoid resuming its child devices if they have
> >> DPM_FLAG_SMART_SUSPEND set to allow them to remain in runtime
> >> suspend during system suspend.
> >>
> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> ---
> >>  drivers/mfd/intel-lpss.c |    6 +++++-
> >>  1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > Is this patch independent?
> 
> It depends on the flag definition at least, but functionally it also
> depends on the PCI support for the flag.

No problem.  Which tree to you propose this goes through?

> > For my own reference:
> >   Acked-for-MFD-by: Lee Jones <lee.jones@linaro.org>


-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  2017-11-01  9:28       ` Lee Jones
@ 2017-11-01 20:26         ` Rafael J. Wysocki
  2017-11-08 11:08           ` Lee Jones
  0 siblings, 1 reply; 79+ messages in thread
From: Rafael J. Wysocki @ 2017-11-01 20:26 UTC (permalink / raw)
  To: Lee Jones
  Cc: Rafael J. Wysocki, Rafael J. Wysocki, Linux PM, Bjorn Helgaas,
	Alan Stern, Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c

On Wed, Nov 1, 2017 at 10:28 AM, Lee Jones <lee.jones@linaro.org> wrote:
> On Tue, 31 Oct 2017, Rafael J. Wysocki wrote:
>
>> On Tue, Oct 31, 2017 at 4:09 PM, Lee Jones <lee.jones@linaro.org> wrote:
>> > On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:
>> >
>> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >>
>> >> Make the intel-lpss driver set DPM_FLAG_SMART_SUSPEND for its
>> >> devices which will allow them to stay in runtime suspend during
>> >> system suspend unless they need to be reconfigured for some reason.
>> >>
>> >> Also make it avoid resuming its child devices if they have
>> >> DPM_FLAG_SMART_SUSPEND set to allow them to remain in runtime
>> >> suspend during system suspend.
>> >>
>> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>> >> ---
>> >>  drivers/mfd/intel-lpss.c |    6 +++++-
>> >>  1 file changed, 5 insertions(+), 1 deletion(-)
>> >
>> > Is this patch independent?
>>
>> It depends on the flag definition at least, but functionally it also
>> depends on the PCI support for the flag.
>
> No problem.  Which tree to you propose this goes through?

linux-pm.git if that's not a problem as the patches it depends on will
go through it too.

That said I'll resend it when the core patches it depends on are ready.

Thanks,
Rafael

^ permalink raw reply	[flat|nested] 79+ messages in thread

* Re: [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND
  2017-11-01 20:26         ` Rafael J. Wysocki
@ 2017-11-08 11:08           ` Lee Jones
  0 siblings, 0 replies; 79+ messages in thread
From: Lee Jones @ 2017-11-08 11:08 UTC (permalink / raw)
  To: Rafael J. Wysocki
  Cc: Rafael J. Wysocki, Linux PM, Bjorn Helgaas, Alan Stern,
	Greg Kroah-Hartman, LKML, Linux ACPI, Linux PCI,
	Linux Documentation, Mika Westerberg, Ulf Hansson,
	Andy Shevchenko, Kevin Hilman, Wolfram Sang, linux-i2c

On Wed, 01 Nov 2017, Rafael J. Wysocki wrote:

> On Wed, Nov 1, 2017 at 10:28 AM, Lee Jones <lee.jones@linaro.org> wrote:
> > On Tue, 31 Oct 2017, Rafael J. Wysocki wrote:
> >
> >> On Tue, Oct 31, 2017 at 4:09 PM, Lee Jones <lee.jones@linaro.org> wrote:
> >> > On Mon, 16 Oct 2017, Rafael J. Wysocki wrote:
> >> >
> >> >> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >>
> >> >> Make the intel-lpss driver set DPM_FLAG_SMART_SUSPEND for its
> >> >> devices which will allow them to stay in runtime suspend during
> >> >> system suspend unless they need to be reconfigured for some reason.
> >> >>
> >> >> Also make it avoid resuming its child devices if they have
> >> >> DPM_FLAG_SMART_SUSPEND set to allow them to remain in runtime
> >> >> suspend during system suspend.
> >> >>
> >> >> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> >> >> ---
> >> >>  drivers/mfd/intel-lpss.c |    6 +++++-
> >> >>  1 file changed, 5 insertions(+), 1 deletion(-)
> >> >
> >> > Is this patch independent?
> >>
> >> It depends on the flag definition at least, but functionally it also
> >> depends on the PCI support for the flag.
> >
> > No problem.  Which tree to you propose this goes through?
> 
> linux-pm.git if that's not a problem as the patches it depends on will
> go through it too.
> 
> That said I'll resend it when the core patches it depends on are ready.

It's fine by me.

Please check to see if there are any clashes with MFD.  If there are,
I'll need a (small) pull-request from you.

-- 
Lee Jones
Linaro STMicroelectronics Landing Team Lead
Linaro.org │ Open source software for ARM SoCs
Follow Linaro: Facebook | Twitter | Blog

^ permalink raw reply	[flat|nested] 79+ messages in thread

end of thread, other threads:[~2017-11-08 11:08 UTC | newest]

Thread overview: 79+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2017-10-16  1:12 [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Rafael J. Wysocki
2017-10-16  1:29 ` [PATCH 01/12] PM / core: Add NEVER_SKIP and SMART_PREPARE driver flags Rafael J. Wysocki
2017-10-16  5:34   ` Lukas Wunner
2017-10-16 22:03     ` Rafael J. Wysocki
2017-10-16  6:28   ` Greg Kroah-Hartman
2017-10-16 22:05     ` Rafael J. Wysocki
2017-10-17  7:15       ` Greg Kroah-Hartman
2017-10-17 15:26         ` Rafael J. Wysocki
2017-10-18  6:56           ` Greg Kroah-Hartman
2017-10-16  6:31   ` Greg Kroah-Hartman
2017-10-16 22:07     ` Rafael J. Wysocki
2017-10-17 13:26       ` Greg Kroah-Hartman
2017-10-16 20:16   ` Alan Stern
2017-10-16 22:11     ` Rafael J. Wysocki
2017-10-18 23:17   ` [Update][PATCH v2 " Rafael J. Wysocki
2017-10-19  7:33     ` Greg Kroah-Hartman
2017-10-20 11:11       ` Rafael J. Wysocki
2017-10-20 11:35         ` Greg Kroah-Hartman
2017-10-20 11:28           ` Rafael J. Wysocki
2017-10-23 16:37     ` Ulf Hansson
2017-10-23 20:41       ` Rafael J. Wysocki
2017-10-16  1:29 ` [PATCH 02/12] PCI / PM: Use the NEVER_SKIP driver flag Rafael J. Wysocki
2017-10-23 16:40   ` Ulf Hansson
2017-10-16  1:29 ` [PATCH 03/12] PM: i2c-designware-platdrv: Use DPM_FLAG_SMART_PREPARE Rafael J. Wysocki
2017-10-23 16:57   ` Ulf Hansson
2017-10-16  1:29 ` [PATCH 04/12] PM / core: Add SMART_SUSPEND driver flag Rafael J. Wysocki
2017-10-23 19:01   ` Ulf Hansson
2017-10-24  5:22   ` Ulf Hansson
2017-10-24  8:55     ` Rafael J. Wysocki
2017-10-16  1:29 ` [PATCH 05/12] PCI / PM: Drop unnecessary invocations of pcibios_pm_ops callbacks Rafael J. Wysocki
2017-10-23 19:06   ` Ulf Hansson
2017-10-16  1:29 ` [PATCH 06/12] PCI / PM: Take SMART_SUSPEND driver flag into account Rafael J. Wysocki
2017-10-16  1:29 ` [PATCH 07/12] ACPI / LPSS: Consolidate runtime PM and system sleep handling Rafael J. Wysocki
2017-10-23 19:09   ` Ulf Hansson
2017-10-16  1:30 ` [PATCH 08/12] ACPI / PM: Take SMART_SUSPEND driver flag into account Rafael J. Wysocki
2017-10-16  1:30 ` [PATCH 09/12] PM / mfd: intel-lpss: Use DPM_FLAG_SMART_SUSPEND Rafael J. Wysocki
2017-10-31 15:09   ` Lee Jones
2017-10-31 16:28     ` Rafael J. Wysocki
2017-11-01  9:28       ` Lee Jones
2017-11-01 20:26         ` Rafael J. Wysocki
2017-11-08 11:08           ` Lee Jones
2017-10-16  1:30 ` [PATCH 10/12] PM / core: Add LEAVE_SUSPENDED driver flag Rafael J. Wysocki
2017-10-23 19:38   ` Ulf Hansson
2017-10-16  1:31 ` [PATCH 11/12] PM: i2c-designware-platdrv: Optimize power management Rafael J. Wysocki
2017-10-26 20:41   ` Wolfram Sang
2017-10-26 21:14     ` Rafael J. Wysocki
2017-10-16  1:32 ` [PATCH 12/12] PM / core: Add AVOID_RPM driver flag Rafael J. Wysocki
2017-10-17 15:33   ` Andy Shevchenko
2017-10-17 15:59     ` Rafael J. Wysocki
2017-10-17 16:25       ` Andy Shevchenko
2017-10-16  7:08 ` [PATCH 0/12] PM / sleep: Driver flags for system suspend/resume Greg Kroah-Hartman
2017-10-16 21:50   ` Rafael J. Wysocki
2017-10-17  8:36 ` Ulf Hansson
2017-10-17 15:25   ` Rafael J. Wysocki
2017-10-17 19:41     ` Ulf Hansson
2017-10-17 20:12       ` Alan Stern
2017-10-17 23:07         ` Rafael J. Wysocki
2017-10-18  0:39       ` Rafael J. Wysocki
2017-10-18 10:24         ` Rafael J. Wysocki
2017-10-18 12:34           ` Ulf Hansson
2017-10-18 21:54             ` Rafael J. Wysocki
2017-10-18 11:57         ` Ulf Hansson
2017-10-18 13:00           ` Rafael J. Wysocki
2017-10-18 14:11             ` Ulf Hansson
2017-10-18 19:45               ` Grygorii Strashko
2017-10-18 21:48                 ` Rafael J. Wysocki
2017-10-19  8:33                   ` Ulf Hansson
2017-10-19 17:21                     ` Grygorii Strashko
2017-10-19 18:04                       ` Ulf Hansson
2017-10-19 18:11                         ` Ulf Hansson
2017-10-19 21:31                           ` Grygorii Strashko
2017-10-20  6:05                             ` Ulf Hansson
2017-10-18 22:12               ` Rafael J. Wysocki
2017-10-19 12:21                 ` Ulf Hansson
2017-10-19 18:01                   ` Ulf Hansson
2017-10-20  1:19                   ` Rafael J. Wysocki
2017-10-20  5:57                     ` Ulf Hansson
2017-10-20 20:46 ` Bjorn Helgaas
2017-10-21  1:04   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).