[RFC/PATCH 0/3] PM: QoS: Introduce a system-wakeup QoS limit for s2idle and genpd

linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

* [RFC/PATCH 0/3] PM: QoS: Introduce a system-wakeup QoS limit for s2idle and genpd
@ 2025-07-16 12:33 Ulf Hansson
  2025-07-16 12:33 ` [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit Ulf Hansson
                   ` (2 more replies)
  0 siblings, 3 replies; 9+ messages in thread
From: Ulf Hansson @ 2025-07-16 12:33 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Kevin Hilman, Pavel Machek, Len Brown, Daniel Lezcano,
	Saravana Kannan, Maulik Shah, Prasad Sodagudi, Ulf Hansson,
	linux-kernel

Some platforms and devices supports multiple low-power-states than can be
used for system-wide suspend. Today these states are selected on per
subsystem basis and in most cases it's the deepest possible state that
becomes selected.

For some use-cases this is a problem as it isn't suitable or even breaks
the system-wakeup latency constraint, when we decide to enter these deeper
states during system-wide suspend.

Therefore, let's introduce an interface for user-space, allowing us to specify
the system-wakeup QoS limit. As part of this initial series, let's also start
to take the new QoS limit into account when selecting a low-power-state for PM
domains by genpd and for s2idle in the cpuidle core.

Note that, documentation of the new userspace interface are intentionally not
included in this initial version. I simply wanted us to focus the discussion on
whether we think the proposed approach seems reasonable, before spending time
on the documentation.

If you want to run some tests, there is a new file added at
/dev/system_wakeup_latency, which works similar as the /dev/cpu_dma_latency [1].

Note that, I was first considering to re-use /dev/cpu_dma_latency for the
system-wakeup latency constraint too, but after a second thought it seems like
mixing QoS limits for runtime and system-wide suspend doesn't really work well.

Kind regards
Ulf Hansson

[1]
Documentation/power/pm_qos_interface.rst

Ulf Hansson (3):
  PM: QoS: Introduce a system-wakeup QoS limit
  pmdomain: Respect the system-wakeup QoS limit at system-wide suspend
  cpuidle: Respect the system-wakeup QoS limit for s2idle

 drivers/cpuidle/cpuidle.c   |   9 +--
 drivers/pmdomain/core.c     |  10 +++-
 drivers/pmdomain/governor.c |  23 ++++++++
 include/linux/pm_domain.h   |   1 +
 include/linux/pm_qos.h      |   9 +++
 kernel/power/qos.c          | 114 ++++++++++++++++++++++++++++++++++++
 6 files changed, 160 insertions(+), 6 deletions(-)

-- 
2.43.0

^ permalink raw reply	[flat|nested] 9+ messages in thread

* [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit
  2025-07-16 12:33 [RFC/PATCH 0/3] PM: QoS: Introduce a system-wakeup QoS limit for s2idle and genpd Ulf Hansson
@ 2025-07-16 12:33 ` Ulf Hansson
  2025-07-21 16:39   ` Rafael J. Wysocki
  2025-07-16 12:33 ` [RFC/PATCH 2/3] pmdomain: Respect the system-wakeup QoS limit at system-wide suspend Ulf Hansson
  2025-07-16 12:33 ` [RFC/PATCH 3/3] cpuidle: Respect the system-wakeup QoS limit for s2idle Ulf Hansson
  2 siblings, 1 reply; 9+ messages in thread
From: Ulf Hansson @ 2025-07-16 12:33 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Kevin Hilman, Pavel Machek, Len Brown, Daniel Lezcano,
	Saravana Kannan, Maulik Shah, Prasad Sodagudi, Ulf Hansson,
	linux-kernel

Some platforms and devices supports multiple low-power-states than can be
used for system-wide suspend. Today these states are selected on per
subsystem basis and in most cases it's the deepest possible state that
becomes selected.

For some use-cases this is a problem as it isn't suitable or even breaks
the system-wakeup latency constraint, when we decide to enter these deeper
states during system-wide suspend.

Therefore, let's introduce an interface for user-space, allowing us to
specify the system-wakeup QoS limit. Subsequent changes will start taking
into account the QoS limit.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 include/linux/pm_qos.h |   9 ++++
 kernel/power/qos.c     | 114 +++++++++++++++++++++++++++++++++++++++++
 2 files changed, 123 insertions(+)

diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
index 4a69d4af3ff8..5f84084f19c8 100644
--- a/include/linux/pm_qos.h
+++ b/include/linux/pm_qos.h
@@ -143,6 +143,15 @@ bool pm_qos_update_flags(struct pm_qos_flags *pqf,
 			 struct pm_qos_flags_request *req,
 			 enum pm_qos_req_action action, s32 val);
 
+#ifdef CONFIG_PM_SLEEP
+s32 system_wakeup_latency_qos_limit(void);
+#else
+static inline s32 system_wakeup_latency_qos_limit(void)
+{
+	return PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
+}
+#endif
+
 #ifdef CONFIG_CPU_IDLE
 s32 cpu_latency_qos_limit(void);
 bool cpu_latency_qos_request_active(struct pm_qos_request *req);
diff --git a/kernel/power/qos.c b/kernel/power/qos.c
index 4244b069442e..fb496c220ffe 100644
--- a/kernel/power/qos.c
+++ b/kernel/power/qos.c
@@ -209,6 +209,120 @@ bool pm_qos_update_flags(struct pm_qos_flags *pqf,
 	return prev_value != curr_value;
 }
 
+#ifdef CONFIG_PM_SLEEP
+static struct pm_qos_constraints system_wakeup_latency_constraints = {
+	.list = PLIST_HEAD_INIT(system_wakeup_latency_constraints.list),
+	.target_value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT,
+	.default_value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT,
+	.no_constraint_value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT,
+	.type = PM_QOS_MIN,
+};
+
+/**
+ * system_wakeup_latency_qos_limit - Current system wakeup latency QoS limit.
+ *
+ * Returns the current system wakeup latency QoS limit that may have been
+ * requested by user-space.
+ */
+s32 system_wakeup_latency_qos_limit(void)
+{
+	return pm_qos_read_value(&system_wakeup_latency_constraints);
+}
+
+static int system_wakeup_latency_qos_open(struct inode *inode,
+					  struct file *filp)
+{
+	struct pm_qos_request *req;
+
+	req = kzalloc(sizeof(*req), GFP_KERNEL);
+	if (!req)
+		return -ENOMEM;
+
+	req->qos = &system_wakeup_latency_constraints;
+	pm_qos_update_target(req->qos, &req->node, PM_QOS_ADD_REQ,
+			     PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
+	filp->private_data = req;
+
+	return 0;
+}
+
+static int system_wakeup_latency_qos_release(struct inode *inode,
+					     struct file *filp)
+{
+	struct pm_qos_request *req = filp->private_data;
+
+	filp->private_data = NULL;
+	pm_qos_update_target(req->qos, &req->node, PM_QOS_REMOVE_REQ,
+			     PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
+	kfree(req);
+
+	return 0;
+}
+
+static ssize_t system_wakeup_latency_qos_read(struct file *filp,
+					      char __user *buf,
+					      size_t count,
+					      loff_t *f_pos)
+{
+	s32 value = pm_qos_read_value(&system_wakeup_latency_constraints);
+
+	return simple_read_from_buffer(buf, count, f_pos, &value, sizeof(s32));
+}
+
+static ssize_t system_wakeup_latency_qos_write(struct file *filp,
+					       const char __user *buf,
+					       size_t count, loff_t *f_pos)
+{
+	struct pm_qos_request *req = filp->private_data;
+	s32 value;
+
+	if (count == sizeof(s32)) {
+		if (copy_from_user(&value, buf, sizeof(s32)))
+			return -EFAULT;
+	} else {
+		int ret;
+
+		ret = kstrtos32_from_user(buf, count, 16, &value);
+		if (ret)
+			return ret;
+	}
+
+	if (value < 0)
+		return -EINVAL;
+
+	pm_qos_update_target(req->qos, &req->node, PM_QOS_UPDATE_REQ, value);
+
+	return count;
+}
+
+static const struct file_operations system_wakeup_latency_qos_fops = {
+	.open = system_wakeup_latency_qos_open,
+	.release = system_wakeup_latency_qos_release,
+	.read = system_wakeup_latency_qos_read,
+	.write = system_wakeup_latency_qos_write,
+	.llseek = noop_llseek,
+};
+
+static struct miscdevice system_wakeup_latency_qos_miscdev = {
+	.minor = MISC_DYNAMIC_MINOR,
+	.name = "system_wakeup_latency",
+	.fops = &system_wakeup_latency_qos_fops,
+};
+
+static int __init system_wakeup_latency_qos_init(void)
+{
+	int ret;
+
+	ret = misc_register(&system_wakeup_latency_qos_miscdev);
+	if (ret < 0)
+		pr_err("%s: %s setup failed\n", __func__,
+		       system_wakeup_latency_qos_miscdev.name);
+
+	return ret;
+}
+late_initcall(system_wakeup_latency_qos_init);
+#endif /* CONFIG_PM_SLEEP */
+
 #ifdef CONFIG_CPU_IDLE
 /* Definitions related to the CPU latency QoS. */
 
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC/PATCH 2/3] pmdomain: Respect the system-wakeup QoS limit at system-wide suspend
  2025-07-16 12:33 [RFC/PATCH 0/3] PM: QoS: Introduce a system-wakeup QoS limit for s2idle and genpd Ulf Hansson
  2025-07-16 12:33 ` [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit Ulf Hansson
@ 2025-07-16 12:33 ` Ulf Hansson
  2025-07-16 12:33 ` [RFC/PATCH 3/3] cpuidle: Respect the system-wakeup QoS limit for s2idle Ulf Hansson
  2 siblings, 0 replies; 9+ messages in thread
From: Ulf Hansson @ 2025-07-16 12:33 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Kevin Hilman, Pavel Machek, Len Brown, Daniel Lezcano,
	Saravana Kannan, Maulik Shah, Prasad Sodagudi, Ulf Hansson,
	linux-kernel

A system-wakeup QoS limit may have been requested by user-space. To avoid
entering a too deep state for PM domains that are managed my genpd, let's
start to take into account the QoS limit when selecting the
low-power-state.

If it turns out that none of the states in the list of domain-idlestates
are suitable to use, let's leave the PM domain powered-on.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/pmdomain/core.c     | 10 ++++++++--
 drivers/pmdomain/governor.c | 23 +++++++++++++++++++++++
 include/linux/pm_domain.h   |  1 +
 3 files changed, 32 insertions(+), 2 deletions(-)

diff --git a/drivers/pmdomain/core.c b/drivers/pmdomain/core.c
index a86aeda1c955..5cbe7473c2b9 100644
--- a/drivers/pmdomain/core.c
+++ b/drivers/pmdomain/core.c
@@ -1396,8 +1396,14 @@ static void genpd_sync_power_off(struct generic_pm_domain *genpd, bool use_lock,
 			return;
 	}
 
-	/* Choose the deepest state when suspending */
-	genpd->state_idx = genpd->state_count - 1;
+	if (genpd->gov && genpd->gov->system_power_down_ok) {
+		if (!genpd->gov->system_power_down_ok(&genpd->domain))
+			return;
+	} else {
+		/* Default to the deepest state. */
+		genpd->state_idx = genpd->state_count - 1;
+	}
+
 	if (_genpd_power_off(genpd, false)) {
 		genpd->states[genpd->state_idx].rejected++;
 		return;
diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
index 39359811a930..2630ba6f28ef 100644
--- a/drivers/pmdomain/governor.c
+++ b/drivers/pmdomain/governor.c
@@ -343,6 +343,27 @@ static bool default_power_down_ok(struct dev_pm_domain *pd)
 	return _default_power_down_ok(pd, ktime_get());
 }
 
+static bool default_system_power_down_ok(struct dev_pm_domain *pd)
+{
+	s64 constraint_ns = system_wakeup_latency_qos_limit() * NSEC_PER_USEC;
+	struct generic_pm_domain *genpd = pd_to_genpd(pd);
+	int state_idx = genpd->state_count - 1;
+
+	/* Find the deepest state for the latency constraint. */
+	while (state_idx >= 0) {
+		s64 latency_ns = genpd->states[state_idx].power_off_latency_ns +
+				 genpd->states[state_idx].power_on_latency_ns;
+
+		if (latency_ns <= constraint_ns) {
+			genpd->state_idx = state_idx;
+			return true;
+		}
+		state_idx--;
+	}
+
+	return false;
+}
+
 #ifdef CONFIG_CPU_IDLE
 static bool cpu_power_down_ok(struct dev_pm_domain *pd)
 {
@@ -418,12 +439,14 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
 struct dev_power_governor pm_domain_cpu_gov = {
 	.suspend_ok = default_suspend_ok,
 	.power_down_ok = cpu_power_down_ok,
+	.system_power_down_ok = default_system_power_down_ok,
 };
 #endif
 
 struct dev_power_governor simple_qos_governor = {
 	.suspend_ok = default_suspend_ok,
 	.power_down_ok = default_power_down_ok,
+	.system_power_down_ok = default_system_power_down_ok,
 };
 
 /*
diff --git a/include/linux/pm_domain.h b/include/linux/pm_domain.h
index 99556589f45e..0ad14cc4ad7f 100644
--- a/include/linux/pm_domain.h
+++ b/include/linux/pm_domain.h
@@ -140,6 +140,7 @@ enum genpd_sync_state {
 };
 
 struct dev_power_governor {
+	bool (*system_power_down_ok)(struct dev_pm_domain *domain);
 	bool (*power_down_ok)(struct dev_pm_domain *domain);
 	bool (*suspend_ok)(struct device *dev);
 };
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* [RFC/PATCH 3/3] cpuidle: Respect the system-wakeup QoS limit for s2idle
  2025-07-16 12:33 [RFC/PATCH 0/3] PM: QoS: Introduce a system-wakeup QoS limit for s2idle and genpd Ulf Hansson
  2025-07-16 12:33 ` [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit Ulf Hansson
  2025-07-16 12:33 ` [RFC/PATCH 2/3] pmdomain: Respect the system-wakeup QoS limit at system-wide suspend Ulf Hansson
@ 2025-07-16 12:33 ` Ulf Hansson
  2025-07-21 16:04   ` Rafael J. Wysocki
  2 siblings, 1 reply; 9+ messages in thread
From: Ulf Hansson @ 2025-07-16 12:33 UTC (permalink / raw)
  To: Rafael J . Wysocki, linux-pm
  Cc: Kevin Hilman, Pavel Machek, Len Brown, Daniel Lezcano,
	Saravana Kannan, Maulik Shah, Prasad Sodagudi, Ulf Hansson,
	linux-kernel

A system-wakeup QoS limit may have been requested by user-space. To avoid
entering a too deep state during s2idle, let's start to take into account
the QoS limit when selecting a suitable low-power-state.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---
 drivers/cpuidle/cpuidle.c | 9 +++++----
 1 file changed, 5 insertions(+), 4 deletions(-)

diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
index 0835da449db8..5f6dacb5b134 100644
--- a/drivers/cpuidle/cpuidle.c
+++ b/drivers/cpuidle/cpuidle.c
@@ -190,14 +190,15 @@ static noinstr void enter_s2idle_proper(struct cpuidle_driver *drv,
  */
 int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev)
 {
+	u64 constraint_ns = system_wakeup_latency_qos_limit() * NSEC_PER_USEC;
 	int index;
 
 	/*
-	 * Find the deepest state with ->enter_s2idle present, which guarantees
-	 * that interrupts won't be enabled when it exits and allows the tick to
-	 * be frozen safely.
+	 * Find the deepest state with ->enter_s2idle present that meets the
+	 * system-wakeup QoS limit, which guarantees that interrupts won't be
+	 * enabled when it exits and allows the tick to be frozen safely.
 	 */
-	index = find_deepest_state(drv, dev, U64_MAX, 0, true);
+	index = find_deepest_state(drv, dev, constraint_ns, 0, true);
 	if (index > 0) {
 		enter_s2idle_proper(drv, dev, index);
 		local_irq_enable();
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 9+ messages in thread

* Re: [RFC/PATCH 3/3] cpuidle: Respect the system-wakeup QoS limit for s2idle
  2025-07-16 12:33 ` [RFC/PATCH 3/3] cpuidle: Respect the system-wakeup QoS limit for s2idle Ulf Hansson
@ 2025-07-21 16:04   ` Rafael J. Wysocki
  0 siblings, 0 replies; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-07-21 16:04 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J . Wysocki, linux-pm, Kevin Hilman, Pavel Machek,
	Len Brown, Daniel Lezcano, Saravana Kannan, Maulik Shah,
	Prasad Sodagudi, linux-kernel

On Wed, Jul 16, 2025 at 2:33 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> A system-wakeup QoS limit may have been requested by user-space. To avoid
> entering a too deep state during s2idle, let's start to take into account
> the QoS limit when selecting a suitable low-power-state.
>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  drivers/cpuidle/cpuidle.c | 9 +++++----
>  1 file changed, 5 insertions(+), 4 deletions(-)
>
> diff --git a/drivers/cpuidle/cpuidle.c b/drivers/cpuidle/cpuidle.c
> index 0835da449db8..5f6dacb5b134 100644
> --- a/drivers/cpuidle/cpuidle.c
> +++ b/drivers/cpuidle/cpuidle.c
> @@ -190,14 +190,15 @@ static noinstr void enter_s2idle_proper(struct cpuidle_driver *drv,
>   */
>  int cpuidle_enter_s2idle(struct cpuidle_driver *drv, struct cpuidle_device *dev)
>  {
> +       u64 constraint_ns = system_wakeup_latency_qos_limit() * NSEC_PER_USEC;
>         int index;
>
>         /*
> -        * Find the deepest state with ->enter_s2idle present, which guarantees
> -        * that interrupts won't be enabled when it exits and allows the tick to
> -        * be frozen safely.
> +        * Find the deepest state with ->enter_s2idle present that meets the
> +        * system-wakeup QoS limit, which guarantees that interrupts won't be
> +        * enabled when it exits and allows the tick to be frozen safely.
>          */
> -       index = find_deepest_state(drv, dev, U64_MAX, 0, true);
> +       index = find_deepest_state(drv, dev, constraint_ns, 0, true);
>         if (index > 0) {
>                 enter_s2idle_proper(drv, dev, index);
>                 local_irq_enable();

This is not the only place that needs to be patched this way.
cpuidle_idle_call() is another one AFAICS.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit
  2025-07-16 12:33 ` [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit Ulf Hansson
@ 2025-07-21 16:39   ` Rafael J. Wysocki
  2025-08-11 17:15     ` Kevin Hilman
  0 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-07-21 16:39 UTC (permalink / raw)
  To: Ulf Hansson
  Cc: Rafael J . Wysocki, linux-pm, Kevin Hilman, Pavel Machek,
	Len Brown, Daniel Lezcano, Saravana Kannan, Maulik Shah,
	Prasad Sodagudi, linux-kernel

On Wed, Jul 16, 2025 at 2:33 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>
> Some platforms and devices supports multiple low-power-states than can be
> used for system-wide suspend. Today these states are selected on per
> subsystem basis and in most cases it's the deepest possible state that
> becomes selected.
>
> For some use-cases this is a problem as it isn't suitable or even breaks
> the system-wakeup latency constraint, when we decide to enter these deeper
> states during system-wide suspend.
>
> Therefore, let's introduce an interface for user-space, allowing us to
> specify the system-wakeup QoS limit. Subsequent changes will start taking
> into account the QoS limit.

Well, this is not really a system-wakeup limit, but a CPU idle state
latency limit for states entered in the last step of suspend-to-idle.

It looks like the problem is that the existing CPU latency QoS is not
taken into account by suspend-to-idle, so instead of adding an
entirely new interface to overcome this, would it make sense to add an
ioctl() to the existing one that would allow the user of it to
indicate that the given request should also be respected by
suspend-to-idle?

There are two basic reasons why I think so:
(1) The requests that you want to be respected by suspend-to-idle
should also be respected by the regular "runtime" idle, or at least I
don't see a reason why it wouldn't be the case.
(2) The new interface introduced by this patch basically duplicates
the existing one.

The flow related to this I kind of envision would be as follows:
(1) User space opens /dev/cpu_dma_latency and a single CPU latency QoS
request is created via cpu_latency_qos_add_request().
(2) User space calls a new ioctl() on the open file descriptor to
indicate that the request should also apply to suspend-to-idle.  A new
request is created with the same value and added to a new list of
constraints.  That new list of constraints will be used by
suspend-to-idle.
(3) Writing to the open file descriptor causes both requests to be updated.
(4) If user space does not want the request to apply to
suspend-to-idle any more, it can use another new ioctl() to achieve
that.  It would cause the second (suspend-to-idle) copy of the request
to be dropped.
(5) Closing the file descriptor causes both copies of the request to be dropped.

> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
>  include/linux/pm_qos.h |   9 ++++
>  kernel/power/qos.c     | 114 +++++++++++++++++++++++++++++++++++++++++
>  2 files changed, 123 insertions(+)
>
> diff --git a/include/linux/pm_qos.h b/include/linux/pm_qos.h
> index 4a69d4af3ff8..5f84084f19c8 100644
> --- a/include/linux/pm_qos.h
> +++ b/include/linux/pm_qos.h
> @@ -143,6 +143,15 @@ bool pm_qos_update_flags(struct pm_qos_flags *pqf,
>                          struct pm_qos_flags_request *req,
>                          enum pm_qos_req_action action, s32 val);
>
> +#ifdef CONFIG_PM_SLEEP
> +s32 system_wakeup_latency_qos_limit(void);
> +#else
> +static inline s32 system_wakeup_latency_qos_limit(void)
> +{
> +       return PM_QOS_RESUME_LATENCY_NO_CONSTRAINT;
> +}
> +#endif
> +
>  #ifdef CONFIG_CPU_IDLE
>  s32 cpu_latency_qos_limit(void);
>  bool cpu_latency_qos_request_active(struct pm_qos_request *req);
> diff --git a/kernel/power/qos.c b/kernel/power/qos.c
> index 4244b069442e..fb496c220ffe 100644
> --- a/kernel/power/qos.c
> +++ b/kernel/power/qos.c
> @@ -209,6 +209,120 @@ bool pm_qos_update_flags(struct pm_qos_flags *pqf,
>         return prev_value != curr_value;
>  }
>
> +#ifdef CONFIG_PM_SLEEP
> +static struct pm_qos_constraints system_wakeup_latency_constraints = {
> +       .list = PLIST_HEAD_INIT(system_wakeup_latency_constraints.list),
> +       .target_value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT,
> +       .default_value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT,
> +       .no_constraint_value = PM_QOS_RESUME_LATENCY_NO_CONSTRAINT,
> +       .type = PM_QOS_MIN,
> +};
> +
> +/**
> + * system_wakeup_latency_qos_limit - Current system wakeup latency QoS limit.
> + *
> + * Returns the current system wakeup latency QoS limit that may have been
> + * requested by user-space.
> + */
> +s32 system_wakeup_latency_qos_limit(void)
> +{
> +       return pm_qos_read_value(&system_wakeup_latency_constraints);
> +}
> +
> +static int system_wakeup_latency_qos_open(struct inode *inode,
> +                                         struct file *filp)
> +{
> +       struct pm_qos_request *req;
> +
> +       req = kzalloc(sizeof(*req), GFP_KERNEL);
> +       if (!req)
> +               return -ENOMEM;
> +
> +       req->qos = &system_wakeup_latency_constraints;
> +       pm_qos_update_target(req->qos, &req->node, PM_QOS_ADD_REQ,
> +                            PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
> +       filp->private_data = req;
> +
> +       return 0;
> +}
> +
> +static int system_wakeup_latency_qos_release(struct inode *inode,
> +                                            struct file *filp)
> +{
> +       struct pm_qos_request *req = filp->private_data;
> +
> +       filp->private_data = NULL;
> +       pm_qos_update_target(req->qos, &req->node, PM_QOS_REMOVE_REQ,
> +                            PM_QOS_RESUME_LATENCY_NO_CONSTRAINT);
> +       kfree(req);
> +
> +       return 0;
> +}
> +
> +static ssize_t system_wakeup_latency_qos_read(struct file *filp,
> +                                             char __user *buf,
> +                                             size_t count,
> +                                             loff_t *f_pos)
> +{
> +       s32 value = pm_qos_read_value(&system_wakeup_latency_constraints);
> +
> +       return simple_read_from_buffer(buf, count, f_pos, &value, sizeof(s32));
> +}
> +
> +static ssize_t system_wakeup_latency_qos_write(struct file *filp,
> +                                              const char __user *buf,
> +                                              size_t count, loff_t *f_pos)
> +{
> +       struct pm_qos_request *req = filp->private_data;
> +       s32 value;
> +
> +       if (count == sizeof(s32)) {
> +               if (copy_from_user(&value, buf, sizeof(s32)))
> +                       return -EFAULT;
> +       } else {
> +               int ret;
> +
> +               ret = kstrtos32_from_user(buf, count, 16, &value);
> +               if (ret)
> +                       return ret;
> +       }
> +
> +       if (value < 0)
> +               return -EINVAL;
> +
> +       pm_qos_update_target(req->qos, &req->node, PM_QOS_UPDATE_REQ, value);
> +
> +       return count;
> +}
> +
> +static const struct file_operations system_wakeup_latency_qos_fops = {
> +       .open = system_wakeup_latency_qos_open,
> +       .release = system_wakeup_latency_qos_release,
> +       .read = system_wakeup_latency_qos_read,
> +       .write = system_wakeup_latency_qos_write,
> +       .llseek = noop_llseek,
> +};
> +
> +static struct miscdevice system_wakeup_latency_qos_miscdev = {
> +       .minor = MISC_DYNAMIC_MINOR,
> +       .name = "system_wakeup_latency",
> +       .fops = &system_wakeup_latency_qos_fops,
> +};
> +
> +static int __init system_wakeup_latency_qos_init(void)
> +{
> +       int ret;
> +
> +       ret = misc_register(&system_wakeup_latency_qos_miscdev);
> +       if (ret < 0)
> +               pr_err("%s: %s setup failed\n", __func__,
> +                      system_wakeup_latency_qos_miscdev.name);
> +
> +       return ret;
> +}
> +late_initcall(system_wakeup_latency_qos_init);
> +#endif /* CONFIG_PM_SLEEP */
> +
>  #ifdef CONFIG_CPU_IDLE
>  /* Definitions related to the CPU latency QoS. */
>
> --

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit
  2025-07-21 16:39   ` Rafael J. Wysocki
@ 2025-08-11 17:15     ` Kevin Hilman
  2025-08-11 19:16       ` Rafael J. Wysocki
  0 siblings, 1 reply; 9+ messages in thread
From: Kevin Hilman @ 2025-08-11 17:15 UTC (permalink / raw)
  To: Rafael J. Wysocki, Ulf Hansson
  Cc: Rafael J . Wysocki, linux-pm, Pavel Machek, Len Brown,
	Daniel Lezcano, Saravana Kannan, Maulik Shah, Prasad Sodagudi,
	linux-kernel

"Rafael J. Wysocki" <rafael@kernel.org> writes:

> On Wed, Jul 16, 2025 at 2:33 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
>>
>> Some platforms and devices supports multiple low-power-states than can be
>> used for system-wide suspend. Today these states are selected on per
>> subsystem basis and in most cases it's the deepest possible state that
>> becomes selected.
>>
>> For some use-cases this is a problem as it isn't suitable or even breaks
>> the system-wakeup latency constraint, when we decide to enter these deeper
>> states during system-wide suspend.
>>
>> Therefore, let's introduce an interface for user-space, allowing us to
>> specify the system-wakeup QoS limit. Subsequent changes will start taking
>> into account the QoS limit.
>
> Well, this is not really a system-wakeup limit, but a CPU idle state
> latency limit for states entered in the last step of suspend-to-idle.
>
> It looks like the problem is that the existing CPU latency QoS is not
> taken into account by suspend-to-idle, so instead of adding an
> entirely new interface to overcome this, would it make sense to add an
> ioctl() to the existing one that would allow the user of it to
> indicate that the given request should also be respected by
> suspend-to-idle?
>
> There are two basic reasons why I think so:
> (1) The requests that you want to be respected by suspend-to-idle
> should also be respected by the regular "runtime" idle, or at least I
> don't see a reason why it wouldn't be the case.
> (2) The new interface introduced by this patch basically duplicates
> the existing one.

I also think that just using the existing /dev/cpu_dma_latency is the
right approach here, and simply teaching s2idle to respect this value.

I'm curious about the need for a new ioctl() though.  Under what
conditions do you want normal/runtime CPUidle to respect this value and
s2idle to not respect this value?

Kevin

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit
  2025-08-11 17:15     ` Kevin Hilman
@ 2025-08-11 19:16       ` Rafael J. Wysocki
  2025-08-12  9:26         ` Ulf Hansson
  0 siblings, 1 reply; 9+ messages in thread
From: Rafael J. Wysocki @ 2025-08-11 19:16 UTC (permalink / raw)
  To: Kevin Hilman
  Cc: Rafael J. Wysocki, Ulf Hansson, linux-pm, Pavel Machek, Len Brown,
	Daniel Lezcano, Saravana Kannan, Maulik Shah, Prasad Sodagudi,
	linux-kernel

On Mon, Aug 11, 2025 at 7:16 PM Kevin Hilman <khilman@baylibre.com> wrote:
>
> "Rafael J. Wysocki" <rafael@kernel.org> writes:
>
> > On Wed, Jul 16, 2025 at 2:33 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> >>
> >> Some platforms and devices supports multiple low-power-states than can be
> >> used for system-wide suspend. Today these states are selected on per
> >> subsystem basis and in most cases it's the deepest possible state that
> >> becomes selected.
> >>
> >> For some use-cases this is a problem as it isn't suitable or even breaks
> >> the system-wakeup latency constraint, when we decide to enter these deeper
> >> states during system-wide suspend.
> >>
> >> Therefore, let's introduce an interface for user-space, allowing us to
> >> specify the system-wakeup QoS limit. Subsequent changes will start taking
> >> into account the QoS limit.
> >
> > Well, this is not really a system-wakeup limit, but a CPU idle state
> > latency limit for states entered in the last step of suspend-to-idle.
> >
> > It looks like the problem is that the existing CPU latency QoS is not
> > taken into account by suspend-to-idle, so instead of adding an
> > entirely new interface to overcome this, would it make sense to add an
> > ioctl() to the existing one that would allow the user of it to
> > indicate that the given request should also be respected by
> > suspend-to-idle?
> >
> > There are two basic reasons why I think so:
> > (1) The requests that you want to be respected by suspend-to-idle
> > should also be respected by the regular "runtime" idle, or at least I
> > don't see a reason why it wouldn't be the case.
> > (2) The new interface introduced by this patch basically duplicates
> > the existing one.
>
> I also think that just using the existing /dev/cpu_dma_latency is the
> right approach here, and simply teaching s2idle to respect this value.
>
> I'm curious about the need for a new ioctl() though.  Under what
> conditions do you want normal/runtime CPUidle to respect this value and
> s2idle to not respect this value?

In a typical PC environment s2idle is a replacement for ACPI S3 which
does not take any QoS constraints into account, so users may want to
set QoS limits for run-time and then suspend with the expectation that
QoS will not affect it.

^ permalink raw reply	[flat|nested] 9+ messages in thread

* Re: [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit
  2025-08-11 19:16       ` Rafael J. Wysocki
@ 2025-08-12  9:26         ` Ulf Hansson
  0 siblings, 0 replies; 9+ messages in thread
From: Ulf Hansson @ 2025-08-12  9:26 UTC (permalink / raw)
  To: Rafael J. Wysocki, Kevin Hilman
  Cc: linux-pm, Pavel Machek, Len Brown, Daniel Lezcano,
	Saravana Kannan, Maulik Shah, Prasad Sodagudi, linux-kernel

On Mon, 11 Aug 2025 at 21:16, Rafael J. Wysocki <rafael@kernel.org> wrote:
>
> On Mon, Aug 11, 2025 at 7:16 PM Kevin Hilman <khilman@baylibre.com> wrote:
> >
> > "Rafael J. Wysocki" <rafael@kernel.org> writes:
> >
> > > On Wed, Jul 16, 2025 at 2:33 PM Ulf Hansson <ulf.hansson@linaro.org> wrote:
> > >>
> > >> Some platforms and devices supports multiple low-power-states than can be
> > >> used for system-wide suspend. Today these states are selected on per
> > >> subsystem basis and in most cases it's the deepest possible state that
> > >> becomes selected.
> > >>
> > >> For some use-cases this is a problem as it isn't suitable or even breaks
> > >> the system-wakeup latency constraint, when we decide to enter these deeper
> > >> states during system-wide suspend.
> > >>
> > >> Therefore, let's introduce an interface for user-space, allowing us to
> > >> specify the system-wakeup QoS limit. Subsequent changes will start taking
> > >> into account the QoS limit.
> > >
> > > Well, this is not really a system-wakeup limit, but a CPU idle state
> > > latency limit for states entered in the last step of suspend-to-idle.
> > >
> > > It looks like the problem is that the existing CPU latency QoS is not
> > > taken into account by suspend-to-idle, so instead of adding an
> > > entirely new interface to overcome this, would it make sense to add an
> > > ioctl() to the existing one that would allow the user of it to
> > > indicate that the given request should also be respected by
> > > suspend-to-idle?
> > >
> > > There are two basic reasons why I think so:
> > > (1) The requests that you want to be respected by suspend-to-idle
> > > should also be respected by the regular "runtime" idle, or at least I
> > > don't see a reason why it wouldn't be the case.
> > > (2) The new interface introduced by this patch basically duplicates
> > > the existing one.
> >
> > I also think that just using the existing /dev/cpu_dma_latency is the
> > right approach here, and simply teaching s2idle to respect this value.
> >
> > I'm curious about the need for a new ioctl() though.  Under what
> > conditions do you want normal/runtime CPUidle to respect this value and
> > s2idle to not respect this value?
>
> In a typical PC environment s2idle is a replacement for ACPI S3 which
> does not take any QoS constraints into account, so users may want to
> set QoS limits for run-time and then suspend with the expectation that
> QoS will not affect it.

Yes, I agree. To me, these are orthogonal use-cases which could have
different wakeup latency constraints.

Adding an ioctl for /dev/cpu_dma_latency, as suggested by Rafael would
allow this to be managed, I think.

Although, I am not fully convinced yet that re-using
/dev/cpu_dma_latency is the right path. The main reason is that I
don't want us to limit the use-case to CPU latencies, but rather allow
the QoS constraint to be system-wide for any type of device. For
example, it could be used by storage drivers too (like NVMe, UFS,
eMMC), as a way to understand what low power state to pick as system
wide suspend. If you have a closer look at patch2 [1] , I suggest we
extend the genpd-governor for *both* CPU-cluster-PM-domains and for
other PM-domains too.

Interested to hear your thoughts around this.

Kind regards
Uffe

[1]
https://lore.kernel.org/all/20250716123323.65441-3-ulf.hansson@linaro.org/

^ permalink raw reply	[flat|nested] 9+ messages in thread

end of thread, other threads:[~2025-08-12  9:27 UTC | newest]

Thread overview: 9+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-07-16 12:33 [RFC/PATCH 0/3] PM: QoS: Introduce a system-wakeup QoS limit for s2idle and genpd Ulf Hansson
2025-07-16 12:33 ` [RFC/PATCH 1/3] PM: QoS: Introduce a system-wakeup QoS limit Ulf Hansson
2025-07-21 16:39   ` Rafael J. Wysocki
2025-08-11 17:15     ` Kevin Hilman
2025-08-11 19:16       ` Rafael J. Wysocki
2025-08-12  9:26         ` Ulf Hansson
2025-07-16 12:33 ` [RFC/PATCH 2/3] pmdomain: Respect the system-wakeup QoS limit at system-wide suspend Ulf Hansson
2025-07-16 12:33 ` [RFC/PATCH 3/3] cpuidle: Respect the system-wakeup QoS limit for s2idle Ulf Hansson
2025-07-21 16:04   ` Rafael J. Wysocki

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).