linux-pm.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v3 0/2] pmdomain: Improve idle state selection for CPUs
@ 2025-10-31 11:00 Ulf Hansson
  2025-10-31 11:00 ` [PATCH v3 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
  2025-10-31 11:00 ` [PATCH v3 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson
  0 siblings, 2 replies; 4+ messages in thread
From: Ulf Hansson @ 2025-10-31 11:00 UTC (permalink / raw)
  To: Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, Ben Horgan, linux-pm,
	linux-arm-kernel, linux-kernel, Ulf Hansson

Platforms using the genpd governor for CPUs are relying on it to find the most
optimal idle state for a group of CPUs. Although, observations tells us that
there are some significant improvement that can be made around this.

These improvement are based upon allowing us to take pending IPIs into account
for the group of CPUs that the genpd governor is in control of. If there is
pending IPI for any of these CPUs, we should not request an idle state that
affects the group, but rather pick a shallower state that affects only the CPU.

More details are available in the commit messages for each patch.

Kind regards
Ulf Hansson

Ulf Hansson (2):
  smp: Introduce a helper function to check for pending IPIs
  pmdomain: Extend the genpd governor for CPUs to account for IPIs

 drivers/pmdomain/governor.c | 20 +++++++++++++-------
 include/linux/smp.h         |  5 +++++
 kernel/smp.c                | 24 ++++++++++++++++++++++++
 3 files changed, 42 insertions(+), 7 deletions(-)

-- 
2.43.0


^ permalink raw reply	[flat|nested] 4+ messages in thread

* [PATCH v3 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-31 11:00 [PATCH v3 0/2] pmdomain: Improve idle state selection for CPUs Ulf Hansson
@ 2025-10-31 11:00 ` Ulf Hansson
  2025-11-01 19:59   ` Thomas Gleixner
  2025-10-31 11:00 ` [PATCH v3 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson
  1 sibling, 1 reply; 4+ messages in thread
From: Ulf Hansson @ 2025-10-31 11:00 UTC (permalink / raw)
  To: Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, Ben Horgan, linux-pm,
	linux-arm-kernel, linux-kernel, Ulf Hansson

When governors used during cpuidle try to find the most optimal idle state
for a CPU or a group of CPUs, they are known to quite often fail. One
reason for this is, that they are not taking into account whether there has
been an IPI scheduled for any of the CPUs that are affected by the selected
idle state.

To enable pending IPIs to be taken into account for cpuidle decisions,
introduce a new helper function, cpus_peek_for_pending_ipi().

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v3:
	- Renamed the helper function and updated its description.
	- Updated the commit message.

Changes in v2:
	- Implemented a common function, rather than making it arch-specific. As
	suggested by Thomas and Marc.
	- Renamed the function to indicate that it doesn't provide correctness.
	- Clarified function description and commit message.

---
 include/linux/smp.h |  5 +++++
 kernel/smp.c        | 24 ++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 18e9c918325e..91d0ecf3b8d3 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -168,6 +168,7 @@ int smp_call_function_any(const struct cpumask *mask,
 
 void kick_all_cpus_sync(void);
 void wake_up_all_idle_cpus(void);
+bool cpus_peek_for_pending_ipi(const struct cpumask *mask);
 
 /*
  * Generic and arch helpers
@@ -216,6 +217,10 @@ smp_call_function_any(const struct cpumask *mask, smp_call_func_t func,
 
 static inline void kick_all_cpus_sync(void) {  }
 static inline void wake_up_all_idle_cpus(void) {  }
+static inline bool cpus_peek_for_pending_ipi(const struct cpumask *mask)
+{
+	return false;
+}
 
 #define setup_max_cpus 0
 
diff --git a/kernel/smp.c b/kernel/smp.c
index 02f52291fae4..4533e666152c 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1087,6 +1087,30 @@ void wake_up_all_idle_cpus(void)
 }
 EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus);
 
+/**
+ * cpus_peek_for_pending_ipi - Check for pending IPI for CPUs
+ * @mask: The CPU mask for the CPUs to check.
+ *
+ * This function walks through the @mask to check if there are any pending IPIs
+ * scheduled, for any of the CPUs in the @mask.
+ *
+ * It's important for the caller to know that this function does not guarantee
+ * correctness. It provides a snapshot, while being as lightweight as possible.
+ *
+ * Returns true if there is a pending IPI scheduled and false otherwise.
+ */
+bool cpus_peek_for_pending_ipi(const struct cpumask *mask)
+{
+	unsigned int cpu;
+
+	for_each_cpu(cpu, mask) {
+		if (!llist_empty(per_cpu_ptr(&call_single_queue, cpu)))
+			return true;
+	}
+
+        return false;
+}
+
 /**
  * struct smp_call_on_cpu_struct - Call a function on a specific CPU
  * @work: &work_struct
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* [PATCH v3 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs
  2025-10-31 11:00 [PATCH v3 0/2] pmdomain: Improve idle state selection for CPUs Ulf Hansson
  2025-10-31 11:00 ` [PATCH v3 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
@ 2025-10-31 11:00 ` Ulf Hansson
  1 sibling, 0 replies; 4+ messages in thread
From: Ulf Hansson @ 2025-10-31 11:00 UTC (permalink / raw)
  To: Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, Ben Horgan, linux-pm,
	linux-arm-kernel, linux-kernel, Ulf Hansson

When the genpd governor for CPUs, tries to select the most optimal idle
state for a group of CPUs managed in a PM domain, it fails far too often.

On a Dragonboard 410c, which is an arm64 based platform with 4 CPUs in one
cluster that is using PSCI OS-initiated mode, we can observe that we often
fail when trying to enter the selected idle state. This is certainly a
suboptimal behaviour that leads to many unnecessary requests being sent to
the PSCI FW.

A simple dd operation that reads from the eMMC, to generate some IRQs and
I/O handling helps us to understand the problem, while also monitoring the
rejected counters in debugfs for the corresponding idle states of the genpd
in question.

 Menu governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             1451           437        91         149        0
S1             65194          558        149        172        0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.562698 seconds, 140.3MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             2694           1073       265        892        1
S1             74567          829        561        790        0

 The dd completed in ~3.6 seconds and rejects increased with 586.

 Teo governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             4976           2096       392        1721       2
S1             160661         1893       1309       1904       0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.543225 seconds, 141.1MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             5192           2194       433        1830       2
S1             167677         2891       3184       4729       0

 The dd completed in ~3.6 seconds and rejects increased with 1916.

The main reason to the above problem is pending IPIs for one of the CPUs
that is affected by the idle state that the genpd governor selected. This
leads to that the PSCI FW refuses to enter it. To improve the behaviour,
let's start to take into account pending IPIs for CPUs in the genpd
governor, hence we fallback to use the shallower per CPU idle state.

 Re-testing with this change shows a significant improved behaviour.

 - Menu governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             2556           878        19         368        1
S1             69974          596        10         152        0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.522010 seconds, 142.0MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             3360           1320       28         819        1
S1             70168          710        11         267        0

 The dd completed in ~3.5 seconds and rejects increased with 10.

 - Teo governor
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             5145           1861       39         938        1
S1             188887         3117       51         1975       0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.653100 seconds, 136.9MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             5260           1923       42         1002       1
S1             190849         4033       52         2892       0

 The dd completed in ~3.7 seconds and rejects increased with 4.

Note that, the rejected counters in genpd are also being accumulated in the
rejected counters that are managed by cpuidle, yet on a per CPU idle states
basis. Comparing these counters before/after this change, through cpuidle's
sysfs interface shows the similar improvements.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v3:
	- Use the new name of the helper function.
	- Minor updates to the commit message.

Changes in v2:
	- Use the new name of the helper function.
	- Re-test and update the statistics in the commit message.

---
 drivers/pmdomain/governor.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
index 39359811a930..a46470f2261a 100644
--- a/drivers/pmdomain/governor.c
+++ b/drivers/pmdomain/governor.c
@@ -404,15 +404,21 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
 		if ((idle_duration_ns >= (genpd->states[i].residency_ns +
 		    genpd->states[i].power_off_latency_ns)) &&
 		    (global_constraint >= (genpd->states[i].power_on_latency_ns +
-		    genpd->states[i].power_off_latency_ns))) {
-			genpd->state_idx = i;
-			genpd->gd->last_enter = now;
-			genpd->gd->reflect_residency = true;
-			return true;
-		}
+		    genpd->states[i].power_off_latency_ns)))
+			break;
+
 	} while (--i >= 0);
 
-	return false;
+	if (i < 0)
+		return false;
+
+	if (cpus_peek_for_pending_ipi(genpd->cpus))
+		return false;
+
+	genpd->state_idx = i;
+	genpd->gd->last_enter = now;
+	genpd->gd->reflect_residency = true;
+	return true;
 }
 
 struct dev_power_governor pm_domain_cpu_gov = {
-- 
2.43.0


^ permalink raw reply related	[flat|nested] 4+ messages in thread

* Re: [PATCH v3 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-31 11:00 ` [PATCH v3 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
@ 2025-11-01 19:59   ` Thomas Gleixner
  0 siblings, 0 replies; 4+ messages in thread
From: Thomas Gleixner @ 2025-11-01 19:59 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, Ben Horgan, linux-pm,
	linux-arm-kernel, linux-kernel, Ulf Hansson

On Fri, Oct 31 2025 at 12:00, Ulf Hansson wrote:
> +/**
> + * cpus_peek_for_pending_ipi - Check for pending IPI for CPUs
> + * @mask: The CPU mask for the CPUs to check.
> + *
> + * This function walks through the @mask to check if there are any pending IPIs
> + * scheduled, for any of the CPUs in the @mask.
> + *
> + * It's important for the caller to know that this function does not guarantee
> + * correctness. It provides a snapshot, while being as lightweight as possible.

This read clumsy. Just say:

  * It does not guarantee correctness as it only can provide a racy snapshot.

or something like that.

> + * Returns true if there is a pending IPI scheduled and false otherwise.
> + */
> +bool cpus_peek_for_pending_ipi(const struct cpumask *mask)
> +{
> +	unsigned int cpu;
> +
> +	for_each_cpu(cpu, mask) {
> +		if (!llist_empty(per_cpu_ptr(&call_single_queue, cpu)))
> +			return true;
> +	}
> +
> +        return false;
   ^^^^^^^^
White space damage. Spaces instead of TAB.

Thanks,

        tglx

^ permalink raw reply	[flat|nested] 4+ messages in thread

end of thread, other threads:[~2025-11-01 19:59 UTC | newest]

Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-31 11:00 [PATCH v3 0/2] pmdomain: Improve idle state selection for CPUs Ulf Hansson
2025-10-31 11:00 ` [PATCH v3 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
2025-11-01 19:59   ` Thomas Gleixner
2025-10-31 11:00 ` [PATCH v3 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).