linux-arm-kernel.lists.infradead.org archive mirror
 help / color / mirror / Atom feed
* [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs
@ 2025-10-20 14:17 Ulf Hansson
  2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
  2025-10-20 14:17 ` [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson
  0 siblings, 2 replies; 7+ messages in thread
From: Ulf Hansson @ 2025-10-20 14:17 UTC (permalink / raw)
  To: Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel,
	linux-kernel, Ulf Hansson

Platforms using the genpd governor for CPUs are relying on it to find the most
optimal idlestate for a group of CPUs. Although, observations tells us that
there are some significant improvement that can be made around this.

These improvement are based upon allowing us to take pending IPIs into account
for the group of CPUs that the genpd governor is in control of. If there is
pending IPI for any of these CPUs, we should not request an idlestate that
affects the group, but rather pick a shallower state that affects only the CPU.

More details are available in the commit messages for each patch.

Kind regards
Ulf Hansson


Ulf Hansson (2):
  smp: Introduce a helper function to check for pending IPIs
  pmdomain: Extend the genpd governor for CPUs to account for IPIs

 drivers/pmdomain/governor.c | 20 +++++++++++++-------
 include/linux/smp.h         |  5 +++++
 kernel/smp.c                | 24 ++++++++++++++++++++++++
 3 files changed, 42 insertions(+), 7 deletions(-)

-- 
2.43.0



^ permalink raw reply	[flat|nested] 7+ messages in thread

* [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-20 14:17 [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs Ulf Hansson
@ 2025-10-20 14:17 ` Ulf Hansson
  2025-10-20 19:10   ` Ben Horgan
  2025-10-27 17:20   ` Thomas Gleixner
  2025-10-20 14:17 ` [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson
  1 sibling, 2 replies; 7+ messages in thread
From: Ulf Hansson @ 2025-10-20 14:17 UTC (permalink / raw)
  To: Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel,
	linux-kernel, Ulf Hansson

When governors used during cpuidle, tries to find the most optimal
idlestate for a CPU or a group of CPUs, they are known to quite often fail.
One reason for this, is that we are not taking into account whether there
has been an IPI scheduled for any of the CPUs that are affected by the
selected idlestate.

To enable pending IPIs to be taken into account for cpuidle decisions,
let's introduce a new helper function, cpus_may_have_pending_ipi().

Note that, the implementation is intentionally as lightweight as possible,
in favor of always providing the correct information. For cpuidle decisions
this is good enough.

Suggested-by: Thomas Gleixner <tglx@linutronix.de>
Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v2:
	- Implemented a common function, rather than making it arch-specific. As
	suggested by Thomas and Marc.
	- Renamed the function to indicate that it doesn't provide correctness.
	- Clarified function description and commit message.

---
 include/linux/smp.h |  5 +++++
 kernel/smp.c        | 24 ++++++++++++++++++++++++
 2 files changed, 29 insertions(+)

diff --git a/include/linux/smp.h b/include/linux/smp.h
index 18e9c918325e..093e5458493e 100644
--- a/include/linux/smp.h
+++ b/include/linux/smp.h
@@ -168,6 +168,7 @@ int smp_call_function_any(const struct cpumask *mask,
 
 void kick_all_cpus_sync(void);
 void wake_up_all_idle_cpus(void);
+bool cpus_may_have_pending_ipi(const struct cpumask *mask);
 
 /*
  * Generic and arch helpers
@@ -216,6 +217,10 @@ smp_call_function_any(const struct cpumask *mask, smp_call_func_t func,
 
 static inline void kick_all_cpus_sync(void) {  }
 static inline void wake_up_all_idle_cpus(void) {  }
+static inline bool cpus_may_have_pending_ipi(const struct cpumask *mask)
+{
+	return false;
+}
 
 #define setup_max_cpus 0
 
diff --git a/kernel/smp.c b/kernel/smp.c
index 02f52291fae4..775f90790935 100644
--- a/kernel/smp.c
+++ b/kernel/smp.c
@@ -1087,6 +1087,30 @@ void wake_up_all_idle_cpus(void)
 }
 EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus);
 
+/**
+ * cpus_may_have_pending_ipi - Check for pending IPIs for CPUs
+ * @mask: The CPU mask for the CPUs to check.
+ *
+ * This function walks through the @mask to check if there are any pending IPIs
+ * scheduled, for any of the CPUs in the @mask.
+ *
+ * It's important for the caller to know that this function does not guarantee
+ * correctness, as the intent is to be as lightweight as possible.
+ *
+ * Returns true if there is a pending IPI scheduled and false otherwise.
+ */
+bool cpus_may_have_pending_ipi(const struct cpumask *mask)
+{
+	unsigned int cpu;
+
+	for_each_cpu(cpu, mask) {
+		if (!llist_empty(per_cpu_ptr(&call_single_queue, cpu)))
+			return true;
+	}
+
+        return false;
+}
+
 /**
  * struct smp_call_on_cpu_struct - Call a function on a specific CPU
  * @work: &work_struct
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs
  2025-10-20 14:17 [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs Ulf Hansson
  2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
@ 2025-10-20 14:17 ` Ulf Hansson
  1 sibling, 0 replies; 7+ messages in thread
From: Ulf Hansson @ 2025-10-20 14:17 UTC (permalink / raw)
  To: Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel,
	linux-kernel, Ulf Hansson

When the genpd governor for CPUs, tries to select the most optimal
idlestate for a group of CPUs managed in a PM domain, it fails far too
often.

On a Dragonboard 410c, which is an arm64 based platform with 4 CPUs
in one cluster that is using PSCI OS-initiated mode, we can observe that we
often fail when trying to enter the selected idlestate. This is certainly a
suboptimal behaviour that leads to many unnecessary requests being sent to
the PSCI FW.

A simple dd operation that reads from the eMMC, to generate some IRQs and
I/O handling helps us to understand the problem, while also monitoring the
rejected counters in debugfs for the corresponding idlestates of the genpd
in question.

 Menu governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             1451           437        91         149        0
S1             65194          558        149        172        0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.562698 seconds, 140.3MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             2694           1073       265        892        1
S1             74567          829        561        790        0

 The dd completed in ~3.6 seconds and rejects increased with 586.

 Teo governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             4976           2096       392        1721       2
S1             160661         1893       1309       1904       0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.543225 seconds, 141.1MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             5192           2194       433        1830       2
S1             167677         2891       3184       4729       0

 The dd completed in ~3.6 seconds and rejects increased with 1916.

The main reason to the above problem is pending IPIs for one of the CPUs
that is affected by the idlestate that the genpd governor selected. This
leads to that the PSCI FW refuses to enter it. To improve the behaviour,
let's start to take into account pending IPIs for CPUs in the genpd
governor, hence we fallback to use the shallower per CPU idlestate.

 Re-testing with this change shows a significant improved behaviour.

 - Menu governor:
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             2556           878        19         368        1
S1             69974          596        10         152        0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.522010 seconds, 142.0MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             3360           1320       28         819        1
S1             70168          710        11         267        0

 The dd completed in ~3.5 seconds and rejects increased with 10.

 - Teo governor
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             5145           1861       39         938        1
S1             188887         3117       51         1975       0
dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500
524288000 bytes (500.0MB) copied, 3.653100 seconds, 136.9MB/s
cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states
State          Time Spent(ms) Usage      Rejected   Above      Below
S0             5260           1923       42         1002       1
S1             190849         4033       52         2892       0

 The dd completed in ~3.7 seconds and rejects increased with 4.

Note that, the rejected counters in genpd are also being accumulated in the
rejected counters that are managed by cpuidle, yet on a per CPU idlestates
basis. Comparing these counters before/after this change, through cpuidle's
sysfs interface shows the similar improvements.

Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
---

Changes in v2:
	- Use the new name of the helper function.
	- Re-test and update the statistics in the commit message.

---
 drivers/pmdomain/governor.c | 20 +++++++++++++-------
 1 file changed, 13 insertions(+), 7 deletions(-)

diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c
index 39359811a930..ed2ce9b6f8d1 100644
--- a/drivers/pmdomain/governor.c
+++ b/drivers/pmdomain/governor.c
@@ -404,15 +404,21 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd)
 		if ((idle_duration_ns >= (genpd->states[i].residency_ns +
 		    genpd->states[i].power_off_latency_ns)) &&
 		    (global_constraint >= (genpd->states[i].power_on_latency_ns +
-		    genpd->states[i].power_off_latency_ns))) {
-			genpd->state_idx = i;
-			genpd->gd->last_enter = now;
-			genpd->gd->reflect_residency = true;
-			return true;
-		}
+		    genpd->states[i].power_off_latency_ns)))
+			break;
+
 	} while (--i >= 0);
 
-	return false;
+	if (i < 0)
+		return false;
+
+	if (cpus_may_have_pending_ipi(genpd->cpus))
+		return false;
+
+	genpd->state_idx = i;
+	genpd->gd->last_enter = now;
+	genpd->gd->reflect_residency = true;
+	return true;
 }
 
 struct dev_power_governor pm_domain_cpu_gov = {
-- 
2.43.0



^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
@ 2025-10-20 19:10   ` Ben Horgan
  2025-10-21 10:08     ` Ulf Hansson
  2025-10-27 17:20   ` Thomas Gleixner
  1 sibling, 1 reply; 7+ messages in thread
From: Ben Horgan @ 2025-10-20 19:10 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki, Thomas Gleixner
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel,
	linux-kernel

Hi Ulf,

Only a comment on the naming rather than a full review.

On 10/20/25 15:17, Ulf Hansson wrote:
> When governors used during cpuidle, tries to find the most optimal
> idlestate for a CPU or a group of CPUs, they are known to quite often fail.
> One reason for this, is that we are not taking into account whether there
> has been an IPI scheduled for any of the CPUs that are affected by the
> selected idlestate.
> 
> To enable pending IPIs to be taken into account for cpuidle decisions,
> let's introduce a new helper function, cpus_may_have_pending_ipi().

To me, "may" indicates permission, i.e. is allowed, rather than
correctness. Would "likely" be better here, cpus_likely_have_pending_ipi()?

> 
> Note that, the implementation is intentionally as lightweight as possible,
> in favor of always providing the correct information. For cpuidle decisions
> this is good enough.
> 
> Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> ---
> 
> Changes in v2:
> 	- Implemented a common function, rather than making it arch-specific. As
> 	suggested by Thomas and Marc.
> 	- Renamed the function to indicate that it doesn't provide correctness.
> 	- Clarified function description and commit message.
> 
-- 
Thanks,

Ben



^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-20 19:10   ` Ben Horgan
@ 2025-10-21 10:08     ` Ulf Hansson
  2025-10-27 17:15       ` Thomas Gleixner
  0 siblings, 1 reply; 7+ messages in thread
From: Ulf Hansson @ 2025-10-21 10:08 UTC (permalink / raw)
  To: Ben Horgan
  Cc: Rafael J . Wysocki, Thomas Gleixner, Mark Rutland, Marc Zyngier,
	Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot,
	linux-pm, linux-arm-kernel, linux-kernel

On Mon, 20 Oct 2025 at 21:11, Ben Horgan <ben.horgan@arm.com> wrote:
>
> Hi Ulf,
>
> Only a comment on the naming rather than a full review.
>
> On 10/20/25 15:17, Ulf Hansson wrote:
> > When governors used during cpuidle, tries to find the most optimal
> > idlestate for a CPU or a group of CPUs, they are known to quite often fail.
> > One reason for this, is that we are not taking into account whether there
> > has been an IPI scheduled for any of the CPUs that are affected by the
> > selected idlestate.
> >
> > To enable pending IPIs to be taken into account for cpuidle decisions,
> > let's introduce a new helper function, cpus_may_have_pending_ipi().
>
> To me, "may" indicates permission, i.e. is allowed, rather than
> correctness. Would "likely" be better here, cpus_likely_have_pending_ipi()?

Sure, that sounds better to me too.

I leave it a few days to allow people to provide their additional
input, before posting a new version with the new name of the function.

>
> >
> > Note that, the implementation is intentionally as lightweight as possible,
> > in favor of always providing the correct information. For cpuidle decisions
> > this is good enough.
> >
> > Suggested-by: Thomas Gleixner <tglx@linutronix.de>
> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org>
> > ---
> >
> > Changes in v2:
> >       - Implemented a common function, rather than making it arch-specific. As
> >       suggested by Thomas and Marc.
> >       - Renamed the function to indicate that it doesn't provide correctness.
> >       - Clarified function description and commit message.
> >
> --
> Thanks,
>
> Ben
>

Kind regards
Uffe


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-21 10:08     ` Ulf Hansson
@ 2025-10-27 17:15       ` Thomas Gleixner
  0 siblings, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2025-10-27 17:15 UTC (permalink / raw)
  To: Ulf Hansson, Ben Horgan
  Cc: Rafael J . Wysocki, Mark Rutland, Marc Zyngier, Maulik Shah,
	Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm,
	linux-arm-kernel, linux-kernel

On Tue, Oct 21 2025 at 12:08, Ulf Hansson wrote:

> On Mon, 20 Oct 2025 at 21:11, Ben Horgan <ben.horgan@arm.com> wrote:
>>
>> Hi Ulf,
>>
>> Only a comment on the naming rather than a full review.
>>
>> On 10/20/25 15:17, Ulf Hansson wrote:
>> > When governors used during cpuidle, tries to find the most optimal
>> > idlestate for a CPU or a group of CPUs, they are known to quite often fail.
>> > One reason for this, is that we are not taking into account whether there
>> > has been an IPI scheduled for any of the CPUs that are affected by the
>> > selected idlestate.
>> >
>> > To enable pending IPIs to be taken into account for cpuidle decisions,
>> > let's introduce a new helper function, cpus_may_have_pending_ipi().
>>
>> To me, "may" indicates permission, i.e. is allowed, rather than
>> correctness. Would "likely" be better here, cpus_likely_have_pending_ipi()?
>
> Sure, that sounds better to me too.

cpus_peek_for_pending_ipis() perhaps?
  


^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs
  2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
  2025-10-20 19:10   ` Ben Horgan
@ 2025-10-27 17:20   ` Thomas Gleixner
  1 sibling, 0 replies; 7+ messages in thread
From: Thomas Gleixner @ 2025-10-27 17:20 UTC (permalink / raw)
  To: Ulf Hansson, Rafael J . Wysocki
  Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla,
	Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel,
	linux-kernel, Ulf Hansson

On Mon, Oct 20 2025 at 16:17, Ulf Hansson wrote:

> When governors used during cpuidle, tries to find the most optimal

When governors used during cpuidle trie to ...

Both plural and no comma.

> idlestate for a CPU or a group of CPUs, they are known to quite often fail.

idle state

> One reason for this, is that we are not taking into account whether there

...for this is, that they are not taking into account

> has been an IPI scheduled for any of the CPUs that are affected by the
> selected idlestate.
>
> To enable pending IPIs to be taken into account for cpuidle decisions,
> let's introduce a new helper function, cpus_may_have_pending_ipi().

s/let's//

> Note that, the implementation is intentionally as lightweight as possible,
> in favor of always providing the correct information.

That sentence doesn't make sense. It's a snapshot and therefore can't
provide the correct information.

>  For cpuidle decisions this is good enough.

Thanks,

        tglx


^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2025-10-27 17:20 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-10-20 14:17 [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs Ulf Hansson
2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson
2025-10-20 19:10   ` Ben Horgan
2025-10-21 10:08     ` Ulf Hansson
2025-10-27 17:15       ` Thomas Gleixner
2025-10-27 17:20   ` Thomas Gleixner
2025-10-20 14:17 ` [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).