* [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs @ 2025-10-20 14:17 Ulf Hansson 2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson 2025-10-20 14:17 ` [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson 0 siblings, 2 replies; 7+ messages in thread From: Ulf Hansson @ 2025-10-20 14:17 UTC (permalink / raw) To: Rafael J . Wysocki, Thomas Gleixner Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel, Ulf Hansson Platforms using the genpd governor for CPUs are relying on it to find the most optimal idlestate for a group of CPUs. Although, observations tells us that there are some significant improvement that can be made around this. These improvement are based upon allowing us to take pending IPIs into account for the group of CPUs that the genpd governor is in control of. If there is pending IPI for any of these CPUs, we should not request an idlestate that affects the group, but rather pick a shallower state that affects only the CPU. More details are available in the commit messages for each patch. Kind regards Ulf Hansson Ulf Hansson (2): smp: Introduce a helper function to check for pending IPIs pmdomain: Extend the genpd governor for CPUs to account for IPIs drivers/pmdomain/governor.c | 20 +++++++++++++------- include/linux/smp.h | 5 +++++ kernel/smp.c | 24 ++++++++++++++++++++++++ 3 files changed, 42 insertions(+), 7 deletions(-) -- 2.43.0 ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs 2025-10-20 14:17 [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs Ulf Hansson @ 2025-10-20 14:17 ` Ulf Hansson 2025-10-20 19:10 ` Ben Horgan 2025-10-27 17:20 ` Thomas Gleixner 2025-10-20 14:17 ` [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson 1 sibling, 2 replies; 7+ messages in thread From: Ulf Hansson @ 2025-10-20 14:17 UTC (permalink / raw) To: Rafael J . Wysocki, Thomas Gleixner Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel, Ulf Hansson When governors used during cpuidle, tries to find the most optimal idlestate for a CPU or a group of CPUs, they are known to quite often fail. One reason for this, is that we are not taking into account whether there has been an IPI scheduled for any of the CPUs that are affected by the selected idlestate. To enable pending IPIs to be taken into account for cpuidle decisions, let's introduce a new helper function, cpus_may_have_pending_ipi(). Note that, the implementation is intentionally as lightweight as possible, in favor of always providing the correct information. For cpuidle decisions this is good enough. Suggested-by: Thomas Gleixner <tglx@linutronix.de> Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> --- Changes in v2: - Implemented a common function, rather than making it arch-specific. As suggested by Thomas and Marc. - Renamed the function to indicate that it doesn't provide correctness. - Clarified function description and commit message. --- include/linux/smp.h | 5 +++++ kernel/smp.c | 24 ++++++++++++++++++++++++ 2 files changed, 29 insertions(+) diff --git a/include/linux/smp.h b/include/linux/smp.h index 18e9c918325e..093e5458493e 100644 --- a/include/linux/smp.h +++ b/include/linux/smp.h @@ -168,6 +168,7 @@ int smp_call_function_any(const struct cpumask *mask, void kick_all_cpus_sync(void); void wake_up_all_idle_cpus(void); +bool cpus_may_have_pending_ipi(const struct cpumask *mask); /* * Generic and arch helpers @@ -216,6 +217,10 @@ smp_call_function_any(const struct cpumask *mask, smp_call_func_t func, static inline void kick_all_cpus_sync(void) { } static inline void wake_up_all_idle_cpus(void) { } +static inline bool cpus_may_have_pending_ipi(const struct cpumask *mask) +{ + return false; +} #define setup_max_cpus 0 diff --git a/kernel/smp.c b/kernel/smp.c index 02f52291fae4..775f90790935 100644 --- a/kernel/smp.c +++ b/kernel/smp.c @@ -1087,6 +1087,30 @@ void wake_up_all_idle_cpus(void) } EXPORT_SYMBOL_GPL(wake_up_all_idle_cpus); +/** + * cpus_may_have_pending_ipi - Check for pending IPIs for CPUs + * @mask: The CPU mask for the CPUs to check. + * + * This function walks through the @mask to check if there are any pending IPIs + * scheduled, for any of the CPUs in the @mask. + * + * It's important for the caller to know that this function does not guarantee + * correctness, as the intent is to be as lightweight as possible. + * + * Returns true if there is a pending IPI scheduled and false otherwise. + */ +bool cpus_may_have_pending_ipi(const struct cpumask *mask) +{ + unsigned int cpu; + + for_each_cpu(cpu, mask) { + if (!llist_empty(per_cpu_ptr(&call_single_queue, cpu))) + return true; + } + + return false; +} + /** * struct smp_call_on_cpu_struct - Call a function on a specific CPU * @work: &work_struct -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs 2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson @ 2025-10-20 19:10 ` Ben Horgan 2025-10-21 10:08 ` Ulf Hansson 2025-10-27 17:20 ` Thomas Gleixner 1 sibling, 1 reply; 7+ messages in thread From: Ben Horgan @ 2025-10-20 19:10 UTC (permalink / raw) To: Ulf Hansson, Rafael J . Wysocki, Thomas Gleixner Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel Hi Ulf, Only a comment on the naming rather than a full review. On 10/20/25 15:17, Ulf Hansson wrote: > When governors used during cpuidle, tries to find the most optimal > idlestate for a CPU or a group of CPUs, they are known to quite often fail. > One reason for this, is that we are not taking into account whether there > has been an IPI scheduled for any of the CPUs that are affected by the > selected idlestate. > > To enable pending IPIs to be taken into account for cpuidle decisions, > let's introduce a new helper function, cpus_may_have_pending_ipi(). To me, "may" indicates permission, i.e. is allowed, rather than correctness. Would "likely" be better here, cpus_likely_have_pending_ipi()? > > Note that, the implementation is intentionally as lightweight as possible, > in favor of always providing the correct information. For cpuidle decisions > this is good enough. > > Suggested-by: Thomas Gleixner <tglx@linutronix.de> > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > --- > > Changes in v2: > - Implemented a common function, rather than making it arch-specific. As > suggested by Thomas and Marc. > - Renamed the function to indicate that it doesn't provide correctness. > - Clarified function description and commit message. > -- Thanks, Ben ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs 2025-10-20 19:10 ` Ben Horgan @ 2025-10-21 10:08 ` Ulf Hansson 2025-10-27 17:15 ` Thomas Gleixner 0 siblings, 1 reply; 7+ messages in thread From: Ulf Hansson @ 2025-10-21 10:08 UTC (permalink / raw) To: Ben Horgan Cc: Rafael J . Wysocki, Thomas Gleixner, Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel On Mon, 20 Oct 2025 at 21:11, Ben Horgan <ben.horgan@arm.com> wrote: > > Hi Ulf, > > Only a comment on the naming rather than a full review. > > On 10/20/25 15:17, Ulf Hansson wrote: > > When governors used during cpuidle, tries to find the most optimal > > idlestate for a CPU or a group of CPUs, they are known to quite often fail. > > One reason for this, is that we are not taking into account whether there > > has been an IPI scheduled for any of the CPUs that are affected by the > > selected idlestate. > > > > To enable pending IPIs to be taken into account for cpuidle decisions, > > let's introduce a new helper function, cpus_may_have_pending_ipi(). > > To me, "may" indicates permission, i.e. is allowed, rather than > correctness. Would "likely" be better here, cpus_likely_have_pending_ipi()? Sure, that sounds better to me too. I leave it a few days to allow people to provide their additional input, before posting a new version with the new name of the function. > > > > > Note that, the implementation is intentionally as lightweight as possible, > > in favor of always providing the correct information. For cpuidle decisions > > this is good enough. > > > > Suggested-by: Thomas Gleixner <tglx@linutronix.de> > > Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> > > --- > > > > Changes in v2: > > - Implemented a common function, rather than making it arch-specific. As > > suggested by Thomas and Marc. > > - Renamed the function to indicate that it doesn't provide correctness. > > - Clarified function description and commit message. > > > -- > Thanks, > > Ben > Kind regards Uffe ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs 2025-10-21 10:08 ` Ulf Hansson @ 2025-10-27 17:15 ` Thomas Gleixner 0 siblings, 0 replies; 7+ messages in thread From: Thomas Gleixner @ 2025-10-27 17:15 UTC (permalink / raw) To: Ulf Hansson, Ben Horgan Cc: Rafael J . Wysocki, Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel On Tue, Oct 21 2025 at 12:08, Ulf Hansson wrote: > On Mon, 20 Oct 2025 at 21:11, Ben Horgan <ben.horgan@arm.com> wrote: >> >> Hi Ulf, >> >> Only a comment on the naming rather than a full review. >> >> On 10/20/25 15:17, Ulf Hansson wrote: >> > When governors used during cpuidle, tries to find the most optimal >> > idlestate for a CPU or a group of CPUs, they are known to quite often fail. >> > One reason for this, is that we are not taking into account whether there >> > has been an IPI scheduled for any of the CPUs that are affected by the >> > selected idlestate. >> > >> > To enable pending IPIs to be taken into account for cpuidle decisions, >> > let's introduce a new helper function, cpus_may_have_pending_ipi(). >> >> To me, "may" indicates permission, i.e. is allowed, rather than >> correctness. Would "likely" be better here, cpus_likely_have_pending_ipi()? > > Sure, that sounds better to me too. cpus_peek_for_pending_ipis() perhaps? ^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs 2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson 2025-10-20 19:10 ` Ben Horgan @ 2025-10-27 17:20 ` Thomas Gleixner 1 sibling, 0 replies; 7+ messages in thread From: Thomas Gleixner @ 2025-10-27 17:20 UTC (permalink / raw) To: Ulf Hansson, Rafael J . Wysocki Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel, Ulf Hansson On Mon, Oct 20 2025 at 16:17, Ulf Hansson wrote: > When governors used during cpuidle, tries to find the most optimal When governors used during cpuidle trie to ... Both plural and no comma. > idlestate for a CPU or a group of CPUs, they are known to quite often fail. idle state > One reason for this, is that we are not taking into account whether there ...for this is, that they are not taking into account > has been an IPI scheduled for any of the CPUs that are affected by the > selected idlestate. > > To enable pending IPIs to be taken into account for cpuidle decisions, > let's introduce a new helper function, cpus_may_have_pending_ipi(). s/let's// > Note that, the implementation is intentionally as lightweight as possible, > in favor of always providing the correct information. That sentence doesn't make sense. It's a snapshot and therefore can't provide the correct information. > For cpuidle decisions this is good enough. Thanks, tglx ^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs 2025-10-20 14:17 [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs Ulf Hansson 2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson @ 2025-10-20 14:17 ` Ulf Hansson 1 sibling, 0 replies; 7+ messages in thread From: Ulf Hansson @ 2025-10-20 14:17 UTC (permalink / raw) To: Rafael J . Wysocki, Thomas Gleixner Cc: Mark Rutland, Marc Zyngier, Maulik Shah, Sudeep Holla, Daniel Lezcano, Vincent Guittot, linux-pm, linux-arm-kernel, linux-kernel, Ulf Hansson When the genpd governor for CPUs, tries to select the most optimal idlestate for a group of CPUs managed in a PM domain, it fails far too often. On a Dragonboard 410c, which is an arm64 based platform with 4 CPUs in one cluster that is using PSCI OS-initiated mode, we can observe that we often fail when trying to enter the selected idlestate. This is certainly a suboptimal behaviour that leads to many unnecessary requests being sent to the PSCI FW. A simple dd operation that reads from the eMMC, to generate some IRQs and I/O handling helps us to understand the problem, while also monitoring the rejected counters in debugfs for the corresponding idlestates of the genpd in question. Menu governor: cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 1451 437 91 149 0 S1 65194 558 149 172 0 dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500 524288000 bytes (500.0MB) copied, 3.562698 seconds, 140.3MB/s cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 2694 1073 265 892 1 S1 74567 829 561 790 0 The dd completed in ~3.6 seconds and rejects increased with 586. Teo governor: cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 4976 2096 392 1721 2 S1 160661 1893 1309 1904 0 dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500 524288000 bytes (500.0MB) copied, 3.543225 seconds, 141.1MB/s cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 5192 2194 433 1830 2 S1 167677 2891 3184 4729 0 The dd completed in ~3.6 seconds and rejects increased with 1916. The main reason to the above problem is pending IPIs for one of the CPUs that is affected by the idlestate that the genpd governor selected. This leads to that the PSCI FW refuses to enter it. To improve the behaviour, let's start to take into account pending IPIs for CPUs in the genpd governor, hence we fallback to use the shallower per CPU idlestate. Re-testing with this change shows a significant improved behaviour. - Menu governor: cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 2556 878 19 368 1 S1 69974 596 10 152 0 dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500 524288000 bytes (500.0MB) copied, 3.522010 seconds, 142.0MB/s cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 3360 1320 28 819 1 S1 70168 710 11 267 0 The dd completed in ~3.5 seconds and rejects increased with 10. - Teo governor cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 5145 1861 39 938 1 S1 188887 3117 51 1975 0 dd if=/dev/mmcblk0 of=/dev/null bs=1M count=500 524288000 bytes (500.0MB) copied, 3.653100 seconds, 136.9MB/s cat /sys/kernel/debug/pm_genpd/power-domain-cluster/idle_states State Time Spent(ms) Usage Rejected Above Below S0 5260 1923 42 1002 1 S1 190849 4033 52 2892 0 The dd completed in ~3.7 seconds and rejects increased with 4. Note that, the rejected counters in genpd are also being accumulated in the rejected counters that are managed by cpuidle, yet on a per CPU idlestates basis. Comparing these counters before/after this change, through cpuidle's sysfs interface shows the similar improvements. Signed-off-by: Ulf Hansson <ulf.hansson@linaro.org> --- Changes in v2: - Use the new name of the helper function. - Re-test and update the statistics in the commit message. --- drivers/pmdomain/governor.c | 20 +++++++++++++------- 1 file changed, 13 insertions(+), 7 deletions(-) diff --git a/drivers/pmdomain/governor.c b/drivers/pmdomain/governor.c index 39359811a930..ed2ce9b6f8d1 100644 --- a/drivers/pmdomain/governor.c +++ b/drivers/pmdomain/governor.c @@ -404,15 +404,21 @@ static bool cpu_power_down_ok(struct dev_pm_domain *pd) if ((idle_duration_ns >= (genpd->states[i].residency_ns + genpd->states[i].power_off_latency_ns)) && (global_constraint >= (genpd->states[i].power_on_latency_ns + - genpd->states[i].power_off_latency_ns))) { - genpd->state_idx = i; - genpd->gd->last_enter = now; - genpd->gd->reflect_residency = true; - return true; - } + genpd->states[i].power_off_latency_ns))) + break; + } while (--i >= 0); - return false; + if (i < 0) + return false; + + if (cpus_may_have_pending_ipi(genpd->cpus)) + return false; + + genpd->state_idx = i; + genpd->gd->last_enter = now; + genpd->gd->reflect_residency = true; + return true; } struct dev_power_governor pm_domain_cpu_gov = { -- 2.43.0 ^ permalink raw reply related [flat|nested] 7+ messages in thread
end of thread, other threads:[~2025-10-27 17:20 UTC | newest] Thread overview: 7+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2025-10-20 14:17 [PATCH v2 0/2] pmdomain: Improve idlestate selection for CPUs Ulf Hansson 2025-10-20 14:17 ` [PATCH v2 1/2] smp: Introduce a helper function to check for pending IPIs Ulf Hansson 2025-10-20 19:10 ` Ben Horgan 2025-10-21 10:08 ` Ulf Hansson 2025-10-27 17:15 ` Thomas Gleixner 2025-10-27 17:20 ` Thomas Gleixner 2025-10-20 14:17 ` [PATCH v2 2/2] pmdomain: Extend the genpd governor for CPUs to account for IPIs Ulf Hansson
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox; as well as URLs for NNTP newsgroup(s).