* Re: [PATCH 0/6] timers/migration: Handle heterogenous CPU capacities
2026-06-03 22:50 ` [PATCH 0/6] timers/migration: Handle heterogenous CPU capacities Christian Loehle
@ 2026-06-04 13:36 ` Frederic Weisbecker
0 siblings, 0 replies; 2+ messages in thread
From: Frederic Weisbecker @ 2026-06-04 13:36 UTC (permalink / raw)
To: Christian Loehle
Cc: LKML, Thomas Gleixner, Anna-Maria Behnsen, Sehee Jeong,
Qais Yousef, John Stultz, Rafael J. Wysocki, Andrea Righi,
Dietmar Eggemann, linux-pm
Le Wed, Jun 03, 2026 at 11:50:58PM +0100, Christian Loehle a écrit :
> On 4/23/26 17:53, Frederic Weisbecker wrote:
> > Hi,
> >
> > This is a late follow-up after:
> >
> > https://lore.kernel.org/lkml/20250910074251.8148-1-sehee1.jeong@samsung.com/
> >
> > To summarize, heterogenous capacity CPUs migrate their timers
> > indifferently between big and little CPUs. And this happens to be often
> > migrated to big CPUs, increasing their idle target residency.
> >
> > Thomas proposed to isolate the hierarchy between big and little CPUs.
> > So here is a try. Note I haven't tested on real heterogenous hardware
> > so if you have it, please test it!
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/frederic/linux-dynticks.git
> > timers/core
> >
> > HEAD: f0a87af6dab6f3a6dd8a603a2b9d7dcc86fd50e4
> > Thanks,
> > Frederic
> > ---
> >
> > Frederic Weisbecker (6):
> > timers/migration: Fix another hotplug activation race
> > timers/migration: Abstract out hierarchy to prepare for CPU capacity awareness
> > timers/migration: Track CPUs in a hierarchy
> > timers/migration: Split per-capacity hierarchies
> > timers/migration: Handle capacity in connect tracepoints
> > scripts/timers: Add timer_migration_tree.py
> >
> > include/trace/events/timer_migration.h | 24 ++--
> > kernel/time/timer_migration.c | 246 ++++++++++++++++++++++++---------
> > kernel/time/timer_migration.h | 19 +++
> > scripts/timer_migration_tree.py | 122 ++++++++++++++++
> > 4 files changed, 337 insertions(+), 74 deletions(-)
>
> Hi Frederic,
> sorry for the late reaction to this, I completely missed it (CCing
> linux-pm would have helped :) ).
Good point, next time I'll do!
>
> I'm not convinced that unconditionally splitting the timer migration
> hierarchy per-capacity is always the right tradeoff from a power point of
> view. On some asymmetric systems we only have one or two CPUs in a given
> capacity class. In that case the split can effectively remove most of the
> useful timer migration opportunity for that class, even though allowing
> migration across nearby capacities may still be better for idle residency.
>
> I tested this on an Orion O6 system with the following topology:
>
> online CPUs: 0-11
>
> capacity 279: CPUs 2,3,4,5
> capacity 866: CPUs 8,9
> capacity 905: CPUs 6,7
> capacity 984: CPUs 10,11
> capacity 1024: CPUs 0,1
>
> I compared the series up to and including the preparatory/refactoring
> patch 3 against the full series including the per-capacity hierarchy split.
> The numbers below are aggregate cpuidle residency deltas over a 600s run.
>
> Idle workload:
>
> variant LPI-0 LPI-1 LPI-2 LPI-1+2
> base 2298.7s 1253.8s 2817.0s 4070.8s
> full 2298.8s 1306.1s 2758.7s 4064.7s
> delta +0.1s +52.3s -58.3s -6.1s
>
> Grouped by capacity class, the LPI-2 loss is mostly on the lower-capacity
> CPUs:
>
> group base LPI-2 full LPI-2 delta full
> 279 1073.5s 1031.9s -41.6s
> 866 502.5s 486.4s -16.1s
> 905 499.7s 490.4s -9.3s
> 984 488.8s 496.0s +7.2s
> 1024 252.5s 254.0s +1.5s
>
> For a light tbench run (tbench -R 20 -t 600 4), the result is more mixed:
>
> variant LPI-0 LPI-1 LPI-2 LPI-1+2
> base 2593.5s 1483.4s 410.3s 1893.6s
> full 2605.3s 1446.5s 416.6s 1863.1s
> delta +11.8s -36.9s +6.3s -30.5s
>
> So tbench gets a small increase in deepest idle, but loses more in
> LPI-1+2 overall.
>
> If we do wanna keep the per-capacity hierarchy split, maybe it's sufficient to
> gate this behind there being either a small number of capacity classes or
> ensuring that they all have >=4 CPUs before splitting?
Ok I was afraid of something like that, ie: it works for some usages but not
on others.
And I don't know what to do. For example if I apply your suggested contraints,
on which hierarchy should go those capacities with < 4 CPUs ?
Thoughts?
>
> Kind regards,
> Christian
>
--
Frederic Weisbecker
SUSE Labs
^ permalink raw reply [flat|nested] 2+ messages in thread