public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dietmar Eggemann <dietmar.eggemann@arm.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: "Rafael J. Wysocki" <rjw@rjwysocki.net>,
	Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Lukasz Luba <lukasz.luba@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Morten Rasmussen <morten.rasmussen@arm.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Ricardo Neri <ricardo.neri-calderon@linux.intel.com>,
	Pierre Gondois <pierre.gondois@arm.com>
Subject: Re: [RFC][PATCH v021 5/9] PM: EM: Introduce em_dev_expand_perf_domain()
Date: Mon, 6 Jan 2025 13:59:59 +0100	[thread overview]
Message-ID: <8f26fac2-787d-44a9-a0cd-c3035a91149c@arm.com> (raw)
In-Reply-To: <CAJZ5v0jus4bzeZhUK4WC7uypQkh-_MMuU1M54figsGV3+5OhUg@mail.gmail.com>

On 17/12/2024 21:40, Rafael J. Wysocki wrote:
> On Tue, Dec 17, 2024 at 10:38 AM Dietmar Eggemann
> <dietmar.eggemann@arm.com> wrote:
>>
>> On 29/11/2024 17:02, Rafael J. Wysocki wrote:
>>> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>>
>>> Introduce a helper function for adding a CPU to an existing EM perf
>>> domain.
>>>
>>> Subsequently, this will be used by the intel_pstate driver to add new
>>> CPUs to existing perf domains when those CPUs go online for the first
>>> time after the initialization of the driver.
>>>
>>> No intentional functional impact.
>>>
>>> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
>>> ---
>>>
>>> v0.1 -> v0.2: No changes
>>
>> Could you add information why this new EM interface is needed?
> 
> There is some of it in the changelog already.
> 
> In fact, it is only needed in a corner case when the system starts
> with some CPUs offline and they only go online later (as a result of
> an explicit request from user space).  That is the only case when a
> new CPU may need to be added to an existing perf domain.

OK, I see. Arm doesn't need this since we derive the masks from the
CPUfreq policies so far.

I just verified, we both keep hotplugged-out CPU within the PD. That's
why we mask the PD cpus with cpu_online_mask in:

find_energy_efficient_cpu()

  ...

  for (; pd; pd = pd->next)

    cpumask_and(cpus, perf_domain_span(pd), cpu_online_mask);

>> IIRC, you can't use the existing way (cpufreq_driver::register_em) since
>> it gets called to early (3) for the PD cpumasks to be ready. This issue
>> will be there for any system in which uarch domains are not congruent
>> with clock domains which we hadn't have to deal with Arm's heterogeneous
>> CPUs so far.
> 
> Basically, yes.
> 
>> __init intel_pstate_init()
>>
>>   intel_pstate_register_driver()
>>
>>     cpufreq_register_driver()
>>
>>       subsys_interface_register()
>>
>>         sif->add_dev() -> cpufreq_add_dev()
>>
>>           cpufreq_online()
>>
>>             if (!new_policy && cpufreq_driver->online)
>>
>>             else
>>
>>               cpufreq_driver->init() -> intel_pstate_cpu_init()
>>
>>                 __intel_pstate_cpu_init()
>>
>>                   intel_pstate_init_cpu()
>>
>>                     intel_pstate_get_cpu_pstates()
>>
>>                       hybrid_add_to_domain()
>>
>>                         em_dev_expand_perf_domain()              <-- (1)
>>
>>                   intel_pstate_init_acpi_perf_limits()
>>
>>                     intel_pstate_set_itmt_prio()                 <-- (2)
>>
>>             if (new_policy)
>>
>>               cpufreq_driver->register_em()                      <-- (3)
>>
>>     hybrid_init_cpu_capacity_scaling()
>>
>>       hybrid_refresh_cpu_capacity_scaling()
>>
>>         __hybrid_refresh_cpu_capacity_scaling()                  <-- (4)
>>
>>         hybrid_register_all_perf_domains()
>>
>>           hybrid_register_perf_domain()
>>
>>             em_dev_register_perf_domain()                        <-- (5)
>>
>>       /* Enable EAS */
>>       sched_clear_itmt_support()                                 <-- (6)
>>
>> Debugging this on a 'nosmt' i7-13700K (online mask =
>> [0,2,4,6,8,10,12,14,16-23]
>>
>> (1) Add CPU to existing hybrid PD or create new hybrid PD.
> 
> Not exactly.
> 
> (1) is just to expand an existing perf domain if the CPU is new (was
> offline all the time previously).

OK.

> 
> Likewise, the direct hybrid_register_perf_domain() call in
> hybrid_add_to_domain() is to add a new perf domain if the given CPU is
> new (was offline all the time previously) and is the first one of the
> given type (say, the system is starting with all E-cores offline).
> It won't succeed before (4).
> 
> For CPUs that are online to start with, hybrid_add_to_domain() assigns
> them to hybrid domains and PDs are created for them in
> hybrid_register_all_perf_domains().

Understood.

> 
>> (2) Triggers sched domain rebuild (+ enabling EAS) already here during
>>     startup ?
> 
> This is just to enable ITMT which is the default mechanism for Intel
> hybrid platforms.  It also requires a rebuild of sched domains to be
> enabled.
> 
>>     IMHO, reason is that max_highest_perf > min_highest_perf because of
>>     different itmt prio
> 
> Yes (which means that the platform is at least not homogenous).
> 
> This really has been introduced for the handling of favored cores on
> otherwise non-hybrid platforms (say Tiger Lake).
> 
>>     Happens for CPU8 on my machine (after CPU8 is added to hybrid PD
>>     0,2,4,6,8) (itmt prio for CPU8=69 (1024) instead of 68 (1009)).
>>     So it looks like EAS is enabled before (6) ?
> 
> No, it is ITMT because CPU8 is a favored core.
> 
>> (3) ARM's way to do (5)
>> (4) Setting hybrid_max_perf_cpu
>> (5) Register EM here
>> (6) Actual call to initially triggers sched domain rebuild (+ enable
>>     EAS) (done already in (2) on my machine)
> 
> This is the second rebuild of sched domains to turn off ITMT and
> enable EAS.  The previous one is to enable ITMT.
> 
> The earlier enabling of ITMT could be avoided I think, but that would
> be a complication on platforms that contain favored cores but
> otherwise are not hybrid.

OK, the fact that EAS will already be enabled in (2) is not an issue IMHO.

> 
>> So (3) is not possible for Intel hybrid since the policy's cpumask(s)
> 
> It is possible in principle, but not particularly convenient because
> at that point it is not clear whether or not the platform is really
> hybrid and SMT is off and so whether or not EAS is to be used.
> 
>> contain only one CPUs, i.e. CPUs are not sharing clock.
>> And those cpumasks have to be build under (1) to be used in (5)?
> 
> They are built by the caller of (1) to be precise, but yes.

OK.

I still see an issue in putting all performance CPUs in one PD.

find_energy_efficient_cpu() assumes that all PD CPUs has the same CPU
capacity value.

  find_energy_efficient_cpu()

    ...

    for (; pd; pd = pd->next) {

      ...
      /* Account external pressure for the energy estimation */
      cpu = cpumask_first(cpus);
      cpu_actual_cap = get_actual_cpu_capacity(cpu); --> (1)


Even though X86 does not implement hw_load_avg() or
cpufreq_get_pressure() we would still assume that all PD CPUs have the
same cpu_capacity:


  get_actual_cpu_capacity() <-- (1)

    capacity = arch_scale_cpu_capacity(cpu)


Now the error introduced is small (1024 versus 1009) on my i7-13700K but
it's there. How big can those diffs based om itmt prio be?

  reply	other threads:[~2025-01-06 13:00 UTC|newest]

Thread overview: 31+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-11-29 15:55 [RFC][PATCH v021 0/9] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Rafael J. Wysocki
2024-11-29 15:56 ` [RFC][PATCH v021 1/9] cpufreq: intel_pstate: Use CPPC to get scaling factors Rafael J. Wysocki
2024-11-29 15:56 ` [RFC][PATCH v021 2/9] cpufreq: intel_pstate: Drop Arrow Lake from "scaling factor" list Rafael J. Wysocki
2024-11-29 15:59 ` [RFC][PATCH v021 3/9] PM: EM: Move perf rebuilding function from schedutil to EM Rafael J. Wysocki
2024-12-11 10:32   ` Christian Loehle
2024-11-29 16:00 ` [RFC][PATCH v021 4/9] sched/topology: Adjust cpufreq checks for EAS Rafael J. Wysocki
2024-12-11 10:33   ` Christian Loehle
2024-12-11 11:29     ` Rafael J. Wysocki
2024-12-11 11:43       ` Christian Loehle
2024-12-11 11:59         ` Rafael J. Wysocki
2024-12-11 13:25       ` Vincent Guittot
2024-12-11 16:37         ` Rafael J. Wysocki
2024-12-11 17:07           ` Vincent Guittot
2024-12-11 17:55             ` Rafael J. Wysocki
2024-12-12  8:34               ` Vincent Guittot
2024-12-16 14:49   ` Dietmar Eggemann
2024-12-16 14:58     ` Rafael J. Wysocki
2024-11-29 16:02 ` [RFC][PATCH v021 5/9] PM: EM: Introduce em_dev_expand_perf_domain() Rafael J. Wysocki
2024-12-17  9:38   ` Dietmar Eggemann
2024-12-17 20:40     ` Rafael J. Wysocki
2025-01-06 12:59       ` Dietmar Eggemann [this message]
2024-11-29 16:06 ` [RFC][PATCH v021 6/9] PM: EM: Call em_compute_costs() from em_create_perf_table() Rafael J. Wysocki
2024-11-29 16:15 ` [RFC][PATCH v021 7/9] PM: EM: Register perf domains with ho :active_power() callbacks Rafael J. Wysocki
2024-12-16 10:59   ` Dietmar Eggemann
2024-12-16 11:58     ` Rafael J. Wysocki
2024-11-29 16:21 ` [RFC][PATCH v021 8/9] cpufreq: intel_pstate: Introduce hybrid domains Rafael J. Wysocki
2024-12-12 17:04   ` Christian Loehle
2024-11-29 16:28 ` [RFC][PATCH v021 9/9] cpufreq: intel_pstate: Add basic EAS support on hybrid platforms Rafael J. Wysocki
2025-01-25 11:18 ` [RFC][PATCH v021 0/9] cpufreq: intel_pstate: Enable EAS on hybrid platforms without SMT Dietmar Eggemann
2025-01-27 13:57   ` Rafael J. Wysocki
2025-02-01 12:43     ` Christian Loehle

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=8f26fac2-787d-44a9-a0cd-c3035a91149c@arm.com \
    --to=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=lukasz.luba@arm.com \
    --cc=morten.rasmussen@arm.com \
    --cc=peterz@infradead.org \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=ricardo.neri-calderon@linux.intel.com \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox