Linux Tegra architecture development
 help / color / mirror / Atom feed
From: Sumit Gupta <sumitg@nvidia.com>
To: "Rafael J. Wysocki" <rafael@kernel.org>
Cc: viresh.kumar@linaro.org, pierre.gondois@arm.com,
	ionela.voinescu@arm.com, zhenglifeng1@huawei.com,
	zhanjie9@hisilicon.com, linux-kernel@vger.kernel.org,
	linux-pm@vger.kernel.org, linux-tegra@vger.kernel.org,
	treding@nvidia.com, jonathanh@nvidia.com, vsethi@nvidia.com,
	ksitaraman@nvidia.com, sanjayc@nvidia.com, mochs@nvidia.com,
	bbasu@nvidia.com, sumitg@nvidia.com
Subject: Re: [PATCH] cpufreq: CPPC: Preserve OSPM-set registers across hotplug and unload
Date: Wed, 24 Jun 2026 18:26:15 +0530	[thread overview]
Message-ID: <de03b23f-18a3-4e99-a15a-5044463ac484@nvidia.com> (raw)
In-Reply-To: <CAJZ5v0jvkNfx-1XChRCsGHJAcY7W4j41P+Xug6+TpPge7tHF7Q@mail.gmail.com>

Hi Rafael,


On 23/06/26 16:30, Rafael J. Wysocki wrote:
> External email: Use caution opening links or attachments
>
>
> On Tue, Jun 23, 2026 at 11:54 AM Sumit Gupta <sumitg@nvidia.com> wrote:
>> Values written to OSPM-set CPPC registers (via sysfs or the autonomous
>> boot parameter) can be lost in two ways:
>>
>>    - Across CPU hotplug: the platform may reset a CPU's registers when it
>>      is offlined.
>>    - On driver unload: the value the driver wrote is left in the register
>>      instead of returning to its pre-driver state.
>>
>> Add a small table-driven mechanism that handles both:
>>
>>    - Capture each register's firmware value when a CPU is first seen and
>>      restore it on driver unload.
>>    - Record the last value the driver set and reapply it from ->init()
>>      when the policy is reactivated after CPU hotplug.
> I'm not sure if this is a good idea TBH.
>
> The overall system state when the CPU goes online may be completely
> different from the system state when the CPU was online last time, so
> there is no reason to restore its settings from before offline, at
> least in principle.

These are values userspace deliberately set, more like a frequency QoS
request that persists until userspace changes it than transient state.
It's platform dependent though, as the platform I test on preserves them
across hotplug, but where the platform resets the register on offline
the value is silently lost. Should the kernel re-apply it on online,
or is that better left to userspace? If so, I will drop the hotplug reapply
and keep only the restore on driver unload.

intel_pstate and amd-pstate already re-sync these on CPU online, as they
keep the policy across hotplug via ->online()/->offline(). So for them
it's re-syncing the hardware, not restoring state from before offline.

cppc_cpufreq has no ->online()/->offline() today, so it fully tears the
policy down and rebuilds it, which is why this reads as "restoring
settings from before offline".
Would adding ->online()/->offline() be acceptable, with policy preserved
and ->online() just re-syncing the registers the platform reset?

Thanks,
Sumit

>> The firmware value is captured on a CPU's first activation rather than
>> once at module load, so CPUs offline at boot or hot-added later are
>> covered.
>>
>> Reapply is only needed on a full policy teardown and bring-up, which goes
>> through ->init(). In a SHARED_TYPE_ANY policy, offlining a single CPU
>> leaves the shared register untouched, so nothing is lost there.
>>
>> Cover the OSPM Nominal Performance, Autonomous Selection
>> (auto_sel) and Energy Performance Preference (EPP) registers. For
>> auto_sel it replaces the previous unconditional
>> cppc_set_auto_sel(cpu, false) on unload with a restore of the firmware
>> value captured at the CPU's first init.
>>
>> Suggested-by: Pierre Gondois <pierre.gondois@arm.com>
>> Signed-off-by: Sumit Gupta <sumitg@nvidia.com>
>> ---
>>
>> This applies on top of (not yet merged):
>>   [1] ACPI: CPPC: Add ospm_nominal_perf support
>>       https://lore.kernel.org/lkml/20260615185934.2383514-1-sumitg@nvidia.com/
>>   [2] cpufreq: CPPC: add autonomous mode boot parameter support
>>       https://lore.kernel.org/lkml/20260623080652.3353386-1-sumitg@nvidia.com/
>>
>>   drivers/cpufreq/cppc_cpufreq.c | 194 +++++++++++++++++++++++++++++++--
>>   1 file changed, 186 insertions(+), 8 deletions(-)
>>
>> diff --git a/drivers/cpufreq/cppc_cpufreq.c b/drivers/cpufreq/cppc_cpufreq.c
>> index a3fabfb07fbe..d6ea2cbde187 100644
>> --- a/drivers/cpufreq/cppc_cpufreq.c
>> +++ b/drivers/cpufreq/cppc_cpufreq.c
>> @@ -28,6 +28,174 @@
>>
>>   static struct cpufreq_driver cppc_cpufreq_driver;
>>
>> +/*
>> + * OSPM-set CPPC registers tracked for save/restore. A value set via sysfs or
>> + * the autonomous boot parameter is reapplied across CPU hotplug, and the
>> + * firmware value is restored on driver unload.
>> + */
>> +enum cppc_saved_reg_id {
>> +       CPPC_SAVED_OSPM_NOMINAL_PERF,
>> +       CPPC_SAVED_AUTO_SEL,
>> +       CPPC_SAVED_EPP,
>> +       CPPC_NR_SAVED_REGS,
>> +};
>> +
>> +struct cppc_saved_reg {
>> +       int (*get)(int cpu, u64 *val);
>> +       int (*set)(int cpu, u64 val);
>> +};
>> +
>> +/* u64 wrappers so the bool auto_sel register fits the table signatures. */
>> +static int cppc_get_auto_sel_u64(int cpu, u64 *val)
>> +{
>> +       bool enable;
>> +       int ret;
>> +
>> +       ret = cppc_get_auto_sel(cpu, &enable);
>> +       if (ret)
>> +               return ret;
>> +
>> +       *val = enable;
>> +       return 0;
>> +}
>> +
>> +static int cppc_set_auto_sel_u64(int cpu, u64 val)
>> +{
>> +       return cppc_set_auto_sel(cpu, !!val);
>> +}
>> +
>> +static const struct cppc_saved_reg cppc_saved_regs[CPPC_NR_SAVED_REGS] = {
>> +       [CPPC_SAVED_OSPM_NOMINAL_PERF] = {
>> +               cppc_get_ospm_nominal_perf, cppc_set_ospm_nominal_perf,
>> +       },
>> +       [CPPC_SAVED_AUTO_SEL] = {
>> +               cppc_get_auto_sel_u64, cppc_set_auto_sel_u64,
>> +       },
>> +       [CPPC_SAVED_EPP] = {
>> +               cppc_get_epp_perf, cppc_set_epp,
>> +       },
>> +};
>> +
>> +/*
>> + * Per-CPU saved state for each register in cppc_saved_regs[]:
>> + *   firmware_val      - register value before the driver touched it, restored
>> + *                       on unload
>> + *   requested_val     - last value the driver set (sysfs or boot parameter),
>> + *                       reapplied on policy reactivation
>> + *   firmware_captured - whether firmware_val has been read, so a not-yet-seen
>> + *                       CPU isn't mistaken for one whose firmware value is 0
>> + */
>> +struct cppc_saved_state {
>> +       u64 firmware_val;
>> +       u64 requested_val;
>> +       bool firmware_captured;
>> +};
>> +
>> +/*
>> + * Per-CPU and not tied to a policy, so the saved values survive policy
>> + * teardown/bring-up across CPU hotplug. cpu_data->perf_ctrls is per-policy
>> + * and freed on policy ->exit.
>> + */
>> +static DEFINE_PER_CPU(struct cppc_saved_state[CPPC_NR_SAVED_REGS], cppc_saved_state);
>> +
>> +static void cppc_cache_perf_ctrls(struct cppc_cpudata *cpu_data,
>> +                                 enum cppc_saved_reg_id reg, u64 val)
>> +{
>> +       switch (reg) {
>> +       case CPPC_SAVED_AUTO_SEL:
>> +               cpu_data->perf_ctrls.auto_sel = val;
>> +               break;
>> +       case CPPC_SAVED_EPP:
>> +               cpu_data->perf_ctrls.energy_perf = val;
>> +               break;
>> +       default:
>> +               break;
>> +       }
>> +}
>> +
>> +/*
>> + * Save the requested value for the given register and CPU, to be reapplied when
>> + * the policy is reactivated after CPU hotplug. Also update the per-policy
>> + * perf_ctrls copy so the saved and current values stay in sync.
>> + */
>> +static void cppc_save_requested(struct cppc_cpudata *cpu_data, unsigned int cpu,
>> +                               enum cppc_saved_reg_id reg, u64 val)
>> +{
>> +       per_cpu(cppc_saved_state, cpu)[reg].requested_val = val;
>> +       cppc_cache_perf_ctrls(cpu_data, reg, val);
>> +}
>> +
>> +/*
>> + * Reapply each register's last requested value from ->init(), so a value set
>> + * via sysfs or the boot parameter survives a policy teardown and bring-up
>> + * across CPU hotplug. Also keep perf_ctrls in sync with it.
>> + */
>> +static void cppc_cpufreq_reapply_requested_regs(struct cpufreq_policy *policy)
>> +{
>> +       struct cppc_cpudata *cpu_data = policy->driver_data;
>> +       unsigned int cpu, i;
>> +       u64 val;
>> +
>> +       for_each_cpu(cpu, policy->cpus) {
>> +               for (i = 0; i < CPPC_NR_SAVED_REGS; i++) {
>> +                       val = per_cpu(cppc_saved_state, cpu)[i].requested_val;
>> +                       if (val == U64_MAX)
>> +                               continue;
>> +
>> +                       cppc_saved_regs[i].set(cpu, val);
>> +
>> +                       /* Keep perf_ctrls in sync via the policy's CPU. */
>> +                       if (cpu == policy->cpu)
>> +                               cppc_cache_perf_ctrls(cpu_data, i, val);
>> +               }
>> +       }
>> +}
>> +
>> +/*
>> + * On a CPU's first ->init(), capture each register's firmware value to be
>> + * restored on driver unload. Later calls for the same CPU are a no-op. Capturing
>> + * from ->init() rather than module load covers CPUs that appear later. Also seed
>> + * requested_val to U64_MAX so its zeroed default is not taken as a request for 0.
>> + */
>> +static void cppc_cpufreq_save_firmware_regs(struct cpufreq_policy *policy)
>> +{
>> +       unsigned int cpu, i;
>> +       u64 val;
>> +
>> +       for_each_cpu(cpu, policy->cpus) {
>> +               for (i = 0; i < CPPC_NR_SAVED_REGS; i++) {
>> +                       struct cppc_saved_state *s =
>> +                               &per_cpu(cppc_saved_state, cpu)[i];
>> +
>> +                       /* Capture once per CPU; skip if already recorded. */
>> +                       if (s->firmware_captured)
>> +                               continue;
>> +
>> +                       if (cppc_saved_regs[i].get(cpu, &val))
>> +                               val = U64_MAX;
>> +                       s->firmware_val = val;
>> +                       s->requested_val = U64_MAX;
>> +                       s->firmware_captured = true;
>> +               }
>> +       }
>> +}
>> +
>> +/* On driver unload, restore each captured CPU's firmware value. */
>> +static void cppc_cpufreq_restore_firmware_regs(void)
>> +{
>> +       unsigned int cpu, i;
>> +
>> +       for_each_present_cpu(cpu) {
>> +               for (i = 0; i < CPPC_NR_SAVED_REGS; i++) {
>> +                       struct cppc_saved_state *s =
>> +                               &per_cpu(cppc_saved_state, cpu)[i];
>> +
>> +                       if (s->firmware_captured && s->firmware_val != U64_MAX)
>> +                               cppc_saved_regs[i].set(cpu, s->firmware_val);
>> +               }
>> +       }
>> +}
>> +
>>   /* Autonomous Selection boot parameter modes */
>>   enum {
>>          AUTO_SEL_DISABLED = 0,
>> @@ -766,6 +934,9 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>          policy->cur = cppc_perf_to_khz(caps, caps->highest_perf);
>>          cpu_data->perf_ctrls.desired_perf =  caps->highest_perf;
>>
>> +       /* Capture a CPU's firmware values on its first init, before any driver write. */
>> +       cppc_cpufreq_save_firmware_regs(policy);
>> +
>>          /*
>>           * Enable autonomous mode on first init if boot param is set.
>>           * Check last_governor to detect first init and skip if auto_sel
>> @@ -812,7 +983,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>                          if (ret && ret != -EOPNOTSUPP)
>>                                  pr_warn("Failed to set EPP for CPU%d (%d)\n", cpu, ret);
>>                          else if (!ret)
>> -                               cpu_data->perf_ctrls.energy_perf = epp;
>> +                               cppc_save_requested(cpu_data, cpu, CPPC_SAVED_EPP, epp);
>>                  }
>>
>>                  /* Program min/max/desired into CPPC regs (non-fatal on failure). */
>> @@ -826,7 +997,7 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>                          pr_warn("auto_sel CPU%d failed (%d); using OS mode\n",
>>                                  cpu, ret);
>>                  else if (!ret)
>> -                       cpu_data->perf_ctrls.auto_sel = true;
>> +                       cppc_save_requested(cpu_data, cpu, CPPC_SAVED_AUTO_SEL, true);
>>          }
>>
>>          if (cpu_data->perf_ctrls.auto_sel) {
>> @@ -850,6 +1021,10 @@ static int cppc_cpufreq_cpu_init(struct cpufreq_policy *policy)
>>          }
>>
>>          cppc_cpufreq_cpu_fie_init(policy);
>> +
>> +       /* Reapply any saved values lost across a full policy teardown. */
>> +       cppc_cpufreq_reapply_requested_regs(policy);
>> +
>>          return 0;
>>
>>   out:
>> @@ -1039,6 +1214,8 @@ static ssize_t store_auto_select(struct cpufreq_policy *policy,
>>                  }
>>          }
>>
>> +       cppc_save_requested(cpu_data, policy->cpu, CPPC_SAVED_AUTO_SEL, val);
>> +
>>          return count;
>>   }
>>
>> @@ -1111,7 +1288,7 @@ store_energy_performance_preference_val(struct cpufreq_policy *policy,
>>          if (ret)
>>                  return ret;
>>
>> -       cpu_data->perf_ctrls.energy_perf = val;
>> +       cppc_save_requested(cpu_data, policy->cpu, CPPC_SAVED_EPP, val);
>>
>>          return count;
>>   }
>> @@ -1193,6 +1370,9 @@ static ssize_t store_ospm_nominal_freq(struct cpufreq_policy *policy,
>>                  }
>>          }
>>
>> +       for_each_cpu(sib, policy->cpus)
>> +               cppc_save_requested(cpu_data, sib, CPPC_SAVED_OSPM_NOMINAL_PERF, perf);
>> +
>>          return count;
>>
>>   rollback:
>> @@ -1258,13 +1438,11 @@ static int __init cppc_cpufreq_init(void)
>>
>>   static void __exit cppc_cpufreq_exit(void)
>>   {
>> -       unsigned int cpu;
>> -
>> -       for_each_present_cpu(cpu)
>> -               cppc_set_auto_sel(cpu, false);
>> -
>>          cpufreq_unregister_driver(&cppc_cpufreq_driver);
>>          cppc_freq_invariance_exit();
>> +
>> +       /* Restore auto_sel and the other saved registers to their firmware value. */
>> +       cppc_cpufreq_restore_firmware_regs();
>>   }
>>
>>   module_param_cb(auto_sel_mode, &auto_sel_mode_ops, &auto_sel_mode, 0444);
>> --
>> 2.34.1
>>
>>

  reply	other threads:[~2026-06-24 12:56 UTC|newest]

Thread overview: 4+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2026-06-23  9:54 [PATCH] cpufreq: CPPC: Preserve OSPM-set registers across hotplug and unload Sumit Gupta
2026-06-23 11:00 ` Rafael J. Wysocki
2026-06-24 12:56   ` Sumit Gupta [this message]
2026-06-24 13:20     ` Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=de03b23f-18a3-4e99-a15a-5044463ac484@nvidia.com \
    --to=sumitg@nvidia.com \
    --cc=bbasu@nvidia.com \
    --cc=ionela.voinescu@arm.com \
    --cc=jonathanh@nvidia.com \
    --cc=ksitaraman@nvidia.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=linux-tegra@vger.kernel.org \
    --cc=mochs@nvidia.com \
    --cc=pierre.gondois@arm.com \
    --cc=rafael@kernel.org \
    --cc=sanjayc@nvidia.com \
    --cc=treding@nvidia.com \
    --cc=viresh.kumar@linaro.org \
    --cc=vsethi@nvidia.com \
    --cc=zhanjie9@hisilicon.com \
    --cc=zhenglifeng1@huawei.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox