Linux Power Management development
 help / color / mirror / Atom feed
From: Sultan Alsawaf <sultan@kerneltoast.com>
To: "Rafael J. Wysocki" <rjw@rjwysocki.net>
Cc: Linux PM <linux-pm@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	Srinivas Pandruvada <srinivas.pandruvada@linux.intel.com>,
	Mario Limonciello <mario.limonciello@amd.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Christian Loehle <christian.loehle@arm.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Valentin Schneider <vschneid@redhat.com>,
	Ingo Molnar <mingo@redhat.com>
Subject: Re: [PATCH v2 5/6] cpufreq: Avoid using inconsistent policy->min and policy->max
Date: Fri, 18 Apr 2025 20:18:11 +1000	[thread overview]
Message-ID: <aAIm48RPmm1d_Y6u@sultan-box.localdomain> (raw)
In-Reply-To: <9458818.CDJkKcVGEf@rjwysocki.net>

On Tue, Apr 15, 2025 at 12:04:21PM +0200, Rafael J. Wysocki wrote:
> From: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> 
> Since cpufreq_driver_resolve_freq() can run in parallel with
> cpufreq_set_policy() and there is no synchronization between them,
> the former may access policy->min and policy->max while the latter
> is updating them and it may see intermediate values of them due
> to the way the update is carried out.  Also the compiler is free
> to apply any optimizations it wants both to the stores in
> cpufreq_set_policy() and to the loads in cpufreq_driver_resolve_freq()
> which may result in additional inconsistencies.
> 
> To address this, use WRITE_ONCE() when updating policy->min and
> policy->max in cpufreq_set_policy() and use READ_ONCE() for reading
> them in cpufreq_driver_resolve_freq().  Moreover, rearrange the update
> in cpufreq_set_policy() to avoid storing intermediate values in
> policy->min and policy->max with the help of the observation that
> their new values are expected to be properly ordered upfront.
> 
> Also modify cpufreq_driver_resolve_freq() to take the possible reverse
> ordering of policy->min and policy->max, which may happen depending on
> the ordering of operations when this function and cpufreq_set_policy()
> run concurrently, into account by always honoring the max when it
> turns out to be less than the min (in case it comes from thermal
> throttling or similar).
> 
> Fixes: 151717690694 ("cpufreq: Make policy min/max hard requirements")
> Cc: 5.16+ <stable@vger.kernel.org> # 5.16+
> Signed-off-by: Rafael J. Wysocki <rafael.j.wysocki@intel.com>
> ---
> 
> v1 -> v2: Minor edit in the subject
> 
> ---
>  drivers/cpufreq/cpufreq.c |   46 ++++++++++++++++++++++++++++++++++++----------
>  1 file changed, 36 insertions(+), 10 deletions(-)
> 
> --- a/drivers/cpufreq/cpufreq.c
> +++ b/drivers/cpufreq/cpufreq.c
> @@ -490,14 +490,12 @@
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_disable_fast_switch);
>  
> -static unsigned int clamp_and_resolve_freq(struct cpufreq_policy *policy,
> -					   unsigned int target_freq,
> -					   unsigned int relation)
> +static unsigned int __resolve_freq(struct cpufreq_policy *policy,
> +				   unsigned int target_freq,
> +				   unsigned int relation)
>  {
>  	unsigned int idx;
>  
> -	target_freq = clamp_val(target_freq, policy->min, policy->max);
> -
>  	if (!policy->freq_table)
>  		return target_freq;
>  
> @@ -507,6 +505,15 @@
>  	return policy->freq_table[idx].frequency;
>  }
>  
> +static unsigned int clamp_and_resolve_freq(struct cpufreq_policy *policy,
> +					   unsigned int target_freq,
> +					   unsigned int relation)
> +{
> +	target_freq = clamp_val(target_freq, policy->min, policy->max);
> +
> +	return __resolve_freq(policy, target_freq, relation);
> +}
> +
>  /**
>   * cpufreq_driver_resolve_freq - Map a target frequency to a driver-supported
>   * one.
> @@ -521,7 +528,22 @@
>  unsigned int cpufreq_driver_resolve_freq(struct cpufreq_policy *policy,
>  					 unsigned int target_freq)
>  {
> -	return clamp_and_resolve_freq(policy, target_freq, CPUFREQ_RELATION_LE);
> +	unsigned int min = READ_ONCE(policy->min);
> +	unsigned int max = READ_ONCE(policy->max);
> +
> +	/*
> +	 * If this function runs in parallel with cpufreq_set_policy(), it may
> +	 * read policy->min before the update and policy->max after the update
> +	 * or the other way around, so there is no ordering guarantee.
> +	 *
> +	 * Resolve this by always honoring the max (in case it comes from
> +	 * thermal throttling or similar).
> +	 */
> +	if (unlikely(min > max))
> +		min = max;
> +
> +	return __resolve_freq(policy, clamp_val(target_freq, min, max),
> +			      CPUFREQ_RELATION_LE);
>  }
>  EXPORT_SYMBOL_GPL(cpufreq_driver_resolve_freq);
>  
> @@ -2632,11 +2654,15 @@
>  	 * Resolve policy min/max to available frequencies. It ensures
>  	 * no frequency resolution will neither overshoot the requested maximum
>  	 * nor undershoot the requested minimum.
> +	 *
> +	 * Avoid storing intermediate values in policy->max or policy->min and
> +	 * compiler optimizations around them because them may be accessed
> +	 * concurrently by cpufreq_driver_resolve_freq() during the update.
>  	 */
> -	policy->min = new_data.min;
> -	policy->max = new_data.max;
> -	policy->min = clamp_and_resolve_freq(policy, policy->min, CPUFREQ_RELATION_L);
> -	policy->max = clamp_and_resolve_freq(policy, policy->max, CPUFREQ_RELATION_H);
> +	WRITE_ONCE(policy->max, __resolve_freq(policy, new_data.max, CPUFREQ_RELATION_H));
> +	new_data.min = __resolve_freq(policy, new_data.min, CPUFREQ_RELATION_L);
> +	WRITE_ONCE(policy->min, new_data.min > policy->max ? policy->max : new_data.min);

I don't think this is sufficient, because this still permits an incoherent
policy->min and policy->max combination, which makes it possible for schedutil
to honor the incoherent limits; i.e., schedutil may observe old policy->min and
new policy->max or vice-versa.

We also can't permit a wrong freq to be propagated to the driver and then send
the _right_ freq afterwards; IOW, we can't let a bogus freq slip through and
just correct it later.

How about using a seqlock?

Sultan

  parent reply	other threads:[~2025-04-18 10:18 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-04-15  9:52 [PATCH v2 0/6] cpufreq/sched: Improve synchronization of policy limits updates with schedutil Rafael J. Wysocki
2025-04-15  9:58 ` [PATCH v2 1/6] cpufreq/sched: Fix the usage of CPUFREQ_NEED_UPDATE_LIMITS Rafael J. Wysocki
2025-04-16 11:35   ` Christian Loehle
2025-04-20  1:10   ` Sultan Alsawaf
2025-04-15  9:59 ` [PATCH v2 2/6] cpufreq/sched: Explicitly synchronize limits_changed flag handling Rafael J. Wysocki
2025-04-16 12:01   ` Christian Loehle
2025-04-16 12:28     ` Rafael J. Wysocki
2025-04-15 10:00 ` [PATCH v2 3/6] cpufreq/sched: Set need_freq_update in ignore_dl_rate_limit() Rafael J. Wysocki
2025-04-16 12:26   ` Christian Loehle
2025-04-15 10:02 ` [PATCH v2 4/6] cpufreq: Rename __resolve_freq() to clamp_and_resolve_freq() Rafael J. Wysocki
2025-04-15 10:04 ` [PATCH v2 5/6] cpufreq: Avoid using inconsistent policy->min and policy->max Rafael J. Wysocki
2025-04-16 12:39   ` Christian Loehle
2025-04-16 12:50     ` Rafael J. Wysocki
2025-04-18 10:18   ` Sultan Alsawaf [this message]
2025-04-18 19:42     ` Rafael J. Wysocki
2025-04-18 22:21       ` Sultan Alsawaf
2025-04-19 10:39         ` Rafael J. Wysocki
2025-04-15 10:05 ` [PATCH v2 6/6] cpufreq: Eliminate clamp_and_resolve_freq() Rafael J. Wysocki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=aAIm48RPmm1d_Y6u@sultan-box.localdomain \
    --to=sultan@kerneltoast.com \
    --cc=christian.loehle@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rjw@rjwysocki.net \
    --cc=srinivas.pandruvada@linux.intel.com \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox