public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Dhananjay Ugwekar <Dhananjay.Ugwekar@amd.com>
To: Mario Limonciello <mario.limonciello@amd.com>,
	"Gautham R . Shenoy" <gautham.shenoy@amd.com>
Cc: Perry Yuan <perry.yuan@amd.com>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org
Subject: Re: [PATCH v3 10/15] cpufreq/amd-pstate: Move limit updating code
Date: Tue, 17 Dec 2024 12:20:06 +0530	[thread overview]
Message-ID: <cfcbf8a4-7254-438a-8b0c-29807e07ed90@amd.com> (raw)
In-Reply-To: <a8c6ff3f-2ad8-4926-9ac3-f0f39ceffa2c@amd.com>

On 12/16/2024 9:09 PM, Mario Limonciello wrote:
> On 12/16/2024 08:45, Dhananjay Ugwekar wrote:
>> On 12/16/2024 7:51 PM, Mario Limonciello wrote:
>>> On 12/16/2024 08:16, Dhananjay Ugwekar wrote:
>>>> Hello Mario,
>>>>
>>>> On 12/10/2024 12:22 AM, Mario Limonciello wrote:
>>>>> The limit updating code in amd_pstate_epp_update_limit() should not
>>>>> only apply to EPP updates.  Move it to amd_pstate_update_min_max_limit()
>>>>> so other callers can benefit as well.
>>>>>
>>>>> With this move it's not necessary to have clamp_t calls anymore because
>>>>> the verify callback is called when setting limits.
>>>>
>>>> While testing this series, I observed that with amd_pstate=passive + schedutil governor,
>>>> the scaling_max_freq limits were not being honored and I bisected the issue down to this
>>>> patch.
>>>>
>>>> I went through the code and noticed that in amd_pstate_adjust_perf(), we set the min_perf
>>>> field in MSR_AMD_CPPC_REQ to "cap_perf" which is equal to cpudata->highest_perf (which is
>>>> equal to 255 for non-preferred cores systems). This didnt seem logical to me and I changed
>>>> cap_perf to cpudata->max_limit_perf which gives us the value updated in scaling_max_freq.
>>>>
>>>> I think as we removed the redundant clamping code, this pre-existing issue got exposed.
>>>> The below diff fixes the issue for me.
>>>>
>>>> Please let me know your thoughts on this.
>>>>
>>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>>> index d7b1de97727a..1ac34e3f1fc5 100644
>>>> --- a/drivers/cpufreq/amd-pstate.c
>>>> +++ b/drivers/cpufreq/amd-pstate.c
>>>> @@ -699,7 +699,7 @@ static void amd_pstate_adjust_perf(unsigned int cpu,
>>>>           if (min_perf < lowest_nonlinear_perf)
>>>>                   min_perf = lowest_nonlinear_perf;
>> here^^^
>>>>
>>>> -       max_perf = cap_perf;
>>>> +       max_perf = cpudata->max_limit_perf;
>>>>           if (max_perf < min_perf)
>>>>                   max_perf = min_perf;
>>>
>>> With this change I think you can also drop the comparison afterwards, as an optimization right?
>>
>> Umm I think it is possible that scaling_max_freq is set to a value lower than
>> lowest_nonlinear_freq in that case this if condition would be needed (as min_perf
>> is being lower bounded at lowest_nonlinear_freq at the location highlighted above).
>> I would be okay with keeping this check in.
> 
> Well this feels like a bigger problem actually - why is it forcefully bounded at lowest nonlinear freq?  Performance is going to be awful at that level 

Actually this wont necessarily deteriorate the performance, as we are just restricting 
the min_perf to not go below lowest_nonlinear level. So we are actually ensuring that 
the schedutil doesnt select a des_perf below lowest_nonlinear_perf.

(hence why commit 5d9a354cf839a ("cpufreq/amd-pstate: Set the initial min_freq to lowest_nonlinear_freq") was done),
> 
> but shouldn't we "let" people go below that in passive and guided?  We do for active.

Yes I agree, we should allow the user to set min limit in the entire frequency range, 
I thought there would've been some reason for restricting this. But I dont see any 
reasoning for this in the blamed commit log as well. I think one reason would be that 
below lowest_nonlinear_freq we dont get real power savings. And schedutil might dip 
into this lower inefficient range if we dont force bound it.

Thanks,
Dhananjay

> 
>>
>> Also, what is the behavior if max_perf is set to a value lower than min_perf in
>> the CPPC_REQ MSR? I guess platform FW would also be smart enough to handle this
>> implicitly, but cant say for sure.
>>
> 
> I would hope so too; but yeah you're right we don't know for sure.
> 
>>>
>>> As this is already in superm1.git/linux-next after testing can you please send a patch relative to superm1.git/linux-next branch?
>>
>> Sure, I'll send out the patch once we finalize on the above if condition.
>>
>> Regards,
>> Dhananjay
>>
>>>
>>> Thanks!
>>>
>>>>
>>>> Thanks,
>>>> Dhananjay
>>>>
>>>>>
>>>>> Reviewed-by: Gautham R. Shenoy <gautham.shenoy@amd.com>
>>>>> Signed-off-by: Mario Limonciello <mario.limonciello@amd.com>
>>>>> ---
>>>>> v2:
>>>>>    * Drop lowest_perf variable
>>>>> ---
>>>>>    drivers/cpufreq/amd-pstate.c | 28 +++++-----------------------
>>>>>    1 file changed, 5 insertions(+), 23 deletions(-)
>>>>>
>>>>> diff --git a/drivers/cpufreq/amd-pstate.c b/drivers/cpufreq/amd-pstate.c
>>>>> index 3a3df67c096d5..dc3c45b6f5103 100644
>>>>> --- a/drivers/cpufreq/amd-pstate.c
>>>>> +++ b/drivers/cpufreq/amd-pstate.c
>>>>> @@ -537,10 +537,6 @@ static void amd_pstate_update(struct amd_cpudata *cpudata, u32 min_perf,
>>>>>        u32 nominal_perf = READ_ONCE(cpudata->nominal_perf);
>>>>>        u64 value = prev;
>>>>>    -    min_perf = clamp_t(unsigned long, min_perf, cpudata->min_limit_perf,
>>>>> -            cpudata->max_limit_perf);
>>>>> -    max_perf = clamp_t(unsigned long, max_perf, cpudata->min_limit_perf,
>>>>> -            cpudata->max_limit_perf);
>>>>>        des_perf = clamp_t(unsigned long, des_perf, min_perf, max_perf);
>>>>>          max_freq = READ_ONCE(cpudata->max_limit_freq);
>>>>> @@ -607,7 +603,7 @@ static int amd_pstate_verify(struct cpufreq_policy_data *policy_data)
>>>>>      static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>>>>>    {
>>>>> -    u32 max_limit_perf, min_limit_perf, lowest_perf, max_perf, max_freq;
>>>>> +    u32 max_limit_perf, min_limit_perf, max_perf, max_freq;
>>>>>        struct amd_cpudata *cpudata = policy->driver_data;
>>>>>          max_perf = READ_ONCE(cpudata->highest_perf);
>>>>> @@ -615,12 +611,8 @@ static int amd_pstate_update_min_max_limit(struct cpufreq_policy *policy)
>>>>>        max_limit_perf = div_u64(policy->max * max_perf, max_freq);
>>>>>        min_limit_perf = div_u64(policy->min * max_perf, max_freq);
>>>>>    -    lowest_perf = READ_ONCE(cpudata->lowest_perf);
>>>>> -    if (min_limit_perf < lowest_perf)
>>>>> -        min_limit_perf = lowest_perf;
>>>>> -
>>>>> -    if (max_limit_perf < min_limit_perf)
>>>>> -        max_limit_perf = min_limit_perf;
>>>>> +    if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>>>> +        min_limit_perf = min(cpudata->nominal_perf, max_limit_perf);
>>>>>          WRITE_ONCE(cpudata->max_limit_perf, max_limit_perf);
>>>>>        WRITE_ONCE(cpudata->min_limit_perf, min_limit_perf);
>>>>> @@ -1562,28 +1554,18 @@ static void amd_pstate_epp_cpu_exit(struct cpufreq_policy *policy)
>>>>>    static int amd_pstate_epp_update_limit(struct cpufreq_policy *policy)
>>>>>    {
>>>>>        struct amd_cpudata *cpudata = policy->driver_data;
>>>>> -    u32 max_perf, min_perf;
>>>>>        u64 value;
>>>>>        s16 epp;
>>>>>    -    max_perf = READ_ONCE(cpudata->highest_perf);
>>>>> -    min_perf = READ_ONCE(cpudata->lowest_perf);
>>>>>        amd_pstate_update_min_max_limit(policy);
>>>>>    -    max_perf = clamp_t(unsigned long, max_perf, cpudata->min_limit_perf,
>>>>> -            cpudata->max_limit_perf);
>>>>> -    min_perf = clamp_t(unsigned long, min_perf, cpudata->min_limit_perf,
>>>>> -            cpudata->max_limit_perf);
>>>>>        value = READ_ONCE(cpudata->cppc_req_cached);
>>>>>    -    if (cpudata->policy == CPUFREQ_POLICY_PERFORMANCE)
>>>>> -        min_perf = min(cpudata->nominal_perf, max_perf);
>>>>> -
>>>>>        value &= ~(AMD_CPPC_MAX_PERF_MASK | AMD_CPPC_MIN_PERF_MASK |
>>>>>               AMD_CPPC_DES_PERF_MASK);
>>>>> -    value |= FIELD_PREP(AMD_CPPC_MAX_PERF_MASK, max_perf);
>>>>> +    value |= FIELD_PREP(AMD_CPPC_MAX_PERF_MASK, cpudata->max_limit_perf);
>>>>>        value |= FIELD_PREP(AMD_CPPC_DES_PERF_MASK, 0);
>>>>> -    value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, min_perf);
>>>>> +    value |= FIELD_PREP(AMD_CPPC_MIN_PERF_MASK, cpudata->min_limit_perf);
>>>>>          /* Get BIOS pre-defined epp value */
>>>>>        epp = amd_pstate_get_epp(cpudata, value);
>>>>
>>>
>>
> 


  reply	other threads:[~2024-12-17  6:50 UTC|newest]

Thread overview: 25+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-12-09 18:52 [PATCH v3 00/15] amd-pstate fixes and improvements for 6.14 Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 01/15] cpufreq/amd-pstate: Store the boost numerator as highest perf again Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 02/15] cpufreq/amd-pstate: Use boost numerator for upper bound of frequencies Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 03/15] cpufreq/amd-pstate: Add trace event for EPP perf updates Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 04/15] cpufreq/amd-pstate: convert mutex use to guard() Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 05/15] cpufreq/amd-pstate: Drop cached epp_policy variable Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 06/15] cpufreq/amd-pstate: Use FIELD_PREP and FIELD_GET macros Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 07/15] cpufreq/amd-pstate: Only update the cached value in msr_set_epp() on success Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 08/15] cpufreq/amd-pstate: store all values in cpudata struct in khz Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 09/15] cpufreq/amd-pstate: Change amd_pstate_update_perf() to return an int Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 10/15] cpufreq/amd-pstate: Move limit updating code Mario Limonciello
2024-12-16 14:16   ` Dhananjay Ugwekar
2024-12-16 14:21     ` Mario Limonciello
2024-12-16 14:45       ` Dhananjay Ugwekar
2024-12-16 15:39         ` Mario Limonciello
2024-12-17  6:50           ` Dhananjay Ugwekar [this message]
2024-12-17 19:44             ` Mario Limonciello
2024-12-19  4:22               ` Gautham R. Shenoy
2024-12-19  4:24               ` Dhananjay Ugwekar
2024-12-09 18:52 ` [PATCH v3 11/15] cpufreq/amd-pstate: Cache EPP value and use that everywhere Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 12/15] cpufreq/amd-pstate: Always write EPP value when updating perf Mario Limonciello
2024-12-10 11:08   ` Gautham R. Shenoy
2024-12-09 18:52 ` [PATCH v3 13/15] cpufreq/amd-pstate: Drop ret variable from amd_pstate_set_energy_pref_index() Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 14/15] cpufreq/amd-pstate: Set different default EPP policy for Epyc and Ryzen Mario Limonciello
2024-12-09 18:52 ` [PATCH v3 15/15] cpufreq/amd-pstate: Drop boost_state variable Mario Limonciello

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=cfcbf8a4-7254-438a-8b0c-29807e07ed90@amd.com \
    --to=dhananjay.ugwekar@amd.com \
    --cc=gautham.shenoy@amd.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mario.limonciello@amd.com \
    --cc=perry.yuan@amd.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox