public inbox for linux-kernel@vger.kernel.org
 help / color / mirror / Atom feed
From: Pierre Gondois <pierre.gondois@arm.com>
To: Qais Yousef <qyousef@layalina.io>
Cc: "Rafael J. Wysocki" <rafael@kernel.org>,
	Viresh Kumar <viresh.kumar@linaro.org>,
	linux-kernel@vger.kernel.org, linux-pm@vger.kernel.org,
	Ingo Molnar <mingo@kernel.org>,
	Peter Zijlstra <peterz@infradead.org>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Christian.Loehle@arm.com
Subject: Re: [PATCH] cpufreq: Change default transition delay to 2ms
Date: Tue, 20 Feb 2024 18:38:24 +0100	[thread overview]
Message-ID: <d58de550-0ce1-4af9-9e2d-dedd5e73c797@arm.com> (raw)
In-Reply-To: <20240220135037.qriyapwrznz2wdni@airbuntu>

Hello Qais,

I added some other remarks,

On 2/20/24 14:50, Qais Yousef wrote:
> On 02/14/24 10:19, Pierre Gondois wrote:
>> Hello,
>>
>> On 2/12/24 16:53, Rafael J. Wysocki wrote:
>>> On Mon, Feb 5, 2024 at 8:45 AM Viresh Kumar <viresh.kumar@linaro.org> wrote:
>>>>
>>>> On 05-02-24, 02:25, Qais Yousef wrote:
>>>>> 10ms is too high for today's hardware, even low end ones. This default
>>>>> end up being used a lot on Arm machines at least. Pine64, mac mini and
>>>>> pixel 6 all end up with 10ms rate_limit_us when using schedutil, and
>>>>> it's too high for all of them.
>>>>>
>>>>> Change the default to 2ms which should be 'pessimistic' enough for worst
>>>>> case scenario, but not too high for platforms with fast DVFS hardware.
>>>>>
>>>>> Signed-off-by: Qais Yousef <qyousef@layalina.io>
>>>>> ---
>>>>>    drivers/cpufreq/cpufreq.c | 4 ++--
>>>>>    1 file changed, 2 insertions(+), 2 deletions(-)
>>>>>
>>>>> diff --git a/drivers/cpufreq/cpufreq.c b/drivers/cpufreq/cpufreq.c
>>>>> index 44db4f59c4cc..8207f7294cb6 100644
>>>>> --- a/drivers/cpufreq/cpufreq.c
>>>>> +++ b/drivers/cpufreq/cpufreq.c
>>>>> @@ -582,11 +582,11 @@ unsigned int cpufreq_policy_transition_delay_us(struct cpufreq_policy *policy)
>>>>>                  * for platforms where transition_latency is in milliseconds, it
>>>>>                  * ends up giving unrealistic values.
>>>>>                  *
>>>>> -              * Cap the default transition delay to 10 ms, which seems to be
>>>>> +              * Cap the default transition delay to 2 ms, which seems to be
>>>>>                  * a reasonable amount of time after which we should reevaluate
>>>>>                  * the frequency.
>>>>>                  */
>>>>> -             return min(latency * LATENCY_MULTIPLIER, (unsigned int)10000);
>>>>> +             return min(latency * LATENCY_MULTIPLIER, (unsigned int)(2*MSEC_PER_SEC));
>>>>
>>>> Please add spaces around '*'.
>>>>
>>>> Acked-by: Viresh Kumar <viresh.kumar@linaro.org>
>>>
>>> I've adjusted the whitespace as suggested above and applied the patch
>>> as 5.9 material.
>>>
>>> Thanks!
>>>
>>
>> To add some numbers, on a Juno-r2, with latency measured between the frequency
>> request on the kernel side and the SCP actually making the frequency update.
>>
>> The SCP is the firmware responsible of making the frequency updates. It receives
>> the kernel requests and coordinate them/make the actual changes. The SCP also has
>> a mechanism called 'fast channel' (FC) where the kernel writes the requested
>> frequency to a memory area shared with the SCP. Every 4ms, the SCP polls/reads
>> these memory area and make the required modifications.
>>
>> Latency values (in ms)
>> Workload:
>> Idle system, during ~30s
>> +---------------------------------------+
>> |       |   Without FC  |      With FC  |
>> +-------+---------------+---------------+
>> | count |       1663    |        1102   |
>> | mean  |          2.92 |          2.10 |
>> | std   |          1.90 |          1.58 |
>> | min   |          0.21 |          0.00 |
>> | 25%   |          1.64 |          0.91 |
>> | 50%   |          2.57 |          1.68 |
>> | 75%   |          3.66 |          2.97 |
>> | max   |         14.37 |         13.50 |
>> +-------+---------------+---------------+
>>
>> Latency values (in ms)
>> Workload:
>> One 1% task per CPU, period = 32ms. This allows to wake up the CPU
>> every 32ms and send more requests/give more work to the SCP. Indeed
>> the SCP is also responsible of idle state transitions.
>> Test duration ~=30s.
>> +---------------------------------------+
>> |       |   Without FC  |      With FC  |
>> +-------+---------------+---------------+
>> | count |       1629    |       1446    |
>> | mean  |          3.23 |          2.31 |
>> | std   |          2.40 |          1.73 |
>> | min   |          0.05 |          0.02 |
>> | 25%   |          1.91 |          0.98 |
>> | 50%   |          2.65 |          2.00 |
>> | 75%   |          3.65 |          3.23 |
>> | max   |         20.56 |         16.73 |
>> +-------+---------------+---------------+
>>
>> ---

1.
With this patch, platforms like the Juno which:
- don't set a `transition_delay_us`
- have a high `transition_latency` (> 1ms)
can request freq. changes every 2ms.

If a platform has a `transition_latency` > 2ms, this means:
   `transition_latency` > `transition_delay_us`
I.e. a second freq. requests might be emitted before the first one
will be completed. On the Juno, this doesn't cause any 'real' issue
as the SCMI/mailbox mechanism works well, but this doesn't seem
correct.
If the util of CPUs is in between OPPs (i.e. freq. changes are often
required), the Juno:
- sends a freq. request
- waits for completion and schedules another task in the meantime
- upon completion, immediately sends a new freq.

I think that the following should be respected/checked:
- `transition_latency` < `transition_delay_us`
(it might also make sense to have, with K being any factor:)
- `transition_latency` * K < `transition_delay_us`


2.
There are references to the 10ms values at other places in the code:

include/linux/cpufreq.h
  * ondemand governor will work on any processor with transition latency <= 10ms,

drivers/cpufreq/cpufreq.c
  * For platforms that can change the frequency very fast (< 10
  * us), the above formula gives a decent transition delay. But
-> the 10us value matches 10ms = 10us * LATENCY_MULTIPLIER

Documentation/admin-guide/pm/cpufreq.rst
  Typically, it is set to values of the order of 10000 (10 ms).  Its
  default value is equal to the value of ``cpuinfo_transition_latency``


3.
There seems to be a dependency of the conservative/ondemand governors
over the the value returned by `cpufreq_policy_transition_delay_us()`:

drivers/cpufreq/cpufreq_governor.c
   dbs_data->sampling_rate = max_t(unsigned int,
     CPUFREQ_DBS_MIN_SAMPLING_INTERVAL,            // = 2 * tick period = 8ms
     cpufreq_policy_transition_delay_us(policy));  // [1]: val <= 2ms

[1]
if `transition_latency` is not set and `transition_delay_us` is,
which is the case for the Juno.

The `sampling_rate` is, FYIU, the period used to evaluate the ratio
of the idle/busy time, and if necessary increase/decrease the freq.

This patch will likely reduce this sampling rate from 10ms -> 8ms
(if `cpufreq_policy_transition_delay_us()`` now returns 2ms for some
platforms). This is not much, but just wanted to note it.

Regards,
Pierre


>>
>> The latency increases when fast channels are not used and when there is an actual
>> workload. On average it is always > 2ms. Juno's release date seems to be 2014,
>> so the platform is quite old, but it should also have benefited from regular
>> firmware updates.
> 
> Thanks for sharing the numbers
> 
>>
>> Regards,
>> Pierre

  reply	other threads:[~2024-02-20 17:38 UTC|newest]

Thread overview: 18+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2024-02-05  2:25 [PATCH] cpufreq: Change default transition delay to 2ms Qais Yousef
2024-02-05  7:45 ` Viresh Kumar
2024-02-12 15:53   ` Rafael J. Wysocki
2024-02-14  9:19     ` Pierre Gondois
2024-02-20 13:50       ` Qais Yousef
2024-02-20 17:38         ` Pierre Gondois [this message]
2024-02-22 11:55           ` Qais Yousef
2024-02-22 15:15             ` Pierre Gondois
2024-02-22 23:39               ` Qais Yousef
2024-02-23  9:48                 ` Pierre Gondois
2024-02-23 13:27                   ` Qais Yousef
2024-02-27 23:34                 ` [PATCH] cpufreq: Honour transition_latency over transition_delay_us Qais Yousef
2024-02-29 19:26                   ` Rafael J. Wysocki
2024-02-20 13:49     ` [PATCH] cpufreq: Change default transition delay to 2ms Qais Yousef
2024-02-05  9:17 ` Christian Loehle
2024-02-05 12:01   ` Qais Yousef
2024-02-05 17:35     ` Christian Loehle
2024-02-05 21:54       ` Qais Yousef

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=d58de550-0ce1-4af9-9e2d-dedd5e73c797@arm.com \
    --to=pierre.gondois@arm.com \
    --cc=Christian.Loehle@arm.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=linux-pm@vger.kernel.org \
    --cc=mingo@kernel.org \
    --cc=peterz@infradead.org \
    --cc=qyousef@layalina.io \
    --cc=rafael@kernel.org \
    --cc=vincent.guittot@linaro.org \
    --cc=viresh.kumar@linaro.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox