* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
[not found] <20240114183600.135316-1-vincent.guittot@linaro.org>
@ 2024-02-14 17:12 ` Jon Hunter
2024-02-14 17:19 ` Vincent Guittot
2024-02-14 17:19 ` Linus Torvalds
0 siblings, 2 replies; 8+ messages in thread
From: Jon Hunter @ 2024-02-14 17:12 UTC (permalink / raw)
To: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, wkarny, torvalds,
qyousef, tglx, rafael, viresh.kumar, linux-kernel, linux-pm,
linux-tegra@vger.kernel.org, Thierry Reding, Sasha Levin,
Laxman Dewangan, Shardar Mohammed
Hi Vincent,
On 14/01/2024 18:36, Vincent Guittot wrote:
> When frequency invariance is not enabled, get_capacity_ref_freq(policy)
> returns the current frequency and the performance margin applied by
> map_util_perf(), enabled the utilization to go above the maximum compute
> capacity and to select a higher frequency than the current one.
>
> The performance margin is now applied earlier in the path to take into
> account some utilization clampings and we can't get an utilization higher
> than the maximum compute capacity.
>
> We must use a frequency above the current frequency to get a chance to
> select a higher OPP when the current one becomes fully used. Apply
> the same margin and returns a frequency 25% higher than the current one in
> order to switch to the next OPP before we fully use the cpu at the current
> one.
>
> Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> Closes: https://lore.kernel.org/lkml/CAHk-=wgWcYX2oXKtgvNN2LLDXP7kXkbo-xTfumEjmPbjSer2RQ@mail.gmail.com/
> Reported-by: Wyes Karny <wkarny@gmail.com>
> Closes: https://lore.kernel.org/lkml/20240114091240.xzdvqk75ifgfj5yx@wyes-pc/
> Fixes: 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor performance estimation")
> Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> Tested-by: Wyes Karny <wkarny@gmail.com>
> ---
> kernel/sched/cpufreq_schedutil.c | 6 +++++-
> 1 file changed, 5 insertions(+), 1 deletion(-)
>
> diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> index 95c3c097083e..d12e95d30e2e 100644
> --- a/kernel/sched/cpufreq_schedutil.c
> +++ b/kernel/sched/cpufreq_schedutil.c
> @@ -133,7 +133,11 @@ unsigned long get_capacity_ref_freq(struct cpufreq_policy *policy)
> if (arch_scale_freq_invariant())
> return policy->cpuinfo.max_freq;
>
> - return policy->cur;
> + /*
> + * Apply a 25% margin so that we select a higher frequency than
> + * the current one before the CPU is full busy
> + */
> + return policy->cur + (policy->cur >> 2);
> }
>
> /**
We have also observed a performance degradation on our Tegra platforms
with v6.8-rc1. Unfortunately, the above change does not fix the problem
for us and we are still seeing a performance issue with v6.8-rc4. For
example, running Dhrystone on Tegra234 I am seeing the following ...
Linux v6.7:
[ 2216.301949] CPU0: Dhrystones per Second: 31976326 (18199 DMIPS)
[ 2220.993877] CPU1: Dhrystones per Second: 49568123 (28211 DMIPS)
[ 2225.685280] CPU2: Dhrystones per Second: 49568123 (28211 DMIPS)
[ 2230.364423] CPU3: Dhrystones per Second: 49632220 (28248 DMIPS)
Linux v6.8-rc4:
[ 44.661686] CPU0: Dhrystones per Second: 16068483 (9145 DMIPS)
[ 51.895107] CPU1: Dhrystones per Second: 16077457 (9150 DMIPS)
[ 59.105410] CPU2: Dhrystones per Second: 16095436 (9160 DMIPS)
[ 66.333297] CPU3: Dhrystones per Second: 16064000 (9142 DMIPS)
If I revert this change and the following ...
b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
f12560779f9d ("sched/cpufreq: Rework iowait boost")
9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
... then the perf is similar to where it was ...
Linux v6.8-rc4 plus reverts:
[ 31.768189] CPU0: Dhrystones per Second: 48421678 (27559 DMIPS)
[ 36.556838] CPU1: Dhrystones per Second: 48401324 (27547 DMIPS)
[ 41.343343] CPU2: Dhrystones per Second: 48421678 (27559 DMIPS)
[ 46.163347] CPU3: Dhrystones per Second: 47679814 (27137 DMIPS)
All CPUs are running with the schedutil CPUFREQ governor. We have not
looked any further but wanted to report this in case you have any more
thoughts or suggestions for us to try.
Thanks
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-14 17:12 ` [PATCH] sched/fair: Fix frequency selection for non invariant case Jon Hunter
@ 2024-02-14 17:19 ` Vincent Guittot
2024-02-14 17:19 ` Linus Torvalds
1 sibling, 0 replies; 8+ messages in thread
From: Vincent Guittot @ 2024-02-14 17:19 UTC (permalink / raw)
To: Jon Hunter
Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, bristot, vschneid, wkarny, torvalds, qyousef, tglx,
rafael, viresh.kumar, linux-kernel, linux-pm,
linux-tegra@vger.kernel.org, Thierry Reding, Sasha Levin,
Laxman Dewangan, Shardar Mohammed
Hi John,
On Wed, 14 Feb 2024 at 18:12, Jon Hunter <jonathanh@nvidia.com> wrote:
>
> Hi Vincent,
>
> On 14/01/2024 18:36, Vincent Guittot wrote:
> > When frequency invariance is not enabled, get_capacity_ref_freq(policy)
> > returns the current frequency and the performance margin applied by
> > map_util_perf(), enabled the utilization to go above the maximum compute
> > capacity and to select a higher frequency than the current one.
> >
> > The performance margin is now applied earlier in the path to take into
> > account some utilization clampings and we can't get an utilization higher
> > than the maximum compute capacity.
> >
> > We must use a frequency above the current frequency to get a chance to
> > select a higher OPP when the current one becomes fully used. Apply
> > the same margin and returns a frequency 25% higher than the current one in
> > order to switch to the next OPP before we fully use the cpu at the current
> > one.
> >
> > Reported-by: Linus Torvalds <torvalds@linux-foundation.org>
> > Closes: https://lore.kernel.org/lkml/CAHk-=wgWcYX2oXKtgvNN2LLDXP7kXkbo-xTfumEjmPbjSer2RQ@mail.gmail.com/
> > Reported-by: Wyes Karny <wkarny@gmail.com>
> > Closes: https://lore.kernel.org/lkml/20240114091240.xzdvqk75ifgfj5yx@wyes-pc/
> > Fixes: 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor performance estimation")
> > Signed-off-by: Vincent Guittot <vincent.guittot@linaro.org>
> > Tested-by: Wyes Karny <wkarny@gmail.com>
> > ---
> > kernel/sched/cpufreq_schedutil.c | 6 +++++-
> > 1 file changed, 5 insertions(+), 1 deletion(-)
> >
> > diff --git a/kernel/sched/cpufreq_schedutil.c b/kernel/sched/cpufreq_schedutil.c
> > index 95c3c097083e..d12e95d30e2e 100644
> > --- a/kernel/sched/cpufreq_schedutil.c
> > +++ b/kernel/sched/cpufreq_schedutil.c
> > @@ -133,7 +133,11 @@ unsigned long get_capacity_ref_freq(struct cpufreq_policy *policy)
> > if (arch_scale_freq_invariant())
> > return policy->cpuinfo.max_freq;
> >
> > - return policy->cur;
> > + /*
> > + * Apply a 25% margin so that we select a higher frequency than
> > + * the current one before the CPU is full busy
> > + */
> > + return policy->cur + (policy->cur >> 2);
> > }
> >
> > /**
>
>
> We have also observed a performance degradation on our Tegra platforms
> with v6.8-rc1. Unfortunately, the above change does not fix the problem
> for us and we are still seeing a performance issue with v6.8-rc4. For
> example, running Dhrystone on Tegra234 I am seeing the following ...
>
> Linux v6.7:
> [ 2216.301949] CPU0: Dhrystones per Second: 31976326 (18199 DMIPS)
> [ 2220.993877] CPU1: Dhrystones per Second: 49568123 (28211 DMIPS)
> [ 2225.685280] CPU2: Dhrystones per Second: 49568123 (28211 DMIPS)
> [ 2230.364423] CPU3: Dhrystones per Second: 49632220 (28248 DMIPS)
>
> Linux v6.8-rc4:
> [ 44.661686] CPU0: Dhrystones per Second: 16068483 (9145 DMIPS)
> [ 51.895107] CPU1: Dhrystones per Second: 16077457 (9150 DMIPS)
> [ 59.105410] CPU2: Dhrystones per Second: 16095436 (9160 DMIPS)
> [ 66.333297] CPU3: Dhrystones per Second: 16064000 (9142 DMIPS)
>
> If I revert this change and the following ...
>
> b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
> f12560779f9d ("sched/cpufreq: Rework iowait boost")
> 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
>
> ... then the perf is similar to where it was ...
>
> Linux v6.8-rc4 plus reverts:
> [ 31.768189] CPU0: Dhrystones per Second: 48421678 (27559 DMIPS)
> [ 36.556838] CPU1: Dhrystones per Second: 48401324 (27547 DMIPS)
> [ 41.343343] CPU2: Dhrystones per Second: 48421678 (27559 DMIPS)
> [ 46.163347] CPU3: Dhrystones per Second: 47679814 (27137 DMIPS)
>
> All CPUs are running with the schedutil CPUFREQ governor. We have not
> looked any further but wanted to report this in case you have any more
> thoughts or suggestions for us to try.
Have you tried this :
https://lore.kernel.org/lkml/20240117190545.596057-1-vincent.guittot@linaro.org/
It's in driver-core-linus' branch and should be sent to Linus soon
Vincent
>
> Thanks
> Jon
>
> --
> nvpublic
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-14 17:12 ` [PATCH] sched/fair: Fix frequency selection for non invariant case Jon Hunter
2024-02-14 17:19 ` Vincent Guittot
@ 2024-02-14 17:19 ` Linus Torvalds
2024-02-14 17:22 ` Vincent Guittot
1 sibling, 1 reply; 8+ messages in thread
From: Linus Torvalds @ 2024-02-14 17:19 UTC (permalink / raw)
To: Jon Hunter
Cc: Vincent Guittot, mingo, peterz, juri.lelli, dietmar.eggemann,
rostedt, bsegall, mgorman, bristot, vschneid, wkarny, qyousef,
tglx, rafael, viresh.kumar, linux-kernel, linux-pm,
linux-tegra@vger.kernel.org, Thierry Reding, Sasha Levin,
Laxman Dewangan, Shardar Mohammed
On Wed, 14 Feb 2024 at 09:12, Jon Hunter <jonathanh@nvidia.com> wrote:
>
> We have also observed a performance degradation on our Tegra platforms
> with v6.8-rc1. Unfortunately, the above change does not fix the problem
> for us and we are still seeing a performance issue with v6.8-rc4. For
> example, running Dhrystone on Tegra234 I am seeing the following ...
>
> Linux v6.7:
> [ 2216.301949] CPU0: Dhrystones per Second: 31976326 (18199 DMIPS)
> [ 2220.993877] CPU1: Dhrystones per Second: 49568123 (28211 DMIPS)
> [ 2225.685280] CPU2: Dhrystones per Second: 49568123 (28211 DMIPS)
> [ 2230.364423] CPU3: Dhrystones per Second: 49632220 (28248 DMIPS)
>
> Linux v6.8-rc4:
> [ 44.661686] CPU0: Dhrystones per Second: 16068483 (9145 DMIPS)
> [ 51.895107] CPU1: Dhrystones per Second: 16077457 (9150 DMIPS)
> [ 59.105410] CPU2: Dhrystones per Second: 16095436 (9160 DMIPS)
> [ 66.333297] CPU3: Dhrystones per Second: 16064000 (9142 DMIPS)
>
> If I revert this change and the following ...
>
> b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
> f12560779f9d ("sched/cpufreq: Rework iowait boost")
> 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
>
> ... then the perf is similar to where it was ...
Ok, guys, this whole scheduler / cpufreq rewrite seems to have been
completely buggered.
Please tell me why we shouldn't just revert things as per above?
Sure, the problem _I_ experienced is fixed, but apparently there are
others just lurking, and they are even bigger degradations than the
one I saw.
We're now at rc4, we're not releasing a 6.8 with the above kinds of
numbers. So either there's another obvious one-liner fix, or we need
to revert this whole thing.
Yes, dhrystones is a truly crappy benchmark, but partly _because_ it's
such a horribly bad benchmark it's also a very simple case. It's pure
CPU load with absolutely nothing interesting going on. Regressing on
that by a factor of three is a sign of complete failure.
Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-14 17:19 ` Linus Torvalds
@ 2024-02-14 17:22 ` Vincent Guittot
2024-02-14 17:57 ` Jon Hunter
2024-02-15 8:45 ` Thorsten Leemhuis
0 siblings, 2 replies; 8+ messages in thread
From: Vincent Guittot @ 2024-02-14 17:22 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jon Hunter, mingo, peterz, juri.lelli, dietmar.eggemann, rostedt,
bsegall, mgorman, bristot, vschneid, wkarny, qyousef, tglx,
rafael, viresh.kumar, linux-kernel, linux-pm,
linux-tegra@vger.kernel.org, Thierry Reding, Sasha Levin,
Laxman Dewangan, Shardar Mohammed
On Wed, 14 Feb 2024 at 18:20, Linus Torvalds
<torvalds@linux-foundation.org> wrote:
>
> On Wed, 14 Feb 2024 at 09:12, Jon Hunter <jonathanh@nvidia.com> wrote:
> >
> > We have also observed a performance degradation on our Tegra platforms
> > with v6.8-rc1. Unfortunately, the above change does not fix the problem
> > for us and we are still seeing a performance issue with v6.8-rc4. For
> > example, running Dhrystone on Tegra234 I am seeing the following ...
> >
> > Linux v6.7:
> > [ 2216.301949] CPU0: Dhrystones per Second: 31976326 (18199 DMIPS)
> > [ 2220.993877] CPU1: Dhrystones per Second: 49568123 (28211 DMIPS)
> > [ 2225.685280] CPU2: Dhrystones per Second: 49568123 (28211 DMIPS)
> > [ 2230.364423] CPU3: Dhrystones per Second: 49632220 (28248 DMIPS)
> >
> > Linux v6.8-rc4:
> > [ 44.661686] CPU0: Dhrystones per Second: 16068483 (9145 DMIPS)
> > [ 51.895107] CPU1: Dhrystones per Second: 16077457 (9150 DMIPS)
> > [ 59.105410] CPU2: Dhrystones per Second: 16095436 (9160 DMIPS)
> > [ 66.333297] CPU3: Dhrystones per Second: 16064000 (9142 DMIPS)
> >
> > If I revert this change and the following ...
> >
> > b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
> > f12560779f9d ("sched/cpufreq: Rework iowait boost")
> > 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
> >
> > ... then the perf is similar to where it was ...
>
> Ok, guys, this whole scheduler / cpufreq rewrite seems to have been
> completely buggered.
>
> Please tell me why we shouldn't just revert things as per above?
>
> Sure, the problem _I_ experienced is fixed, but apparently there are
> others just lurking, and they are even bigger degradations than the
> one I saw.
>
> We're now at rc4, we're not releasing a 6.8 with the above kinds of
> numbers. So either there's another obvious one-liner fix, or we need
> to revert this whole thing.
This should fix it:
https://lore.kernel.org/lkml/20240117190545.596057-1-vincent.guittot@linaro.org/
>
> Yes, dhrystones is a truly crappy benchmark, but partly _because_ it's
> such a horribly bad benchmark it's also a very simple case. It's pure
> CPU load with absolutely nothing interesting going on. Regressing on
> that by a factor of three is a sign of complete failure.
>
> Linus
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-14 17:22 ` Vincent Guittot
@ 2024-02-14 17:57 ` Jon Hunter
2024-02-15 8:45 ` Thorsten Leemhuis
1 sibling, 0 replies; 8+ messages in thread
From: Jon Hunter @ 2024-02-14 17:57 UTC (permalink / raw)
To: Vincent Guittot, Linus Torvalds
Cc: mingo, peterz, juri.lelli, dietmar.eggemann, rostedt, bsegall,
mgorman, bristot, vschneid, wkarny, qyousef, tglx, rafael,
viresh.kumar, linux-kernel, linux-pm, linux-tegra@vger.kernel.org,
Thierry Reding, Sasha Levin, Laxman Dewangan, Shardar Mohammed
On 14/02/2024 17:22, Vincent Guittot wrote:
> On Wed, 14 Feb 2024 at 18:20, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>>
>> On Wed, 14 Feb 2024 at 09:12, Jon Hunter <jonathanh@nvidia.com> wrote:
>>>
>>> We have also observed a performance degradation on our Tegra platforms
>>> with v6.8-rc1. Unfortunately, the above change does not fix the problem
>>> for us and we are still seeing a performance issue with v6.8-rc4. For
>>> example, running Dhrystone on Tegra234 I am seeing the following ...
>>>
>>> Linux v6.7:
>>> [ 2216.301949] CPU0: Dhrystones per Second: 31976326 (18199 DMIPS)
>>> [ 2220.993877] CPU1: Dhrystones per Second: 49568123 (28211 DMIPS)
>>> [ 2225.685280] CPU2: Dhrystones per Second: 49568123 (28211 DMIPS)
>>> [ 2230.364423] CPU3: Dhrystones per Second: 49632220 (28248 DMIPS)
>>>
>>> Linux v6.8-rc4:
>>> [ 44.661686] CPU0: Dhrystones per Second: 16068483 (9145 DMIPS)
>>> [ 51.895107] CPU1: Dhrystones per Second: 16077457 (9150 DMIPS)
>>> [ 59.105410] CPU2: Dhrystones per Second: 16095436 (9160 DMIPS)
>>> [ 66.333297] CPU3: Dhrystones per Second: 16064000 (9142 DMIPS)
>>>
>>> If I revert this change and the following ...
>>>
>>> b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
>>> f12560779f9d ("sched/cpufreq: Rework iowait boost")
>>> 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
>>>
>>> ... then the perf is similar to where it was ...
>>
>> Ok, guys, this whole scheduler / cpufreq rewrite seems to have been
>> completely buggered.
>>
>> Please tell me why we shouldn't just revert things as per above?
>>
>> Sure, the problem _I_ experienced is fixed, but apparently there are
>> others just lurking, and they are even bigger degradations than the
>> one I saw.
>>
>> We're now at rc4, we're not releasing a 6.8 with the above kinds of
>> numbers. So either there's another obvious one-liner fix, or we need
>> to revert this whole thing.
>
> This should fix it:
> https://lore.kernel.org/lkml/20240117190545.596057-1-vincent.guittot@linaro.org/
Yes I can confirm that this does fix it ...
[ 29.440836] CPU0: Dhrystones per Second: 48340366 (27513 DMIPS)
[ 34.221323] CPU1: Dhrystones per Second: 48585127 (27652 DMIPS)
[ 38.988036] CPU2: Dhrystones per Second: 48667266 (27699 DMIPS)
[ 43.769430] CPU3: Dhrystones per Second: 48544161 (27629 DMIPS)
>> Yes, dhrystones is a truly crappy benchmark, but partly _because_ it's
>> such a horribly bad benchmark it's also a very simple case. It's pure
>> CPU load with absolutely nothing interesting going on. Regressing on
>> that by a factor of three is a sign of complete failure.
We have a few other more extensive tests that have been failing due to
the perf issue. We will run those with the above and if we see any more
issues I will let everyone know.
Thanks
Jon
--
nvpublic
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-14 17:22 ` Vincent Guittot
2024-02-14 17:57 ` Jon Hunter
@ 2024-02-15 8:45 ` Thorsten Leemhuis
2024-02-15 9:13 ` Greg KH
1 sibling, 1 reply; 8+ messages in thread
From: Thorsten Leemhuis @ 2024-02-15 8:45 UTC (permalink / raw)
To: Linus Torvalds
Cc: Jon Hunter, mingo, peterz, juri.lelli, dietmar.eggemann, rostedt,
bsegall, mgorman, bristot, vschneid, wkarny, qyousef, tglx,
rafael, viresh.kumar, linux-kernel, linux-pm,
linux-tegra@vger.kernel.org, Thierry Reding, Sasha Levin,
Laxman Dewangan, Shardar Mohammed, Vincent Guittot,
Linux kernel regressions list
Linus, what...
On 14.02.24 18:22, Vincent Guittot wrote:
> On Wed, 14 Feb 2024 at 18:20, Linus Torvalds
> <torvalds@linux-foundation.org> wrote:
>> On Wed, 14 Feb 2024 at 09:12, Jon Hunter <jonathanh@nvidia.com> wrote:
>>> We have also observed a performance degradation on our Tegra platforms
>>> with v6.8-rc1. Unfortunately, the above change does not fix the problem
>>> for us and we are still seeing a performance issue with v6.8-rc4. For
>>> example, running Dhrystone on Tegra234 I am seeing the following ...
>>> [...]
>>> If I revert this change and the following ...
>>> b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
>>> f12560779f9d ("sched/cpufreq: Rework iowait boost")
>>> 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
>>> ... then the perf is similar to where it was ...
>>
>> Ok, guys, this whole scheduler / cpufreq rewrite seems to have been
>> completely buggered.
>> [...]
> This should fix it:
> https://lore.kernel.org/lkml/20240117190545.596057-1-vincent.guittot@linaro.org/
...do you want me to do in situations like this? I'm asking, as I see
situations like this frequently -- e.g. people reporting problems a
second, third, or fourth time while the fix is already sitting in -next
for a few days.
Want me to list them in the weekly reports so that you can cherry-pick
them from -next if you want?
Ciao, Thorsten
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-15 8:45 ` Thorsten Leemhuis
@ 2024-02-15 9:13 ` Greg KH
2024-02-15 10:00 ` Thorsten Leemhuis
0 siblings, 1 reply; 8+ messages in thread
From: Greg KH @ 2024-02-15 9:13 UTC (permalink / raw)
To: Thorsten Leemhuis
Cc: Linus Torvalds, Jon Hunter, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
wkarny, qyousef, tglx, rafael, viresh.kumar, linux-kernel,
linux-pm, linux-tegra@vger.kernel.org, Thierry Reding,
Sasha Levin, Laxman Dewangan, Shardar Mohammed, Vincent Guittot,
Linux kernel regressions list
On Thu, Feb 15, 2024 at 09:45:01AM +0100, Thorsten Leemhuis wrote:
> Linus, what...
>
> On 14.02.24 18:22, Vincent Guittot wrote:
> > On Wed, 14 Feb 2024 at 18:20, Linus Torvalds
> > <torvalds@linux-foundation.org> wrote:
> >> On Wed, 14 Feb 2024 at 09:12, Jon Hunter <jonathanh@nvidia.com> wrote:
> >>> We have also observed a performance degradation on our Tegra platforms
> >>> with v6.8-rc1. Unfortunately, the above change does not fix the problem
> >>> for us and we are still seeing a performance issue with v6.8-rc4. For
> >>> example, running Dhrystone on Tegra234 I am seeing the following ...
> >>> [...]
> >>> If I revert this change and the following ...
> >>> b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
> >>> f12560779f9d ("sched/cpufreq: Rework iowait boost")
> >>> 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
> >>> ... then the perf is similar to where it was ...
> >>
> >> Ok, guys, this whole scheduler / cpufreq rewrite seems to have been
> >> completely buggered.
> >> [...]
> > This should fix it:
> > https://lore.kernel.org/lkml/20240117190545.596057-1-vincent.guittot@linaro.org/
>
> ...do you want me to do in situations like this? I'm asking, as I see
> situations like this frequently -- e.g. people reporting problems a
> second, third, or fourth time while the fix is already sitting in -next
> for a few days.
>
> Want me to list them in the weekly reports so that you can cherry-pick
> them from -next if you want?
Poke the maintainer to get off their butt and submit the pull request to
Linus (note, this is me in this case...)
I'll get it into the next -rc, sorry for the delay, other things got in
the way, my fault.
greg k-h
^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [PATCH] sched/fair: Fix frequency selection for non invariant case
2024-02-15 9:13 ` Greg KH
@ 2024-02-15 10:00 ` Thorsten Leemhuis
0 siblings, 0 replies; 8+ messages in thread
From: Thorsten Leemhuis @ 2024-02-15 10:00 UTC (permalink / raw)
To: Greg KH
Cc: Linus Torvalds, Jon Hunter, mingo, peterz, juri.lelli,
dietmar.eggemann, rostedt, bsegall, mgorman, bristot, vschneid,
wkarny, qyousef, tglx, rafael, viresh.kumar, linux-kernel,
linux-pm, linux-tegra@vger.kernel.org, Thierry Reding,
Sasha Levin, Laxman Dewangan, Shardar Mohammed, Vincent Guittot,
Linux kernel regressions list
On 15.02.24 10:13, Greg KH wrote:
> On Thu, Feb 15, 2024 at 09:45:01AM +0100, Thorsten Leemhuis wrote:
>> Linus, what...
>>
>> On 14.02.24 18:22, Vincent Guittot wrote:
>>> On Wed, 14 Feb 2024 at 18:20, Linus Torvalds
>>> <torvalds@linux-foundation.org> wrote:
>>>> On Wed, 14 Feb 2024 at 09:12, Jon Hunter <jonathanh@nvidia.com> wrote:
>>>>> We have also observed a performance degradation on our Tegra platforms
>>>>> with v6.8-rc1. Unfortunately, the above change does not fix the problem
>>>>> for us and we are still seeing a performance issue with v6.8-rc4. For
>>>>> example, running Dhrystone on Tegra234 I am seeing the following ...
>>>>> [...]
>>>>> If I revert this change and the following ...
>>>>> b3edde44e5d4 ("cpufreq/schedutil: Use a fixed reference frequency")
>>>>> f12560779f9d ("sched/cpufreq: Rework iowait boost")
>>>>> 9c0b4bb7f630 ("sched/cpufreq: Rework schedutil governor
>>>>> ... then the perf is similar to where it was ...
>>>>
>>>> Ok, guys, this whole scheduler / cpufreq rewrite seems to have been
>>>> completely buggered.
>>>> [...]
>>> This should fix it:
>>> https://lore.kernel.org/lkml/20240117190545.596057-1-vincent.guittot@linaro.org/
>>
>> ...do you want me to do in situations like this? I'm asking, as I see
>> situations like this frequently -- e.g. people reporting problems a
>> second, third, or fourth time while the fix is already sitting in -next
>> for a few days.
>>
>> Want me to list them in the weekly reports so that you can cherry-pick
>> them from -next if you want?
>
> Poke the maintainer to get off their butt and submit the pull request to
> Linus
Well, I did that sometimes and will continue to do so. But some
maintainers then feel pestered and become annoyed by my efforts -- which
in the long-term is counter productive, as regression tracking will only
work well if maintainers and I work well together. That's why I'm a bit
careful with such things (side note: don't worry, I know that some
conflict is inevitable -- but I don't have your or Linus standing, so I
have to choose my fights carefully...).
I sometimes also got replies along the lines of "we are only at -rc2,
this can wait till -rc5 or -rc6" -- and I have no quote from Linus at
hand I can point maintainers to that says something along the lines of
"if a regression fix was in -next for at least two days, submit it to
mainline before the next -rc, unless there is a strong reason why that
particular fix needs more testing" (or whatever he actually wants).
> (note, this is me in this case...)
> I'll get it into the next -rc, sorry for the delay, other things got in
> the way, my fault.
Happens, I don't care too much about this specific event, more about the
general problem. Especially the -mm tree bugs me sometimes, as I noticed
that Andrew often lets regression fixes linger in -next for round about
a week; he furthermore sometimes sends this stuff to Linus on Mondays or
Tuesdays. Due to that the fixes often miss at least one, sometimes two
-rcs. That is especially hard to watch if the regression made it to a
stable kernel and you are waiting for the fix to get mainlined.
Ciao, Thorsten
^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2024-02-15 10:00 UTC | newest]
Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
[not found] <20240114183600.135316-1-vincent.guittot@linaro.org>
2024-02-14 17:12 ` [PATCH] sched/fair: Fix frequency selection for non invariant case Jon Hunter
2024-02-14 17:19 ` Vincent Guittot
2024-02-14 17:19 ` Linus Torvalds
2024-02-14 17:22 ` Vincent Guittot
2024-02-14 17:57 ` Jon Hunter
2024-02-15 8:45 ` Thorsten Leemhuis
2024-02-15 9:13 ` Greg KH
2024-02-15 10:00 ` Thorsten Leemhuis
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox