* Re: sched/deadline: Use revised wakeup rule for dl_server
@ 2026-05-08 8:09 Andreas Ziegler
2026-05-08 9:20 ` Christian Loehle
2026-05-11 12:46 ` Juri Lelli
0 siblings, 2 replies; 9+ messages in thread
From: Andreas Ziegler @ 2026-05-08 8:09 UTC (permalink / raw)
To: Peter Zijlstra, Juri Lelli; +Cc: linux-kernel
Linux kernel version: 6.12
CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied)
Architecture: aarch64
Platform: Raspberry Pi 4
Hi everyone,
Commit d66792919d4f (sched/deadline: Use revised wakeup rule for
dl_server) [1] introduced a marked degradation in scheduling latency for
real-time tasks in the presence of heavy I/O load.
--- a/kernel/sched/deadline.c
+++ b/kernel/sched/deadline.c
@@ -1079,7 +1079,7 @@ static void update_dl_entity(struct
sched_dl_entity *dl_se)
if (dl_time_before(dl_se->deadline, rq_clock(rq)) ||
dl_entity_overflow(dl_se, rq_clock(rq))) {
- if (unlikely(!dl_is_implicit(dl_se) &&
+ if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) &&
!dl_time_before(dl_se->deadline, rq_clock(rq)) &&
!is_dl_boosted(dl_se))) {
update_dl_revised_wakeup(dl_se, rq);
This was observed using a modified version of Con Kolivas' interactivity
benchmark [2]; kernel bisection eventually pointed to the above
mentioned commit.
Benchmark results before d66792919d4f:
--- Benchmarking simulated cpu of Audio real time in the presence of
simulated ---
Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%]
None 76.6 +/- 8.3654 76 166
Video 78.5 +/- 3.9433 78 107
X 76.4 +/- 8.123 75 157
Burn 72.0 +/- 6.4733 71 127
Write 255.3 +/- 26.627 252 331
Read 226.6 +/- 12.38 227 262
Ring 84.2 +/- 6.6207 83 125
Compile 225.3 +/- 23.949 222 328
136.8 +/- 78.462 331
Benchmark results after d66792919d4f:
--- Benchmarking simulated cpu of Audio real time in the presence of
simulated ---
Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%]
None 68.4 +/- 9.7864 67 169
Video 74.4 +/- 3.724 74 97
X 72.0 +/- 6.5681 71 129
Burn 66.9 +/- 5.9059 66 117
Write 9576.9 +/- 67639 250500418 98.1 98.1
Read 209.3 +/- 11.018 209 267
Ring 80.5 +/- 8.0993 78 125
Compile 239.0 +/- 29.447 234 372
1298.4 +/- 24118 500418
Reverting this commit obviously solves the issue for me. I have no idea
why this issue appears exclusively with heavy write loads in the
background.
Is this a scheduler issue, or rather something in the background?
Kind regards,
Andreas
[1]
https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/commit/?h=v6.12.86&id=d66792919d4f7bd326dfd8c21d019f7c5d4ef05c
[2] https://github.com/ckolivas/interbench
^ permalink raw reply [flat|nested] 9+ messages in thread* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-08 8:09 sched/deadline: Use revised wakeup rule for dl_server Andreas Ziegler @ 2026-05-08 9:20 ` Christian Loehle 2026-05-08 12:06 ` Andreas Ziegler 2026-05-11 12:46 ` Juri Lelli 1 sibling, 1 reply; 9+ messages in thread From: Christian Loehle @ 2026-05-08 9:20 UTC (permalink / raw) To: Andreas Ziegler, Peter Zijlstra, Juri Lelli Cc: linux-kernel, Dietmar Eggemann On 5/8/26 09:09, Andreas Ziegler wrote: > Linux kernel version: 6.12 > CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) > Architecture: aarch64 > Platform: Raspberry Pi 4 > > Hi everyone, > > Commit d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) [1] introduced a marked degradation in scheduling latency for real-time tasks in the presence of heavy I/O load. > > --- a/kernel/sched/deadline.c > +++ b/kernel/sched/deadline.c > @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se) > if (dl_time_before(dl_se->deadline, rq_clock(rq)) || > dl_entity_overflow(dl_se, rq_clock(rq))) { > > - if (unlikely(!dl_is_implicit(dl_se) && > + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && > !dl_time_before(dl_se->deadline, rq_clock(rq)) && > !is_dl_boosted(dl_se))) { > update_dl_revised_wakeup(dl_se, rq); > > This was observed using a modified version of Con Kolivas' interactivity benchmark [2]; kernel bisection eventually pointed to the above mentioned commit. > > Benchmark results before d66792919d4f: > > --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- > Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] > None 76.6 +/- 8.3654 76 166 > Video 78.5 +/- 3.9433 78 107 > X 76.4 +/- 8.123 75 157 > Burn 72.0 +/- 6.4733 71 127 > Write 255.3 +/- 26.627 252 331 > Read 226.6 +/- 12.38 227 262 > Ring 84.2 +/- 6.6207 83 125 > Compile 225.3 +/- 23.949 222 328 > > 136.8 +/- 78.462 331 > > Benchmark results after d66792919d4f: > > --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- > Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] > None 68.4 +/- 9.7864 67 169 > Video 74.4 +/- 3.724 74 97 > X 72.0 +/- 6.5681 71 129 > Burn 66.9 +/- 5.9059 66 117 > Write 9576.9 +/- 67639 250500418 98.1 98.1 > Read 209.3 +/- 11.018 209 267 > Ring 80.5 +/- 8.0993 78 125 > Compile 239.0 +/- 29.447 234 372 > > 1298.4 +/- 24118 500418 > > Reverting this commit obviously solves the issue for me. I have no idea why this issue appears exclusively with heavy write loads in the background. > > Is this a scheduler issue, or rather something in the background? > Hi Andreas, You're using cpufreq schedutil for your tests I'm assuming? Is there a difference in cpufreq behavior (avg cpufreq or OPP residencies?) Does the regression also happen on powersave/performance governor? ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-08 9:20 ` Christian Loehle @ 2026-05-08 12:06 ` Andreas Ziegler 2026-05-08 14:13 ` Christian Loehle 0 siblings, 1 reply; 9+ messages in thread From: Andreas Ziegler @ 2026-05-08 12:06 UTC (permalink / raw) To: Christian Loehle Cc: Peter Zijlstra, Juri Lelli, linux-kernel, Dietmar Eggemann Hi Christian, On 2026-05-08 09:20, Christian Loehle wrote: > On 5/8/26 09:09, Andreas Ziegler wrote: >> Linux kernel version: 6.12 >> CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >> Architecture: aarch64 >> Platform: Raspberry Pi 4 >> >> Hi everyone, >> >> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for >> dl_server) [1] introduced a marked degradation in scheduling latency >> for real-time tasks in the presence of heavy I/O load. >> >> --- a/kernel/sched/deadline.c >> +++ b/kernel/sched/deadline.c >> @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct >> sched_dl_entity *dl_se) >> if (dl_time_before(dl_se->deadline, rq_clock(rq)) || >> dl_entity_overflow(dl_se, rq_clock(rq))) { >> >> - if (unlikely(!dl_is_implicit(dl_se) && >> + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && >> !dl_time_before(dl_se->deadline, rq_clock(rq)) && >> !is_dl_boosted(dl_se))) { >> update_dl_revised_wakeup(dl_se, rq); >> >> This was observed using a modified version of Con Kolivas' >> interactivity benchmark [2]; kernel bisection eventually pointed to >> the above mentioned commit. >> >> Benchmark results before d66792919d4f: >> >> --- Benchmarking simulated cpu of Audio real time in the presence of >> simulated --- >> Load Latency +/- SD median max [100n] Desired CPU Deadlines >> met [%] >> None 76.6 +/- 8.3654 76 166 >> Video 78.5 +/- 3.9433 78 107 >> X 76.4 +/- 8.123 75 157 >> Burn 72.0 +/- 6.4733 71 127 >> Write 255.3 +/- 26.627 252 331 >> Read 226.6 +/- 12.38 227 262 >> Ring 84.2 +/- 6.6207 83 125 >> Compile 225.3 +/- 23.949 222 328 >> >> 136.8 +/- 78.462 331 >> >> Benchmark results after d66792919d4f: >> >> --- Benchmarking simulated cpu of Audio real time in the presence of >> simulated --- >> Load Latency +/- SD median max [100n] Desired CPU Deadlines >> met [%] >> None 68.4 +/- 9.7864 67 169 >> Video 74.4 +/- 3.724 74 97 >> X 72.0 +/- 6.5681 71 129 >> Burn 66.9 +/- 5.9059 66 117 >> Write 9576.9 +/- 67639 250500418 98.1 98.1 >> Read 209.3 +/- 11.018 209 267 >> Ring 80.5 +/- 8.0993 78 125 >> Compile 239.0 +/- 29.447 234 372 >> >> 1298.4 +/- 24118 500418 >> >> Reverting this commit obviously solves the issue for me. I have no >> idea why this issue appears exclusively with heavy write loads in the >> background. >> >> Is this a scheduler issue, or rather something in the background? >> > > Hi Andreas, > You're using cpufreq schedutil for your tests I'm assuming? > Is there a difference in cpufreq behavior (avg cpufreq or OPP > residencies?) > Does the regression also happen on powersave/performance governor? Actually this is a very stripped-down system. The 'performance' cpufreq governor is the only one compiled in, the processor cores run on a fixed frequency. CONFIG_PM_OPP is not set. Removing the frequency constraint and using 'powersave' governor lets the latency values rise generally, but the anomaly under write loads persists. The cpu frequency does not change, but remains stuck on the lowest level. --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] None 238.7 +/- 31.416 229 405 Video 228.6 +/- 13.668 226 291 X 247.8 +/- 29.196 239 425 Burn 222.6 +/- 30.631 215 348 Write 1214.8 +/- 20397 369500411 99.8 99.8 Read 393.9 +/- 21.375 394 476 Ring 250.3 +/- 27.59 241 365 Compile 411.2 +/- 23.41 411 474 401.0 +/- 7218.2 500411 Same with 'schedutil' governor; the cpu frequency adjusts with the load. --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] None 200.9 +/- 57.332 208 431 Video 136.2 +/- 23.784 136 250 X 172.3 +/- 59.286 174 404 Burn 104.1 +/- 22.847 97 247 Write 5337.5 +/- 49960 286500394 99 99 Read 300.5 +/- 18.65 301 359 Ring 119.8 +/- 15.8 115 196 Compile 282.7 +/- 25.056 280 469 831.7 +/- 17746 500394 Kind regards, Andreas ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-08 12:06 ` Andreas Ziegler @ 2026-05-08 14:13 ` Christian Loehle 2026-05-09 11:42 ` Andreas Ziegler 0 siblings, 1 reply; 9+ messages in thread From: Christian Loehle @ 2026-05-08 14:13 UTC (permalink / raw) To: Andreas Ziegler Cc: Peter Zijlstra, Juri Lelli, linux-kernel, Dietmar Eggemann, John Stultz On 5/8/26 13:06, Andreas Ziegler wrote: > Hi Christian, > > On 2026-05-08 09:20, Christian Loehle wrote: >> On 5/8/26 09:09, Andreas Ziegler wrote: >>> Linux kernel version: 6.12 >>> CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >>> Architecture: aarch64 >>> Platform: Raspberry Pi 4 >>> >>> Hi everyone, >>> >>> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) [1] introduced a marked degradation in scheduling latency for real-time tasks in the presence of heavy I/O load. >>> >>> --- a/kernel/sched/deadline.c >>> +++ b/kernel/sched/deadline.c >>> @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se) >>> if (dl_time_before(dl_se->deadline, rq_clock(rq)) || >>> dl_entity_overflow(dl_se, rq_clock(rq))) { >>> >>> - if (unlikely(!dl_is_implicit(dl_se) && >>> + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && >>> !dl_time_before(dl_se->deadline, rq_clock(rq)) && >>> !is_dl_boosted(dl_se))) { >>> update_dl_revised_wakeup(dl_se, rq); >>> >>> This was observed using a modified version of Con Kolivas' interactivity benchmark [2]; kernel bisection eventually pointed to the above mentioned commit. >>> >>> Benchmark results before d66792919d4f: >>> >>> --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- >>> Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] >>> None 76.6 +/- 8.3654 76 166 >>> Video 78.5 +/- 3.9433 78 107 >>> X 76.4 +/- 8.123 75 157 >>> Burn 72.0 +/- 6.4733 71 127 >>> Write 255.3 +/- 26.627 252 331 >>> Read 226.6 +/- 12.38 227 262 >>> Ring 84.2 +/- 6.6207 83 125 >>> Compile 225.3 +/- 23.949 222 328 >>> >>> 136.8 +/- 78.462 331 >>> >>> Benchmark results after d66792919d4f: >>> >>> --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- >>> Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] >>> None 68.4 +/- 9.7864 67 169 >>> Video 74.4 +/- 3.724 74 97 >>> X 72.0 +/- 6.5681 71 129 >>> Burn 66.9 +/- 5.9059 66 117 >>> Write 9576.9 +/- 67639 250500418 98.1 98.1 >>> Read 209.3 +/- 11.018 209 267 >>> Ring 80.5 +/- 8.0993 78 125 >>> Compile 239.0 +/- 29.447 234 372 >>> >>> 1298.4 +/- 24118 500418 >>> >>> Reverting this commit obviously solves the issue for me. I have no idea why this issue appears exclusively with heavy write loads in the background. >>> >>> Is this a scheduler issue, or rather something in the background? >>> >> >> Hi Andreas, >> You're using cpufreq schedutil for your tests I'm assuming? >> Is there a difference in cpufreq behavior (avg cpufreq or OPP residencies?) >> Does the regression also happen on powersave/performance governor? > > Actually this is a very stripped-down system. The 'performance' cpufreq governor is the only one compiled in, the processor cores run on a fixed frequency. CONFIG_PM_OPP is not set. That certainly makes the analysis easier. I couldn't reproduce the issue so far on my system but it does seem like the dl server would get potentially unbounded running time with very frequent starting and stopping of the dlserver (which presumably happens because of the writeback) reset the runtime, which then leads to your 25s observed latency. Peter, how is the revised wakeup rule supposed to behave here? > [snip] ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-08 14:13 ` Christian Loehle @ 2026-05-09 11:42 ` Andreas Ziegler 2026-05-11 9:47 ` Christian Loehle 0 siblings, 1 reply; 9+ messages in thread From: Andreas Ziegler @ 2026-05-09 11:42 UTC (permalink / raw) To: Christian Loehle Cc: Peter Zijlstra, Juri Lelli, linux-kernel, Dietmar Eggemann, John Stultz Hi Christian, Everyone, On 2026-05-08 14:13, Christian Loehle wrote: > On 5/8/26 13:06, Andreas Ziegler wrote: >> Hi Christian, >> >> On 2026-05-08 09:20, Christian Loehle wrote: >>> On 5/8/26 09:09, Andreas Ziegler wrote: >>>> Linux kernel version: 6.12 >>>> CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >>>> Architecture: aarch64 >>>> Platform: Raspberry Pi 4 >>>> >>>> Hi everyone, >>>> >>>> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for >>>> dl_server) [1] introduced a marked degradation in scheduling latency >>>> for real-time tasks in the presence of heavy I/O load. >>>> >>>> --- a/kernel/sched/deadline.c >>>> +++ b/kernel/sched/deadline.c >>>> @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct >>>> sched_dl_entity *dl_se) >>>> if (dl_time_before(dl_se->deadline, rq_clock(rq)) || >>>> dl_entity_overflow(dl_se, rq_clock(rq))) { >>>> >>>> - if (unlikely(!dl_is_implicit(dl_se) && >>>> + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && >>>> !dl_time_before(dl_se->deadline, rq_clock(rq)) && >>>> !is_dl_boosted(dl_se))) { >>>> update_dl_revised_wakeup(dl_se, rq); >>>> >>>> This was observed using a modified version of Con Kolivas' >>>> interactivity benchmark [2]; kernel bisection eventually pointed to >>>> the above mentioned commit. >>>> >>>> Benchmark results before d66792919d4f: >>>> >>>> --- Benchmarking simulated cpu of Audio real time in the presence of >>>> simulated --- >>>> Load Latency +/- SD median max [100n] Desired CPU >>>> Deadlines met [%] >>>> None 76.6 +/- 8.3654 76 166 >>>> Video 78.5 +/- 3.9433 78 107 >>>> X 76.4 +/- 8.123 75 157 >>>> Burn 72.0 +/- 6.4733 71 127 >>>> Write 255.3 +/- 26.627 252 331 >>>> Read 226.6 +/- 12.38 227 262 >>>> Ring 84.2 +/- 6.6207 83 125 >>>> Compile 225.3 +/- 23.949 222 328 >>>> >>>> 136.8 +/- 78.462 331 >>>> >>>> Benchmark results after d66792919d4f: >>>> >>>> --- Benchmarking simulated cpu of Audio real time in the presence of >>>> simulated --- >>>> Load Latency +/- SD median max [100n] Desired CPU >>>> Deadlines met [%] >>>> None 68.4 +/- 9.7864 67 169 >>>> Video 74.4 +/- 3.724 74 97 >>>> X 72.0 +/- 6.5681 71 129 >>>> Burn 66.9 +/- 5.9059 66 117 >>>> Write 9576.9 +/- 67639 250500418 98.1 98.1 >>>> Read 209.3 +/- 11.018 209 267 >>>> Ring 80.5 +/- 8.0993 78 125 >>>> Compile 239.0 +/- 29.447 234 372 >>>> >>>> 1298.4 +/- 24118 500418 >>>> >>>> Reverting this commit obviously solves the issue for me. I have no >>>> idea why this issue appears exclusively with heavy write loads in >>>> the background. >>>> >>>> Is this a scheduler issue, or rather something in the background? >>>> >>> >>> Hi Andreas, >>> You're using cpufreq schedutil for your tests I'm assuming? >>> Is there a difference in cpufreq behavior (avg cpufreq or OPP >>> residencies?) >>> Does the regression also happen on powersave/performance governor? >> >> Actually this is a very stripped-down system. The 'performance' >> cpufreq governor is the only one compiled in, the processor cores run >> on a fixed frequency. CONFIG_PM_OPP is not set. > > That certainly makes the analysis easier. > I couldn't reproduce the issue so far on my system but it does seem > like the dl server > would get potentially unbounded running time with very frequent > starting and stopping of the dlserver (which presumably happens because > of > the writeback) reset the runtime, which then leads to your 25s observed > latency. > Peter, how is the revised wakeup rule supposed to behave here? > >> [snip] This seems to be a case of runtime starvation. If I change sched_rt_runtime_us to a smaller value, the benchmark returns reasonable latency values. # echo "980000" > /proc/sys/kernel/sched_rt_runtime_us I could live with this workaround, since it seems not to impact overall latency values in a noticeable way. Kind regards, Andreas ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-09 11:42 ` Andreas Ziegler @ 2026-05-11 9:47 ` Christian Loehle 2026-05-11 12:37 ` Andreas Ziegler 0 siblings, 1 reply; 9+ messages in thread From: Christian Loehle @ 2026-05-11 9:47 UTC (permalink / raw) To: Andreas Ziegler Cc: Peter Zijlstra, Juri Lelli, linux-kernel, Dietmar Eggemann, John Stultz On 5/9/26 12:42, Andreas Ziegler wrote: > Hi Christian, Everyone, > > On 2026-05-08 14:13, Christian Loehle wrote: >> On 5/8/26 13:06, Andreas Ziegler wrote: >>> Hi Christian, >>> >>> On 2026-05-08 09:20, Christian Loehle wrote: >>>> On 5/8/26 09:09, Andreas Ziegler wrote: >>>>> Linux kernel version: 6.12 >>>>> CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >>>>> Architecture: aarch64 >>>>> Platform: Raspberry Pi 4 >>>>> >>>>> Hi everyone, >>>>> >>>>> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) [1] introduced a marked degradation in scheduling latency for real-time tasks in the presence of heavy I/O load. >>>>> >>>>> --- a/kernel/sched/deadline.c >>>>> +++ b/kernel/sched/deadline.c >>>>> @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct sched_dl_entity *dl_se) >>>>> if (dl_time_before(dl_se->deadline, rq_clock(rq)) || >>>>> dl_entity_overflow(dl_se, rq_clock(rq))) { >>>>> >>>>> - if (unlikely(!dl_is_implicit(dl_se) && >>>>> + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) && >>>>> !dl_time_before(dl_se->deadline, rq_clock(rq)) && >>>>> !is_dl_boosted(dl_se))) { >>>>> update_dl_revised_wakeup(dl_se, rq); >>>>> >>>>> This was observed using a modified version of Con Kolivas' interactivity benchmark [2]; kernel bisection eventually pointed to the above mentioned commit. >>>>> >>>>> Benchmark results before d66792919d4f: >>>>> >>>>> --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- >>>>> Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] >>>>> None 76.6 +/- 8.3654 76 166 >>>>> Video 78.5 +/- 3.9433 78 107 >>>>> X 76.4 +/- 8.123 75 157 >>>>> Burn 72.0 +/- 6.4733 71 127 >>>>> Write 255.3 +/- 26.627 252 331 >>>>> Read 226.6 +/- 12.38 227 262 >>>>> Ring 84.2 +/- 6.6207 83 125 >>>>> Compile 225.3 +/- 23.949 222 328 >>>>> >>>>> 136.8 +/- 78.462 331 >>>>> >>>>> Benchmark results after d66792919d4f: >>>>> >>>>> --- Benchmarking simulated cpu of Audio real time in the presence of simulated --- >>>>> Load Latency +/- SD median max [100n] Desired CPU Deadlines met [%] >>>>> None 68.4 +/- 9.7864 67 169 >>>>> Video 74.4 +/- 3.724 74 97 >>>>> X 72.0 +/- 6.5681 71 129 >>>>> Burn 66.9 +/- 5.9059 66 117 >>>>> Write 9576.9 +/- 67639 250500418 98.1 98.1 >>>>> Read 209.3 +/- 11.018 209 267 >>>>> Ring 80.5 +/- 8.0993 78 125 >>>>> Compile 239.0 +/- 29.447 234 372 >>>>> >>>>> 1298.4 +/- 24118 500418 >>>>> >>>>> Reverting this commit obviously solves the issue for me. I have no idea why this issue appears exclusively with heavy write loads in the background. >>>>> >>>>> Is this a scheduler issue, or rather something in the background? >>>>> >>>> >>>> Hi Andreas, >>>> You're using cpufreq schedutil for your tests I'm assuming? >>>> Is there a difference in cpufreq behavior (avg cpufreq or OPP residencies?) >>>> Does the regression also happen on powersave/performance governor? >>> >>> Actually this is a very stripped-down system. The 'performance' cpufreq governor is the only one compiled in, the processor cores run on a fixed frequency. CONFIG_PM_OPP is not set. >> >> That certainly makes the analysis easier. >> I couldn't reproduce the issue so far on my system but it does seem like the dl server >> would get potentially unbounded running time with very frequent >> starting and stopping of the dlserver (which presumably happens because of >> the writeback) reset the runtime, which then leads to your 25s observed latency. >> Peter, how is the revised wakeup rule supposed to behave here? >> >>> [snip] > > This seems to be a case of runtime starvation. If I change sched_rt_runtime_us to a smaller value, the benchmark returns reasonable latency values. > > # echo "980000" > /proc/sys/kernel/sched_rt_runtime_us > > I could live with this workaround, since it seems not to impact overall latency values in a noticeable way. > Not a very stable workaround unfortunately :/ While I try to reproduce this, what you're observing should imply that the background SCHED_NORMAL work is enough to fully utilize the system, right? interbench Write does 4k (buffered) writes of a 1GB file and then close+open and repeat, nothing fancy really. Does this actually produce significant CPU utilization for you? Can you just run the background work and see what that looks like? (What you're seeing looks like a bug in any case, just so I'm not going down a wrong path when trying to reproduce here). ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-11 9:47 ` Christian Loehle @ 2026-05-11 12:37 ` Andreas Ziegler 0 siblings, 0 replies; 9+ messages in thread From: Andreas Ziegler @ 2026-05-11 12:37 UTC (permalink / raw) To: Christian Loehle Cc: Peter Zijlstra, Juri Lelli, linux-kernel, Dietmar Eggemann, John Stultz On 2026-05-11 09:47, Christian Loehle wrote: > On 5/9/26 12:42, Andreas Ziegler wrote: >> Hi Christian, Everyone, >> >> On 2026-05-08 14:13, Christian Loehle wrote: >>> On 5/8/26 13:06, Andreas Ziegler wrote: >>>> Hi Christian, >>>> >>>> On 2026-05-08 09:20, Christian Loehle wrote: >>>>> On 5/8/26 09:09, Andreas Ziegler wrote: >>>>>> Linux kernel version: 6.12 >>>>>> CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >>>>>> Architecture: aarch64 >>>>>> Platform: Raspberry Pi 4 >>>>>> >>>>>> Hi everyone, >>>>>> >>>>>> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for >>>>>> dl_server) [1] introduced a marked degradation in scheduling >>>>>> latency for real-time tasks in the presence of heavy I/O load. >>>>>> >>>>>> --- a/kernel/sched/deadline.c >>>>>> +++ b/kernel/sched/deadline.c >>>>>> @@ -1079,7 +1079,7 @@ static void update_dl_entity(struct >>>>>> sched_dl_entity *dl_se) >>>>>> if (dl_time_before(dl_se->deadline, rq_clock(rq)) || >>>>>> dl_entity_overflow(dl_se, rq_clock(rq))) { >>>>>> >>>>>> - if (unlikely(!dl_is_implicit(dl_se) && >>>>>> + if (unlikely((!dl_is_implicit(dl_se) || dl_se->dl_defer) >>>>>> && >>>>>> !dl_time_before(dl_se->deadline, rq_clock(rq)) >>>>>> && >>>>>> !is_dl_boosted(dl_se))) { >>>>>> update_dl_revised_wakeup(dl_se, rq); >>>>>> >>>>>> This was observed using a modified version of Con Kolivas' >>>>>> interactivity benchmark [2]; kernel bisection eventually pointed >>>>>> to the above mentioned commit. >>>>>> >>>>>> Benchmark results before d66792919d4f: >>>>>> >>>>>> --- Benchmarking simulated cpu of Audio real time in the presence >>>>>> of simulated --- >>>>>> Load Latency +/- SD median max [100n] Desired CPU >>>>>> Deadlines met [%] >>>>>> None 76.6 +/- 8.3654 76 166 >>>>>> Video 78.5 +/- 3.9433 78 107 >>>>>> X 76.4 +/- 8.123 75 157 >>>>>> Burn 72.0 +/- 6.4733 71 127 >>>>>> Write 255.3 +/- 26.627 252 331 >>>>>> Read 226.6 +/- 12.38 227 262 >>>>>> Ring 84.2 +/- 6.6207 83 125 >>>>>> Compile 225.3 +/- 23.949 222 328 >>>>>> >>>>>> 136.8 +/- 78.462 331 >>>>>> >>>>>> Benchmark results after d66792919d4f: >>>>>> >>>>>> --- Benchmarking simulated cpu of Audio real time in the presence >>>>>> of simulated --- >>>>>> Load Latency +/- SD median max [100n] Desired CPU >>>>>> Deadlines met [%] >>>>>> None 68.4 +/- 9.7864 67 169 >>>>>> Video 74.4 +/- 3.724 74 97 >>>>>> X 72.0 +/- 6.5681 71 129 >>>>>> Burn 66.9 +/- 5.9059 66 117 >>>>>> Write 9576.9 +/- 67639 250500418 98.1 98.1 >>>>>> Read 209.3 +/- 11.018 209 267 >>>>>> Ring 80.5 +/- 8.0993 78 125 >>>>>> Compile 239.0 +/- 29.447 234 372 >>>>>> >>>>>> 1298.4 +/- 24118 500418 >>>>>> >>>>>> Reverting this commit obviously solves the issue for me. I have no >>>>>> idea why this issue appears exclusively with heavy write loads in >>>>>> the background. >>>>>> >>>>>> Is this a scheduler issue, or rather something in the background? >>>>>> >>>>> >>>>> Hi Andreas, >>>>> You're using cpufreq schedutil for your tests I'm assuming? >>>>> Is there a difference in cpufreq behavior (avg cpufreq or OPP >>>>> residencies?) >>>>> Does the regression also happen on powersave/performance governor? >>>> >>>> Actually this is a very stripped-down system. The 'performance' >>>> cpufreq governor is the only one compiled in, the processor cores >>>> run on a fixed frequency. CONFIG_PM_OPP is not set. >>> >>> That certainly makes the analysis easier. >>> I couldn't reproduce the issue so far on my system but it does seem >>> like the dl server >>> would get potentially unbounded running time with very frequent >>> starting and stopping of the dlserver (which presumably happens >>> because of >>> the writeback) reset the runtime, which then leads to your 25s >>> observed latency. >>> Peter, how is the revised wakeup rule supposed to behave here? >>> >>>> [snip] >> >> This seems to be a case of runtime starvation. If I change >> sched_rt_runtime_us to a smaller value, the benchmark returns >> reasonable latency values. >> >> # echo "980000" > /proc/sys/kernel/sched_rt_runtime_us >> >> I could live with this workaround, since it seems not to impact >> overall latency values in a noticeable way. >> > > Not a very stable workaround unfortunately :/ > While I try to reproduce this, what you're observing should imply that > the > background SCHED_NORMAL work is enough to fully utilize the system, > right? > interbench Write does 4k (buffered) writes of a 1GB file and then > close+open > and repeat, nothing fancy really. Does this actually produce > significant CPU > utilization for you? Can you just run the background work and see what > that > looks like? > (What you're seeing looks like a bug in any case, just so I'm not going > down > a wrong path when trying to reproduce here). You are right, and this was a false positive; the problem seems to be intermittent (maybe 1/20) and I just got lucky for one session. Some background information about the current state of the system: /* CONFIG_CPU_FREQ is not set */ Root filesystem in RAM (initrd) Cpu 3 is isolated: boot parameters: console=tty1 console=ttyAMA0,115200 isolcpus=nohz,domain,managed_irq,3 nohz_full=3 rcu_nocbs=3 Background load is normally near 100% idle; this is from top after reboot: Mem: 95724K used, 853524K free, 42408K shrd, 72K buff, 43352K cached CPU: 0.0% usr 0.0% sys 0.0% nic 100% idle 0.0% io 0.0% irq 0.0% sirq Load average: 0.21 0.17 0.07 3/126 702 The file size used by interbench is even less than 1GB, due to the limits of the rootfs. Typical values are around 100-200 MiB. It is written in an infinite loop until receiving the stop message (via pipe) from the controlling process. The check for the abort signal occurs after a completed write, not on block level. I just noticed that interbench seems to have a bug itself: it uses only one processor - looks like a mangled cpu mask. Top output during the write benchmark: Mem: 358024K used, 591224K free, 298516K shrd, 2504K buff, 299464K cached CPU: 1.8% usr 23.1% sys 0.0% nic 74.9% idle 0.0% io 0.0% irq 0.0% sirq Load average: 1.21 0.46 0.29 5/129 2116 PID PPID USER STAT VSZ %VSZ CPU %CPU COMMAND 2106 2105 root S 1228 0.1 0 23.6 interbench -r -t 60 -u -w Write -W 2109 2105 root S 1228 0.1 0 1.2 interbench -r -t 60 -u -w Write -W 1829 1274 root R 1600 0.1 2 0.0 top -d 5 22 2 root SW 0 0.0 0 0.0 [rcuc/0] 1270 2 root IW 0 0.0 0 0.0 [kworker/0:0-eve] 652 1 mpd S 27632 2.9 0 0.0 /usr/bin/mpd 2023 2021 root S 4476 0.4 0 0.0 sshd-session: root@notty 675 673 root S 4448 0.4 1 0.0 sshd-session: root@pts/0 673 601 root S 4140 0.4 0 0.0 sshd-session: root [priv] 2021 601 root S 4140 0.4 0 0.0 sshd-session: root [priv] 601 1 root S 3736 0.3 1 0.0 sshd: /usr/sbin/sshd [listener] 0 2024 2023 root S 3224 0.3 1 0.0 /usr/libexec/sftp-server 2025 2023 root S 3188 0.3 2 0.0 /usr/libexec/sftp-server 501 1 root S 1884 0.2 1 0.0 /usr/sbin/wpa_supplicant -B -P /va 131 1 root S 1672 0.1 0 0.0 /sbin/mdev -df 676 675 root S 1636 0.1 1 0.0 -sh 1274 605 root S 1636 0.1 1 0.0 -sh 605 1 root S 1592 0.1 1 0.0 /usr/sbin/telnetd -F 527 1 root S 1576 0.1 2 0.0 udhcpc -t1 -A2 -b -R -O search -O 1 0 root S 1576 0.1 0 0.0 init I tried limiting interbench's rather excessive SCHED_FIFO priorities to values normal for the system, but without success. ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-08 8:09 sched/deadline: Use revised wakeup rule for dl_server Andreas Ziegler 2026-05-08 9:20 ` Christian Loehle @ 2026-05-11 12:46 ` Juri Lelli 2026-05-11 14:13 ` Andreas Ziegler 1 sibling, 1 reply; 9+ messages in thread From: Juri Lelli @ 2026-05-11 12:46 UTC (permalink / raw) To: Andreas Ziegler; +Cc: Peter Zijlstra, linux-kernel Hello, On 08/05/26 08:09, Andreas Ziegler wrote: > Linux kernel version: 6.12 > CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) > Architecture: aarch64 > Platform: Raspberry Pi 4 > > Hi everyone, > > Commit d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) > [1] introduced a marked degradation in scheduling latency for real-time > tasks in the presence of heavy I/O load. Can this be the same regression reported here? https://marc.info/?l=linux-rt-users&m=177844667227991 Please notice the list of missing subsequent fixes Mike is suggesting to test with. https://marc.info/?l=linux-rt-users&m=177847863710263&w=2 Thanks, Juri ^ permalink raw reply [flat|nested] 9+ messages in thread
* Re: sched/deadline: Use revised wakeup rule for dl_server 2026-05-11 12:46 ` Juri Lelli @ 2026-05-11 14:13 ` Andreas Ziegler 0 siblings, 0 replies; 9+ messages in thread From: Andreas Ziegler @ 2026-05-11 14:13 UTC (permalink / raw) To: Juri Lelli; +Cc: Peter Zijlstra, linux-kernel Hi Juri, On 2026-05-11 12:46, Juri Lelli wrote: > Hello, > > On 08/05/26 08:09, Andreas Ziegler wrote: >> Linux kernel version: 6.12 >> CONFIG_PREEMPT_RT (w/ PREEMPT_RT patch applied) >> Architecture: aarch64 >> Platform: Raspberry Pi 4 >> >> Hi everyone, >> >> Commit d66792919d4f (sched/deadline: Use revised wakeup rule for >> dl_server) >> [1] introduced a marked degradation in scheduling latency for >> real-time >> tasks in the presence of heavy I/O load. > > Can this be the same regression reported here? > > https://marc.info/?l=linux-rt-users&m=177844667227991 Yes, this is the same issue. I wonder where the 50 ms are coming from ... The value is fairly consistent also in my results. > Please notice the list of missing subsequent fixes Mike is suggesting > to > test with. > > https://marc.info/?l=linux-rt-users&m=177847863710263&w=2 I will take a look at the mentioned patches. > Thanks, > Juri Thank you for the update, Andreas ^ permalink raw reply [flat|nested] 9+ messages in thread
end of thread, other threads:[~2026-05-11 14:13 UTC | newest] Thread overview: 9+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-08 8:09 sched/deadline: Use revised wakeup rule for dl_server Andreas Ziegler 2026-05-08 9:20 ` Christian Loehle 2026-05-08 12:06 ` Andreas Ziegler 2026-05-08 14:13 ` Christian Loehle 2026-05-09 11:42 ` Andreas Ziegler 2026-05-11 9:47 ` Christian Loehle 2026-05-11 12:37 ` Andreas Ziegler 2026-05-11 12:46 ` Juri Lelli 2026-05-11 14:13 ` Andreas Ziegler
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox