* [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT @ 2026-05-10 20:57 Lukas Beckmann 2026-05-11 5:50 ` Mike Galbraith 2026-05-11 14:21 ` Sasha Levin 0 siblings, 2 replies; 8+ messages in thread From: Lukas Beckmann @ 2026-05-10 20:57 UTC (permalink / raw) To: Peter Zijlstra, Juri Lelli, Sasha Levin Cc: regressions, stable, linux-rt-users Hi, I am reporting a regression which was introduced by d66792919d4f on 6.12.y. Since this commit, cyclictest reports latencies up to 50 milliseconds, on kernels with CONFIG_PREEMPT_RT=y. Steps to reproduce: 1. run a load (e.g. stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 128M) 2. run cyclictest (e.g. cyclictest -a -t -m -p 80 -i 250 -d 0) cyclictest results on the current linux-6.12.y branch (tag v6.12.87): # /dev/cpu_dma_latency set to 0us policy: fifo: loadavg: 9.37 9.21 6.90 9/211 978 T: 0 ( 884) P:80 I:250 C:4688252 Min: 3 Act: 6 Avg: 6 Max: 51956 T: 1 ( 885) P:80 I:250 C:4688051 Min: 3 Act: 7 Avg: 6 Max: 50106 T: 2 ( 886) P:80 I:250 C:4688242 Min: 3 Act: 6 Avg: 6 Max: 51965 T: 3 ( 887) P:80 I:250 C:4688434 Min: 3 Act: 12 Avg: 8 Max: 59 cyclictest results on 6.12.y with d66792919d4f reverted: # /dev/cpu_dma_latency set to 0us policy: fifo: loadavg: 9.43 9.50 9.44 8/204 5758 T: 0 ( 862) P:80 I:250 C:272329322 Min: 3 Act: 6 Avg: 6 Max: 57 T: 1 ( 863) P:80 I:250 C:272329324 Min: 3 Act: 7 Avg: 6 Max: 77 T: 2 ( 864) P:80 I:250 C:272329322 Min: 3 Act: 7 Avg: 6 Max: 68 T: 3 ( 865) P:80 I:250 C:272329322 Min: 3 Act: 16 Avg: 7 Max: 81 This is reproducible on multiple machines. It looks like the timer fires and there is also a sched_waking event in the trace, but the cyclictest thread does not get scheduled for another 50ms. I found this, because Debian updated its rt kernel from 6.12.74 to 6.12.85. The issue was also present with upstream 6.12.85 and HEAD, but not with 6.12.74, so I started bisecting and eventually found d66792919d4f. Is it possible to revert the commit? I can provide traces or help with testing if needed. Thanks Lukas Beckmann #regzbot introduced: d66792919d4f ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-10 20:57 [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT Lukas Beckmann @ 2026-05-11 5:50 ` Mike Galbraith 2026-05-11 14:21 ` Sasha Levin 1 sibling, 0 replies; 8+ messages in thread From: Mike Galbraith @ 2026-05-11 5:50 UTC (permalink / raw) To: Lukas Beckmann, Peter Zijlstra, Juri Lelli, Sasha Levin Cc: regressions, stable, linux-rt-users On Sun, 2026-05-10 at 22:57 +0200, Lukas Beckmann wrote: > Hi, Greetings! > I am reporting a regression which was introduced by d66792919d4f on 6.12.y. > Since this commit, cyclictest reports latencies up to 50 milliseconds, > on kernels with CONFIG_PREEMPT_RT=y. > > Steps to reproduce: > 1. run a load (e.g. stress-ng --cpu 4 --io 2 --vm 2 --vm-bytes 128M) > 2. run cyclictest (e.g. cyclictest -a -t -m -p 80 -i 250 -d 0) ... > I found this, because Debian updated its rt kernel from 6.12.74 to 6.12.85. > The issue was also present with upstream 6.12.85 and HEAD, but not with > 6.12.74, so I started bisecting and eventually found d66792919d4f. > > Is it possible to revert the commit? > > I can provide traces or help wth testing if needed. FWIW, my box says this *may* be due to a fix (+follow-ups) that didn't wander back to stable. cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") Fixes: 557a6bfc662c ("sched/fair: Add trivial fair server") Follow-ups: 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") Local 6.12-rt tree containing the above (et al) failed to reproduce in the hour I let it try to, whereas a build excluding only all locally added fix backports reproduced in fairly short order. Suspects NOT confirmed in total isolation... -Mike ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-10 20:57 [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT Lukas Beckmann 2026-05-11 5:50 ` Mike Galbraith @ 2026-05-11 14:21 ` Sasha Levin 2026-05-11 15:30 ` Mike Galbraith 2026-05-11 22:08 ` Lukas Beckmann 1 sibling, 2 replies; 8+ messages in thread From: Sasha Levin @ 2026-05-11 14:21 UTC (permalink / raw) To: Peter Zijlstra, Juri Lelli Cc: Sasha Levin, regressions, stable, linux-rt-users, Lukas Beckmann, Mike Galbraith On Sun, May 10, 2026 at 10:57:46PM +0200, Lukas Beckmann wrote: > I am reporting a regression which was introduced by d66792919d4f on 6.12.y. > Since this commit, cyclictest reports latencies up to 50 milliseconds, > on kernels with CONFIG_PREEMPT_RT=y. [...] > Is it possible to revert the commit? > > I can provide traces or help with testing if needed. Thanks for the detailed report. Before I revert d66792919d4f from 6.12.y, I'd like to confirm whether the underlying issue is the missing dl_server rework chain on 6.12.y rather than the revised wakeup rule itself. Mike's reply notes that his local 6.12-rt tree carrying the following three commits in cannot reproduce, while the same tree without them reproduces quickly: cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") d66792919d4f's upstream commit message explicitly says it relies on the state established by a3a70caf7906, and none of the three are in 6.12.y. Could you give those three commits a spin on top of 6.12.y (keeping d66792919d4f in place) and see whether the latency goes away? -- Thanks, Sasha ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-11 14:21 ` Sasha Levin @ 2026-05-11 15:30 ` Mike Galbraith 2026-05-11 22:08 ` Lukas Beckmann 1 sibling, 0 replies; 8+ messages in thread From: Mike Galbraith @ 2026-05-11 15:30 UTC (permalink / raw) To: Sasha Levin, Peter Zijlstra, Juri Lelli Cc: regressions, stable, linux-rt-users, Lukas Beckmann On Mon, 2026-05-11 at 10:21 -0400, Sasha Levin wrote: > > Mike's reply notes that his local 6.12-rt tree carrying the following > three commits in cannot reproduce, while the same tree without them > reproduces quickly: > > cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") > 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") > a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") > > d66792919d4f's upstream commit message explicitly says it relies on the > state established by a3a70caf7906, and none of the three are in 6.12.y. > > Could you give those three commits a spin on top of 6.12.y (keeping > d66792919d4f in place) and see whether the latency goes away? I've meanwhile tried those three alone, and the size XXL hits my box readily reproduces in virgin source do indeed go away. 'course there may be another shoe, so... -Mike ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-11 14:21 ` Sasha Levin 2026-05-11 15:30 ` Mike Galbraith @ 2026-05-11 22:08 ` Lukas Beckmann 2026-05-16 19:50 ` Lukas Beckmann 1 sibling, 1 reply; 8+ messages in thread From: Lukas Beckmann @ 2026-05-11 22:08 UTC (permalink / raw) To: Sasha Levin, Peter Zijlstra, Juri Lelli Cc: regressions, stable, linux-rt-users, Lukas Beckmann, Mike Galbraith On 5/11/26 16:21, Sasha Levin wrote: > Thanks for the detailed report. Before I revert d66792919d4f from 6.12.y, > I'd like to confirm whether the underlying issue is the missing dl_server > rework chain on 6.12.y rather than the revised wakeup rule itself. > > Mike's reply notes that his local 6.12-rt tree carrying the following > three commits in cannot reproduce, while the same tree without them > reproduces quickly: > > cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") > 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") > a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") > > d66792919d4f's upstream commit message explicitly says it relies on the > state established by a3a70caf7906, and none of the three are in 6.12.y. > > Could you give those three commits a spin on top of 6.12.y (keeping > d66792919d4f in place) and see whether the latency goes away? If I apply the three commits on 6.12.y, the latencies indeed go away. This is running for a few hours now, and the latencies showed up after 30 minutes tops, with plain 6.12.y before. I will leave this running. Note: I also tried applying only cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") before, and that also seems to fix the issue. Thanks Lukas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-11 22:08 ` Lukas Beckmann @ 2026-05-16 19:50 ` Lukas Beckmann 2026-05-21 7:32 ` Thorsten Leemhuis 0 siblings, 1 reply; 8+ messages in thread From: Lukas Beckmann @ 2026-05-16 19:50 UTC (permalink / raw) To: Sasha Levin, Peter Zijlstra, Juri Lelli Cc: regressions, stable, linux-rt-users, Mike Galbraith On Tue, May 12, 2026 at 12:08:49AM +0200, Lukas Beckmann wrote: > > On 5/11/26 16:21, Sasha Levin wrote: > > Thanks for the detailed report. Before I revert d66792919d4f from 6.12.y, > > I'd like to confirm whether the underlying issue is the missing dl_server > > rework chain on 6.12.y rather than the revised wakeup rule itself. > > > > Mike's reply notes that his local 6.12-rt tree carrying the following > > three commits in cannot reproduce, while the same tree without them > > reproduces quickly: > > > > cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") > > 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") > > a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") > > > > d66792919d4f's upstream commit message explicitly says it relies on the > > state established by a3a70caf7906, and none of the three are in 6.12.y. > > > > Could you give those three commits a spin on top of 6.12.y (keeping > > d66792919d4f in place) and see whether the latency goes away? > > If I apply the three commits on 6.12.y, the latencies indeed go away. > This is running for a few hours now, and the latencies showed up after 30 > minutes tops, with plain 6.12.y before. > I will leave this running. > > Note: > I also tried applying only cccb45d7c429 ("sched/deadline: Less agressive > dl_server handling") before, and that also seems to fix the issue. > > Thanks > Lukas Hey Sasha, Cyclictest is still running and looking good (latency-wise). How should we proceed? Thanks Lukas ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-16 19:50 ` Lukas Beckmann @ 2026-05-21 7:32 ` Thorsten Leemhuis 2026-05-21 16:49 ` Sasha Levin 0 siblings, 1 reply; 8+ messages in thread From: Thorsten Leemhuis @ 2026-05-21 7:32 UTC (permalink / raw) To: Sasha Levin Cc: regressions, Juri Lelli, Peter Zijlstra, stable, linux-rt-users, Mike Galbraith, Lukas Beckmann On 5/16/26 21:50, Lukas Beckmann wrote: > On Tue, May 12, 2026 at 12:08:49AM +0200, Lukas Beckmann wrote: >> On 5/11/26 16:21, Sasha Levin wrote: >>> Thanks for the detailed report. Before I revert d66792919d4f from 6.12.y, >>> I'd like to confirm whether the underlying issue is the missing dl_server >>> rework chain on 6.12.y rather than the revised wakeup rule itself. >>> >>> Mike's reply notes that his local 6.12-rt tree carrying the following >>> three commits in cannot reproduce, while the same tree without them >>> reproduces quickly: >>> >>> cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") >>> 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") >>> a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") >>> >>> d66792919d4f's upstream commit message explicitly says it relies on the >>> state established by a3a70caf7906, and none of the three are in 6.12.y. >>> >>> Could you give those three commits a spin on top of 6.12.y (keeping >>> d66792919d4f in place) and see whether the latency goes away? >> >> If I apply the three commits on 6.12.y, the latencies indeed go away. >> This is running for a few hours now, and the latencies showed up after 30 >> minutes tops, with plain 6.12.y before. >> I will leave this running. > > Cyclictest is still running and looking good (latency-wise). > How should we proceed? Sasha, just wondering: is this still in your queue? It sounds like a clear case to pick those three up for 6.12.y (everybody: please correct me if I'm wrong). Or are you busy and should we ask Greg to pick them up? Ciao, Thorsten ^ permalink raw reply [flat|nested] 8+ messages in thread
* Re: [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT 2026-05-21 7:32 ` Thorsten Leemhuis @ 2026-05-21 16:49 ` Sasha Levin 0 siblings, 0 replies; 8+ messages in thread From: Sasha Levin @ 2026-05-21 16:49 UTC (permalink / raw) To: Thorsten Leemhuis Cc: regressions, Juri Lelli, Peter Zijlstra, stable, linux-rt-users, Mike Galbraith, Lukas Beckmann On Thu, May 21, 2026 at 09:32:32AM +0200, Thorsten Leemhuis wrote: >On 5/16/26 21:50, Lukas Beckmann wrote: >> On Tue, May 12, 2026 at 12:08:49AM +0200, Lukas Beckmann wrote: >>> On 5/11/26 16:21, Sasha Levin wrote: >>>> Thanks for the detailed report. Before I revert d66792919d4f from 6.12.y, >>>> I'd like to confirm whether the underlying issue is the missing dl_server >>>> rework chain on 6.12.y rather than the revised wakeup rule itself. >>>> >>>> Mike's reply notes that his local 6.12-rt tree carrying the following >>>> three commits in cannot reproduce, while the same tree without them >>>> reproduces quickly: >>>> >>>> cccb45d7c429 ("sched/deadline: Less agressive dl_server handling") >>>> 4ae8d9aa9f9d ("sched/deadline: Fix dl_server getting stuck") >>>> a3a70caf7906 ("sched/deadline: Fix dl_server behaviour") >>>> >>>> d66792919d4f's upstream commit message explicitly says it relies on the >>>> state established by a3a70caf7906, and none of the three are in 6.12.y. >>>> >>>> Could you give those three commits a spin on top of 6.12.y (keeping >>>> d66792919d4f in place) and see whether the latency goes away? >>> >>> If I apply the three commits on 6.12.y, the latencies indeed go away. >>> This is running for a few hours now, and the latencies showed up after 30 >>> minutes tops, with plain 6.12.y before. >>> I will leave this running. >> >> Cyclictest is still running and looking good (latency-wise). >> How should we proceed? > >Sasha, just wondering: is this still in your queue? It sounds like a >clear case to pick those three up for 6.12.y (everybody: please correct >me if I'm wrong). Or are you busy and should we ask Greg to pick them up? Nope, sorry, I got sidetracked by a large series of commits from the merge window. I'll plan to queue this up for the next release. It'll be even better if someone can send us a more "official" backport request, ideally with a tested-by too :) -- Thanks, Sasha ^ permalink raw reply [flat|nested] 8+ messages in thread
end of thread, other threads:[~2026-05-21 16:49 UTC | newest] Thread overview: 8+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2026-05-10 20:57 [REGRESSION] 6.12.y: d66792919d4f (sched/deadline: Use revised wakeup rule for dl_server) causes latencies up to 50ms with PREEMPT_RT Lukas Beckmann 2026-05-11 5:50 ` Mike Galbraith 2026-05-11 14:21 ` Sasha Levin 2026-05-11 15:30 ` Mike Galbraith 2026-05-11 22:08 ` Lukas Beckmann 2026-05-16 19:50 ` Lukas Beckmann 2026-05-21 7:32 ` Thorsten Leemhuis 2026-05-21 16:49 ` Sasha Levin
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox