* SCHED_DEADLINE tasks causing WARNING at kernel/sched/sched.h message
@ 2025-09-02 14:06 Marcel Ziswiler
2025-09-02 16:49 ` Marcel Ziswiler
0 siblings, 1 reply; 2+ messages in thread
From: Marcel Ziswiler @ 2025-09-02 14:06 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vineeth Pillai,
Daniel Bristot de Oliveira, Luca Abeni
Hi
As part of our trustable work [1], we also run a lot of real time scheduler (SCHED_DEADLINE) tests on the
mainline Linux kernel (v6.16.2 in below reported case). Apart from some regression identified which recently
got fixed [2], the Linux scheduler proves quite capable of scheduling deadline tasks down to a granularity of
5ms on both of our test systems (amd64-based Intel NUCs and aarch64-based RADXA ROCK5Bs).
However, very rarely (e.g. only once over the course of the 2.4 billion tests we ran last week on ROCK5B), we
do get the following message in the logs.
Aug 23 18:09:37 localhost kernel: ------------[ cut here ]------------
Aug 23 18:09:37 localhost kernel: WARNING: CPU: 7 PID: 259143 at kernel/sched/sched.h:1787
__task_rq_lock+0xac/0xfc
Aug 23 18:09:37 localhost kernel: Modules linked in: ghash_generic overlay snd_soc_hdmi_codec panthor
rockchipdrm pwm_fan rfkill_gpio phy_rockchip_usbdp cdc_ether typec synopsys_hdmirx display_connector
snd_soc_simple_card hantro_vpu usbnet phy_rockchip_naneng_combphy phy_rockchip_samsung_hdptx rockchip_thermal
snd_soc_es8316 drm_gpuvm rtc_hym8563 rk805_pwrkey rockchip_saradc drm_exec industrialio_triggered_buffer
drm_shmem_helper kfifo_buf dw_hdmi_qp spi_rockchip_sfc analogix_dp gpu_sched dw_mipi_dsi drm_dp_aux_bus dw_hdmi
cec drm_display_helper snd_soc_rockchip_i2s_tdm cfg80211 drm_client_lib r8152 drm_dma_helper drm_kms_helper
v4l2_vp9 mii v4l2_h264 v4l2_jpeg v4l2_mem2mem snd_soc_audio_graph_card snd_soc_simple_card_utils rfkill
pci_endpoint_test drm dm_mod snd_aloop backlight dax
Aug 23 18:09:37 localhost kernel: CPU: 7 UID: 0 PID: 259143 Comm: stress-ng-cpu-s Not tainted 6.16.2-dirty #1
PREEMPT_RT
Aug 23 18:09:37 localhost kernel: Hardware name: radxa Radxa ROCK 5 Model B/Radxa ROCK 5 Model B, BIOS 2024.07-
00925-g459560000736 07/01/2024
Aug 23 18:09:37 localhost kernel: pstate: 804000c9 (Nzcv daIF +PAN -UAO -TCO -DIT -SSBS BTYPE=--)
Aug 23 18:09:37 localhost kernel: pc : __task_rq_lock+0xac/0xfc
Aug 23 18:09:37 localhost kernel: lr : __task_rq_lock+0x54/0xfc
Aug 23 18:09:37 localhost kernel: sp : ffff80009dd83b80
Aug 23 18:09:37 localhost kernel: x29: ffff80009dd83b80 x28: ffff0001e4855780 x27: 0000000000000000
Aug 23 18:09:37 localhost kernel: x26: 0000000000000000 x25: 0000000000000000 x24: ffffb2e0b0d37c60
Aug 23 18:09:37 localhost kernel: x23: ffff80009dd83bc8 x22: ffff0001e4855780 x21: ffff0001e4855780
Aug 23 18:09:37 localhost kernel: x20: ffffb2e0b0c2fe40 x19: ffff0002fef2ae40 x18: 0000000000000000
Aug 23 18:09:37 localhost kernel: x17: 0000000000000000 x16: 0000000000000000 x15: 0000ffffd2453db8
Aug 23 18:09:37 localhost kernel: x14: ffff0001e4855800 x13: 0000000000000000 x12: 0000000000000000
Aug 23 18:09:37 localhost kernel: x11: 0000000000000165 x10: ffff000100a5fce8 x9 : 0000000000000000
Aug 23 18:09:37 localhost kernel: x8 : 0000000000000000 x7 : ffff0001e4855800 x6 : 000000000000008f
Aug 23 18:09:37 localhost kernel: x5 : 0000000000000000 x4 : 0000000000000000 x3 : 0000000000000000
Aug 23 18:09:37 localhost kernel: x2 : 0000000000000001 x1 : ffff0002fef17258 x0 : ffffb2e0b0d5bfb0
Aug 23 18:09:37 localhost kernel: Call trace:
Aug 23 18:09:37 localhost kernel: __task_rq_lock+0xac/0xfc (P)
Aug 23 18:09:37 localhost kernel: rt_mutex_setprio+0x6c/0x498
Aug 23 18:09:37 localhost kernel: rt_mutex_slowunlock+0x17c/0x310
Aug 23 18:09:37 localhost kernel: rt_spin_unlock+0x7c/0x90
Aug 23 18:09:37 localhost kernel: cpuset_cpus_allowed+0xd8/0x10c
Aug 23 18:09:37 localhost kernel: __sched_setaffinity+0xb0/0x194
Aug 23 18:09:37 localhost kernel: sched_setaffinity+0x140/0x27c
Aug 23 18:09:37 localhost kernel: __arm64_sys_sched_setaffinity+0xb8/0x180
Aug 23 18:09:37 localhost kernel: invoke_syscall+0x48/0x104
Aug 23 18:09:37 localhost kernel: el0_svc_common.constprop.0+0xc0/0xe0
Aug 23 18:09:37 localhost kernel: do_el0_svc+0x1c/0x28
Aug 23 18:09:37 localhost kernel: el0_svc+0x34/0x104
Aug 23 18:09:37 localhost kernel: el0t_64_sync_handler+0x10c/0x138
Aug 23 18:09:37 localhost kernel: el0t_64_sync+0x198/0x19c
Aug 23 18:09:37 localhost kernel: ---[ end trace 0000000000000000 ]---
Usually, this is accompanied by our test workload process also getting the SIGXCPU signal, despite it not
overrunning its allocated runtime, at least not on purpose.
We are wondering what exactly could cause this or what exactly could be the issue.
We are happy to provide more detailed debugging information (however, full journal logs are usually a couple
hundred MB in size), but are looking for suggestions on how/what exactly to look at.
Any help is much appreciated. Thanks!
Cheers
Marcel
[1] https://projects.eclipse.org/projects/technology.tsf
[2] https://lore.kernel.org/all/ce8469c4fb2f3e2ada74add22cce4bfe61fd5bab.camel@codethink.co.uk
[3] https://lore.kernel.org/all/20250715071658.267-1-ziqianlu@bytedance.com
[4] https://git.kernel.org/pub/scm/linux/kernel/git/stable/linux.git/tree/kernel/sched/sched.h?h=v6.16.2#n1785
BTW: due to us having applied this patch set on top of v6.16.2 [3] the line number moved by 2 lines so the
WARN_ON line in questions is actually the following [4].
^ permalink raw reply [flat|nested] 2+ messages in thread
* Re: SCHED_DEADLINE tasks causing WARNING at kernel/sched/sched.h message
2025-09-02 14:06 SCHED_DEADLINE tasks causing WARNING at kernel/sched/sched.h message Marcel Ziswiler
@ 2025-09-02 16:49 ` Marcel Ziswiler
0 siblings, 0 replies; 2+ messages in thread
From: Marcel Ziswiler @ 2025-09-02 16:49 UTC (permalink / raw)
To: linux-kernel
Cc: Ingo Molnar, Peter Zijlstra, Juri Lelli, Vineeth Pillai,
Daniel Bristot de Oliveira, Luca Abeni
Hi
On Tue, 2025-09-02 at 16:06 +0200, Marcel Ziswiler wrote:
> As part of our trustable work [1], we also run a lot of real time scheduler (SCHED_DEADLINE) tests on the
> mainline Linux kernel (v6.16.2 in below reported case).
Looking through more logs from earlier test runs I found similar WARN_ONs dating back as early as v6.15.3. So
it does not look like a "new" issue in that sense.
[snip]
Any help is much appreciated. Thanks!
Cheers
Marcel
^ permalink raw reply [flat|nested] 2+ messages in thread
end of thread, other threads:[~2025-09-02 17:22 UTC | newest]
Thread overview: 2+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-09-02 14:06 SCHED_DEADLINE tasks causing WARNING at kernel/sched/sched.h message Marcel Ziswiler
2025-09-02 16:49 ` Marcel Ziswiler
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).