Re: [PATCH] sched/deadline: Remove fair-servers from real-time task's bandwidth accounting

All of lore.kernel.org
 help / color / mirror / Atom feed

From: Matteo Martelli <matteo.martelli@codethink.co.uk>
To: Yuri Andriaccio <yurand2000@gmail.com>
Cc: linux-kernel@vger.kernel.org,
	Luca Abeni <luca.abeni@santannapisa.it>,
	Yuri Andriaccio <yuri.andriaccio@santannapisa.it>,
	Ingo Molnar <mingo@redhat.com>,
	Peter Zijlstra <peterz@infradead.org>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>
Subject: Re: [PATCH] sched/deadline: Remove fair-servers from real-time task's bandwidth accounting
Date: Fri, 25 Jul 2025 13:44:51 +0200	[thread overview]
Message-ID: <86013fcc38e582ab89b9b7e4864cc1bd@codethink.co.uk> (raw)
In-Reply-To: <20250721111131.309388-1-yurand2000@gmail.com>

Hi Yuri,

On Mon, 21 Jul 2025 13:11:31 +0200, Yuri Andriaccio <yurand2000@gmail.com> wrote:
> Fair-servers are currently used in place of the old RT_THROTTLING mechanism to
> prevent the starvation of SCHED_OTHER (and other lower priority) tasks when
> real-time FIFO/RR processes are trying to fully utilize the CPU. To allow the
> RT_THROTTLING mechanism, the maximum allocatable bandwidth for real-time tasks
> has been limited to 95% of the CPU-time.
> 
> The RT_THROTTLING mechanism is now removed in favor of fair-servers, which are
> currently set to use, as expected, 5% of the CPU-time. Still, they share the
> same bandwidth that allows to run real-time tasks, and which is still set to 95%
> of the total CPU-time. This means that by removing the RT_THROTTLING mechanism,
> the bandwidth remaning for real-time SCHED_DEADLINE tasks and other dl-servers
> (FIFO/RR are not affected) is only 90%.
> ...

I noticed the same issue while recently addressing a stress-ng bug [1].

> This patch reclaims the 5% lost SCHED_DEADLINE CPU-time (FIFO/RR are not
> affected, there is no admission test there to perform), by accounting the
> fair-server's bandwidth separately. After this patch, the above script runs
> successfully also when allocating 95% bw per task/cpu.
> ...

I've tested your patch on top of 6.16.0-rc7-00091-gdd9c17322a6c,
x86 defconfig, on qemu with 4 virtual CPUs with a debian sid image [2].
I used stress-ng at latest master commit (35e617844) so that it includes
the fix to run the stress workers at full deadline bandwidth. The patch
seems to work properly but I encountered some kernel warnings while
testing. See the details below.


Checked default settings after boot (95% global BW, 5% fair_server BW):
----------
root@localhost:~# cat /proc/sys/kernel/sched_rt_period_us
1000000
root@localhost:~# cat /proc/sys/kernel/sched_rt_runtime_us
950000
root@localhost:~# cat /sys/kernel/debug/sched/fair_server/cpu*/period
1000000000
1000000000
1000000000
1000000000
root@localhost:~# cat /sys/kernel/debug/sched/fair_server/cpu*/runtime
50000000
50000000
50000000
50000000
----------

Launched stress-ng command using 95% BW: 95ms runtime, 100ms period, 100ms
deadline. All 4 stress processes ran properly:
----------
root@localhost:~# ./stress-ng/stress-ng --sched deadline --sched-runtime 95000000 --sched-period 100000000 --sched-deadline 100000000 --cpu 0 --verbose --metrics --timeout 10s
stress-ng: debug: [351] invoked with './stress-ng/stress-ng --sched deadline --sched-runtime 95000000 --sched-period 100000000 --sched-deadline 100000000 --cpu 0 --verbose --metrics --timeout 10s' by user 0 'root'
stress-ng: debug: [351] stress-ng 0.19.02 g35e617844d9b
stress-ng: debug: [351] system: Linux localhost 6.16.0-rc7-00364-gfe48072cc20f #1 SMP PREEMPT_DYNAMIC Thu Jul 24 18:44:39 CEST 2025 x86_64, gcc 14.2.0, glibc 2.41, little endian
stress-ng: debug: [351] RAM total: 1.9G, RAM free: 1.8G, swap free: 0.0
stress-ng: debug: [351] temporary file path: '/root', filesystem type: ext4 (187735 blocks available, QEMU HARDDISK)
stress-ng: debug: [351] 4 processors online, 4 processors configured
stress-ng: info:  [351] setting to a 10 secs run per stressor
stress-ng: debug: [351] CPU data cache: L1: 64K, L2: 512K, L3: 16384K
stress-ng: debug: [351] cache allocate: shared cache buffer size: 16384K
stress-ng: info:  [351] dispatching hogs: 4 cpu
stress-ng: debug: [351] starting stressors
stress-ng: debug: [352] deadline: setting scheduler class 'sched' (period=100000000, runtime=95000000, deadline=100000000)
stress-ng: debug: [353] deadline: setting scheduler class 'sched' (period=100000000, runtime=95000000, deadline=100000000)
stress-ng: debug: [352] cpu: [352] started (instance 0 on CPU 3)
stress-ng: debug: [353] cpu: [353] started (instance 1 on CPU 2)
stress-ng: debug: [354] deadline: setting scheduler class 'sched' (period=100000000, runtime=95000000, deadline=100000000)
stress-ng: debug: [354] cpu: [354] started (instance 2 on CPU 1)
stress-ng: debug: [351] 4 stressors started
stress-ng: debug: [355] deadline: setting scheduler class 'sched' (period=100000000, runtime=95000000, deadline=100000000)
stress-ng: debug: [355] cpu: [355] started (instance 3 on CPU 3)
stress-ng: debug: [352] cpu: using method 'all'
stress-ng: debug: [355] cpu: [355] exited (instance 3 on CPU 0)
stress-ng: debug: [354] cpu: [354] exited (instance 2 on CPU 1)
stress-ng: debug: [353] cpu: [353] exited (instance 1 on CPU 2)
stress-ng: debug: [352] cpu: [352] exited (instance 0 on CPU 3)
stress-ng: debug: [351] cpu: [352] terminated (success)
stress-ng: debug: [351] cpu: [353] terminated (success)
stress-ng: debug: [351] cpu: [354] terminated (success)
stress-ng: debug: [351] cpu: [355] terminated (success)
stress-ng: debug: [351] metrics-check: all stressor metrics validated and sane
stress-ng: metrc: [351] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
stress-ng: metrc: [351]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
stress-ng: metrc: [351] cpu               75655     10.00     38.01      0.00      7564.01        1990.31        95.01          8460
stress-ng: info:  [351] skipped: 0
stress-ng: info:  [351] passed: 4: cpu (4)
stress-ng: info:  [351] failed: 0
stress-ng: info:  [351] metrics untrustworthy: 0
stress-ng: info:  [351] successful run completed in 10.10 secs
----------

Launched stress-ng command using 96% BW: 96ms runtime, 100ms period, 100ms
deadline. The last spawn stress process failed to execute:
"stress-ng: warn: [357] cpu: [361] aborted early, out of system resources"
(without the patch, stress-ng produced the same error but with BW > 90%):
----------
root@localhost:~# ./stress-ng/stress-ng --sched deadline --sched-runtime 96000000 --sched-period 100000000 --sched-deadline 100000000 --cpu 0 --verbose --metrics --timeout 10s
stress-ng: debug: [357] invoked with './stress-ng/stress-ng --sched deadline --sched-runtime 96000000 --sched-period 100000000 --sched-deadline 100000000 --cpu 0 --verbose --metrics --timeout 10s' by user 0 'root'
stress-ng: debug: [357] stress-ng 0.19.02 g35e617844d9b
stress-ng: debug: [357] system: Linux localhost 6.16.0-rc7-00364-gfe48072cc20f #1 SMP PREEMPT_DYNAMIC Thu Jul 24 18:44:39 CEST 2025 x86_64, gcc 14.2.0, glibc 2.41, little endian
stress-ng: debug: [357] RAM total: 1.9G, RAM free: 1.8G, swap free: 0.0
stress-ng: debug: [357] temporary file path: '/root', filesystem type: ext4 (187735 blocks available, QEMU HARDDISK)
stress-ng: debug: [357] 4 processors online, 4 processors configured
stress-ng: info:  [357] setting to a 10 secs run per stressor
stress-ng: debug: [357] CPU data cache: L1: 64K, L2: 512K, L3: 16384K
stress-ng: debug: [357] cache allocate: shared cache buffer size: 16384K
stress-ng: info:  [357] dispatching hogs: 4 cpu
stress-ng: debug: [357] starting stressors
stress-ng: debug: [358] deadline: setting scheduler class 'sched' (period=100000000, runtime=96000000, deadline=100000000)
stress-ng: debug: [359] deadline: setting scheduler class 'sched' (period=100000000, runtime=96000000, deadline=100000000)
stress-ng: debug: [360] deadline: setting scheduler class 'sched' (period=100000000, runtime=96000000, deadline=100000000)
stress-ng: debug: [358] cpu: [358] started (instance 0 on CPU 1)
stress-ng: debug: [359] cpu: [359] started (instance 1 on CPU 2)
stress-ng: debug: [360] cpu: [360] started (instance 2 on CPU 0)
stress-ng: debug: [357] 4 stressors started
stress-ng: debug: [361] deadline: setting scheduler class 'sched' (period=100000000, runtime=96000000, deadline=100000000)
stress-ng: debug: [358] cpu: using method 'all'
stress-ng: debug: [359] cpu: [359] exited (instance 1 on CPU 3)
stress-ng: debug: [360] cpu: [360] exited (instance 2 on CPU 2)
stress-ng: debug: [358] cpu: [358] exited (instance 0 on CPU 0)
stress-ng: debug: [357] cpu: [358] terminated (success)
stress-ng: debug: [357] cpu: [359] terminated (success)
stress-ng: debug: [357] cpu: [360] terminated (success)
stress-ng: warn:  [357] cpu: [361] aborted early, out of system resources
stress-ng: debug: [357] cpu: [361] terminated (no resources)
stress-ng: debug: [357] metrics-check: all stressor metrics validated and sane
stress-ng: metrc: [357] stressor       bogo ops real time  usr time  sys time   bogo ops/s     bogo ops/s CPU used per       RSS Max
stress-ng: metrc: [357]                           (secs)    (secs)    (secs)   (real time) (usr+sys time) instance (%)          (KB)
stress-ng: metrc: [357] cpu               58940     10.00     28.80      0.00      5892.97        2046.02        96.01          8444
stress-ng: info:  [357] skipped: 1: cpu (1)
stress-ng: info:  [357] passed: 3: cpu (3)
stress-ng: info:  [357] failed: 0
stress-ng: info:  [357] metrics untrustworthy: 0
stress-ng: info:  [357] successful run completed in 10.10 secs
----------

Trying to increase fair_server BW without reducing the global BW, failed as
expected:
----------
root@localhost:~# echo 60000000 > /sys/kernel/debug/sched/fair_server/cpu0/runtime
-bash: echo: write error: Device or resource busy
----------


Trying to increase fair_server BW after reducing the global BW, succeeded:
----------
root@localhost:~# echo 940000 > /proc/sys/kernel/sched_rt_runtime_us
root@localhost:~# echo 60000000 > /sys/kernel/debug/sched/fair_server/cpu0/runtime
root@localhost:~# echo 60000000 > /sys/kernel/debug/sched/fair_server/cpu1/runtime
root@localhost:~# echo 60000000 > /sys/kernel/debug/sched/fair_server/cpu2/runtime
root@localhost:~# echo 60000000 > /sys/kernel/debug/sched/fair_server/cpu3/runtime
root@localhost:~#
----------


Checked that settings had been applied correctly:
----------
root@localhost:~# cat /proc/sys/kernel/sched_rt_runtime_us
940000
root@localhost:~# cat /sys/kernel/debug/sched/fair_server/cpu*/runtime
60000000
60000000
60000000
60000000
----------

NOTE: After setting fair_server runtime for cpu2, I've encointered the
following warning, however it also happened after setting the runtime
for cpu0 during another instance of the same test:
----------
[  196.647897] ------------[ cut here ]------------
[  196.649198] WARNING: kernel/sched/deadline.c:284 at dl_rq_change_utilization+0x5f/0x1d0, CPU#1: bash/249
[  196.650244] Modules linked in:
[  196.650613] CPU: 1 UID: 0 PID: 249 Comm: bash Not tainted 6.16.0-rc7-00364-gfe48072cc20f #1 PREEMPT(voluntary)
[  196.651698] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-1-1 04/01/2014
[  196.652834] RIP: 0010:dl_rq_change_utilization+0x5f/0x1d0
[  196.653747] Code: 01 00 00 4d 89 88 d0 08 00 00 48 83 c4 18 c3 cc cc cc cc 90 0f 0b 90 49 c7 80 d0 08 00 00 00 00 00 00 48 85 d2 74 dc 31 c0 90 <0f> 0b 90 49 39 c1 73 d1 e9 04 01 00 00 f6 46 53 10 75 73 48 8b 87
[  196.655791] RSP: 0018:ffffa47b003e3d60 EFLAGS: 00010093
[  196.656328] RAX: 0000000000000000 RBX: 000000003b9aca00 RCX: ffff9f25bdd292b0
[  196.657031] RDX: 000000000000cccc RSI: ffff9f25bdd292b0 RDI: ffff9f25bdd289c0
[  196.657742] RBP: ffff9f25bdd289c0 R08: ffff9f25bdd289c0 R09: 000000000000f5c2
[  196.658453] R10: 000000000000000a R11: 0fffffffffffffff R12: ffff9f25bdd292b0
[  196.659077] R13: 0000000000001000 R14: ffff9f25411a3840 R15: ffff9f25411a3800
[  196.659752] FS:  00007f3c521c4740(0000) GS:ffff9f26249cb000(0000) knlGS:0000000000000000
[  196.660321] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  196.660740] CR2: 000055d43eae1a58 CR3: 0000000005281000 CR4: 00000000000006f0
[  196.661210] Call Trace:
[  196.661383]  <TASK>
[  196.661605]  ? dl_bw_cpus+0x74/0x80
[  196.661840]  dl_server_apply_params+0x1b3/0x1e0
[  196.662142]  sched_fair_server_write.isra.0+0x10a/0x1a0
[  196.662523]  full_proxy_write+0x54/0x90
[  196.662780]  vfs_write+0xc9/0x480
[  196.663006]  ksys_write+0x6e/0xe0
[  196.663229]  do_syscall_64+0xa4/0x260
[  196.663513]  entry_SYSCALL_64_after_hwframe+0x77/0x7f
[  196.663845] RIP: 0033:0x7f3c52256687
[  196.664084] Code: 48 89 fa 4c 89 df e8 58 b3 00 00 8b 93 08 03 00 00 59 5e 48 83 f8 fc 74 1a 5b c3 0f 1f 84 00 00 00 00 00 48 8b 44 24 10 0f 05 <5b> c3 0f 1f 80 00 00 00 00 83 e2 39 83 fa 08 75 de e8 23 ff ff ff
[  196.666102] RSP: 002b:00007ffde87d9720 EFLAGS: 00000202 ORIG_RAX: 0000000000000001
[  196.666631] RAX: ffffffffffffffda RBX: 00007f3c521c4740 RCX: 00007f3c52256687
[  196.667099] RDX: 0000000000000009 RSI: 000055d43eb74a50 RDI: 0000000000000001
[  196.667596] RBP: 000055d43eb74a50 R08: 0000000000000000 R09: 0000000000000000
[  196.668057] R10: 0000000000000000 R11: 0000000000000202 R12: 0000000000000009
[  196.668562] R13: 00007f3c523af5c0 R14: 00007f3c523ace80 R15: 0000000000000000
[  196.669448]  </TASK>
[  196.669795] ---[ end trace 0000000000000000 ]---
----------

I've also encountered this other warning after setting cpu4 during
another instance of the same test:
----------
[  177.452335] ------------[ cut here ]------------
[  177.453820] WARNING: kernel/sched/deadline.c:257 at task_non_contending+0x259/0x3b0, CPU#1: swapper/1/0
[  177.455214] Modules linked in:
[  177.455555] CPU: 1 UID: 0 PID: 0 Comm: swapper/1 Tainted: G        W           6.16.0-rc7-00364-gfe48072cc20f #1 PREEMPT(voluntary)
[  177.456841] Tainted: [W]=WARN
[  177.457192] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-1-1 04/01/2014
[  177.458202] RIP: 0010:task_non_contending+0x259/0x3b0
[  177.458751] Code: 4c 89 0c 24 e8 b8 46 49 00 4c 8b 0c 24 e9 6b fe ff ff f6 43 53 10 0f 85 57 ff ff ff 49 8b 80 c8 08 00 00 48 2b 43 30 73 06 90 <0f> 0b 90 31 c0 49 63 90 28 0b 00 00 49 89 80 c8 08 00 00 48 c7 c0
[  177.460760] RSP: 0018:ffffa88f400e8ea0 EFLAGS: 00010087
[  177.461352] RAX: ffffffffffffd70a RBX: ffff8c8b7dca92b0 RCX: 0000000000000001
[  177.462121] RDX: fffffffffffdf33f RSI: 0000000003938700 RDI: ffff8c8b7dca92b0
[  177.462884] RBP: ffff8c8b7dca89c0 R08: ffff8c8b7dca89c0 R09: 0000000000000002
[  177.463655] R10: ffff8c8b7dca89c0 R11: 0000000000000000 R12: ffffffffa7b55050
[  177.464403] R13: 0000000000000002 R14: ffff8c8b7dca92b0 R15: ffff8c8b7dc9bb00
[  177.465124] FS:  0000000000000000(0000) GS:ffff8c8bd3dcb000(0000) knlGS:0000000000000000
[  177.465775] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  177.466231] CR2: 000055c7394fa93c CR3: 0000000006855000 CR4: 00000000000006f0
[  177.466757] Call Trace:
[  177.466952]  <IRQ>
[  177.467104]  dl_task_timer+0x3ca/0x830
[  177.467373]  __hrtimer_run_queues+0x12e/0x2a0
[  177.467686]  hrtimer_interrupt+0xf7/0x220
[  177.468000]  __sysvec_apic_timer_interrupt+0x53/0x100
[  177.468358]  sysvec_apic_timer_interrupt+0x66/0x80
[  177.468701]  </IRQ>
[  177.468866]  <TASK>
[  177.469047]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  177.469413] RIP: 0010:pv_native_safe_halt+0xf/0x20
[  177.469759] Code: 06 86 00 c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d f5 97 1f 00 fb f4 <c3> cc cc cc cc 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90
[  177.471123] RSP: 0018:ffffa88f400a3ed8 EFLAGS: 00000212
[  177.471491] RAX: ffff8c8bd3dcb000 RBX: ffff8c8b01249d80 RCX: ffff8c8b02399701
[  177.472013] RDX: 4000000000000000 RSI: 0000000000000083 RDI: 000000000000bc4c
[  177.472511] RBP: 0000000000000001 R08: 000000000000bc4c R09: ffff8c8b7dca49d0
[  177.473021] R10: 0000002954e5b780 R11: 0000000000000002 R12: 0000000000000000
[  177.473518] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  177.474058]  ? ct_kernel_exit.constprop.0+0x63/0xf0
[  177.474404]  default_idle+0x9/0x10
[  177.474649]  default_idle_call+0x2b/0x100
[  177.474952]  do_idle+0x1d0/0x230
[  177.475186]  cpu_startup_entry+0x24/0x30
[  177.475466]  start_secondary+0xf3/0x100
[  177.475745]  common_startup_64+0x13e/0x148
[  177.476079]  </TASK>
[  177.476241] ---[ end trace 0000000000000000 ]---
----------

Beside the warnings the new max BW it was applied correctly, and re-running
stress-ng with 95% BW it failed (last spawn process aborted), while re-running
stress-ng with 94% BW it passed (all processes ran).

Trying to reset global BW to 95% before decreasing fair_server BW failed as
expected:
----------
root@localhost:~# echo 950000 > /proc/sys/kernel/sched_rt_runtime_us
-bash: echo: write error: Device or resource busy
----------

Setting back global BW to 95% after decreasing fair_server BW back to 5%
succeeded:
----------
root@localhost:~# echo 50000000 > /sys/kernel/debug/sched/fair_server/cpu0/runtime
root@localhost:~# echo 50000000 > /sys/kernel/debug/sched/fair_server/cpu1/runtime
root@localhost:~# echo 50000000 > /sys/kernel/debug/sched/fair_server/cpu2/runtime
root@localhost:~# echo 50000000 > /sys/kernel/debug/sched/fair_server/cpu3/runtime
root@localhost:~# echo 950000 > /proc/sys/kernel/sched_rt_runtime_us
root@localhost:~#
----------

Checked that settings had been applied correctly:
----------
root@localhost:~# cat /proc/sys/kernel/sched_rt_runtime_us
950000
root@localhost:~# cat /sys/kernel/debug/sched/fair_server/cpu*/runtime
50000000
50000000
50000000
50000000
----------

After a few seconds I've enountered yet another warning:
----------
[  749.612165] ------------[ cut here ]------------
[  749.613416] WARNING: kernel/sched/deadline.c:245 at task_contending+0x167/0x190, CPU#3: swapper/3/0
[  749.614406] Modules linked in:
[  749.614783] CPU: 3 UID: 0 PID: 0 Comm: swapper/3 Tainted: G        W           6.16.0-rc7-00364-gfe48072cc20f #1 PREEMPT(voluntary)
[  749.616070] Tainted: [W]=WARN
[  749.616398] Hardware name: QEMU Standard PC (Q35 + ICH9, 2009), BIOS Arch Linux 1.17.0-1-1 04/01/2014
[  749.617383] RIP: 0010:task_contending+0x167/0x190
[  749.617913] Code: 48 83 c4 08 e9 1a 56 49 00 83 e1 04 0f 85 7a ff ff ff c3 cc cc cc cc 90 0f 0b 90 e9 39 ff ff ff 90 0f 0b 90 e9 f5 fe ff ff 90 <0f> 0b 90 e9 f9 fe ff ff 90 0f 0b 90 eb 91 48 c7 c6 50 4c ef 96 48
[  749.621091] RSP: 0018:ffffa47b00140de8 EFLAGS: 00010087
[  749.621685] RAX: ffff9f25bdda89c0 RBX: ffff9f25bdda92b0 RCX: 000000000000f5c2
[  749.622470] RDX: 000000000000f5c2 RSI: 0000000000000001 RDI: 0000000000000000
[  749.623243] RBP: 0000000000000001 R08: 00000000041d4a50 R09: 0000000000000002
[  749.623935] R10: 0000000000000001 R11: ffff9f2541e28010 R12: 0000000000000001
[  749.624526] R13: ffff9f25bdda8a40 R14: 0000000000000000 R15: 0000000000200b20
[  749.625066] FS:  0000000000000000(0000) GS:ffff9f2624acb000(0000) knlGS:0000000000000000
[  749.625676] CS:  0010 DS: 0000 ES: 0000 CR0: 0000000080050033
[  749.626090] CR2: 00007f3c523df18c CR3: 00000000050c8000 CR4: 00000000000006f0
[  749.626639] Call Trace:
[  749.626825]  <IRQ>
[  749.626980]  enqueue_dl_entity+0x401/0x730
[  749.627281]  dl_server_start+0x35/0x80
[  749.627588]  enqueue_task_fair+0x20c/0x820
[  749.627891]  enqueue_task+0x2c/0x70
[  749.628150]  ttwu_do_activate+0x6e/0x210
[  749.628439]  try_to_wake_up+0x249/0x630
[  749.628741]  ? __pfx_hrtimer_wakeup+0x10/0x10
[  749.629062]  hrtimer_wakeup+0x1d/0x30
[  749.629332]  __hrtimer_run_queues+0x12e/0x2a0
[  749.629712]  hrtimer_interrupt+0xf7/0x220
[  749.630005]  __sysvec_apic_timer_interrupt+0x53/0x100
[  749.630373]  sysvec_apic_timer_interrupt+0x66/0x80
[  749.630810]  </IRQ>
[  749.630971]  <TASK>
[  749.631131]  asm_sysvec_apic_timer_interrupt+0x1a/0x20
[  749.631500] RIP: 0010:pv_native_safe_halt+0xf/0x20
[  749.631821] Code: 06 86 00 c3 cc cc cc cc 0f 1f 00 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 90 f3 0f 1e fa eb 07 0f 00 2d f5 97 1f 00 fb f4 <c3> cc cc cc cc 66 2e 0f 1f 84 00 00 00 00 00 66 90 90 90 90 90 90
[  749.633041] RSP: 0018:ffffa47b000b3ed8 EFLAGS: 00000202
[  749.633490] RAX: ffff9f2624acb000 RBX: ffff9f254124bb00 RCX: ffffa47b0056fe30
[  749.633996] RDX: 4000000000000000 RSI: 0000000000000083 RDI: 000000000006ae44
[  749.634482] RBP: 0000000000000003 R08: 000000000006ae44 R09: ffff9f25bdda49d0
[  749.634952] R10: 000000ae900bc540 R11: 0000000000000008 R12: 0000000000000000
[  749.635437] R13: 0000000000000000 R14: 0000000000000000 R15: 0000000000000000
[  749.637635]  ? ct_kernel_exit.constprop.0+0x63/0xf0
[  749.638053]  default_idle+0x9/0x10
[  749.638349]  default_idle_call+0x2b/0x100
[  749.638682]  do_idle+0x1d0/0x230
[  749.638903]  cpu_startup_entry+0x24/0x30
[  749.639163]  start_secondary+0xf3/0x100
[  749.639420]  common_startup_64+0x13e/0x148
[  749.639718]  </TASK>
[  749.639870] ---[ end trace 0000000000000000 ]---
----------

I then reduced fair_server runtime to 40000000 (for all CPUs), raised global
sched_rt_runtime_us to 960000, and re-run stress-ng at 96% BW (passed) and at
97% BW (failed: one process aborted). During this last test I've got again the
two warnings mentioned above (kernel/sched/deadline.c:284 at dequeue_task_dl
and WARNING: kernel/sched/deadline.c:245 at task_contending+0x167/0x190), this
while setting the fair_server runtime but also while running stress-ng.

I couldn't reproduce the warning on the same kernel without the patch applied,
by trying different configurations of fair_server BW and global BW, while they
seem very reproducible with the patch applied.

[1]: https://github.com/ColinIanKing/stress-ng/issues/549
[2]: https://cloud.debian.org/images/cloud/sid/daily/20250606-2135/debian-sid-nocloud-amd64-daily-20250606-2135.qcow2

I hope this is helpful.

Best regards,
Matteo Martelli

next prev parent reply	other threads:[~2025-07-25 12:16 UTC|newest]

Thread overview: 5+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-07-21 11:11 [PATCH] sched/deadline: Remove fair-servers from real-time task's bandwidth accounting Yuri Andriaccio
2025-07-22  0:07 ` kernel test robot
2025-07-25 11:44 ` Matteo Martelli [this message]
2025-07-25 15:28   ` Yuri Andriaccio
2025-07-25 17:22     ` Matteo Martelli

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=86013fcc38e582ab89b9b7e4864cc1bd@codethink.co.uk \
    --to=matteo.martelli@codethink.co.uk \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=luca.abeni@santannapisa.it \
    --cc=mgorman@suse.de \
    --cc=mingo@redhat.com \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    --cc=yurand2000@gmail.com \
    --cc=yuri.andriaccio@santannapisa.it \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.