linux-kernel.vger.kernel.org archive mirror
 help / color / mirror / Atom feed
* [PATCH 0/5] sched/deadline: Fix GRUB accounting
@ 2025-06-27 11:51 Juri Lelli
  2025-06-27 11:51 ` [PATCH 1/5] sched/deadline: Initialize dl_servers after SMP Juri Lelli
                   ` (6 more replies)
  0 siblings, 7 replies; 18+ messages in thread
From: Juri Lelli @ 2025-06-27 11:51 UTC (permalink / raw)
  To: Ingo Molnar, Peter Zijlstra, Vincent Guittot, Dietmar Eggemann,
	Steven Rostedt, Ben Segall, Mel Gorman, Valentin Schneider,
	Waiman Long
  Cc: linux-kernel, Marcel Ziswiler, Luca Abeni, Juri Lelli

Hi All,

This patch series addresses a significant regression observed in
`SCHED_DEADLINE` performance, specifically when `SCHED_FLAG_RECLAIM`
(Greedy Reclamation of Unused Bandwidth - GRUB) is enabled alongside
overrunning jobs. This issue was reported by Marcel [1].

Marcel's team extensive real-time scheduler (`SCHED_DEADLINE`) tests on
mainline Linux kernels (amd64-based Intel NUCs and aarch64-based RADXA
ROCK5Bs) typically show zero deadline misses for 5ms granularity tasks.
However, with reclaim mode enabled and the same two overrunning jobs in
the mix, they observed a dramatic increase in deadline misses: 43
million on NUC and 600 thousand on ROCK55B. This highlights a critical
accounting issue within `SCHED_DEADLINE` when reclaim is active.

This series fixes the issue by doing the following.

- 1/5: sched/deadline: Initialize dl_servers after SMP
  Currently, `dl-servers` are initialized too early during boot, before
  all CPUs are online. This results in an incorrect calculation of
  per-runqueue `DEADLINE` variables, such as `extra_bw`, which rely on a
  stable CPU count. This patch moves the `dl-server` initialization to a
  later stage, after SMP initialization, ensuring all CPUs are online and
  correct `extra_bw` values can be computed from the start.

- 2/5: sched/deadline: Reset extra_bw to max_bw when clearing root domains
  The `dl_clear_root_domain()` function was found to not properly account
  for the fact that per-runqueue `extra_bw` variables retained stale
  values computed before root domain changes. This led to broken
  accounting. This patch fixes the issue by resetting `extra_bw` to
  `max_bw` before restoring `dl-server` contributions, ensuring a clean
  state.

- 3/5: sched/deadline: Fix accounting after global limits change
  Changes to global `SCHED_DEADLINE` limits (handled by
  `sched_rt_handler()` logic) were found to leave stale or incorrect
  values in various accounting-related variables, including `extra_bw`.
  This patch properly cleans up per-runqueue variables before implementing
  the global limit change and then rebuilds the scheduling domains. This
  ensures that the accounting is correctly restored and maintained after
  such global limit adjustments.

- 4/5 and 5/5 are simple drgn scripts I put together to help debugging
  this issue. I have the impression that they might be useful to have
  around for the future.

Please review and test.

The set is also availabe at

git@github.com:jlelli/linux.git upstream/fix-grub-tip

1 - https://lore.kernel.org/lkml/ce8469c4fb2f3e2ada74add22cce4bfe61fd5bab.camel@codethink.co.uk/

Thanks,
Juri

Juri Lelli (5):
  sched/deadline: Initialize dl_servers after SMP
  sched/deadline: Reset extra_bw to max_bw when clearing root domains
  sched/deadline: Fix accounting after global limits change
  tools/sched: Add root_domains_dump.py which dumps root domains info
  tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info

 MAINTAINERS                      |  1 +
 kernel/sched/core.c              |  2 +
 kernel/sched/deadline.c          | 61 +++++++++++++++++++---------
 kernel/sched/rt.c                |  6 +++
 kernel/sched/sched.h             |  1 +
 tools/sched/dl_bw_dump.py        | 57 ++++++++++++++++++++++++++
 tools/sched/root_domains_dump.py | 68 ++++++++++++++++++++++++++++++++
 7 files changed, 177 insertions(+), 19 deletions(-)
 create mode 100755 tools/sched/dl_bw_dump.py
 create mode 100755 tools/sched/root_domains_dump.py

-- 
2.49.0


^ permalink raw reply	[flat|nested] 18+ messages in thread

end of thread, other threads:[~2025-07-15 10:07 UTC | newest]

Thread overview: 18+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2025-06-27 11:51 [PATCH 0/5] sched/deadline: Fix GRUB accounting Juri Lelli
2025-06-27 11:51 ` [PATCH 1/5] sched/deadline: Initialize dl_servers after SMP Juri Lelli
     [not found]   ` <1e39c473-d161-4ad0-bfdc-8a306f57135f@redhat.com>
2025-06-29 23:08     ` Waiman Long
2025-06-30 10:21     ` Juri Lelli
2025-06-30 17:04       ` Waiman Long
2025-07-14  9:10   ` [tip: sched/core] " tip-bot2 for Juri Lelli
2025-06-27 11:51 ` [PATCH 2/5] sched/deadline: Reset extra_bw to max_bw when clearing root domains Juri Lelli
2025-07-14  9:10   ` [tip: sched/core] " tip-bot2 for Juri Lelli
2025-06-27 11:51 ` [PATCH 3/5] sched/deadline: Fix accounting after global limits change Juri Lelli
2025-07-14  8:59   ` Peter Zijlstra
2025-07-15 10:07     ` Juri Lelli
2025-07-14  9:10   ` [tip: sched/core] " tip-bot2 for Juri Lelli
2025-06-27 11:51 ` [PATCH 4/5] tools/sched: Add root_domains_dump.py which dumps root domains info Juri Lelli
2025-07-14  9:10   ` [tip: sched/core] " tip-bot2 for Juri Lelli
2025-06-27 11:51 ` [PATCH 5/5] tools/sched: Add dl_bw_dump.py for printing bandwidth accounting info Juri Lelli
2025-07-14  9:10   ` [tip: sched/core] " tip-bot2 for Juri Lelli
2025-06-30  6:01 ` [PATCH 0/5] sched/deadline: Fix GRUB accounting Marcel Ziswiler
2025-07-11 10:05 ` Marcel Ziswiler

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).