From: Ingo Molnar <mingo@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
linux-kernel@vger.kernel.org,
Peter Zijlstra <peterz@infradead.org>,
Thomas Gleixner <tglx@linutronix.de>,
Juri Lelli <juri.lelli@redhat.com>,
Vincent Guittot <vincent.guittot@linaro.org>,
Dietmar Eggemann <dietmar.eggemann@arm.com>,
Steven Rostedt <rostedt@goodmis.org>,
Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
Valentin Schneider <vschneid@redhat.com>,
Shrikanth Hegde <sshegde@linux.ibm.com>,
Tejun Heo <tj@kernel.org>
Subject: [GIT PULL v2] Scheduler enhancements for v6.14
Date: Tue, 21 Jan 2025 08:23:03 +0100 [thread overview]
Message-ID: <Z49LV4I63Qeh3oSz@gmail.com> (raw)
In-Reply-To: <Z46HUWsE97BHNkke@localhost.localdomain>
* Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:
> On 20-Jan-2025 12:07:41 PM, Ingo Molnar wrote:
> >
> > Linus,
> >
> > Please pull the latest sched/core Git tree from:
> >
> > git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2025-01-20
> >
> > # HEAD: 7d9da040575b343085287686fa902a5b2d43c7ca psi: Fix race when task wakes up before psi_sched_switch() adjusts flags
> >
> > Scheduler enhancements for v6.14:
>
> [...]
>
> > - RSEQ enhancements:
> >
> > - Validate read-only fields under DEBUG_RSEQ config
> > (Mathieu Desnoyers)
>
> FYI, a regression introduced by this commit was reported by s390x
> glibc developers testing against linux-next:
>
> https://sourceware.org/pipermail/libc-alpha/2025-January/163993.html
>
> I've sent a fix here:
>
> https://lore.kernel.org/lkml/20250116205956.836074-1-mathieu.desnoyers@efficios.com/
>
> The commit introducing the issue is in this PR, but not the fix.
Indeed - with the bug RSEQ_FLAG_UNREGISTER would fail with an incorrect
-EFAULT return.
I've applied your fix, and updated the pull request for Linus further
below. If Linus has already pulled I'll send a fixes pull request
separately, or Linus can apply the fix from email directly:
Acked-by: Ingo Molnar <mingo@kernel.org>
Or he can pull the sched-core-2025-01-21 tag below safely on top of
sched-core-2025-01-20, which will result in a diffstat of:
Mathieu Desnoyers (1):
rseq: Fix rseq unregistration regression
kernel/rseq.c | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
Since I booted the scheduler tree on generic desktops and it was tested
on other systems as well and nothing appeared to be broken, I presume
RSEQ_FLAG_UNREGISTER is used only in libc syscall-testcases and in
specific applications?
Thanks,
Ingo
===================================>
Linus,
Please pull the latest sched/core Git tree from:
git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2025-01-21
# HEAD: 40724ecafccb1fb62b66264854e8c3ad394c8f3d rseq: Fix rseq unregistration regression
Scheduler enhancements for v6.14:
- Fair scheduler (SCHED_FAIR) enhancements:
- Behavioral improvements:
- Untangle NEXT_BUDDY and pick_next_task() (Peter Zijlstra)
- Delayed-dequeue enhancements & fixes: (Vincent Guittot)
- Rename h_nr_running into h_nr_queued
- Add new cfs_rq.h_nr_runnable
- Use the new cfs_rq.h_nr_runnable
- Removed unsued cfs_rq.h_nr_delayed
- Rename cfs_rq.idle_h_nr_running into h_nr_idle
- Remove unused cfs_rq.idle_nr_running
- Rename cfs_rq.nr_running into nr_queued
- Do not try to migrate delayed dequeue task
- Fix variable declaration position
- Encapsulate set custom slice in a __setparam_fair() function
- Fixes:
- Fix race between yield_to() and try_to_wake_up() (Tianchen Ding)
- Fix CPU bandwidth limit bypass during CPU hotplug (Vishal Chourasia)
- Cleanups:
- Clean up in migrate_degrades_locality() to improve
readability (Peter Zijlstra)
- Mark m*_vruntime() with __maybe_unused (Andy Shevchenko)
- Update comments after sched_tick() rename (Sebastian Andrzej Siewior)
- Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()
(Valentin Schneider)
- Deadline scheduler (SCHED_DL) enhancements:
- Restore dl_server bandwidth on non-destructive root domain
changes (Juri Lelli)
- Correctly account for allocated bandwidth during
hotplug (Juri Lelli)
- Check bandwidth overflow earlier for hotplug (Juri Lelli)
- Clean up goto label in pick_earliest_pushable_dl_task()
(John Stultz)
- Consolidate timer cancellation (Wander Lairson Costa)
- Load-balancer enhancements:
- Improve performance by prioritizing migrating eligible
tasks in sched_balance_rq() (Hao Jia)
- Do not compute NUMA Balancing stats unnecessarily during
load-balancing (K Prateek Nayak)
- Do not compute overloaded status unnecessarily during
load-balancing (K Prateek Nayak)
- Generic scheduling code enhancements:
- Use READ_ONCE() in task_on_rq_queued(), to consistently use
the WRITE_ONCE() updated ->on_rq field (Harshit Agarwal)
- Isolated CPUs support enhancements: (Waiman Long)
- Make "isolcpus=nohz" equivalent to "nohz_full"
- Consolidate housekeeping cpumasks that are always identical
- Remove HK_TYPE_SCHED
- Unify HK_TYPE_{TIMER|TICK|MISC} to HK_TYPE_KERNEL_NOISE
- RSEQ enhancements:
- Validate read-only fields under DEBUG_RSEQ config
(Mathieu Desnoyers)
- PSI enhancements:
- Fix race when task wakes up before psi_sched_switch()
adjusts flags (Chengming Zhou)
- IRQ time accounting performance enhancements: (Yafang Shao)
- Define sched_clock_irqtime as static key
- Don't account irq time if sched_clock_irqtime is disabled
- Virtual machine scheduling enhancements:
- Don't try to catch up excess steal time (Suleiman Souhlal)
- Heterogenous x86 CPU scheduling enhancements: (K Prateek Nayak)
- Convert "sysctl_sched_itmt_enabled" to boolean
- Use guard() for itmt_update_mutex
- Move the "sched_itmt_enabled" sysctl to debugfs
- Remove x86_smt_flags and use cpu_smt_flags directly
- Use x86_sched_itmt_flags for PKG domain unconditionally
- Debugging code & instrumentation enhancements:
- Change need_resched warnings to pr_err() (David Rientjes)
- Print domain name in /proc/schedstat (K Prateek Nayak)
- Fix value reported by hot tasks pulled in /proc/schedstat (Peter Zijlstra)
- Report the different kinds of imbalances in /proc/schedstat (Swapnil Sapkal)
- Move sched domain name out of CONFIG_SCHED_DEBUG (Swapnil Sapkal)
- Update Schedstat version to 17 (Swapnil Sapkal)
Thanks,
Ingo
------------------>
Andy Shevchenko (1):
sched/fair: Mark m*_vruntime() with __maybe_unused
Chengming Zhou (1):
psi: Fix race when task wakes up before psi_sched_switch() adjusts flags
David Rientjes (1):
sched/debug: Change need_resched warnings to pr_err
Hao Jia (1):
sched/core: Prioritize migrating eligible tasks in sched_balance_rq()
Harshit Agarwal (1):
sched: add READ_ONCE to task_on_rq_queued
John Stultz (1):
sched: deadline: Cleanup goto label in pick_earliest_pushable_dl_task
Juri Lelli (3):
sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes
sched/deadline: Correctly account for allocated bandwidth during hotplug
sched/deadline: Check bandwidth overflow earlier for hotplug
K Prateek Nayak (8):
sched/stats: Print domain name in /proc/schedstat
x86/itmt: Convert "sysctl_sched_itmt_enabled" to boolean
x86/itmt: Use guard() for itmt_update_mutex
x86/itmt: Move the "sched_itmt_enabled" sysctl to debugfs
x86/topology: Remove x86_smt_flags and use cpu_smt_flags directly
x86/topology: Use x86_sched_itmt_flags for PKG domain unconditionally
sched/fair: Do not compute NUMA Balancing stats unnecessarily during lb
sched/fair: Do not compute overloaded status unnecessarily during lb
Mathieu Desnoyers (2):
rseq: Validate read-only fields under DEBUG_RSEQ config
rseq: Fix rseq unregistration regression
Peter Zijlstra (3):
sched/fair: Untangle NEXT_BUDDY and pick_next_task()
sched/fair: Fix value reported by hot tasks pulled in /proc/schedstat
sched/fair: Cleanup in migrate_degrades_locality() to improve readability
Sebastian Andrzej Siewior (1):
sched/fair: Update comments after sched_tick() rename.
Suleiman Souhlal (1):
sched: Don't try to catch up excess steal time.
Swapnil Sapkal (3):
sched: Report the different kinds of imbalances in /proc/schedstat
sched: Move sched domain name out of CONFIG_SCHED_DEBUG
docs: Update Schedstat version to 17
Tianchen Ding (1):
sched: Fix race between yield_to() and try_to_wake_up()
Valentin Schneider (1):
sched/fair: Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()
Vincent Guittot (10):
sched/fair: Rename h_nr_running into h_nr_queued
sched/fair: Add new cfs_rq.h_nr_runnable
sched/fair: Use the new cfs_rq.h_nr_runnable
sched/fair: Removed unsued cfs_rq.h_nr_delayed
sched/fair: Rename cfs_rq.idle_h_nr_running into h_nr_idle
sched/fair: Remove unused cfs_rq.idle_nr_running
sched/fair: Rename cfs_rq.nr_running into nr_queued
sched/fair: Do not try to migrate delayed dequeue task
sched/fair: Fix variable declaration position
sched/fair: Encapsulate set custom slice in a __setparam_fair() function
Vishal Chourasia (1):
sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug
Waiman Long (4):
sched/core: Remove HK_TYPE_SCHED
sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full"
sched/isolation: Consolidate housekeeping cpumasks that are always identical
sched: Unify HK_TYPE_{TIMER|TICK|MISC} to HK_TYPE_KERNEL_NOISE
Wander Lairson Costa (1):
sched/deadline: Consolidate Timer Cancellation
Yafang Shao (3):
sched: Define sched_clock_irqtime as static key
sched: Don't account irq time if sched_clock_irqtime is disabled
sched, psi: Don't account irq time if sched_clock_irqtime is disabled
Documentation/admin-guide/kernel-parameters.txt | 4 +-
Documentation/scheduler/sched-stats.rst | 126 ++++---
arch/x86/include/asm/topology.h | 4 +-
arch/x86/kernel/itmt.c | 81 ++---
arch/x86/kernel/smpboot.c | 19 +-
include/linux/sched.h | 10 +
include/linux/sched/isolation.h | 21 +-
include/linux/sched/topology.h | 13 +-
kernel/rseq.c | 98 ++++++
kernel/sched/core.c | 94 +++--
kernel/sched/cputime.c | 16 +-
kernel/sched/deadline.c | 119 +++++--
kernel/sched/debug.c | 25 +-
kernel/sched/fair.c | 444 ++++++++++++++----------
kernel/sched/features.h | 9 +
kernel/sched/isolation.c | 22 +-
kernel/sched/pelt.c | 4 +-
kernel/sched/psi.c | 7 +-
kernel/sched/sched.h | 37 +-
kernel/sched/stats.c | 11 +-
kernel/sched/stats.h | 4 +
kernel/sched/syscalls.c | 18 +-
kernel/sched/topology.c | 12 +-
23 files changed, 720 insertions(+), 478 deletions(-)
next prev parent reply other threads:[~2025-01-21 7:23 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-01-20 11:07 [GIT PULL] Scheduler enhancements for v6.14 Ingo Molnar
2025-01-20 17:26 ` Mathieu Desnoyers
2025-01-21 7:23 ` Ingo Molnar [this message]
2025-01-21 11:49 ` [GIT PULL v2] " Mathieu Desnoyers
2025-01-21 15:37 ` Mathieu Desnoyers
2025-01-21 20:57 ` Ingo Molnar
2025-01-21 19:40 ` pr-tracker-bot
2025-01-21 19:40 ` [GIT PULL] " pr-tracker-bot
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=Z49LV4I63Qeh3oSz@gmail.com \
--to=mingo@kernel.org \
--cc=bsegall@google.com \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@efficios.com \
--cc=mgorman@suse.de \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=sshegde@linux.ibm.com \
--cc=tglx@linutronix.de \
--cc=tj@kernel.org \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.