All of lore.kernel.org
 help / color / mirror / Atom feed
From: Ingo Molnar <mingo@kernel.org>
To: Mathieu Desnoyers <mathieu.desnoyers@efficios.com>
Cc: Linus Torvalds <torvalds@linux-foundation.org>,
	linux-kernel@vger.kernel.org,
	Peter Zijlstra <peterz@infradead.org>,
	Thomas Gleixner <tglx@linutronix.de>,
	Juri Lelli <juri.lelli@redhat.com>,
	Vincent Guittot <vincent.guittot@linaro.org>,
	Dietmar Eggemann <dietmar.eggemann@arm.com>,
	Steven Rostedt <rostedt@goodmis.org>,
	Ben Segall <bsegall@google.com>, Mel Gorman <mgorman@suse.de>,
	Valentin Schneider <vschneid@redhat.com>,
	Shrikanth Hegde <sshegde@linux.ibm.com>,
	Tejun Heo <tj@kernel.org>
Subject: [GIT PULL v2] Scheduler enhancements for v6.14
Date: Tue, 21 Jan 2025 08:23:03 +0100	[thread overview]
Message-ID: <Z49LV4I63Qeh3oSz@gmail.com> (raw)
In-Reply-To: <Z46HUWsE97BHNkke@localhost.localdomain>


* Mathieu Desnoyers <mathieu.desnoyers@efficios.com> wrote:

> On 20-Jan-2025 12:07:41 PM, Ingo Molnar wrote:
> > 
> > Linus,
> > 
> > Please pull the latest sched/core Git tree from:
> > 
> >    git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2025-01-20
> > 
> >    # HEAD: 7d9da040575b343085287686fa902a5b2d43c7ca psi: Fix race when task wakes up before psi_sched_switch() adjusts flags
> > 
> > Scheduler enhancements for v6.14:
> 
> [...]
> 
> >  - RSEQ enhancements:
> > 
> >    - Validate read-only fields under DEBUG_RSEQ config
> >      (Mathieu Desnoyers)
> 
> FYI, a regression introduced by this commit was reported by s390x
> glibc developers testing against linux-next:
> 
> https://sourceware.org/pipermail/libc-alpha/2025-January/163993.html
> 
> I've sent a fix here:
> 
> https://lore.kernel.org/lkml/20250116205956.836074-1-mathieu.desnoyers@efficios.com/
> 
> The commit introducing the issue is in this PR, but not the fix.

Indeed - with the bug RSEQ_FLAG_UNREGISTER would fail with an incorrect 
-EFAULT return.

I've applied your fix, and updated the pull request for Linus further 
below. If Linus has already pulled I'll send a fixes pull request 
separately, or Linus can apply the fix from email directly:

  Acked-by: Ingo Molnar <mingo@kernel.org>

Or he can pull the sched-core-2025-01-21 tag below safely on top of 
sched-core-2025-01-20, which will result in a diffstat of:

  Mathieu Desnoyers (1):
      rseq: Fix rseq unregistration regression

  kernel/rseq.c | 2 +-
  1 file changed, 1 insertion(+), 1 deletion(-)

Since I booted the scheduler tree on generic desktops and it was tested 
on other systems as well and nothing appeared to be broken, I presume 
RSEQ_FLAG_UNREGISTER is used only in libc syscall-testcases and in 
specific applications?

Thanks,

	Ingo

===================================>
Linus,

Please pull the latest sched/core Git tree from:

   git://git.kernel.org/pub/scm/linux/kernel/git/tip/tip.git sched-core-2025-01-21

   # HEAD: 40724ecafccb1fb62b66264854e8c3ad394c8f3d rseq: Fix rseq unregistration regression

Scheduler enhancements for v6.14:

 - Fair scheduler (SCHED_FAIR) enhancements:

   - Behavioral improvements:
     - Untangle NEXT_BUDDY and pick_next_task() (Peter Zijlstra)

   - Delayed-dequeue enhancements & fixes: (Vincent Guittot)

     - Rename h_nr_running into h_nr_queued
     - Add new cfs_rq.h_nr_runnable
     - Use the new cfs_rq.h_nr_runnable
     - Removed unsued cfs_rq.h_nr_delayed
     - Rename cfs_rq.idle_h_nr_running into h_nr_idle
     - Remove unused cfs_rq.idle_nr_running
     - Rename cfs_rq.nr_running into nr_queued
     - Do not try to migrate delayed dequeue task
     - Fix variable declaration position
     - Encapsulate set custom slice in a __setparam_fair() function

   - Fixes:
     - Fix race between yield_to() and try_to_wake_up() (Tianchen Ding)
     - Fix CPU bandwidth limit bypass during CPU hotplug (Vishal Chourasia)

   - Cleanups:
     - Clean up in migrate_degrades_locality() to improve
       readability (Peter Zijlstra)
     - Mark m*_vruntime() with __maybe_unused (Andy Shevchenko)
     - Update comments after sched_tick() rename (Sebastian Andrzej Siewior)
     - Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()
       (Valentin Schneider)

 - Deadline scheduler (SCHED_DL) enhancements:

   - Restore dl_server bandwidth on non-destructive root domain
     changes (Juri Lelli)

   - Correctly account for allocated bandwidth during
     hotplug (Juri Lelli)

   - Check bandwidth overflow earlier for hotplug (Juri Lelli)

   - Clean up goto label in pick_earliest_pushable_dl_task()
     (John Stultz)

   - Consolidate timer cancellation (Wander Lairson Costa)

 - Load-balancer enhancements:

   - Improve performance by prioritizing migrating eligible
     tasks in sched_balance_rq() (Hao Jia)

   - Do not compute NUMA Balancing stats unnecessarily during
     load-balancing (K Prateek Nayak)

   - Do not compute overloaded status unnecessarily during
     load-balancing (K Prateek Nayak)

 - Generic scheduling code enhancements:

   - Use READ_ONCE() in task_on_rq_queued(), to consistently use
     the WRITE_ONCE() updated ->on_rq field (Harshit Agarwal)

 - Isolated CPUs support enhancements: (Waiman Long)

   - Make "isolcpus=nohz" equivalent to "nohz_full"
   - Consolidate housekeeping cpumasks that are always identical
   - Remove HK_TYPE_SCHED
   - Unify HK_TYPE_{TIMER|TICK|MISC} to HK_TYPE_KERNEL_NOISE

 - RSEQ enhancements:

   - Validate read-only fields under DEBUG_RSEQ config
     (Mathieu Desnoyers)

 - PSI enhancements:

   - Fix race when task wakes up before psi_sched_switch()
     adjusts flags (Chengming Zhou)

 - IRQ time accounting performance enhancements: (Yafang Shao)

   - Define sched_clock_irqtime as static key
   - Don't account irq time if sched_clock_irqtime is disabled

 - Virtual machine scheduling enhancements:

   - Don't try to catch up excess steal time (Suleiman Souhlal)

 - Heterogenous x86 CPU scheduling enhancements: (K Prateek Nayak)

   - Convert "sysctl_sched_itmt_enabled" to boolean
   - Use guard() for itmt_update_mutex
   - Move the "sched_itmt_enabled" sysctl to debugfs
   - Remove x86_smt_flags and use cpu_smt_flags directly
   - Use x86_sched_itmt_flags for PKG domain unconditionally

 - Debugging code & instrumentation enhancements:

   - Change need_resched warnings to pr_err() (David Rientjes)
   - Print domain name in /proc/schedstat (K Prateek Nayak)
   - Fix value reported by hot tasks pulled in /proc/schedstat (Peter Zijlstra)
   - Report the different kinds of imbalances in /proc/schedstat (Swapnil Sapkal)
   - Move sched domain name out of CONFIG_SCHED_DEBUG (Swapnil Sapkal)
   - Update Schedstat version to 17 (Swapnil Sapkal)

 Thanks,

	Ingo

------------------>
Andy Shevchenko (1):
      sched/fair: Mark m*_vruntime() with __maybe_unused

Chengming Zhou (1):
      psi: Fix race when task wakes up before psi_sched_switch() adjusts flags

David Rientjes (1):
      sched/debug: Change need_resched warnings to pr_err

Hao Jia (1):
      sched/core: Prioritize migrating eligible tasks in sched_balance_rq()

Harshit Agarwal (1):
      sched: add READ_ONCE to task_on_rq_queued

John Stultz (1):
      sched: deadline: Cleanup goto label in pick_earliest_pushable_dl_task

Juri Lelli (3):
      sched/deadline: Restore dl_server bandwidth on non-destructive root domain changes
      sched/deadline: Correctly account for allocated bandwidth during hotplug
      sched/deadline: Check bandwidth overflow earlier for hotplug

K Prateek Nayak (8):
      sched/stats: Print domain name in /proc/schedstat
      x86/itmt: Convert "sysctl_sched_itmt_enabled" to boolean
      x86/itmt: Use guard() for itmt_update_mutex
      x86/itmt: Move the "sched_itmt_enabled" sysctl to debugfs
      x86/topology: Remove x86_smt_flags and use cpu_smt_flags directly
      x86/topology: Use x86_sched_itmt_flags for PKG domain unconditionally
      sched/fair: Do not compute NUMA Balancing stats unnecessarily during lb
      sched/fair: Do not compute overloaded status unnecessarily during lb

Mathieu Desnoyers (2):
      rseq: Validate read-only fields under DEBUG_RSEQ config
      rseq: Fix rseq unregistration regression

Peter Zijlstra (3):
      sched/fair: Untangle NEXT_BUDDY and pick_next_task()
      sched/fair: Fix value reported by hot tasks pulled in /proc/schedstat
      sched/fair: Cleanup in migrate_degrades_locality() to improve readability

Sebastian Andrzej Siewior (1):
      sched/fair: Update comments after sched_tick() rename.

Suleiman Souhlal (1):
      sched: Don't try to catch up excess steal time.

Swapnil Sapkal (3):
      sched: Report the different kinds of imbalances in /proc/schedstat
      sched: Move sched domain name out of CONFIG_SCHED_DEBUG
      docs: Update Schedstat version to 17

Tianchen Ding (1):
      sched: Fix race between yield_to() and try_to_wake_up()

Valentin Schneider (1):
      sched/fair: Remove CONFIG_CFS_BANDWIDTH=n definition of cfs_bandwidth_used()

Vincent Guittot (10):
      sched/fair: Rename h_nr_running into h_nr_queued
      sched/fair: Add new cfs_rq.h_nr_runnable
      sched/fair: Use the new cfs_rq.h_nr_runnable
      sched/fair: Removed unsued cfs_rq.h_nr_delayed
      sched/fair: Rename cfs_rq.idle_h_nr_running into h_nr_idle
      sched/fair: Remove unused cfs_rq.idle_nr_running
      sched/fair: Rename cfs_rq.nr_running into nr_queued
      sched/fair: Do not try to migrate delayed dequeue task
      sched/fair: Fix variable declaration position
      sched/fair: Encapsulate set custom slice in a __setparam_fair() function

Vishal Chourasia (1):
      sched/fair: Fix CPU bandwidth limit bypass during CPU hotplug

Waiman Long (4):
      sched/core: Remove HK_TYPE_SCHED
      sched/isolation: Make "isolcpus=nohz" equivalent to "nohz_full"
      sched/isolation: Consolidate housekeeping cpumasks that are always identical
      sched: Unify HK_TYPE_{TIMER|TICK|MISC} to HK_TYPE_KERNEL_NOISE

Wander Lairson Costa (1):
      sched/deadline: Consolidate Timer Cancellation

Yafang Shao (3):
      sched: Define sched_clock_irqtime as static key
      sched: Don't account irq time if sched_clock_irqtime is disabled
      sched, psi: Don't account irq time if sched_clock_irqtime is disabled


 Documentation/admin-guide/kernel-parameters.txt |   4 +-
 Documentation/scheduler/sched-stats.rst         | 126 ++++---
 arch/x86/include/asm/topology.h                 |   4 +-
 arch/x86/kernel/itmt.c                          |  81 ++---
 arch/x86/kernel/smpboot.c                       |  19 +-
 include/linux/sched.h                           |  10 +
 include/linux/sched/isolation.h                 |  21 +-
 include/linux/sched/topology.h                  |  13 +-
 kernel/rseq.c                                   |  98 ++++++
 kernel/sched/core.c                             |  94 +++--
 kernel/sched/cputime.c                          |  16 +-
 kernel/sched/deadline.c                         | 119 +++++--
 kernel/sched/debug.c                            |  25 +-
 kernel/sched/fair.c                             | 444 ++++++++++++++----------
 kernel/sched/features.h                         |   9 +
 kernel/sched/isolation.c                        |  22 +-
 kernel/sched/pelt.c                             |   4 +-
 kernel/sched/psi.c                              |   7 +-
 kernel/sched/sched.h                            |  37 +-
 kernel/sched/stats.c                            |  11 +-
 kernel/sched/stats.h                            |   4 +
 kernel/sched/syscalls.c                         |  18 +-
 kernel/sched/topology.c                         |  12 +-
 23 files changed, 720 insertions(+), 478 deletions(-)

  reply	other threads:[~2025-01-21  7:23 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-01-20 11:07 [GIT PULL] Scheduler enhancements for v6.14 Ingo Molnar
2025-01-20 17:26 ` Mathieu Desnoyers
2025-01-21  7:23   ` Ingo Molnar [this message]
2025-01-21 11:49     ` [GIT PULL v2] " Mathieu Desnoyers
2025-01-21 15:37       ` Mathieu Desnoyers
2025-01-21 20:57         ` Ingo Molnar
2025-01-21 19:40     ` pr-tracker-bot
2025-01-21 19:40 ` [GIT PULL] " pr-tracker-bot

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=Z49LV4I63Qeh3oSz@gmail.com \
    --to=mingo@kernel.org \
    --cc=bsegall@google.com \
    --cc=dietmar.eggemann@arm.com \
    --cc=juri.lelli@redhat.com \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=mgorman@suse.de \
    --cc=peterz@infradead.org \
    --cc=rostedt@goodmis.org \
    --cc=sshegde@linux.ibm.com \
    --cc=tglx@linutronix.de \
    --cc=tj@kernel.org \
    --cc=torvalds@linux-foundation.org \
    --cc=vincent.guittot@linaro.org \
    --cc=vschneid@redhat.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.