From: Shrikanth Hegde <sshegde@linux.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: juri.lelli@redhat.com, vincent.guittot@linaro.org,
dietmar.eggemann@arm.com, rostedt@goodmis.org,
bsegall@google.com, mgorman@suse.de, vschneid@redhat.com,
clrkwllms@kernel.org, linux-kernel@vger.kernel.org,
linux-rt-devel@lists.linux.dev,
Linus Torvalds <torvalds@linux-foundation.org>,
mingo@kernel.org, Thomas Gleixner <tglx@linutronix.de>,
Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Subject: Re: [PATCH] sched: Further restrict the preemption modes
Date: Fri, 9 Jan 2026 16:53:04 +0530 [thread overview]
Message-ID: <a86b6bbd-c0ed-40dc-899f-ba162332c80a@linux.ibm.com> (raw)
In-Reply-To: <20251219101502.GB1132199@noisy.programming.kicks-ass.net>
Hi Peter.
On 12/19/25 3:45 PM, Peter Zijlstra wrote:
>
> [ with 6.18 being an LTS release, it might be a good time for this ]
>
> The introduction of PREEMPT_LAZY was for multiple reasons:
>
> - PREEMPT_RT suffered from over-scheduling, hurting performance compared to
> !PREEMPT_RT.
>
> - the introduction of (more) features that rely on preemption; like
> folio_zero_user() which can do large memset() without preemption checks.
>
> (Xen already had a horrible hack to deal with long running hypercalls)
>
> - the endless and uncontrolled sprinkling of cond_resched() -- mostly cargo
> cult or in response to poor to replicate workloads.
>
> By moving to a model that is fundamentally preemptable these things become
> manageable and avoid needing to introduce more horrible hacks.
>
> Since this is a requirement; limit PREEMPT_NONE to architectures that do not
> support preemption at all. Further limit PREEMPT_VOLUNTARY to those
> architectures that do not yet have PREEMPT_LAZY support (with the eventual goal
> to make this the empty set and completely remove voluntary preemption and
> cond_resched() -- notably VOLUNTARY is already limited to !ARCH_NO_PREEMPT.)
>
> This leaves up-to-date architectures (arm64, loongarch, powerpc, riscv, s390,
> x86) with only two preemption models: full and lazy (like PREEMPT_RT).
>
> While Lazy has been the recommended setting for a while, not all distributions
> have managed to make the switch yet. Force things along. Keep the patch minimal
> in case of hard to address regressions that might pop up.
>
> Signed-off-by: Peter Zijlstra (Intel) <peterz@infradead.org>
> ---
> kernel/Kconfig.preempt | 3 +++
> kernel/sched/core.c | 2 +-
> kernel/sched/debug.c | 2 +-
> 3 files changed, 5 insertions(+), 2 deletions(-)
>
> --- a/kernel/Kconfig.preempt
> +++ b/kernel/Kconfig.preempt
> @@ -16,11 +16,13 @@ config ARCH_HAS_PREEMPT_LAZY
>
> choice
> prompt "Preemption Model"
> + default PREEMPT_LAZY if ARCH_HAS_PREEMPT_LAZY
> default PREEMPT_NONE
>
> config PREEMPT_NONE
> bool "No Forced Preemption (Server)"
> depends on !PREEMPT_RT
> + depends on ARCH_NO_PREEMPT
> select PREEMPT_NONE_BUILD if !PREEMPT_DYNAMIC
> help
> This is the traditional Linux preemption model, geared towards
> @@ -35,6 +37,7 @@ config PREEMPT_NONE
>
> config PREEMPT_VOLUNTARY
> bool "Voluntary Kernel Preemption (Desktop)"
> + depends on !ARCH_HAS_PREEMPT_LAZY
> depends on !ARCH_NO_PREEMPT
> depends on !PREEMPT_RT
> select PREEMPT_VOLUNTARY_BUILD if !PREEMPT_DYNAMIC
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -7553,7 +7553,7 @@ int preempt_dynamic_mode = preempt_dynam
>
> int sched_dynamic_mode(const char *str)
> {
> -# ifndef CONFIG_PREEMPT_RT
> +# if !(defined(CONFIG_PREEMPT_RT) || defined(CONFIG_ARCH_HAS_PREEMPT_LAZY))
> if (!strcmp(str, "none"))
> return preempt_dynamic_none;
>
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -243,7 +243,7 @@ static ssize_t sched_dynamic_write(struc
>
> static int sched_dynamic_show(struct seq_file *m, void *v)
> {
> - int i = IS_ENABLED(CONFIG_PREEMPT_RT) * 2;
> + int i = (IS_ENABLED(CONFIG_PREEMPT_RT) || IS_ENABLED(CONFIG_ARCH_HAS_PREEMPT_LAZY)) * 2;
> int j;
>
> /* Count entries in NULL terminated preempt_modes */
Maybe only change the default to LAZY, but keep other options possible
via dynamic update?
- When the kernel changes to lazy being the default, the scheduling
pattern can change and it may affect the workloads. having ability to
dynamically change to none/voluntary could help one to figure out where
it is regressing. we could document cases where regression is expected.
- with preempt=full/lazy we will likely never see softlockups. How are
we going to find out longer kernel paths(some maybe design, some may be
bugs) apart from observing workload regression?
Also, is softlockup code is of any use in preempt=full/lazy?
next prev parent reply other threads:[~2026-01-09 11:23 UTC|newest]
Thread overview: 21+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-12-19 10:15 [PATCH] sched: Further restrict the preemption modes Peter Zijlstra
2026-01-06 15:23 ` Valentin Schneider
2026-01-06 16:40 ` Steven Rostedt
2026-01-09 11:23 ` Shrikanth Hegde [this message]
2026-02-25 10:53 ` Peter Zijlstra
2026-02-25 12:56 ` Shrikanth Hegde
2026-02-26 0:48 ` Steven Rostedt
2026-02-26 5:30 ` Shrikanth Hegde
2026-02-26 17:22 ` Steven Rostedt
2026-02-27 9:09 ` Shrikanth Hegde
2026-02-27 14:53 ` Steven Rostedt
2026-02-27 15:28 ` Shrikanth Hegde
2026-03-09 9:13 ` Shrikanth Hegde
2026-02-24 15:45 ` Ciunas Bennett
2026-02-24 17:11 ` Sebastian Andrzej Siewior
2026-02-25 9:56 ` Ciunas Bennett
2026-02-25 2:30 ` Ilya Leoshkevich
2026-02-25 16:33 ` Christian Borntraeger
2026-02-25 18:30 ` Douglas Freimuth
2026-03-03 9:15 ` Ciunas Bennett
2026-03-03 11:52 ` Peter Zijlstra
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=a86b6bbd-c0ed-40dc-899f-ba162332c80a@linux.ibm.com \
--to=sshegde@linux.ibm.com \
--cc=bigeasy@linutronix.de \
--cc=bsegall@google.com \
--cc=clrkwllms@kernel.org \
--cc=dietmar.eggemann@arm.com \
--cc=juri.lelli@redhat.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linux-rt-devel@lists.linux.dev \
--cc=mgorman@suse.de \
--cc=mingo@kernel.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
--cc=torvalds@linux-foundation.org \
--cc=vincent.guittot@linaro.org \
--cc=vschneid@redhat.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox