* [PATCH 0/7] Lazy preemption bits
@ 2024-10-09 16:54 Ankur Arora
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
` (6 more replies)
0 siblings, 7 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
This series adds RCU and some leftover scheduler bits for lazy
preemption. Goes on top of PeterZ's series:
https://lore.kernel.org/20241007074609.447006177@infradead.org
Tracing and RISCV bits from Sebastian at:
https://lore.kernel.org/lkml/20241009105709.887510-1-bigeasy@linutronix.de/
Please review.
Ankur Arora (6):
sched: warn for high latency with TIF_NEED_RESCHED_LAZY
rcu: limit PREEMPT_RCU configurations
rcu: fix header guard for rcu_all_qs()
rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
Shrikanth Hegde (1):
powerpc: add support for PREEMPT_LAZY
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/thread_info.h | 5 ++++-
arch/powerpc/kernel/interrupt.c | 5 +++--
include/linux/rcutree.h | 2 +-
include/linux/srcutiny.h | 2 +-
kernel/rcu/Kconfig | 4 ++--
kernel/rcu/srcutiny.c | 14 +++++++-------
kernel/rcu/tree_plugin.h | 11 +++++++----
kernel/sched/core.c | 2 +-
kernel/sched/debug.c | 7 +++++--
kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
11 files changed, 44 insertions(+), 31 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 75+ messages in thread
* [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-10 6:37 ` Sebastian Andrzej Siewior
` (3 more replies)
2024-10-09 16:54 ` [PATCH 2/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
` (5 subsequent siblings)
6 siblings, 4 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
resched_latency_warn() now also warns if TIF_NEED_RESCHED_LAZY is set
without rescheduling for more than the latency_warn_ms period.
Cc: Ingo Molnar <mingo@redhat.com>
Cc: Peter Ziljstra <peterz@infradead.org>
Cc: Juri Lelli <juri.lelli@redhat.com>
Cc: Vincent Guittot <vincent.guittot@linaro.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/sched/core.c | 2 +-
kernel/sched/debug.c | 7 +++++--
2 files changed, 6 insertions(+), 3 deletions(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index 694bfcf153cb..1229766b704e 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -5571,7 +5571,7 @@ static u64 cpu_resched_latency(struct rq *rq)
if (sysctl_resched_latency_warn_once && warned_once)
return 0;
- if (!need_resched() || !latency_warn_ms)
+ if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
return 0;
if (system_state == SYSTEM_BOOTING)
diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
index 9abcc6ead11b..f0d551ba64bb 100644
--- a/kernel/sched/debug.c
+++ b/kernel/sched/debug.c
@@ -1293,9 +1293,12 @@ void proc_sched_set_task(struct task_struct *p)
void resched_latency_warn(int cpu, u64 latency)
{
static DEFINE_RATELIMIT_STATE(latency_check_ratelimit, 60 * 60 * HZ, 1);
+ char *nr;
+
+ nr = tif_need_resched() ? "need_resched" : "need_resched_lazy";
WARN(__ratelimit(&latency_check_ratelimit),
- "sched: CPU %d need_resched set for > %llu ns (%d ticks) "
+ "sched: CPU %d %s set for > %llu ns (%d ticks) "
"without schedule\n",
- cpu, latency, cpu_rq(cpu)->ticks_without_resched);
+ cpu, nr, latency, cpu_rq(cpu)->ticks_without_resched);
}
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-09 18:01 ` Peter Zijlstra
2024-10-09 16:54 ` [PATCH 3/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
` (4 subsequent siblings)
6 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC
which allows for dynamic switching of preemption models.
The choice of preemptible RCU or not, however, is fixed at compile
time. Given the trade-offs made to have a preemptible RCU, some
configurations which have limited preemption might prefer the
stronger forward-progress guarantees of PREEMPT_RCU=n.
Accordingly, explicitly limit PREEMPT_RCU=y to PREEMPT_DYNAMIC,
PREEMPT, PREEMPT_RT.
This means that (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), which selects
PREEMPTION will run with PREEMPT_RCU=n. The combination (PREEMPT_LAZY=y,
PREEMPT_DYNAMIC=y), will run with PREEMPT_RCU=y.
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/rcu/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 3e079de0f5b4..1bfe7016724f 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -18,7 +18,7 @@ config TREE_RCU
config PREEMPT_RCU
bool
- default y if PREEMPTION
+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
select TREE_RCU
help
This option selects the RCU implementation that is
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* [PATCH 3/7] rcu: fix header guard for rcu_all_qs()
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
2024-10-09 16:54 ` [PATCH 2/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-10 6:41 ` Sebastian Andrzej Siewior
2024-10-09 16:54 ` [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
` (3 subsequent siblings)
6 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
rcu_all_qs() is defined for !CONFIG_PREEMPT_RCU but the declaration
is conditioned on CONFIG_PREEMPTION.
With CONFIG_PREEMPT_LAZY, CONFIG_PREEMPTION=y does not imply
CONFIG_PREEMPT_RCU=y.
Decouple the two.
Cc: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/linux/rcutree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 90a684f94776..ae8b5cb475a3 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -104,7 +104,7 @@ extern int rcu_scheduler_active;
void rcu_end_inkernel_boot(void);
bool rcu_inkernel_boot_has_ended(void);
bool rcu_is_watching(void);
-#ifndef CONFIG_PREEMPTION
+#ifndef CONFIG_PREEMPT_RCU
void rcu_all_qs(void);
#endif
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
` (2 preceding siblings ...)
2024-10-09 16:54 ` [PATCH 3/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-09 19:05 ` Ankur Arora
2024-10-10 6:50 ` Sebastian Andrzej Siewior
2024-10-09 16:54 ` [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
` (2 subsequent siblings)
6 siblings, 2 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
states for read-side critical sections via rcu_all_qs().
One reason why this was needed, was lacking preempt-count, the tick
handler has no way of knowing whether it is executing in a read-side
critical section or not.
With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
quiescent states via rcu_all_qs().
So, use the availability of preempt_count() to report quiescent states
in rcu_flavor_sched_clock_irq().
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/rcu/tree_plugin.h | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 1c7cbd145d5e..da324d66034b 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
*/
static void rcu_flavor_sched_clock_irq(int user)
{
- if (user || rcu_is_cpu_rrupt_from_idle()) {
+ if (user || rcu_is_cpu_rrupt_from_idle() ||
+ (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
+ !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) {
/*
* Get here if this CPU took its interrupt from user
- * mode or from the idle loop, and if this is not a
- * nested interrupt. In this case, the CPU is in
- * a quiescent state, so note it.
+ * mode, from the idle loop without this being a nested
+ * interrupt, or while not holding a preempt count (but
+ * with PREEMPT_COUNT=y. In this case, the CPU is in a
+ * quiescent state, so note it.
*
* No memory barrier is required here because rcu_qs()
* references only CPU-local variables that other CPUs
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
` (3 preceding siblings ...)
2024-10-09 16:54 ` [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-09 18:02 ` Peter Zijlstra
2024-10-09 16:54 ` [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
2024-10-09 16:54 ` [PATCH 7/7] powerpc: add support for PREEMPT_LAZY Ankur Arora
6 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
CC: Paul E. McKenney <paulmck@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/linux/srcutiny.h | 2 +-
kernel/rcu/Kconfig | 2 +-
kernel/rcu/srcutiny.c | 14 +++++++-------
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 4d96bbdb45f0..1635c5e2662f 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -64,7 +64,7 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
{
int idx;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], READ_ONCE(ssp->srcu_lock_nesting[idx]) + 1);
preempt_enable();
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 1bfe7016724f..cd3148fb566a 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -91,7 +91,7 @@ config NEED_TASKS_RCU
config TASKS_RCU
bool
- default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
+ default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_LAZY)
select IRQ_WORK
config FORCE_TASKS_RUDE_RCU
diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
index 549c03336ee9..8a662d911abd 100644
--- a/kernel/rcu/srcutiny.c
+++ b/kernel/rcu/srcutiny.c
@@ -98,7 +98,7 @@ void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
{
int newval;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
preempt_enable();
@@ -120,7 +120,7 @@ void srcu_drive_gp(struct work_struct *wp)
struct srcu_struct *ssp;
ssp = container_of(wp, struct srcu_struct, srcu_work);
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max))) {
return; /* Already running or nothing to do. */
preempt_enable();
@@ -138,7 +138,7 @@ void srcu_drive_gp(struct work_struct *wp)
WRITE_ONCE(ssp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */
preempt_enable();
swait_event_exclusive(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx]));
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
preempt_enable();
@@ -159,7 +159,7 @@ void srcu_drive_gp(struct work_struct *wp)
* at interrupt level, but the ->srcu_gp_running checks will
* straighten that out.
*/
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
WRITE_ONCE(ssp->srcu_gp_running, false);
idx = ULONG_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max));
preempt_enable();
@@ -172,7 +172,7 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
{
unsigned long cookie;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
cookie = get_state_synchronize_srcu(ssp);
if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie)) {
preempt_enable();
@@ -199,7 +199,7 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
rhp->func = func;
rhp->next = NULL;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
local_irq_save(flags);
*ssp->srcu_cb_tail = rhp;
ssp->srcu_cb_tail = &rhp->next;
@@ -261,7 +261,7 @@ unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
{
unsigned long ret;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
ret = get_state_synchronize_srcu(ssp);
srcu_gp_start_if_needed(ssp);
preempt_enable();
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
` (4 preceding siblings ...)
2024-10-09 16:54 ` [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-10 6:53 ` Sebastian Andrzej Siewior
2024-10-09 16:54 ` [PATCH 7/7] powerpc: add support for PREEMPT_LAZY Ankur Arora
6 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
To reduce RCU noise for nohz_full configurations, osnoise depends
on cond_resched() providing quiescent states for PREEMPT_RCU=n
configurations. And, for PREEMPT_RCU=y configurations does this
by directly calling rcu_momentary_eqs().
With PREEMPT_LAZY=y, however, we can have configurations with
(PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
can help.
Handle that by fallback to the explicit quiescent states via
rcu_momentary_eqs().
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/trace/trace_osnoise.c | 22 ++++++++++++----------
1 file changed, 12 insertions(+), 10 deletions(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index 1439064f65d6..4f4ebf3f15f0 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1538,18 +1538,20 @@ static int run_osnoise(void)
/*
* In some cases, notably when running on a nohz_full CPU with
* a stopped tick PREEMPT_RCU has no way to account for QSs.
- * This will eventually cause unwarranted noise as PREEMPT_RCU
- * will force preemption as the means of ending the current
- * grace period. We avoid this problem by calling
- * rcu_momentary_eqs(), which performs a zero duration
- * EQS allowing PREEMPT_RCU to end the current grace period.
- * This call shouldn't be wrapped inside an RCU critical
- * section.
+ * This will eventually cause unwarranted noise as RCU forces
+ * preemption as the means of ending the current grace period.
+ * We avoid this by calling rcu_momentary_eqs(), which performs
+ * a zero duration EQS allowing RCU to end the current grace
+ * period. This call shouldn't be wrapped inside an RCU
+ * critical section.
*
- * Note that in non PREEMPT_RCU kernels QSs are handled through
- * cond_resched()
+ * For non-PREEMPT_RCU kernels with cond_resched() (non-
+ * PREEMPT_LAZY configurations), QSs are handled through
+ * cond_resched(). For PREEMPT_LAZY kernels, we fallback to
+ * the zero duration QS via rcu_momentary_eqs().
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
+ if (IS_ENABLED(CONFIG_PREEMPT_RCU) ||
+ (!IS_ENABLED(CONFIG_PREEMPT_RCU) && IS_ENABLED(CONFIG_PREEMPTION))) {
if (!disable_irq)
local_irq_disable();
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* [PATCH 7/7] powerpc: add support for PREEMPT_LAZY
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
` (5 preceding siblings ...)
2024-10-09 16:54 ` [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
@ 2024-10-09 16:54 ` Ankur Arora
2024-10-10 7:22 ` Sebastian Andrzej Siewior
6 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 16:54 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
From: Shrikanth Hegde <sshegde@linux.ibm.com>
Add PowerPC arch support for PREEMPT_LAZY by defining LAZY bits.
Since PowerPC doesn't use generic exit to functions, check for
NEED_RESCHED_LAZY when exiting to user or to the kernel from
interrupt routines.
Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
[ Changed TIF_NEED_RESCHED_LAZY to now be defined unconditionally. ]
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
arch/powerpc/Kconfig | 1 +
arch/powerpc/include/asm/thread_info.h | 5 ++++-
arch/powerpc/kernel/interrupt.c | 5 +++--
3 files changed, 8 insertions(+), 3 deletions(-)
diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
index 8094a01974cc..593a1d60d443 100644
--- a/arch/powerpc/Kconfig
+++ b/arch/powerpc/Kconfig
@@ -270,6 +270,7 @@ config PPC
select HAVE_PERF_REGS
select HAVE_PERF_USER_STACK_DUMP
select HAVE_RETHOOK if KPROBES
+ select ARCH_HAS_PREEMPT_LAZY
select HAVE_REGS_AND_STACK_ACCESS_API
select HAVE_RELIABLE_STACKTRACE
select HAVE_RSEQ
diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
index 6ebca2996f18..ae7793dae763 100644
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -117,11 +117,14 @@ void arch_setup_new_exec(void);
#endif
#define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */
#define TIF_32BIT 20 /* 32 bit binary */
+#define TIF_NEED_RESCHED_LAZY 21 /* Lazy rescheduling */
/* as above, but as bit values */
#define _TIF_SYSCALL_TRACE (1<<TIF_SYSCALL_TRACE)
#define _TIF_SIGPENDING (1<<TIF_SIGPENDING)
#define _TIF_NEED_RESCHED (1<<TIF_NEED_RESCHED)
+#define _TIF_NEED_RESCHED_LAZY (1<<TIF_NEED_RESCHED_LAZY)
+
#define _TIF_NOTIFY_SIGNAL (1<<TIF_NOTIFY_SIGNAL)
#define _TIF_POLLING_NRFLAG (1<<TIF_POLLING_NRFLAG)
#define _TIF_32BIT (1<<TIF_32BIT)
@@ -144,7 +147,7 @@ void arch_setup_new_exec(void);
#define _TIF_USER_WORK_MASK (_TIF_SIGPENDING | _TIF_NEED_RESCHED | \
_TIF_NOTIFY_RESUME | _TIF_UPROBE | \
_TIF_RESTORE_TM | _TIF_PATCH_PENDING | \
- _TIF_NOTIFY_SIGNAL)
+ _TIF_NOTIFY_SIGNAL | _TIF_NEED_RESCHED_LAZY)
#define _TIF_PERSYSCALL_MASK (_TIF_RESTOREALL|_TIF_NOERROR)
/* Bits in local_flags */
diff --git a/arch/powerpc/kernel/interrupt.c b/arch/powerpc/kernel/interrupt.c
index af62ec974b97..86fe60295de7 100644
--- a/arch/powerpc/kernel/interrupt.c
+++ b/arch/powerpc/kernel/interrupt.c
@@ -185,7 +185,7 @@ interrupt_exit_user_prepare_main(unsigned long ret, struct pt_regs *regs)
ti_flags = read_thread_flags();
while (unlikely(ti_flags & (_TIF_USER_WORK_MASK & ~_TIF_RESTORE_TM))) {
local_irq_enable();
- if (ti_flags & _TIF_NEED_RESCHED) {
+ if (ti_flags & (_TIF_NEED_RESCHED | _TIF_NEED_RESCHED_LAZY)) {
schedule();
} else {
/*
@@ -396,7 +396,8 @@ notrace unsigned long interrupt_exit_kernel_prepare(struct pt_regs *regs)
/* Returning to a kernel context with local irqs enabled. */
WARN_ON_ONCE(!(regs->msr & MSR_EE));
again:
- if (IS_ENABLED(CONFIG_PREEMPT)) {
+
+ if (IS_ENABLED(CONFIG_PREEMPTION)) {
/* Return to preemptible kernel context */
if (unlikely(read_thread_flags() & _TIF_NEED_RESCHED)) {
if (preempt_count() == 0)
--
2.43.5
^ permalink raw reply related [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 16:54 ` [PATCH 2/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
@ 2024-10-09 18:01 ` Peter Zijlstra
2024-10-09 18:24 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-09 18:01 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 09:54:06AM -0700, Ankur Arora wrote:
> PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC
> which allows for dynamic switching of preemption models.
>
> The choice of preemptible RCU or not, however, is fixed at compile
> time. Given the trade-offs made to have a preemptible RCU, some
> configurations which have limited preemption might prefer the
> stronger forward-progress guarantees of PREEMPT_RCU=n.
>
> Accordingly, explicitly limit PREEMPT_RCU=y to PREEMPT_DYNAMIC,
> PREEMPT, PREEMPT_RT.
>
> This means that (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), which selects
> PREEMPTION will run with PREEMPT_RCU=n. The combination (PREEMPT_LAZY=y,
> PREEMPT_DYNAMIC=y), will run with PREEMPT_RCU=y.
I am completely confused by this. Why do we want this?
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Peter Zijlstra <peterz@infradead.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> kernel/rcu/Kconfig | 2 +-
> 1 file changed, 1 insertion(+), 1 deletion(-)
>
> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> index 3e079de0f5b4..1bfe7016724f 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -18,7 +18,7 @@ config TREE_RCU
>
> config PREEMPT_RCU
> bool
> - default y if PREEMPTION
> + default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> select TREE_RCU
> help
> This option selects the RCU implementation that is
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
2024-10-09 16:54 ` [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
@ 2024-10-09 18:02 ` Peter Zijlstra
2024-10-09 18:52 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-09 18:02 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 09:54:09AM -0700, Ankur Arora wrote:
> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> index 1bfe7016724f..cd3148fb566a 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -91,7 +91,7 @@ config NEED_TASKS_RCU
>
> config TASKS_RCU
> bool
> - default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
> + default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_LAZY)
> select IRQ_WORK
PREEMPT_LAZY implies PREEMPTION
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 18:01 ` Peter Zijlstra
@ 2024-10-09 18:24 ` Paul E. McKenney
2024-10-09 20:52 ` Peter Zijlstra
2024-10-10 6:32 ` Sebastian Andrzej Siewior
0 siblings, 2 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-09 18:24 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ankur Arora, linux-kernel, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 08:01:17PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 09, 2024 at 09:54:06AM -0700, Ankur Arora wrote:
> > PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC
> > which allows for dynamic switching of preemption models.
> >
> > The choice of preemptible RCU or not, however, is fixed at compile
> > time. Given the trade-offs made to have a preemptible RCU, some
> > configurations which have limited preemption might prefer the
> > stronger forward-progress guarantees of PREEMPT_RCU=n.
> >
> > Accordingly, explicitly limit PREEMPT_RCU=y to PREEMPT_DYNAMIC,
> > PREEMPT, PREEMPT_RT.
> >
> > This means that (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), which selects
> > PREEMPTION will run with PREEMPT_RCU=n. The combination (PREEMPT_LAZY=y,
> > PREEMPT_DYNAMIC=y), will run with PREEMPT_RCU=y.
>
> I am completely confused by this. Why do we want this?
In order to support systems that currently run CONFIG_PREEMPT=n that
are adequately but not overly endowed with memory. If we allow all
RCU readers to be preempted, we increase grace-period latency, and also
increase OOM incidence. Which we would like to avoid.
But we do want lazy preemption otherwise, for but one thing to reduce
tail latencies and to reduce the need for preemption points. Thus, we
want a way to allow lazy preemption in general, but to continue with
non-preemptible RCU read-side critical sections.
Or am I once again missing your point?
Thanx, Paul
> > Cc: Paul E. McKenney <paulmck@kernel.org>
> > Cc: Peter Zijlstra <peterz@infradead.org>
> > Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> > ---
> > kernel/rcu/Kconfig | 2 +-
> > 1 file changed, 1 insertion(+), 1 deletion(-)
> >
> > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > index 3e079de0f5b4..1bfe7016724f 100644
> > --- a/kernel/rcu/Kconfig
> > +++ b/kernel/rcu/Kconfig
> > @@ -18,7 +18,7 @@ config TREE_RCU
> >
> > config PREEMPT_RCU
> > bool
> > - default y if PREEMPTION
> > + default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> > select TREE_RCU
> > help
> > This option selects the RCU implementation that is
> > --
> > 2.43.5
> >
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
2024-10-09 18:02 ` Peter Zijlstra
@ 2024-10-09 18:52 ` Ankur Arora
0 siblings, 0 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 18:52 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ankur Arora, linux-kernel, tglx, paulmck, mingo, bigeasy,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Peter Zijlstra <peterz@infradead.org> writes:
> On Wed, Oct 09, 2024 at 09:54:09AM -0700, Ankur Arora wrote:
>
>> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
>> index 1bfe7016724f..cd3148fb566a 100644
>> --- a/kernel/rcu/Kconfig
>> +++ b/kernel/rcu/Kconfig
>> @@ -91,7 +91,7 @@ config NEED_TASKS_RCU
>>
>> config TASKS_RCU
>> bool
>> - default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
>> + default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_LAZY)
>> select IRQ_WORK
>
> PREEMPT_LAZY implies PREEMPTION
So it does. Not sure why I included that.
The PREEMPT_AUTO conditional can be removed altogether.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-09 16:54 ` [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
@ 2024-10-09 19:05 ` Ankur Arora
2024-10-10 14:37 ` Paul E. McKenney
2024-10-10 6:50 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-09 19:05 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
Ankur Arora <ankur.a.arora@oracle.com> writes:
> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
> states for read-side critical sections via rcu_all_qs().
> One reason why this was needed, was lacking preempt-count, the tick
> handler has no way of knowing whether it is executing in a read-side
> critical section or not.
>
> With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
> PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
> quiescent states via rcu_all_qs().
>
> So, use the availability of preempt_count() to report quiescent states
> in rcu_flavor_sched_clock_irq().
A note about the inverse of this case, where we might have long running
loops which only temporarily enable preemption and thus would be
unlikely to align themselves with the tick: in prior discussions [1]
Paul had pointed the need for providing for forcing a context switch
in such a scenario.
I had a patch which did that, but I think it is unnecessary since this
clause in rcu_sched_clock_irq() should already handle it.
void rcu_sched_clock_irq(int user) {
...
/* The load-acquire pairs with the store-release setting to true. */
if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) {
/* Idle and userspace execution already are quiescent states. */
if (!rcu_is_cpu_rrupt_from_idle() && !user) {
set_tsk_need_resched(current);
set_preempt_need_resched();
}
__this_cpu_write(rcu_data.rcu_urgent_qs, false);
}
Paul?
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 18:24 ` Paul E. McKenney
@ 2024-10-09 20:52 ` Peter Zijlstra
2024-10-09 21:16 ` Paul E. McKenney
2024-10-10 6:32 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-09 20:52 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ankur Arora, linux-kernel, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 11:24:09AM -0700, Paul E. McKenney wrote:
> On Wed, Oct 09, 2024 at 08:01:17PM +0200, Peter Zijlstra wrote:
> > On Wed, Oct 09, 2024 at 09:54:06AM -0700, Ankur Arora wrote:
> > > PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC
> > > which allows for dynamic switching of preemption models.
> > >
> > > The choice of preemptible RCU or not, however, is fixed at compile
> > > time. Given the trade-offs made to have a preemptible RCU, some
> > > configurations which have limited preemption might prefer the
> > > stronger forward-progress guarantees of PREEMPT_RCU=n.
> > >
> > > Accordingly, explicitly limit PREEMPT_RCU=y to PREEMPT_DYNAMIC,
> > > PREEMPT, PREEMPT_RT.
> > >
> > > This means that (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), which selects
> > > PREEMPTION will run with PREEMPT_RCU=n. The combination (PREEMPT_LAZY=y,
> > > PREEMPT_DYNAMIC=y), will run with PREEMPT_RCU=y.
> >
> > I am completely confused by this. Why do we want this?
>
> In order to support systems that currently run CONFIG_PREEMPT=n that
> are adequately but not overly endowed with memory. If we allow all
> RCU readers to be preempted, we increase grace-period latency, and also
> increase OOM incidence. Which we would like to avoid.
>
> But we do want lazy preemption otherwise, for but one thing to reduce
> tail latencies and to reduce the need for preemption points. Thus, we
> want a way to allow lazy preemption in general, but to continue with
> non-preemptible RCU read-side critical sections.
>
> Or am I once again missing your point?
Even without this patch this is allowed, right? It's just a default
that's changed. If people want to run PREEMPT_RCU=n, they can select it.
I just don't see a point in making this change.
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 20:52 ` Peter Zijlstra
@ 2024-10-09 21:16 ` Paul E. McKenney
2024-10-10 7:58 ` Peter Zijlstra
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-09 21:16 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ankur Arora, linux-kernel, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 10:52:18PM +0200, Peter Zijlstra wrote:
> On Wed, Oct 09, 2024 at 11:24:09AM -0700, Paul E. McKenney wrote:
> > On Wed, Oct 09, 2024 at 08:01:17PM +0200, Peter Zijlstra wrote:
> > > On Wed, Oct 09, 2024 at 09:54:06AM -0700, Ankur Arora wrote:
> > > > PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC
> > > > which allows for dynamic switching of preemption models.
> > > >
> > > > The choice of preemptible RCU or not, however, is fixed at compile
> > > > time. Given the trade-offs made to have a preemptible RCU, some
> > > > configurations which have limited preemption might prefer the
> > > > stronger forward-progress guarantees of PREEMPT_RCU=n.
> > > >
> > > > Accordingly, explicitly limit PREEMPT_RCU=y to PREEMPT_DYNAMIC,
> > > > PREEMPT, PREEMPT_RT.
> > > >
> > > > This means that (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), which selects
> > > > PREEMPTION will run with PREEMPT_RCU=n. The combination (PREEMPT_LAZY=y,
> > > > PREEMPT_DYNAMIC=y), will run with PREEMPT_RCU=y.
> > >
> > > I am completely confused by this. Why do we want this?
> >
> > In order to support systems that currently run CONFIG_PREEMPT=n that
> > are adequately but not overly endowed with memory. If we allow all
> > RCU readers to be preempted, we increase grace-period latency, and also
> > increase OOM incidence. Which we would like to avoid.
> >
> > But we do want lazy preemption otherwise, for but one thing to reduce
> > tail latencies and to reduce the need for preemption points. Thus, we
> > want a way to allow lazy preemption in general, but to continue with
> > non-preemptible RCU read-side critical sections.
> >
> > Or am I once again missing your point?
>
> Even without this patch this is allowed, right? It's just a default
> that's changed. If people want to run PREEMPT_RCU=n, they can select it.
>
> I just don't see a point in making this change.
Because we don't need a bunch of people surprised by this change in
behavior.
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 18:24 ` Paul E. McKenney
2024-10-09 20:52 ` Peter Zijlstra
@ 2024-10-10 6:32 ` Sebastian Andrzej Siewior
2024-10-10 8:10 ` Peter Zijlstra
1 sibling, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 6:32 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-09 11:24:09 [-0700], Paul E. McKenney wrote:
> In order to support systems that currently run CONFIG_PREEMPT=n that
…
> Or am I once again missing your point?
The change is:
| config PREEMPT_RCU
| bool
|- default y if PREEMPTION
|+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
Now:
- CONFIG_PREEMPT select PREEMPT_BUILD
- PREEMPT_RT select CONFIG_PREEMPTION
- PREEMPT_DYNAMIC selects PREEMPT_BUILD
and PREEMPT_BUILD select CONFIG_PREEMPTION
so in the end, this change is a nop, right?
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
@ 2024-10-10 6:37 ` Sebastian Andrzej Siewior
2024-10-10 18:19 ` Ankur Arora
2024-10-13 9:44 ` kernel test robot
` (2 subsequent siblings)
3 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 6:37 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-09 09:54:05 [-0700], Ankur Arora wrote:
> resched_latency_warn() now also warns if TIF_NEED_RESCHED_LAZY is set
> without rescheduling for more than the latency_warn_ms period.
The description is odd. I think you want to say that
resched_latency_warn() does not warn if a task has TIF_NEED_RESCHED_LAZY
has set for longer periods and you want to add this functionality.
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Ziljstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
…
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 694bfcf153cb..1229766b704e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5571,7 +5571,7 @@ static u64 cpu_resched_latency(struct rq *rq)
> if (sysctl_resched_latency_warn_once && warned_once)
> return 0;
>
> - if (!need_resched() || !latency_warn_ms)
> + if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
tif_need_resched_lazy() is not doing what you think it is doing.
Either PeterZ makes a helper for this or you need
tif_test_bit(TIF_NEED_RESCHED_LAZY).
> return 0;
>
> if (system_state == SYSTEM_BOOTING)
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 3/7] rcu: fix header guard for rcu_all_qs()
2024-10-09 16:54 ` [PATCH 3/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
@ 2024-10-10 6:41 ` Sebastian Andrzej Siewior
2024-10-10 8:11 ` Peter Zijlstra
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 6:41 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-09 09:54:07 [-0700], Ankur Arora wrote:
> rcu_all_qs() is defined for !CONFIG_PREEMPT_RCU but the declaration
> is conditioned on CONFIG_PREEMPTION.
>
> With CONFIG_PREEMPT_LAZY, CONFIG_PREEMPTION=y does not imply
> CONFIG_PREEMPT_RCU=y.
From kernel/rcu/Kconfig:
| config PREEMPT_RCU
| bool
| default y if PREEMPTION
this looks like PREEMPT_RCU implies PREEMPTION.
> Decouple the two.
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-09 16:54 ` [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
2024-10-09 19:05 ` Ankur Arora
@ 2024-10-10 6:50 ` Sebastian Andrzej Siewior
2024-10-10 17:56 ` Ankur Arora
1 sibling, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 6:50 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-09 09:54:08 [-0700], Ankur Arora wrote:
> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
> states for read-side critical sections via rcu_all_qs().
> One reason why this was needed, was lacking preempt-count, the tick
> handler has no way of knowing whether it is executing in a read-side
> critical section or not.
>
> With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
> PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
> quiescent states via rcu_all_qs().
PREEMPT_LAZY selects PREEMPT_BUILD which selects PREEMPTION which in
turn selects PREEMPT_RCU. So this is not a valid combination. Do you
have a different tree than I do? Because maybe I am missing something.
> So, use the availability of preempt_count() to report quiescent states
> in rcu_flavor_sched_clock_irq().
>
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> kernel/rcu/tree_plugin.h | 11 +++++++----
> 1 file changed, 7 insertions(+), 4 deletions(-)
>
> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
> index 1c7cbd145d5e..da324d66034b 100644
> --- a/kernel/rcu/tree_plugin.h
> +++ b/kernel/rcu/tree_plugin.h
> @@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
> */
> static void rcu_flavor_sched_clock_irq(int user)
> {
> - if (user || rcu_is_cpu_rrupt_from_idle()) {
> + if (user || rcu_is_cpu_rrupt_from_idle() ||
> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) {
couldn't you use a helper preemptible()?
> /*
> * Get here if this CPU took its interrupt from user
> - * mode or from the idle loop, and if this is not a
> - * nested interrupt. In this case, the CPU is in
> - * a quiescent state, so note it.
> + * mode, from the idle loop without this being a nested
> + * interrupt, or while not holding a preempt count (but
> + * with PREEMPT_COUNT=y. In this case, the CPU is in a
> + * quiescent state, so note it.
> *
> * No memory barrier is required here because rcu_qs()
> * references only CPU-local variables that other CPUs
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
2024-10-09 16:54 ` [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
@ 2024-10-10 6:53 ` Sebastian Andrzej Siewior
2024-10-10 14:39 ` Paul E. McKenney
2024-10-10 17:50 ` Ankur Arora
0 siblings, 2 replies; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 6:53 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-09 09:54:10 [-0700], Ankur Arora wrote:
> To reduce RCU noise for nohz_full configurations, osnoise depends
> on cond_resched() providing quiescent states for PREEMPT_RCU=n
> configurations. And, for PREEMPT_RCU=y configurations does this
> by directly calling rcu_momentary_eqs().
>
> With PREEMPT_LAZY=y, however, we can have configurations with
> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
> can help.
PREEMPTION=y, PREEMPT_RCU=n should not be possible.
> Handle that by fallback to the explicit quiescent states via
> rcu_momentary_eqs().
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 7/7] powerpc: add support for PREEMPT_LAZY
2024-10-09 16:54 ` [PATCH 7/7] powerpc: add support for PREEMPT_LAZY Ankur Arora
@ 2024-10-10 7:22 ` Sebastian Andrzej Siewior
2024-10-10 18:10 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 7:22 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-09 09:54:11 [-0700], Ankur Arora wrote:
> From: Shrikanth Hegde <sshegde@linux.ibm.com>
>
> Add PowerPC arch support for PREEMPT_LAZY by defining LAZY bits.
>
> Since PowerPC doesn't use generic exit to functions, check for
> NEED_RESCHED_LAZY when exiting to user or to the kernel from
> interrupt routines.
>
> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
> [ Changed TIF_NEED_RESCHED_LAZY to now be defined unconditionally. ]
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> arch/powerpc/Kconfig | 1 +
> arch/powerpc/include/asm/thread_info.h | 5 ++++-
> arch/powerpc/kernel/interrupt.c | 5 +++--
> 3 files changed, 8 insertions(+), 3 deletions(-)
>
> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
> index 8094a01974cc..593a1d60d443 100644
> --- a/arch/powerpc/Kconfig
> +++ b/arch/powerpc/Kconfig
> @@ -270,6 +270,7 @@ config PPC
> select HAVE_PERF_REGS
> select HAVE_PERF_USER_STACK_DUMP
> select HAVE_RETHOOK if KPROBES
> + select ARCH_HAS_PREEMPT_LAZY
> select HAVE_REGS_AND_STACK_ACCESS_API
> select HAVE_RELIABLE_STACKTRACE
> select HAVE_RSEQ
I would move this up to the ARCH_HAS_ block.
> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
> index 6ebca2996f18..ae7793dae763 100644
> --- a/arch/powerpc/include/asm/thread_info.h
> +++ b/arch/powerpc/include/asm/thread_info.h
> @@ -117,11 +117,14 @@ void arch_setup_new_exec(void);
> #endif
> #define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */
> #define TIF_32BIT 20 /* 32 bit binary */
> +#define TIF_NEED_RESCHED_LAZY 21 /* Lazy rescheduling */
I don't see any of the bits being used in assembly anymore.
If you group the _TIF_USER_WORK_MASK bits it a single 16 bit block then
the compiler could issue a single andi.
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-09 21:16 ` Paul E. McKenney
@ 2024-10-10 7:58 ` Peter Zijlstra
2024-10-10 14:19 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-10 7:58 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Ankur Arora, linux-kernel, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 02:16:43PM -0700, Paul E. McKenney wrote:
> Because we don't need a bunch of people surprised by this change in
> behavior.
I'm not sure what change in behaviour; Lazy is brand spanking new, it
will have new behaviour not in line with previous modes. Just put in the
help text that if you're looking for a reality closer to the old
Voluntary to use PREEMPT_RCU=n.
In reality very few people will care, because distros will be running
PREEMPT_DYNAMIC and the 3 people doing their own thing ought to be savvy
enough to deal with things, no?
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 6:32 ` Sebastian Andrzej Siewior
@ 2024-10-10 8:10 ` Peter Zijlstra
2024-10-10 9:13 ` Sebastian Andrzej Siewior
2024-10-10 17:42 ` Ankur Arora
0 siblings, 2 replies; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-10 8:10 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Paul E. McKenney, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Thu, Oct 10, 2024 at 08:32:07AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-09 11:24:09 [-0700], Paul E. McKenney wrote:
> > In order to support systems that currently run CONFIG_PREEMPT=n that
> …
> > Or am I once again missing your point?
>
> The change is:
> | config PREEMPT_RCU
> | bool
> |- default y if PREEMPTION
> |+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
>
> Now:
> - CONFIG_PREEMPT select PREEMPT_BUILD
> - PREEMPT_RT select CONFIG_PREEMPTION
> - PREEMPT_DYNAMIC selects PREEMPT_BUILD
>
> and PREEMPT_BUILD select CONFIG_PREEMPTION
>
> so in the end, this change is a nop, right?
PREEMPT_RT selects PREEMPTION *and* has one of PREEMPT / PREEMPT_LAZY /
PREEMPT_DYNAMIC, all of which in turn select PREEMPT_BUILD, which
selects PREEMPTION.
(arguably we can remove the select PREEMPTION from PREEMPT_RT)
The proposed change is not a nop because the config: PREEMPT_LAZY=y
PREEMPT_DYNAMIC=n will result in false, while it will have PREEMPTION.
That said, I really do not agree with the change, it makes the condition
complicated for no reason.
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 3/7] rcu: fix header guard for rcu_all_qs()
2024-10-10 6:41 ` Sebastian Andrzej Siewior
@ 2024-10-10 8:11 ` Peter Zijlstra
2024-10-10 14:29 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-10 8:11 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Thu, Oct 10, 2024 at 08:41:23AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-09 09:54:07 [-0700], Ankur Arora wrote:
> > rcu_all_qs() is defined for !CONFIG_PREEMPT_RCU but the declaration
> > is conditioned on CONFIG_PREEMPTION.
> >
> > With CONFIG_PREEMPT_LAZY, CONFIG_PREEMPTION=y does not imply
> > CONFIG_PREEMPT_RCU=y.
>
> From kernel/rcu/Kconfig:
> | config PREEMPT_RCU
> | bool
> | default y if PREEMPTION
>
> this looks like PREEMPT_RCU implies PREEMPTION.
The point was to make PREEMPT_RCU=n possible even when PREEMPTION=y,
which is fine.
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 8:10 ` Peter Zijlstra
@ 2024-10-10 9:13 ` Sebastian Andrzej Siewior
2024-10-10 10:03 ` Peter Zijlstra
2024-10-10 17:35 ` Ankur Arora
2024-10-10 17:42 ` Ankur Arora
1 sibling, 2 replies; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 9:13 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Paul E. McKenney, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-10 10:10:32 [+0200], Peter Zijlstra wrote:
> On Thu, Oct 10, 2024 at 08:32:07AM +0200, Sebastian Andrzej Siewior wrote:
> > On 2024-10-09 11:24:09 [-0700], Paul E. McKenney wrote:
> > > In order to support systems that currently run CONFIG_PREEMPT=n that
> > …
> > > Or am I once again missing your point?
> >
> > The change is:
> > | config PREEMPT_RCU
> > | bool
> > |- default y if PREEMPTION
> > |+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> >
> > Now:
> > - CONFIG_PREEMPT select PREEMPT_BUILD
> > - PREEMPT_RT select CONFIG_PREEMPTION
> > - PREEMPT_DYNAMIC selects PREEMPT_BUILD
> >
> > and PREEMPT_BUILD select CONFIG_PREEMPTION
> >
> > so in the end, this change is a nop, right?
>
> PREEMPT_RT selects PREEMPTION *and* has one of PREEMPT / PREEMPT_LAZY /
> PREEMPT_DYNAMIC, all of which in turn select PREEMPT_BUILD, which
> selects PREEMPTION.
>
> (arguably we can remove the select PREEMPTION from PREEMPT_RT)
>
> The proposed change is not a nop because the config: PREEMPT_LAZY=y
> PREEMPT_DYNAMIC=n will result in false, while it will have PREEMPTION.
I have a config with PREEMPT_LAZY=y PREEMPT_DYNAMIC=n and
CONFIG_PREEMPT_RCU=y.
I can't deselect CONFIG_PREEMPT_RCU=y. This is because LAZY selects
PREEMPT_BUILD and PREEMPT_RCU selects itself once PREEMPTION is on.
> That said, I really do not agree with the change, it makes the condition
> complicated for no reason.
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 9:13 ` Sebastian Andrzej Siewior
@ 2024-10-10 10:03 ` Peter Zijlstra
2024-10-10 10:26 ` Sebastian Andrzej Siewior
2024-10-10 17:35 ` Ankur Arora
1 sibling, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-10 10:03 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Paul E. McKenney, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Thu, Oct 10, 2024 at 11:13:26AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-10 10:10:32 [+0200], Peter Zijlstra wrote:
> > On Thu, Oct 10, 2024 at 08:32:07AM +0200, Sebastian Andrzej Siewior wrote:
> > > On 2024-10-09 11:24:09 [-0700], Paul E. McKenney wrote:
> > > > In order to support systems that currently run CONFIG_PREEMPT=n that
> > > …
> > > > Or am I once again missing your point?
> > >
> > > The change is:
> > > | config PREEMPT_RCU
> > > | bool
> > > |- default y if PREEMPTION
> > > |+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> > >
> > > Now:
> > > - CONFIG_PREEMPT select PREEMPT_BUILD
> > > - PREEMPT_RT select CONFIG_PREEMPTION
> > > - PREEMPT_DYNAMIC selects PREEMPT_BUILD
> > >
> > > and PREEMPT_BUILD select CONFIG_PREEMPTION
> > >
> > > so in the end, this change is a nop, right?
> >
> > PREEMPT_RT selects PREEMPTION *and* has one of PREEMPT / PREEMPT_LAZY /
> > PREEMPT_DYNAMIC, all of which in turn select PREEMPT_BUILD, which
> > selects PREEMPTION.
> >
> > (arguably we can remove the select PREEMPTION from PREEMPT_RT)
> >
> > The proposed change is not a nop because the config: PREEMPT_LAZY=y
> > PREEMPT_DYNAMIC=n will result in false, while it will have PREEMPTION.
>
> I have a config with PREEMPT_LAZY=y PREEMPT_DYNAMIC=n and
> CONFIG_PREEMPT_RCU=y.
>
> I can't deselect CONFIG_PREEMPT_RCU=y. This is because LAZY selects
> PREEMPT_BUILD and PREEMPT_RCU selects itself once PREEMPTION is on.
Oh, the entry isn't user selectable? Fix that perhaps?
- bool
+ bool "Use preemptible RCU"
Or something along those lines -- I forever forget how Kconfig works.
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 10:03 ` Peter Zijlstra
@ 2024-10-10 10:26 ` Sebastian Andrzej Siewior
2024-10-10 10:44 ` Peter Zijlstra
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-10 10:26 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Paul E. McKenney, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-10 12:03:08 [+0200], Peter Zijlstra wrote:
> >
> > I can't deselect CONFIG_PREEMPT_RCU=y. This is because LAZY selects
> > PREEMPT_BUILD and PREEMPT_RCU selects itself once PREEMPTION is on.
>
> Oh, the entry isn't user selectable? Fix that perhaps?
>
> - bool
> + bool "Use preemptible RCU"
>
> Or something along those lines -- I forever forget how Kconfig works.
Oh. Well, yes. If we do this then it becomes suddenly selectable and
half of the series makes sense…
But as you said, this complicates things.
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 10:26 ` Sebastian Andrzej Siewior
@ 2024-10-10 10:44 ` Peter Zijlstra
2024-10-10 14:29 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Peter Zijlstra @ 2024-10-10 10:44 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Paul E. McKenney, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Thu, Oct 10, 2024 at 12:26:57PM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-10 12:03:08 [+0200], Peter Zijlstra wrote:
> > >
> > > I can't deselect CONFIG_PREEMPT_RCU=y. This is because LAZY selects
> > > PREEMPT_BUILD and PREEMPT_RCU selects itself once PREEMPTION is on.
> >
> > Oh, the entry isn't user selectable? Fix that perhaps?
> >
> > - bool
> > + bool "Use preemptible RCU"
> >
> > Or something along those lines -- I forever forget how Kconfig works.
>
> Oh. Well, yes. If we do this then it becomes suddenly selectable and
> half of the series makes sense…
> But as you said, this complicates things.
But then you leave it up to the user, instead of doing something quite
random. This would allow you to configure PREEMPT_RCU=n despite also
using PREEMPT_DYNAMIC if that is your thing.
I fundamentally hate the whole randomness of the earlier proposed
selection criteria. It only disables PREEMPT_RCU if you use LAZY and not
also have PREEMPT_RT or PREEMPT_DYNAMIC.
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 7:58 ` Peter Zijlstra
@ 2024-10-10 14:19 ` Paul E. McKenney
0 siblings, 0 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-10 14:19 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Ankur Arora, linux-kernel, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Thu, Oct 10, 2024 at 09:58:50AM +0200, Peter Zijlstra wrote:
> On Wed, Oct 09, 2024 at 02:16:43PM -0700, Paul E. McKenney wrote:
>
> > Because we don't need a bunch of people surprised by this change in
> > behavior.
>
> I'm not sure what change in behaviour; Lazy is brand spanking new, it
> will have new behaviour not in line with previous modes. Just put in the
> help text that if you're looking for a reality closer to the old
> Voluntary to use PREEMPT_RCU=n.
>
> In reality very few people will care, because distros will be running
> PREEMPT_DYNAMIC and the 3 people doing their own thing ought to be savvy
> enough to deal with things, no?
No.
Sure, a number of them have proven themselves capable of debugging back
from an otherwise inexplicable series of OOMs not obviously caused by
RCU Kconfig settings, but why put them through that? And especially,
why put others through such a "learning experience"?
Why not instead tag the Kconfig help text noting that lazy preemption
does not automatically apply to RCU read-side critical sections, and
that if you want it to, there is a Kconfig option that is there for you?
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 10:44 ` Peter Zijlstra
@ 2024-10-10 14:29 ` Paul E. McKenney
2024-10-11 8:18 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-10 14:29 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Sebastian Andrzej Siewior, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Thu, Oct 10, 2024 at 12:44:38PM +0200, Peter Zijlstra wrote:
> On Thu, Oct 10, 2024 at 12:26:57PM +0200, Sebastian Andrzej Siewior wrote:
> > On 2024-10-10 12:03:08 [+0200], Peter Zijlstra wrote:
> > > >
> > > > I can't deselect CONFIG_PREEMPT_RCU=y. This is because LAZY selects
> > > > PREEMPT_BUILD and PREEMPT_RCU selects itself once PREEMPTION is on.
That is not good!
> > > Oh, the entry isn't user selectable? Fix that perhaps?
> > >
> > > - bool
> > > + bool "Use preemptible RCU"
> > >
> > > Or something along those lines -- I forever forget how Kconfig works.
> >
> > Oh. Well, yes. If we do this then it becomes suddenly selectable and
> > half of the series makes sense…
> > But as you said, this complicates things.
>
> But then you leave it up to the user, instead of doing something quite
> random. This would allow you to configure PREEMPT_RCU=n despite also
> using PREEMPT_DYNAMIC if that is your thing.
Ahem. How many users keep track of all of the Kconfig options? I doubt
that this number is greater than zero. Part of the goal here needs to
be to minimize the Kconfig futzing users must do. The default settings
really do matter.
> I fundamentally hate the whole randomness of the earlier proposed
> selection criteria. It only disables PREEMPT_RCU if you use LAZY and not
> also have PREEMPT_RT or PREEMPT_DYNAMIC.
Not at all random.
If you have PREEMPT_RT, you need preemptible RCU, so the defaults should
supply it.
If you have PREEMPT_DYNAMIC, presumably you would like to boot with
preemption enabled, and would like it to act as if you had built the
kernel to be unconditionally preemptible, so again you need preemptible
RCU, and so the defaults should supply it.
If you started off building a non-preemptible kernel, then you are not
using one of the major distros (last I checked). There is a good chance
that you have a large number of systems, and are thus deeply interested
in minimizing memory cost. In which case, you need non-preemptible
RCU in the new-age lazy-preemptible kernel.
Hence the choice of non-preemptible RCU as the default in a kernel that,
without lazy preemption, would use non-preemptible RCU.
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 3/7] rcu: fix header guard for rcu_all_qs()
2024-10-10 8:11 ` Peter Zijlstra
@ 2024-10-10 14:29 ` Paul E. McKenney
0 siblings, 0 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-10 14:29 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Sebastian Andrzej Siewior, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Thu, Oct 10, 2024 at 10:11:29AM +0200, Peter Zijlstra wrote:
> On Thu, Oct 10, 2024 at 08:41:23AM +0200, Sebastian Andrzej Siewior wrote:
> > On 2024-10-09 09:54:07 [-0700], Ankur Arora wrote:
> > > rcu_all_qs() is defined for !CONFIG_PREEMPT_RCU but the declaration
> > > is conditioned on CONFIG_PREEMPTION.
> > >
> > > With CONFIG_PREEMPT_LAZY, CONFIG_PREEMPTION=y does not imply
> > > CONFIG_PREEMPT_RCU=y.
> >
> > From kernel/rcu/Kconfig:
> > | config PREEMPT_RCU
> > | bool
> > | default y if PREEMPTION
> >
> > this looks like PREEMPT_RCU implies PREEMPTION.
>
> The point was to make PREEMPT_RCU=n possible even when PREEMPTION=y,
> which is fine.
Agreed!
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-09 19:05 ` Ankur Arora
@ 2024-10-10 14:37 ` Paul E. McKenney
2024-10-10 17:59 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-10 14:37 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Wed, Oct 09, 2024 at 12:05:47PM -0700, Ankur Arora wrote:
>
>
> Ankur Arora <ankur.a.arora@oracle.com> writes:
>
> > With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
> > states for read-side critical sections via rcu_all_qs().
> > One reason why this was needed, was lacking preempt-count, the tick
> > handler has no way of knowing whether it is executing in a read-side
> > critical section or not.
> >
> > With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
> > PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
> > quiescent states via rcu_all_qs().
> >
> > So, use the availability of preempt_count() to report quiescent states
> > in rcu_flavor_sched_clock_irq().
>
> A note about the inverse of this case, where we might have long running
> loops which only temporarily enable preemption and thus would be
> unlikely to align themselves with the tick: in prior discussions [1]
> Paul had pointed the need for providing for forcing a context switch
> in such a scenario.
>
> I had a patch which did that, but I think it is unnecessary since this
> clause in rcu_sched_clock_irq() should already handle it.
>
> void rcu_sched_clock_irq(int user) {
> ...
> /* The load-acquire pairs with the store-release setting to true. */
> if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) {
> /* Idle and userspace execution already are quiescent states. */
> if (!rcu_is_cpu_rrupt_from_idle() && !user) {
> set_tsk_need_resched(current);
> set_preempt_need_resched();
> }
> __this_cpu_write(rcu_data.rcu_urgent_qs, false);
> }
>
> Paul?
As long as the tick is actually enabled.
But looking deeper, there is code in force_qs_rnp() and
rcu_watching_snap_recheck() to force the tick on CPUs that don't
response to the grace period soon enough via the -1 return from the
rcu_watching_snap_recheck() function and via resched_cpu().
So we might be covered. Some serious testing is warranted.
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
2024-10-10 6:53 ` Sebastian Andrzej Siewior
@ 2024-10-10 14:39 ` Paul E. McKenney
2024-10-10 17:50 ` Ankur Arora
1 sibling, 0 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-10 14:39 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, peterz, tglx, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On Thu, Oct 10, 2024 at 08:53:38AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-09 09:54:10 [-0700], Ankur Arora wrote:
> > To reduce RCU noise for nohz_full configurations, osnoise depends
> > on cond_resched() providing quiescent states for PREEMPT_RCU=n
> > configurations. And, for PREEMPT_RCU=y configurations does this
> > by directly calling rcu_momentary_eqs().
> >
> > With PREEMPT_LAZY=y, however, we can have configurations with
> > (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
> > can help.
>
> PREEMPTION=y, PREEMPT_RCU=n should not be possible.
PREEMPT_LAZY=y, PREEMPT_RCU=n really does need to be possible in order to
avoid OOM on systems not abundantly endowed with memory. This gets the
improvement in tail latencies from PREEMPT_LAZY but the OOM resistance
of PREEMPT_RCU=n. Such systems really do not want to be preempting
RCU readers.
Thanx, Paul
> > Handle that by fallback to the explicit quiescent states via
> > rcu_momentary_eqs().
>
> Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 9:13 ` Sebastian Andrzej Siewior
2024-10-10 10:03 ` Peter Zijlstra
@ 2024-10-10 17:35 ` Ankur Arora
2024-10-11 7:58 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 17:35 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Paul E. McKenney, Ankur Arora, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-10 10:10:32 [+0200], Peter Zijlstra wrote:
>> On Thu, Oct 10, 2024 at 08:32:07AM +0200, Sebastian Andrzej Siewior wrote:
>> > On 2024-10-09 11:24:09 [-0700], Paul E. McKenney wrote:
>> > > In order to support systems that currently run CONFIG_PREEMPT=n that
>> > …
>> > > Or am I once again missing your point?
>> >
>> > The change is:
>> > | config PREEMPT_RCU
>> > | bool
>> > |- default y if PREEMPTION
>> > |+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
>> >
>> > Now:
>> > - CONFIG_PREEMPT select PREEMPT_BUILD
>> > - PREEMPT_RT select CONFIG_PREEMPTION
>> > - PREEMPT_DYNAMIC selects PREEMPT_BUILD
>> >
>> > and PREEMPT_BUILD select CONFIG_PREEMPTION
>> >
>> > so in the end, this change is a nop, right?
>>
>> PREEMPT_RT selects PREEMPTION *and* has one of PREEMPT / PREEMPT_LAZY /
>> PREEMPT_DYNAMIC, all of which in turn select PREEMPT_BUILD, which
>> selects PREEMPTION.
>>
>> (arguably we can remove the select PREEMPTION from PREEMPT_RT)
>>
>> The proposed change is not a nop because the config: PREEMPT_LAZY=y
>> PREEMPT_DYNAMIC=n will result in false, while it will have PREEMPTION.
>
> I have a config with PREEMPT_LAZY=y PREEMPT_DYNAMIC=n and
> CONFIG_PREEMPT_RCU=y.
>
> I can't deselect CONFIG_PREEMPT_RCU=y. This is because LAZY selects
> PREEMPT_BUILD and PREEMPT_RCU selects itself once PREEMPTION is on.
That's odd. I have that exact configuration (PREEMPT_DYANMIC=n,
PREEMPT_LAZY=y, PREEMPT_RCU=n).
Can you share your .config?
Ankur
>> That said, I really do not agree with the change, it makes the condition
>> complicated for no reason.
>
> Sebastian
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 8:10 ` Peter Zijlstra
2024-10-10 9:13 ` Sebastian Andrzej Siewior
@ 2024-10-10 17:42 ` Ankur Arora
1 sibling, 0 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 17:42 UTC (permalink / raw)
To: Peter Zijlstra
Cc: Sebastian Andrzej Siewior, Paul E. McKenney, Ankur Arora,
linux-kernel, tglx, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, frederic,
efault
Peter Zijlstra <peterz@infradead.org> writes:
> On Thu, Oct 10, 2024 at 08:32:07AM +0200, Sebastian Andrzej Siewior wrote:
>> On 2024-10-09 11:24:09 [-0700], Paul E. McKenney wrote:
>> > In order to support systems that currently run CONFIG_PREEMPT=n that
>> …
>> > Or am I once again missing your point?
>>
>> The change is:
>> | config PREEMPT_RCU
>> | bool
>> |- default y if PREEMPTION
>> |+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
>>
>> Now:
>> - CONFIG_PREEMPT select PREEMPT_BUILD
>> - PREEMPT_RT select CONFIG_PREEMPTION
>> - PREEMPT_DYNAMIC selects PREEMPT_BUILD
>>
>> and PREEMPT_BUILD select CONFIG_PREEMPTION
>>
>> so in the end, this change is a nop, right?
>
> PREEMPT_RT selects PREEMPTION *and* has one of PREEMPT / PREEMPT_LAZY /
> PREEMPT_DYNAMIC, all of which in turn select PREEMPT_BUILD, which
> selects PREEMPTION.
>
> (arguably we can remove the select PREEMPTION from PREEMPT_RT)
>
> The proposed change is not a nop because the config: PREEMPT_LAZY=y
> PREEMPT_DYNAMIC=n will result in false, while it will have PREEMPTION.
>
> That said, I really do not agree with the change, it makes the condition
> complicated for no reason.
Isn't the behaviour identical to how it behaves when you choose
PREEMPT_NONE/VOLUNTARY and enable/disable PREEMPT_DYNAMIC?
This option is just choosing the RCU model suited for the kinds of
preemption supported.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
2024-10-10 6:53 ` Sebastian Andrzej Siewior
2024-10-10 14:39 ` Paul E. McKenney
@ 2024-10-10 17:50 ` Ankur Arora
2024-10-11 7:36 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 17:50 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-09 09:54:10 [-0700], Ankur Arora wrote:
>> To reduce RCU noise for nohz_full configurations, osnoise depends
>> on cond_resched() providing quiescent states for PREEMPT_RCU=n
>> configurations. And, for PREEMPT_RCU=y configurations does this
>> by directly calling rcu_momentary_eqs().
>>
>> With PREEMPT_LAZY=y, however, we can have configurations with
>> (PREEMPTION=y, PREEMPT_RCU=n), which means neither of the above
>> can help.
>
> PREEMPTION=y, PREEMPT_RCU=n should not be possible.
That's a statement. Is there an argument here?
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-10 6:50 ` Sebastian Andrzej Siewior
@ 2024-10-10 17:56 ` Ankur Arora
2024-10-11 7:52 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 17:56 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-09 09:54:08 [-0700], Ankur Arora wrote:
>> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
>> states for read-side critical sections via rcu_all_qs().
>> One reason why this was needed, was lacking preempt-count, the tick
>> handler has no way of knowing whether it is executing in a read-side
>> critical section or not.
>>
>> With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
>> PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
>> quiescent states via rcu_all_qs().
>
> PREEMPT_LAZY selects PREEMPT_BUILD which selects PREEMPTION which in
> turn selects PREEMPT_RCU. So this is not a valid combination. Do you
> have a different tree than I do? Because maybe I am missing something.
The second patch in the series?
>> So, use the availability of preempt_count() to report quiescent states
>> in rcu_flavor_sched_clock_irq().
>>
>> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> kernel/rcu/tree_plugin.h | 11 +++++++----
>> 1 file changed, 7 insertions(+), 4 deletions(-)
>>
>> diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
>> index 1c7cbd145d5e..da324d66034b 100644
>> --- a/kernel/rcu/tree_plugin.h
>> +++ b/kernel/rcu/tree_plugin.h
>> @@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
>> */
>> static void rcu_flavor_sched_clock_irq(int user)
>> {
>> - if (user || rcu_is_cpu_rrupt_from_idle()) {
>> + if (user || rcu_is_cpu_rrupt_from_idle() ||
>> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
>> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) {
>
> couldn't you use a helper preemptible()?
Alas no. This check isn't trying to establish preemptibility (this is
called in irq context so we already know that we aren't preemptible.)
The check is using the preempt count to see if it can infer the state
of RCU read side critical section on this CPU.
>> /*
>> * Get here if this CPU took its interrupt from user
>> - * mode or from the idle loop, and if this is not a
>> - * nested interrupt. In this case, the CPU is in
>> - * a quiescent state, so note it.
>> + * mode, from the idle loop without this being a nested
>> + * interrupt, or while not holding a preempt count (but
>> + * with PREEMPT_COUNT=y. In this case, the CPU is in a
>> + * quiescent state, so note it.
>> *
>> * No memory barrier is required here because rcu_qs()
>> * references only CPU-local variables that other CPUs
>
> Sebastian
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-10 14:37 ` Paul E. McKenney
@ 2024-10-10 17:59 ` Ankur Arora
0 siblings, 0 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 17:59 UTC (permalink / raw)
To: paulmck
Cc: Ankur Arora, linux-kernel, peterz, tglx, mingo, bigeasy,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Paul E. McKenney <paulmck@kernel.org> writes:
> On Wed, Oct 09, 2024 at 12:05:47PM -0700, Ankur Arora wrote:
>>
>>
>> Ankur Arora <ankur.a.arora@oracle.com> writes:
>>
>> > With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
>> > states for read-side critical sections via rcu_all_qs().
>> > One reason why this was needed, was lacking preempt-count, the tick
>> > handler has no way of knowing whether it is executing in a read-side
>> > critical section or not.
>> >
>> > With PREEMPT_LAZY=y, there can be configurations with PREEMPT_COUNT=y,
>> > PREEMPT_RCU=n, where cond_resched() is a stub that does not provide
>> > quiescent states via rcu_all_qs().
>> >
>> > So, use the availability of preempt_count() to report quiescent states
>> > in rcu_flavor_sched_clock_irq().
>>
>> A note about the inverse of this case, where we might have long running
>> loops which only temporarily enable preemption and thus would be
>> unlikely to align themselves with the tick: in prior discussions [1]
>> Paul had pointed the need for providing for forcing a context switch
>> in such a scenario.
>>
>> I had a patch which did that, but I think it is unnecessary since this
>> clause in rcu_sched_clock_irq() should already handle it.
>>
>> void rcu_sched_clock_irq(int user) {
>> ...
>> /* The load-acquire pairs with the store-release setting to true. */
>> if (smp_load_acquire(this_cpu_ptr(&rcu_data.rcu_urgent_qs))) {
>> /* Idle and userspace execution already are quiescent states. */
>> if (!rcu_is_cpu_rrupt_from_idle() && !user) {
>> set_tsk_need_resched(current);
>> set_preempt_need_resched();
>> }
>> __this_cpu_write(rcu_data.rcu_urgent_qs, false);
>> }
>>
>> Paul?
>
> As long as the tick is actually enabled.
>
> But looking deeper, there is code in force_qs_rnp() and
> rcu_watching_snap_recheck() to force the tick on CPUs that don't
> response to the grace period soon enough via the -1 return from the
> rcu_watching_snap_recheck() function and via resched_cpu().
Ah. I had missed that path. Thanks.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 7/7] powerpc: add support for PREEMPT_LAZY
2024-10-10 7:22 ` Sebastian Andrzej Siewior
@ 2024-10-10 18:10 ` Ankur Arora
2024-10-11 18:35 ` Shrikanth Hegde
0 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 18:10 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault, sshegde
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-09 09:54:11 [-0700], Ankur Arora wrote:
>> From: Shrikanth Hegde <sshegde@linux.ibm.com>
>>
>> Add PowerPC arch support for PREEMPT_LAZY by defining LAZY bits.
>>
>> Since PowerPC doesn't use generic exit to functions, check for
>> NEED_RESCHED_LAZY when exiting to user or to the kernel from
>> interrupt routines.
>>
>> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
>> [ Changed TIF_NEED_RESCHED_LAZY to now be defined unconditionally. ]
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> arch/powerpc/Kconfig | 1 +
>> arch/powerpc/include/asm/thread_info.h | 5 ++++-
>> arch/powerpc/kernel/interrupt.c | 5 +++--
>> 3 files changed, 8 insertions(+), 3 deletions(-)
>>
>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>> index 8094a01974cc..593a1d60d443 100644
>> --- a/arch/powerpc/Kconfig
>> +++ b/arch/powerpc/Kconfig
>> @@ -270,6 +270,7 @@ config PPC
>> select HAVE_PERF_REGS
>> select HAVE_PERF_USER_STACK_DUMP
>> select HAVE_RETHOOK if KPROBES
>> + select ARCH_HAS_PREEMPT_LAZY
>> select HAVE_REGS_AND_STACK_ACCESS_API
>> select HAVE_RELIABLE_STACKTRACE
>> select HAVE_RSEQ
>
> I would move this up to the ARCH_HAS_ block.
Makes sense.
>> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
>> index 6ebca2996f18..ae7793dae763 100644
>> --- a/arch/powerpc/include/asm/thread_info.h
>> +++ b/arch/powerpc/include/asm/thread_info.h
>> @@ -117,11 +117,14 @@ void arch_setup_new_exec(void);
>> #endif
>> #define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */
>> #define TIF_32BIT 20 /* 32 bit binary */
>> +#define TIF_NEED_RESCHED_LAZY 21 /* Lazy rescheduling */
>
> I don't see any of the bits being used in assembly anymore.
> If you group the _TIF_USER_WORK_MASK bits it a single 16 bit block then
> the compiler could issue a single andi.
I don't know power asm at all, but this seems like a good idea.
Shrikanth?
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-10 6:37 ` Sebastian Andrzej Siewior
@ 2024-10-10 18:19 ` Ankur Arora
0 siblings, 0 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-10 18:19 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-09 09:54:05 [-0700], Ankur Arora wrote:
>> resched_latency_warn() now also warns if TIF_NEED_RESCHED_LAZY is set
>> without rescheduling for more than the latency_warn_ms period.
>
> The description is odd. I think you want to say that
> resched_latency_warn() does not warn if a task has TIF_NEED_RESCHED_LAZY
> has set for longer periods and you want to add this functionality.
>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Peter Ziljstra <peterz@infradead.org>
>> Cc: Juri Lelli <juri.lelli@redhat.com>
>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> …
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 694bfcf153cb..1229766b704e 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5571,7 +5571,7 @@ static u64 cpu_resched_latency(struct rq *rq)
>> if (sysctl_resched_latency_warn_once && warned_once)
>> return 0;
>>
>> - if (!need_resched() || !latency_warn_ms)
>> + if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
>
> tif_need_resched_lazy() is not doing what you think it is doing.
Thanks. My bad.
> tif_test_bit(TIF_NEED_RESCHED_LAZY).
> Either PeterZ makes a helper for this or you need
> tif_test_bit(TIF_NEED_RESCHED_LAZY).
Yeah. tif_test_bit(TIF_NEED_RESCHED_LAZY) and need_resched() will
be identical with !ARCH_HAS_PREEMPT_LAZY but that should be okay in
this context.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
2024-10-10 17:50 ` Ankur Arora
@ 2024-10-11 7:36 ` Sebastian Andrzej Siewior
2024-10-14 20:14 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-11 7:36 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-10 10:50:29 [-0700], Ankur Arora wrote:
> > PREEMPTION=y, PREEMPT_RCU=n should not be possible.
>
> That's a statement. Is there an argument here?
For my taste you should describe in your cover letter the actual problem
and what you intend to do about it. Then you should a series addressing
this issue which would probably qualify for all patches in your series
but 7/7 (the PPC bits for lazy preempt). 7/7 should have been the part
where you make possible to make PREEMPT_RCU selectable.
Based on my understanding so far you (or Paul) want to make PREEMPT_RCU
selectable if PREEMPT_LAZY is enabled _or_ if DYNAMIC_PREEMPT is enabled
with the NONE or VOLUNTARY model.
This series as-is made no sense to me until Peter made the snippet where
you could indeed make PREEMPT_RCU selectable.
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-10-10 17:56 ` Ankur Arora
@ 2024-10-11 7:52 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-11 7:52 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 2024-10-10 10:56:36 [-0700], Ankur Arora wrote:
> >
> > PREEMPT_LAZY selects PREEMPT_BUILD which selects PREEMPTION which in
> > turn selects PREEMPT_RCU. So this is not a valid combination. Do you
> > have a different tree than I do? Because maybe I am missing something.
>
> The second patch in the series?
As long as "PREEMPT_RCU" as no text behind its bool you can't select it
independently.
> >> --- a/kernel/rcu/tree_plugin.h
> >> +++ b/kernel/rcu/tree_plugin.h
> >> @@ -974,13 +974,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
> >> */
> >> static void rcu_flavor_sched_clock_irq(int user)
> >> {
> >> - if (user || rcu_is_cpu_rrupt_from_idle()) {
> >> + if (user || rcu_is_cpu_rrupt_from_idle() ||
> >> + (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
> >> + !(preempt_count() & (PREEMPT_MASK | SOFTIRQ_MASK)))) {
> >
> > couldn't you use a helper preemptible()?
>
> Alas no. This check isn't trying to establish preemptibility (this is
> called in irq context so we already know that we aren't preemptible.)
> The check is using the preempt count to see if it can infer the state
> of RCU read side critical section on this CPU.
I see.
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 17:35 ` Ankur Arora
@ 2024-10-11 7:58 ` Sebastian Andrzej Siewior
2024-10-15 23:01 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-11 7:58 UTC (permalink / raw)
To: Ankur Arora
Cc: Peter Zijlstra, Paul E. McKenney, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-10 10:35:25 [-0700], Ankur Arora wrote:
> That's odd. I have that exact configuration (PREEMPT_DYANMIC=n,
> PREEMPT_LAZY=y, PREEMPT_RCU=n).
>
> Can you share your .config?
Sent offlist.
> Ankur
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-10 14:29 ` Paul E. McKenney
@ 2024-10-11 8:18 ` Sebastian Andrzej Siewior
2024-10-11 13:59 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-11 8:18 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-10 07:29:07 [-0700], Paul E. McKenney wrote:
> If you have PREEMPT_RT, you need preemptible RCU, so the defaults should
> supply it.
>
> If you have PREEMPT_DYNAMIC, presumably you would like to boot with
> preemption enabled, and would like it to act as if you had built the
> kernel to be unconditionally preemptible, so again you need preemptible
> RCU, and so the defaults should supply it.
>
> If you started off building a non-preemptible kernel, then you are not
> using one of the major distros (last I checked). There is a good chance
> that you have a large number of systems, and are thus deeply interested
> in minimizing memory cost. In which case, you need non-preemptible
> RCU in the new-age lazy-preemptible kernel.
>
> Hence the choice of non-preemptible RCU as the default in a kernel that,
> without lazy preemption, would use non-preemptible RCU.
I *think* I slowly begin to understand. So we have LAZY and DYNAMIC
enabled then and PREEMPT_RCU depends on the default choice which is
either FULL (yes, please) or NONE/VOLUNTARY (no, thank you). But then if
you change the model at runtime (or boottime) from (default) NONE to
FULL you don't have preemptible RCU anymore.
If you would like to add some relief to memory constrained systems,
wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
allow to override it?
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-11 8:18 ` Sebastian Andrzej Siewior
@ 2024-10-11 13:59 ` Paul E. McKenney
2024-10-11 14:43 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-11 13:59 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Fri, Oct 11, 2024 at 10:18:47AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-10 07:29:07 [-0700], Paul E. McKenney wrote:
> > If you have PREEMPT_RT, you need preemptible RCU, so the defaults should
> > supply it.
> >
> > If you have PREEMPT_DYNAMIC, presumably you would like to boot with
> > preemption enabled, and would like it to act as if you had built the
> > kernel to be unconditionally preemptible, so again you need preemptible
> > RCU, and so the defaults should supply it.
> >
> > If you started off building a non-preemptible kernel, then you are not
> > using one of the major distros (last I checked). There is a good chance
> > that you have a large number of systems, and are thus deeply interested
> > in minimizing memory cost. In which case, you need non-preemptible
> > RCU in the new-age lazy-preemptible kernel.
> >
> > Hence the choice of non-preemptible RCU as the default in a kernel that,
> > without lazy preemption, would use non-preemptible RCU.
>
> I *think* I slowly begin to understand. So we have LAZY and DYNAMIC
> enabled then and PREEMPT_RCU depends on the default choice which is
> either FULL (yes, please) or NONE/VOLUNTARY (no, thank you). But then if
> you change the model at runtime (or boottime) from (default) NONE to
> FULL you don't have preemptible RCU anymore.
Almost. If you are able to change the model at boot time or at run time,
then you *always* have preemptible RCU.
> If you would like to add some relief to memory constrained systems,
> wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
> allow to override it?
Does BASE_SMALL affect anything but log buffer sizes? Either way, we
would still need to avoid the larger memory footprint of preemptible
RCU that shows up due to RCU readers being preempted.
Besides, we are not looking to give up performance vs BASE_SMALL's
"may reduce performance" help text.
Yes, yes, it would simplify things to just get rid of non-preemptible RCU,
but that is simply not in the cards at the moment.
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-11 13:59 ` Paul E. McKenney
@ 2024-10-11 14:43 ` Sebastian Andrzej Siewior
2024-10-11 15:59 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-11 14:43 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-11 06:59:44 [-0700], Paul E. McKenney wrote:
> > > If you have PREEMPT_DYNAMIC, presumably you would like to boot with
> > > preemption enabled, and would like it to act as if you had built the
> > > kernel to be unconditionally preemptible, so again you need preemptible
> > > RCU, and so the defaults should supply it.
> > >
> > > If you started off building a non-preemptible kernel, then you are not
> > > using one of the major distros (last I checked). There is a good chance
> > > that you have a large number of systems, and are thus deeply interested
> > > in minimizing memory cost. In which case, you need non-preemptible
> > > RCU in the new-age lazy-preemptible kernel.
> > >
> > > Hence the choice of non-preemptible RCU as the default in a kernel that,
> > > without lazy preemption, would use non-preemptible RCU.
> >
> > I *think* I slowly begin to understand. So we have LAZY and DYNAMIC
> > enabled then and PREEMPT_RCU depends on the default choice which is
> > either FULL (yes, please) or NONE/VOLUNTARY (no, thank you). But then if
> > you change the model at runtime (or boottime) from (default) NONE to
> > FULL you don't have preemptible RCU anymore.
>
> Almost. If you are able to change the model at boot time or at run time,
> then you *always* have preemptible RCU.
Okay, this eliminates PREEMPT_DYNAMIC then.
With PeterZ current series, PREEMPT_LAZY (without everything else
enabled) behaves as PREEMPT without the "forced" wake up for the fair
class. It is preemptible after all, with preempt_disable() actually
doing something. This might speak for preemptible RCU.
And assuming in this condition you that "low memory overhead RCU" which
is not PREEMPT_RCU. This might require a config option.
> > If you would like to add some relief to memory constrained systems,
> > wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
> > allow to override it?
>
> Does BASE_SMALL affect anything but log buffer sizes? Either way, we
> would still need to avoid the larger memory footprint of preemptible
> RCU that shows up due to RCU readers being preempted.
It only reduces data structures where possible. So lower performance is
probably due to things like futex hashmap (and others) are smaller.
> Besides, we are not looking to give up performance vs BASE_SMALL's
> "may reduce performance" help text.
>
> Yes, yes, it would simplify things to just get rid of non-preemptible RCU,
> but that is simply not in the cards at the moment.
Not sure what the time frame is here. If we go for LAZY and remove NONE
and VOLUNTARY then making PREEMPT_RCU would make sense to lower the
memory footprint (and not attaching to BASE_SMALL).
Is this what you intend or did misunderstand something here?
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-11 14:43 ` Sebastian Andrzej Siewior
@ 2024-10-11 15:59 ` Paul E. McKenney
2024-10-15 11:22 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-11 15:59 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Fri, Oct 11, 2024 at 04:43:41PM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-11 06:59:44 [-0700], Paul E. McKenney wrote:
> > > > If you have PREEMPT_DYNAMIC, presumably you would like to boot with
> > > > preemption enabled, and would like it to act as if you had built the
> > > > kernel to be unconditionally preemptible, so again you need preemptible
> > > > RCU, and so the defaults should supply it.
> > > >
> > > > If you started off building a non-preemptible kernel, then you are not
> > > > using one of the major distros (last I checked). There is a good chance
> > > > that you have a large number of systems, and are thus deeply interested
> > > > in minimizing memory cost. In which case, you need non-preemptible
> > > > RCU in the new-age lazy-preemptible kernel.
> > > >
> > > > Hence the choice of non-preemptible RCU as the default in a kernel that,
> > > > without lazy preemption, would use non-preemptible RCU.
> > >
> > > I *think* I slowly begin to understand. So we have LAZY and DYNAMIC
> > > enabled then and PREEMPT_RCU depends on the default choice which is
> > > either FULL (yes, please) or NONE/VOLUNTARY (no, thank you). But then if
> > > you change the model at runtime (or boottime) from (default) NONE to
> > > FULL you don't have preemptible RCU anymore.
> >
> > Almost. If you are able to change the model at boot time or at run time,
> > then you *always* have preemptible RCU.
>
> Okay, this eliminates PREEMPT_DYNAMIC then.
> With PeterZ current series, PREEMPT_LAZY (without everything else
> enabled) behaves as PREEMPT without the "forced" wake up for the fair
> class. It is preemptible after all, with preempt_disable() actually
> doing something. This might speak for preemptible RCU.
> And assuming in this condition you that "low memory overhead RCU" which
> is not PREEMPT_RCU. This might require a config option.
The PREEMPT_DYNAMIC case seems to work well as-is for the intended users,
so I don't see a need to change it. In particular, we already learned
that we need to set PREEMPT_DYNAMIC=n. Yes, had I caught this in time, I
would have argued against changing the default, but this was successfully
slid past me.
As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
than just keeping non-preemptible RCU when the Kconfig options are
consistent with this being expected. If this is the case, what are the
benefits of this more-intrusive change?
> > > If you would like to add some relief to memory constrained systems,
> > > wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
> > > allow to override it?
> >
> > Does BASE_SMALL affect anything but log buffer sizes? Either way, we
> > would still need to avoid the larger memory footprint of preemptible
> > RCU that shows up due to RCU readers being preempted.
>
> It only reduces data structures where possible. So lower performance is
> probably due to things like futex hashmap (and others) are smaller.
Which is still counterproductive for use cases other than small deep
embedded systems.
> > Besides, we are not looking to give up performance vs BASE_SMALL's
> > "may reduce performance" help text.
> >
> > Yes, yes, it would simplify things to just get rid of non-preemptible RCU,
> > but that is simply not in the cards at the moment.
>
> Not sure what the time frame is here. If we go for LAZY and remove NONE
> and VOLUNTARY then making PREEMPT_RCU would make sense to lower the
> memory footprint (and not attaching to BASE_SMALL).
>
> Is this what you intend or did misunderstand something here?
My requirement is that LAZY not remove/disable/whatever non-preemptible
RCU. Those currently using non-preemptible RCU should continue to be able
to be able to use it, with or without LAZY. So why is this requirement
a problem for you? Or am I missing your point?
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 7/7] powerpc: add support for PREEMPT_LAZY
2024-10-10 18:10 ` Ankur Arora
@ 2024-10-11 18:35 ` Shrikanth Hegde
2024-10-12 22:42 ` Michael Ellerman
0 siblings, 1 reply; 75+ messages in thread
From: Shrikanth Hegde @ 2024-10-11 18:35 UTC (permalink / raw)
To: Ankur Arora, Sebastian Andrzej Siewior
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault, Michael Ellerman
On 10/10/24 23:40, Ankur Arora wrote:
>
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>
>> On 2024-10-09 09:54:11 [-0700], Ankur Arora wrote:
>>> From: Shrikanth Hegde <sshegde@linux.ibm.com>
>>>
>>> Add PowerPC arch support for PREEMPT_LAZY by defining LAZY bits.
>>>
>>> Since PowerPC doesn't use generic exit to functions, check for
>>> NEED_RESCHED_LAZY when exiting to user or to the kernel from
>>> interrupt routines.
>>>
>>> Signed-off-by: Shrikanth Hegde <sshegde@linux.ibm.com>
>>> [ Changed TIF_NEED_RESCHED_LAZY to now be defined unconditionally. ]
>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>> ---
>>> arch/powerpc/Kconfig | 1 +
>>> arch/powerpc/include/asm/thread_info.h | 5 ++++-
>>> arch/powerpc/kernel/interrupt.c | 5 +++--
>>> 3 files changed, 8 insertions(+), 3 deletions(-)
>>>
>>> diff --git a/arch/powerpc/Kconfig b/arch/powerpc/Kconfig
>>> index 8094a01974cc..593a1d60d443 100644
>>> --- a/arch/powerpc/Kconfig
>>> +++ b/arch/powerpc/Kconfig
>>> @@ -270,6 +270,7 @@ config PPC
>>> select HAVE_PERF_REGS
>>> select HAVE_PERF_USER_STACK_DUMP
>>> select HAVE_RETHOOK if KPROBES
>>> + select ARCH_HAS_PREEMPT_LAZY
>>> select HAVE_REGS_AND_STACK_ACCESS_API
>>> select HAVE_RELIABLE_STACKTRACE
>>> select HAVE_RSEQ
>>
>> I would move this up to the ARCH_HAS_ block.
>
> Makes sense.
>
>>> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
>>> index 6ebca2996f18..ae7793dae763 100644
>>> --- a/arch/powerpc/include/asm/thread_info.h
>>> +++ b/arch/powerpc/include/asm/thread_info.h
>>> @@ -117,11 +117,14 @@ void arch_setup_new_exec(void);
>>> #endif
>>> #define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */
>>> #define TIF_32BIT 20 /* 32 bit binary */
>>> +#define TIF_NEED_RESCHED_LAZY 21 /* Lazy rescheduling */
>>
>> I don't see any of the bits being used in assembly anymore.
>> If you group the _TIF_USER_WORK_MASK bits it a single 16 bit block then
>> the compiler could issue a single andi.
>
That's a good find. since by default powerpc uses 4 byte fixed ISA,
compiler would generate extra code for _TIF_USER_WORK_MASK. Looked at
the objdump. It indeed does.
I see that value 9 isn't being used. It was last used for TIF_NOHZ which
is removed now. That value could be used for RESCHED_LAZY. Using that
value i see the code generated is similar to what we have now.
+mpe
Ankur, Could you please change the value to 9?
---------------------------------------------------------------------
I see with value 9, andi is used again.
254: 80 00 dc eb ld r30,128(r28)
258: 4e 62 c9 73 andi. r9,r30,25166
--- a/arch/powerpc/include/asm/thread_info.h
+++ b/arch/powerpc/include/asm/thread_info.h
@@ -103,6 +103,7 @@ void arch_setup_new_exec(void);
#define TIF_PATCH_PENDING 6 /* pending live patching update */
#define TIF_SYSCALL_AUDIT 7 /* syscall auditing active */
#define TIF_SINGLESTEP 8 /* singlestepping active */
+#define TIF_NEED_RESCHED_LAZY 9 /* Lazy rescheduling */
#define TIF_SECCOMP 10 /* secure computing */
#define TIF_RESTOREALL 11 /* Restore all regs (implies
NOERROR) */
#define TIF_NOERROR 12 /* Force successful syscall
return */
@@ -117,7 +118,6 @@ void arch_setup_new_exec(void);
#endif
#define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is
polling TIF_NEED_RESCHED */
#define TIF_32BIT 20 /* 32 bit binary */
-#define TIF_NEED_RESCHED_LAZY 21 /* Lazy rescheduling */
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 7/7] powerpc: add support for PREEMPT_LAZY
2024-10-11 18:35 ` Shrikanth Hegde
@ 2024-10-12 22:42 ` Michael Ellerman
0 siblings, 0 replies; 75+ messages in thread
From: Michael Ellerman @ 2024-10-12 22:42 UTC (permalink / raw)
To: Shrikanth Hegde, Ankur Arora, Sebastian Andrzej Siewior
Cc: linux-kernel, peterz, tglx, paulmck, mingo, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
Shrikanth Hegde <sshegde@linux.ibm.com> writes:
> On 10/10/24 23:40, Ankur Arora wrote:
>>
>> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>>
>>> On 2024-10-09 09:54:11 [-0700], Ankur Arora wrote:
>>>> From: Shrikanth Hegde <sshegde@linux.ibm.com>
>>>>
>>>> Add PowerPC arch support for PREEMPT_LAZY by defining LAZY bits.
>>
...
>>>> diff --git a/arch/powerpc/include/asm/thread_info.h b/arch/powerpc/include/asm/thread_info.h
>>>> index 6ebca2996f18..ae7793dae763 100644
>>>> --- a/arch/powerpc/include/asm/thread_info.h
>>>> +++ b/arch/powerpc/include/asm/thread_info.h
>>>> @@ -117,11 +117,14 @@ void arch_setup_new_exec(void);
>>>> #endif
>>>> #define TIF_POLLING_NRFLAG 19 /* true if poll_idle() is polling TIF_NEED_RESCHED */
>>>> #define TIF_32BIT 20 /* 32 bit binary */
>>>> +#define TIF_NEED_RESCHED_LAZY 21 /* Lazy rescheduling */
>>>
>>> I don't see any of the bits being used in assembly anymore.
>>> If you group the _TIF_USER_WORK_MASK bits it a single 16 bit block then
>>> the compiler could issue a single andi.
>
> That's a good find. since by default powerpc uses 4 byte fixed ISA,
> compiler would generate extra code for _TIF_USER_WORK_MASK. Looked at
> the objdump. It indeed does.
>
> I see that value 9 isn't being used. It was last used for TIF_NOHZ which
> is removed now. That value could be used for RESCHED_LAZY. Using that
> value i see the code generated is similar to what we have now.
>
> +mpe
Yep, 9 looks good.
I don't think it *really* matters that it's a single andi. on modern
CPUs, but seeing as bit 9 is free we may as well use it.
cheers
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
2024-10-10 6:37 ` Sebastian Andrzej Siewior
@ 2024-10-13 9:44 ` kernel test robot
2024-10-13 9:54 ` kernel test robot
2024-10-16 9:36 ` Shrikanth Hegde
3 siblings, 0 replies; 75+ messages in thread
From: kernel test robot @ 2024-10-13 9:44 UTC (permalink / raw)
To: Ankur Arora, linux-kernel
Cc: oe-kbuild-all, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault
Hi Ankur,
kernel test robot noticed the following build errors:
[auto build test ERROR on paulmck-rcu/dev]
[also build test ERROR on powerpc/next powerpc/fixes tip/sched/core linus/master v6.12-rc2 next-20241011]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ankur-Arora/sched-warn-for-high-latency-with-TIF_NEED_RESCHED_LAZY/20241010-005819
base: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
patch link: https://lore.kernel.org/r/20241009165411.3426937-2-ankur.a.arora%40oracle.com
patch subject: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
config: x86_64-rhel-8.3 (https://download.01.org/0day-ci/archive/20241013/202410131715.HYC3WK5i-lkp@intel.com/config)
compiler: gcc-12 (Debian 12.2.0-14) 12.2.0
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241013/202410131715.HYC3WK5i-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410131715.HYC3WK5i-lkp@intel.com/
All errors (new ones prefixed by >>):
kernel/sched/core.c: In function 'cpu_resched_latency':
>> kernel/sched/core.c:5528:34: error: implicit declaration of function 'tif_need_resched_lazy'; did you mean 'tif_need_resched'? [-Werror=implicit-function-declaration]
5528 | if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
| ^~~~~~~~~~~~~~~~~~~~~
| tif_need_resched
cc1: some warnings being treated as errors
vim +5528 kernel/sched/core.c
5517
5518 #ifdef CONFIG_SCHED_DEBUG
5519 static u64 cpu_resched_latency(struct rq *rq)
5520 {
5521 int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
5522 u64 resched_latency, now = rq_clock(rq);
5523 static bool warned_once;
5524
5525 if (sysctl_resched_latency_warn_once && warned_once)
5526 return 0;
5527
> 5528 if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
5529 return 0;
5530
5531 if (system_state == SYSTEM_BOOTING)
5532 return 0;
5533
5534 if (!rq->last_seen_need_resched_ns) {
5535 rq->last_seen_need_resched_ns = now;
5536 rq->ticks_without_resched = 0;
5537 return 0;
5538 }
5539
5540 rq->ticks_without_resched++;
5541 resched_latency = now - rq->last_seen_need_resched_ns;
5542 if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
5543 return 0;
5544
5545 warned_once = true;
5546
5547 return resched_latency;
5548 }
5549
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
2024-10-10 6:37 ` Sebastian Andrzej Siewior
2024-10-13 9:44 ` kernel test robot
@ 2024-10-13 9:54 ` kernel test robot
2024-10-16 9:36 ` Shrikanth Hegde
3 siblings, 0 replies; 75+ messages in thread
From: kernel test robot @ 2024-10-13 9:54 UTC (permalink / raw)
To: Ankur Arora, linux-kernel
Cc: llvm, oe-kbuild-all, peterz, tglx, paulmck, mingo, bigeasy,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, ankur.a.arora, efault
Hi Ankur,
kernel test robot noticed the following build errors:
[auto build test ERROR on paulmck-rcu/dev]
[also build test ERROR on powerpc/next powerpc/fixes tip/sched/core linus/master v6.12-rc2 next-20241011]
[If your patch is applied to the wrong git tree, kindly drop us a note.
And when submitting patch, we suggest to use '--base' as documented in
https://git-scm.com/docs/git-format-patch#_base_tree_information]
url: https://github.com/intel-lab-lkp/linux/commits/Ankur-Arora/sched-warn-for-high-latency-with-TIF_NEED_RESCHED_LAZY/20241010-005819
base: https://git.kernel.org/pub/scm/linux/kernel/git/paulmck/linux-rcu.git dev
patch link: https://lore.kernel.org/r/20241009165411.3426937-2-ankur.a.arora%40oracle.com
patch subject: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
config: x86_64-kexec (https://download.01.org/0day-ci/archive/20241013/202410131726.Gi9qvUP8-lkp@intel.com/config)
compiler: clang version 18.1.8 (https://github.com/llvm/llvm-project 3b5b5c1ec4a3095ab096dd780e84d7ab81f3d7ff)
reproduce (this is a W=1 build): (https://download.01.org/0day-ci/archive/20241013/202410131726.Gi9qvUP8-lkp@intel.com/reproduce)
If you fix the issue in a separate patch/commit (i.e. not just a new version of
the same patch/commit), kindly add following tags
| Reported-by: kernel test robot <lkp@intel.com>
| Closes: https://lore.kernel.org/oe-kbuild-all/202410131726.Gi9qvUP8-lkp@intel.com/
All errors (new ones prefixed by >>):
>> kernel/sched/core.c:5528:27: error: call to undeclared function 'tif_need_resched_lazy'; ISO C99 and later do not support implicit function declarations [-Wimplicit-function-declaration]
5528 | if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
| ^
kernel/sched/core.c:5528:27: note: did you mean 'tif_need_resched'?
include/linux/thread_info.h:182:29: note: 'tif_need_resched' declared here
182 | static __always_inline bool tif_need_resched(void)
| ^
1 error generated.
vim +/tif_need_resched_lazy +5528 kernel/sched/core.c
5517
5518 #ifdef CONFIG_SCHED_DEBUG
5519 static u64 cpu_resched_latency(struct rq *rq)
5520 {
5521 int latency_warn_ms = READ_ONCE(sysctl_resched_latency_warn_ms);
5522 u64 resched_latency, now = rq_clock(rq);
5523 static bool warned_once;
5524
5525 if (sysctl_resched_latency_warn_once && warned_once)
5526 return 0;
5527
> 5528 if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
5529 return 0;
5530
5531 if (system_state == SYSTEM_BOOTING)
5532 return 0;
5533
5534 if (!rq->last_seen_need_resched_ns) {
5535 rq->last_seen_need_resched_ns = now;
5536 rq->ticks_without_resched = 0;
5537 return 0;
5538 }
5539
5540 rq->ticks_without_resched++;
5541 resched_latency = now - rq->last_seen_need_resched_ns;
5542 if (resched_latency <= latency_warn_ms * NSEC_PER_MSEC)
5543 return 0;
5544
5545 warned_once = true;
5546
5547 return resched_latency;
5548 }
5549
--
0-DAY CI Kernel Test Service
https://github.com/intel/lkp-tests/wiki
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y
2024-10-11 7:36 ` Sebastian Andrzej Siewior
@ 2024-10-14 20:14 ` Ankur Arora
0 siblings, 0 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-14 20:14 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-10 10:50:29 [-0700], Ankur Arora wrote:
>> > PREEMPTION=y, PREEMPT_RCU=n should not be possible.
>>
>> That's a statement. Is there an argument here?
>
> For my taste you should describe in your cover letter the actual problem
> and what you intend to do about it.
Yes, that's a good point. It has been discussed a few times before but
I clearly didn't make a case for it here.
Thanks
Ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-11 15:59 ` Paul E. McKenney
@ 2024-10-15 11:22 ` Sebastian Andrzej Siewior
2024-10-15 22:13 ` Ankur Arora
2024-10-15 23:11 ` Paul E. McKenney
0 siblings, 2 replies; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-15 11:22 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-11 08:59:14 [-0700], Paul E. McKenney wrote:
> On Fri, Oct 11, 2024 at 04:43:41PM +0200, Sebastian Andrzej Siewior wrote:
>
> > Okay, this eliminates PREEMPT_DYNAMIC then.
> > With PeterZ current series, PREEMPT_LAZY (without everything else
> > enabled) behaves as PREEMPT without the "forced" wake up for the fair
> > class. It is preemptible after all, with preempt_disable() actually
> > doing something. This might speak for preemptible RCU.
> > And assuming in this condition you that "low memory overhead RCU" which
> > is not PREEMPT_RCU. This might require a config option.
>
> The PREEMPT_DYNAMIC case seems to work well as-is for the intended users,
> so I don't see a need to change it. In particular, we already learned
> that we need to set PREEMPT_DYNAMIC=n. Yes, had I caught this in time, I
> would have argued against changing the default, but this was successfully
> slid past me.
>
> As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
> than just keeping non-preemptible RCU when the Kconfig options are
> consistent with this being expected. If this is the case, what are the
> benefits of this more-intrusive change?
As far as I understand you are only concerned about PREEMPT_LAZY and
everything else (PREEMPT_LAZY + PREEMPT_DYNAMIC or PREEMPT_DYNAMIC
without PREEMPT_LAZY) is fine.
In the PREEMPT_LAZY + !PREEMPT_DYNAMIC the suggested change
| config PREEMPT_RCU
| bool
| default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
| select TREE_RCU
| help
would disable PREEMPT_RCU while the default model is PREEMPT. You argue
that only people on small embedded would do such a thing and they would
like to safe additional memory.
I don't think this is always the case because the "preemptible" users
would also get this and this is an unexpected change for them.
> > > > If you would like to add some relief to memory constrained systems,
> > > > wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
> > > > allow to override it?
> > >
> > > Does BASE_SMALL affect anything but log buffer sizes? Either way, we
> > > would still need to avoid the larger memory footprint of preemptible
> > > RCU that shows up due to RCU readers being preempted.
> >
> > It only reduces data structures where possible. So lower performance is
> > probably due to things like futex hashmap (and others) are smaller.
>
> Which is still counterproductive for use cases other than small deep
> embedded systems.
Okay, so that option is gone.
> > > Besides, we are not looking to give up performance vs BASE_SMALL's
> > > "may reduce performance" help text.
> > >
> > > Yes, yes, it would simplify things to just get rid of non-preemptible RCU,
> > > but that is simply not in the cards at the moment.
> >
> > Not sure what the time frame is here. If we go for LAZY and remove NONE
> > and VOLUNTARY then making PREEMPT_RCU would make sense to lower the
> > memory footprint (and not attaching to BASE_SMALL).
> >
> > Is this what you intend or did misunderstand something here?
>
> My requirement is that LAZY not remove/disable/whatever non-preemptible
> RCU. Those currently using non-preemptible RCU should continue to be able
> to be able to use it, with or without LAZY. So why is this requirement
> a problem for you? Or am I missing your point?
Those who were using non-preemptible RCU, whish to use LAZY_PREEPMT +
!PREEMPT_DYNAMIC should be able to disable PREEMPT_RCU only in this case.
Would the following work?
diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
index 8cf8a9a4d868c..2183c775e7808 100644
--- a/kernel/Kconfig.preempt
+++ b/kernel/Kconfig.preempt
@@ -121,6 +121,7 @@ config PREEMPT_COUNT
config PREEMPTION
bool
select PREEMPT_COUNT
+ select PREEMPT_RCU if PREEMPT_DYNAMIC
config PREEMPT_DYNAMIC
bool "Preemption behaviour defined on boot"
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index 3e079de0f5b43..9e4bdbbca4ff9 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -17,7 +17,7 @@ config TREE_RCU
smaller systems.
config PREEMPT_RCU
- bool
+ bool "Preemptible RCU"
default y if PREEMPTION
select TREE_RCU
help
@@ -91,7 +91,7 @@ config NEED_TASKS_RCU
config TASKS_RCU
bool
- default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
+ default NEED_TASKS_RCU && PREEMPTION
select IRQ_WORK
config FORCE_TASKS_RUDE_RCU
I added TASKS_RCU to the hunk since I am not sure if you wish to follow
PREEMPTION (which is set by LAZY) or PREEMPT_RCU.
> Thanx, Paul
Sebastian
^ permalink raw reply related [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-15 11:22 ` Sebastian Andrzej Siewior
@ 2024-10-15 22:13 ` Ankur Arora
2024-10-17 8:04 ` Sebastian Andrzej Siewior
2024-10-15 23:11 ` Paul E. McKenney
1 sibling, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-15 22:13 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Paul E. McKenney, Peter Zijlstra, Ankur Arora, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-11 08:59:14 [-0700], Paul E. McKenney wrote:
>> On Fri, Oct 11, 2024 at 04:43:41PM +0200, Sebastian Andrzej Siewior wrote:
>>
>> > Okay, this eliminates PREEMPT_DYNAMIC then.
>> > With PeterZ current series, PREEMPT_LAZY (without everything else
>> > enabled) behaves as PREEMPT without the "forced" wake up for the fair
>> > class. It is preemptible after all, with preempt_disable() actually
>> > doing something. This might speak for preemptible RCU.
>> > And assuming in this condition you that "low memory overhead RCU" which
>> > is not PREEMPT_RCU. This might require a config option.
>>
>> The PREEMPT_DYNAMIC case seems to work well as-is for the intended users,
>> so I don't see a need to change it. In particular, we already learned
>> that we need to set PREEMPT_DYNAMIC=n. Yes, had I caught this in time, I
>> would have argued against changing the default, but this was successfully
>> slid past me.
>>
>> As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
>> than just keeping non-preemptible RCU when the Kconfig options are
>> consistent with this being expected. If this is the case, what are the
>> benefits of this more-intrusive change?
>
> As far as I understand you are only concerned about PREEMPT_LAZY and
> everything else (PREEMPT_LAZY + PREEMPT_DYNAMIC or PREEMPT_DYNAMIC
> without PREEMPT_LAZY) is fine.
> In the PREEMPT_LAZY + !PREEMPT_DYNAMIC the suggested change
>
> | config PREEMPT_RCU
> | bool
> | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> | select TREE_RCU
> | help
>
> would disable PREEMPT_RCU while the default model is PREEMPT. You argue
With PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n, isn't the default model
PREEMPT_LAZY, which has PREEMPTION=y, but PREEMPT=n?
> that only people on small embedded would do such a thing and they would
> like to safe additional memory.
>
> I don't think this is always the case because the "preemptible" users
> would also get this and this is an unexpected change for them.
Can you clarify this? The intent with lazy is to be preemptible but
preempt less often. In that it is meant to be quite different from
CONFIG_PREEMPT.
Ankur
>> > > > If you would like to add some relief to memory constrained systems,
>> > > > wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
>> > > > allow to override it?
>> > >
>> > > Does BASE_SMALL affect anything but log buffer sizes? Either way, we
>> > > would still need to avoid the larger memory footprint of preemptible
>> > > RCU that shows up due to RCU readers being preempted.
>> >
>> > It only reduces data structures where possible. So lower performance is
>> > probably due to things like futex hashmap (and others) are smaller.
>>
>> Which is still counterproductive for use cases other than small deep
>> embedded systems.
>
> Okay, so that option is gone.
>
>> > > Besides, we are not looking to give up performance vs BASE_SMALL's
>> > > "may reduce performance" help text.
>> > >
>> > > Yes, yes, it would simplify things to just get rid of non-preemptible RCU,
>> > > but that is simply not in the cards at the moment.
>> >
>> > Not sure what the time frame is here. If we go for LAZY and remove NONE
>> > and VOLUNTARY then making PREEMPT_RCU would make sense to lower the
>> > memory footprint (and not attaching to BASE_SMALL).
>> >
>> > Is this what you intend or did misunderstand something here?
>>
>> My requirement is that LAZY not remove/disable/whatever non-preemptible
>> RCU. Those currently using non-preemptible RCU should continue to be able
>> to be able to use it, with or without LAZY. So why is this requirement
>> a problem for you? Or am I missing your point?
>
> Those who were using non-preemptible RCU, whish to use LAZY_PREEPMT +
> !PREEMPT_DYNAMIC should be able to disable PREEMPT_RCU only in this case.
> Would the following work?
>
> diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
> index 8cf8a9a4d868c..2183c775e7808 100644
> --- a/kernel/Kconfig.preempt
> +++ b/kernel/Kconfig.preempt
> @@ -121,6 +121,7 @@ config PREEMPT_COUNT
> config PREEMPTION
> bool
> select PREEMPT_COUNT
> + select PREEMPT_RCU if PREEMPT_DYNAMIC
>
> config PREEMPT_DYNAMIC
> bool "Preemption behaviour defined on boot"
> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> index 3e079de0f5b43..9e4bdbbca4ff9 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -17,7 +17,7 @@ config TREE_RCU
> smaller systems.
>
> config PREEMPT_RCU
> - bool
> + bool "Preemptible RCU"
> default y if PREEMPTION
> select TREE_RCU
> help
> @@ -91,7 +91,7 @@ config NEED_TASKS_RCU
>
> config TASKS_RCU
> bool
> - default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
> + default NEED_TASKS_RCU && PREEMPTION
> select IRQ_WORK
>
> config FORCE_TASKS_RUDE_RCU
>
> I added TASKS_RCU to the hunk since I am not sure if you wish to follow
> PREEMPTION (which is set by LAZY) or PREEMPT_RCU.
>
>> Thanx, Paul
>
> Sebastian
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-11 7:58 ` Sebastian Andrzej Siewior
@ 2024-10-15 23:01 ` Ankur Arora
0 siblings, 0 replies; 75+ messages in thread
From: Ankur Arora @ 2024-10-15 23:01 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, Peter Zijlstra, Paul E. McKenney, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-10 10:35:25 [-0700], Ankur Arora wrote:
>> That's odd. I have that exact configuration (PREEMPT_DYANMIC=n,
>> PREEMPT_LAZY=y, PREEMPT_RCU=n).
>>
>> Can you share your .config?
>
> Sent offlist.
Thanks. So, I tried the config and your config enables
PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n.
That forces PREEMPT_RCU=n.
If in that config you change the preemption options to
PREEMPT_LAZY=y, PREEMPT_DYNAMIC=y, we get PREEMPT_RCU=y.
Is that also the behaviour you are seeing? Seems to be doing
the right thing.
Also, quite similar to how we handle the PREEMPT_NONE/_VOLUNTARY
case. And, lazy is meant to have similar preemption behaviour
as those models.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-15 11:22 ` Sebastian Andrzej Siewior
2024-10-15 22:13 ` Ankur Arora
@ 2024-10-15 23:11 ` Paul E. McKenney
2024-10-17 7:07 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-15 23:11 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Tue, Oct 15, 2024 at 01:22:24PM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-11 08:59:14 [-0700], Paul E. McKenney wrote:
> > On Fri, Oct 11, 2024 at 04:43:41PM +0200, Sebastian Andrzej Siewior wrote:
> >
> > > Okay, this eliminates PREEMPT_DYNAMIC then.
> > > With PeterZ current series, PREEMPT_LAZY (without everything else
> > > enabled) behaves as PREEMPT without the "forced" wake up for the fair
> > > class. It is preemptible after all, with preempt_disable() actually
> > > doing something. This might speak for preemptible RCU.
> > > And assuming in this condition you that "low memory overhead RCU" which
> > > is not PREEMPT_RCU. This might require a config option.
> >
> > The PREEMPT_DYNAMIC case seems to work well as-is for the intended users,
> > so I don't see a need to change it. In particular, we already learned
> > that we need to set PREEMPT_DYNAMIC=n. Yes, had I caught this in time, I
> > would have argued against changing the default, but this was successfully
> > slid past me.
> >
> > As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
> > than just keeping non-preemptible RCU when the Kconfig options are
> > consistent with this being expected. If this is the case, what are the
> > benefits of this more-intrusive change?
>
> As far as I understand you are only concerned about PREEMPT_LAZY and
> everything else (PREEMPT_LAZY + PREEMPT_DYNAMIC or PREEMPT_DYNAMIC
> without PREEMPT_LAZY) is fine.
> In the PREEMPT_LAZY + !PREEMPT_DYNAMIC the suggested change
>
> | config PREEMPT_RCU
> | bool
> | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> | select TREE_RCU
> | help
>
> would disable PREEMPT_RCU while the default model is PREEMPT. You argue
> that only people on small embedded would do such a thing and they would
> like to safe additional memory.
I am more worried about large datacenter deployments than small embedded
systems. Larger systems, but various considerations often limit the
amount of memory on a given system.
> I don't think this is always the case because the "preemptible" users
> would also get this and this is an unexpected change for them.
Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
As conceived last time around, the change would affect only kernels
built with one of the other of those two Kconfig options, which will
not be users expecting preemption.
> > > > > If you would like to add some relief to memory constrained systems,
> > > > > wouldn't BASE_SMALL be something you could hook to? With EXPERT_RCU to
> > > > > allow to override it?
> > > >
> > > > Does BASE_SMALL affect anything but log buffer sizes? Either way, we
> > > > would still need to avoid the larger memory footprint of preemptible
> > > > RCU that shows up due to RCU readers being preempted.
> > >
> > > It only reduces data structures where possible. So lower performance is
> > > probably due to things like futex hashmap (and others) are smaller.
> >
> > Which is still counterproductive for use cases other than small deep
> > embedded systems.
>
> Okay, so that option is gone.
It was worth a look, so thank you!
> > > > Besides, we are not looking to give up performance vs BASE_SMALL's
> > > > "may reduce performance" help text.
> > > >
> > > > Yes, yes, it would simplify things to just get rid of non-preemptible RCU,
> > > > but that is simply not in the cards at the moment.
> > >
> > > Not sure what the time frame is here. If we go for LAZY and remove NONE
> > > and VOLUNTARY then making PREEMPT_RCU would make sense to lower the
> > > memory footprint (and not attaching to BASE_SMALL).
> > >
> > > Is this what you intend or did misunderstand something here?
> >
> > My requirement is that LAZY not remove/disable/whatever non-preemptible
> > RCU. Those currently using non-preemptible RCU should continue to be able
> > to be able to use it, with or without LAZY. So why is this requirement
> > a problem for you? Or am I missing your point?
>
> Those who were using non-preemptible RCU, whish to use LAZY_PREEPMT +
> !PREEMPT_DYNAMIC should be able to disable PREEMPT_RCU only in this case.
> Would the following work?
>
> diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
> index 8cf8a9a4d868c..2183c775e7808 100644
> --- a/kernel/Kconfig.preempt
> +++ b/kernel/Kconfig.preempt
> @@ -121,6 +121,7 @@ config PREEMPT_COUNT
> config PREEMPTION
> bool
> select PREEMPT_COUNT
> + select PREEMPT_RCU if PREEMPT_DYNAMIC
>
> config PREEMPT_DYNAMIC
> bool "Preemption behaviour defined on boot"
> diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> index 3e079de0f5b43..9e4bdbbca4ff9 100644
> --- a/kernel/rcu/Kconfig
> +++ b/kernel/rcu/Kconfig
> @@ -17,7 +17,7 @@ config TREE_RCU
> smaller systems.
>
> config PREEMPT_RCU
> - bool
> + bool "Preemptible RCU"
> default y if PREEMPTION
> select TREE_RCU
> help
> @@ -91,7 +91,7 @@ config NEED_TASKS_RCU
If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
far better to make PREEMPT_RCU depend on neither of those being set.
That would leave the RCU Kconfig settings fully automatic, and this
automation is not to be abandoned lightly.
> config TASKS_RCU
> bool
> - default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
> + default NEED_TASKS_RCU && PREEMPTION
> select IRQ_WORK
>
> config FORCE_TASKS_RUDE_RCU
>
> I added TASKS_RCU to the hunk since I am not sure if you wish to follow
> PREEMPTION (which is set by LAZY) or PREEMPT_RCU.
TASKS_RCU needs to be selected when there is preemption of any kind,
lazy or otherwise, regardless of the settign of PREEMPT_RCU.
The current substition of vanilla RCU for Tasks RCU works only in
kernels that are guaranteed non-preemptible, which does not include
kernels built with lazy preemption.
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
` (2 preceding siblings ...)
2024-10-13 9:54 ` kernel test robot
@ 2024-10-16 9:36 ` Shrikanth Hegde
2024-10-21 19:21 ` Ankur Arora
3 siblings, 1 reply; 75+ messages in thread
From: Shrikanth Hegde @ 2024-10-16 9:36 UTC (permalink / raw)
To: Ankur Arora, linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 10/9/24 22:24, Ankur Arora wrote:
> resched_latency_warn() now also warns if TIF_NEED_RESCHED_LAZY is set
> without rescheduling for more than the latency_warn_ms period.
>
I am bit confused here. Why do we need to warn if LAZY is set for a long
time?
If lazy set, the subsequent tick, it would be set to upgraded to
NEED_RESCHED.
Since the value of latency_warn_ms=100ms, that means even on system with
HZ=100, that means 10 ticks before that warning would be printed no?
IIUC, the changelog c006fac556e40 ("sched: Warn on long periods of
pending need_resched") has the concern of need_resched set but if it is
non-preemptible kernel it would spend a lot of time in kernel mode. In
that case print a warning.
If someone enables Lazy, that means it is preemptible and probably this
whole notion of resched_latency_warn doesn't apply to lazy. Please
correct me if i am not understanding this correctly.
> Cc: Ingo Molnar <mingo@redhat.com>
> Cc: Peter Ziljstra <peterz@infradead.org>
> Cc: Juri Lelli <juri.lelli@redhat.com>
> Cc: Vincent Guittot <vincent.guittot@linaro.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
> ---
> kernel/sched/core.c | 2 +-
> kernel/sched/debug.c | 7 +++++--
> 2 files changed, 6 insertions(+), 3 deletions(-)
>
> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
> index 694bfcf153cb..1229766b704e 100644
> --- a/kernel/sched/core.c
> +++ b/kernel/sched/core.c
> @@ -5571,7 +5571,7 @@ static u64 cpu_resched_latency(struct rq *rq)
> if (sysctl_resched_latency_warn_once && warned_once)
> return 0;
>
> - if (!need_resched() || !latency_warn_ms)
> + if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
> return 0;
>
> if (system_state == SYSTEM_BOOTING)
> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
> index 9abcc6ead11b..f0d551ba64bb 100644
> --- a/kernel/sched/debug.c
> +++ b/kernel/sched/debug.c
> @@ -1293,9 +1293,12 @@ void proc_sched_set_task(struct task_struct *p)
> void resched_latency_warn(int cpu, u64 latency)
> {
> static DEFINE_RATELIMIT_STATE(latency_check_ratelimit, 60 * 60 * HZ, 1);
> + char *nr;
> +
> + nr = tif_need_resched() ? "need_resched" : "need_resched_lazy";
>
> WARN(__ratelimit(&latency_check_ratelimit),
> - "sched: CPU %d need_resched set for > %llu ns (%d ticks) "
> + "sched: CPU %d %s set for > %llu ns (%d ticks) "
> "without schedule\n",
> - cpu, latency, cpu_rq(cpu)->ticks_without_resched);
> + cpu, nr, latency, cpu_rq(cpu)->ticks_without_resched);
> }
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-15 23:11 ` Paul E. McKenney
@ 2024-10-17 7:07 ` Sebastian Andrzej Siewior
2024-10-18 17:38 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-17 7:07 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-15 16:11:55 [-0700], Paul E. McKenney wrote:
> > | config PREEMPT_RCU
> > | bool
> > | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> > | select TREE_RCU
> > | help
> >
> > would disable PREEMPT_RCU while the default model is PREEMPT. You argue
> > that only people on small embedded would do such a thing and they would
> > like to safe additional memory.
>
> I am more worried about large datacenter deployments than small embedded
> systems. Larger systems, but various considerations often limit the
> amount of memory on a given system.
okay.
> > I don't think this is always the case because the "preemptible" users
> > would also get this and this is an unexpected change for them.
>
> Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
no, not yet. It is only adding PREEMPT_LAZY as new model, next to
PREEMPT_NONE and PREEMPT_VOLUNTARY. But is is likely to be on schedule.
> As conceived last time around, the change would affect only kernels
> built with one of the other of those two Kconfig options, which will
> not be users expecting preemption.
If you continue to use PREEMPT_NONE/ PREEMPT_VOLUNTARY nothing changes
right now.
> > diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
> > index 8cf8a9a4d868c..2183c775e7808 100644
> > --- a/kernel/Kconfig.preempt
> > +++ b/kernel/Kconfig.preempt
> > @@ -121,6 +121,7 @@ config PREEMPT_COUNT
> > config PREEMPTION
> > bool
> > select PREEMPT_COUNT
> > + select PREEMPT_RCU if PREEMPT_DYNAMIC
> >
> > config PREEMPT_DYNAMIC
> > bool "Preemption behaviour defined on boot"
> > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > index 3e079de0f5b43..9e4bdbbca4ff9 100644
> > --- a/kernel/rcu/Kconfig
> > +++ b/kernel/rcu/Kconfig
> > @@ -17,7 +17,7 @@ config TREE_RCU
> > smaller systems.
> >
> > config PREEMPT_RCU
> > - bool
> > + bool "Preemptible RCU"
> > default y if PREEMPTION
> > select TREE_RCU
> > help
> > @@ -91,7 +91,7 @@ config NEED_TASKS_RCU
>
> If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
> far better to make PREEMPT_RCU depend on neither of those being set.
> That would leave the RCU Kconfig settings fully automatic, and this
> automation is not to be abandoned lightly.
Yes, that was my intention - only to make is selectable with
LAZY-preemption enabled but without dynamic.
So you are not complete against it.
> > config TASKS_RCU
> > bool
> > - default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
> > + default NEED_TASKS_RCU && PREEMPTION
> > select IRQ_WORK
> >
> > config FORCE_TASKS_RUDE_RCU
> >
> > I added TASKS_RCU to the hunk since I am not sure if you wish to follow
> > PREEMPTION (which is set by LAZY) or PREEMPT_RCU.
>
> TASKS_RCU needs to be selected when there is preemption of any kind,
> lazy or otherwise, regardless of the settign of PREEMPT_RCU.
Okay. In that case PREEMPT_AUTO can be removed.
> The current substition of vanilla RCU for Tasks RCU works only in
> kernels that are guaranteed non-preemptible, which does not include
> kernels built with lazy preemption.
>
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-15 22:13 ` Ankur Arora
@ 2024-10-17 8:04 ` Sebastian Andrzej Siewior
2024-10-17 22:50 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-17 8:04 UTC (permalink / raw)
To: Ankur Arora
Cc: Paul E. McKenney, Peter Zijlstra, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>
> >>
> >> As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
> >> than just keeping non-preemptible RCU when the Kconfig options are
> >> consistent with this being expected. If this is the case, what are the
> >> benefits of this more-intrusive change?
> >
> > As far as I understand you are only concerned about PREEMPT_LAZY and
> > everything else (PREEMPT_LAZY + PREEMPT_DYNAMIC or PREEMPT_DYNAMIC
> > without PREEMPT_LAZY) is fine.
> > In the PREEMPT_LAZY + !PREEMPT_DYNAMIC the suggested change
> >
> > | config PREEMPT_RCU
> > | bool
> > | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> > | select TREE_RCU
> > | help
> >
> > would disable PREEMPT_RCU while the default model is PREEMPT. You argue
>
> With PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n, isn't the default model
> PREEMPT_LAZY, which has PREEMPTION=y, but PREEMPT=n?
Correct.
> > that only people on small embedded would do such a thing and they would
> > like to safe additional memory.
> >
> > I don't think this is always the case because the "preemptible" users
> > would also get this and this is an unexpected change for them.
>
> Can you clarify this? The intent with lazy is to be preemptible but
> preempt less often. In that it is meant to be quite different from
> CONFIG_PREEMPT.
A wake up with PREEMPT may not always lead to a preemption but will lead
to preemption once the time slice is up. With LAZY this changes to the
point that a preemption point will be delayed to the sched tick. Tasks
which are not based on the fail class (SCHED_DL, FIFO, …) will receive a
wake up right away.
If in the long term NONE and VOL goes away you could argue that everyone
using LAZY + !DYNAMIC is one of those. If additionally PREEMPT goes away
then you can not. Therefore I would prefer to have the RCU model to be
selectable rather than forced. While !PREEMPT_RCU may save memory, it
also disable preemption for the read section.
> Ankur
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-17 8:04 ` Sebastian Andrzej Siewior
@ 2024-10-17 22:50 ` Ankur Arora
2024-10-18 17:43 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-17 22:50 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Ankur Arora, Paul E. McKenney, Peter Zijlstra, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
>> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>>
>> >>
>> >> As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
>> >> than just keeping non-preemptible RCU when the Kconfig options are
>> >> consistent with this being expected. If this is the case, what are the
>> >> benefits of this more-intrusive change?
>> >
>> > As far as I understand you are only concerned about PREEMPT_LAZY and
>> > everything else (PREEMPT_LAZY + PREEMPT_DYNAMIC or PREEMPT_DYNAMIC
>> > without PREEMPT_LAZY) is fine.
>> > In the PREEMPT_LAZY + !PREEMPT_DYNAMIC the suggested change
>> >
>> > | config PREEMPT_RCU
>> > | bool
>> > | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
>> > | select TREE_RCU
>> > | help
>> >
>> > would disable PREEMPT_RCU while the default model is PREEMPT. You argue
>>
>> With PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n, isn't the default model
>> PREEMPT_LAZY, which has PREEMPTION=y, but PREEMPT=n?
>
> Correct.
>
>> > that only people on small embedded would do such a thing and they would
>> > like to safe additional memory.
>> >
>> > I don't think this is always the case because the "preemptible" users
>> > would also get this and this is an unexpected change for them.
>>
>> Can you clarify this? The intent with lazy is to be preemptible but
>> preempt less often. In that it is meant to be quite different from
>> CONFIG_PREEMPT.
>
> A wake up with PREEMPT may not always lead to a preemption but will lead
> to preemption once the time slice is up. With LAZY this changes to the
> point that a preemption point will be delayed to the sched tick. Tasks
> which are not based on the fail class (SCHED_DL, FIFO, …) will receive a
> wake up right away.
>> > I don't think this is always the case because the "preemptible" users
>> > would also get this and this is an unexpected change for them.
Yes. My point was that "preemptible" is a mechanism.
The policy about how often preemption happens is determined by the
chosen model PREEMPT_NONE/PREEMPT_VOLUNTARY/PREEMPT_LAZY/PREEMPT/
PREEMPT_RT.
> If in the long term NONE and VOL goes away you could argue that everyone
> using LAZY + !DYNAMIC is one of those.
> If additionally PREEMPT goes away then you can not.
Sure. But, that's just begging the question.
We want _NONE and _VOLUNTARY to go away because keeping cond_resched()
around incurs a cost.
> Therefore I would prefer to have the RCU model to be
> selectable rather than forced. While !PREEMPT_RCU may save memory, it
> also disable preemption for the read section.
When a user chooses one of PREEMPT_NONE/_VOLUNTARY/_LAZY, the implication
is that on the throughput -- latency axis, they care about optimizing
for throughput.
PREEMPT_RCU=n is clearly oriented towards that.
That said, I'm agnostic about making the RCU model selectable. Paul
is the best judge of that.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-17 7:07 ` Sebastian Andrzej Siewior
@ 2024-10-18 17:38 ` Paul E. McKenney
2024-10-21 11:27 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-18 17:38 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Thu, Oct 17, 2024 at 09:07:10AM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-15 16:11:55 [-0700], Paul E. McKenney wrote:
> > > | config PREEMPT_RCU
> > > | bool
> > > | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> > > | select TREE_RCU
> > > | help
> > >
> > > would disable PREEMPT_RCU while the default model is PREEMPT. You argue
> > > that only people on small embedded would do such a thing and they would
> > > like to safe additional memory.
> >
> > I am more worried about large datacenter deployments than small embedded
> > systems. Larger systems, but various considerations often limit the
> > amount of memory on a given system.
>
> okay.
>
> > > I don't think this is always the case because the "preemptible" users
> > > would also get this and this is an unexpected change for them.
> >
> > Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
> no, not yet. It is only adding PREEMPT_LAZY as new model, next to
> PREEMPT_NONE and PREEMPT_VOLUNTARY. But is is likely to be on schedule.
>
> > As conceived last time around, the change would affect only kernels
> > built with one of the other of those two Kconfig options, which will
> > not be users expecting preemption.
>
> If you continue to use PREEMPT_NONE/ PREEMPT_VOLUNTARY nothing changes
> right now.
Good, thank you!
Presumably PREEMPT_NONE=y && PREEMPT_LAZY=y enables lazy preemption,
but retains non-preemptible RCU.
> > > diff --git a/kernel/Kconfig.preempt b/kernel/Kconfig.preempt
> > > index 8cf8a9a4d868c..2183c775e7808 100644
> > > --- a/kernel/Kconfig.preempt
> > > +++ b/kernel/Kconfig.preempt
> > > @@ -121,6 +121,7 @@ config PREEMPT_COUNT
> > > config PREEMPTION
> > > bool
> > > select PREEMPT_COUNT
> > > + select PREEMPT_RCU if PREEMPT_DYNAMIC
> > >
> > > config PREEMPT_DYNAMIC
> > > bool "Preemption behaviour defined on boot"
> > > diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
> > > index 3e079de0f5b43..9e4bdbbca4ff9 100644
> > > --- a/kernel/rcu/Kconfig
> > > +++ b/kernel/rcu/Kconfig
> > > @@ -17,7 +17,7 @@ config TREE_RCU
> > > smaller systems.
> > >
> > > config PREEMPT_RCU
> > > - bool
> > > + bool "Preemptible RCU"
> > > default y if PREEMPTION
> > > select TREE_RCU
> > > help
> > > @@ -91,7 +91,7 @@ config NEED_TASKS_RCU
> >
> > If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
> > far better to make PREEMPT_RCU depend on neither of those being set.
> > That would leave the RCU Kconfig settings fully automatic, and this
> > automation is not to be abandoned lightly.
>
> Yes, that was my intention - only to make is selectable with
> LAZY-preemption enabled but without dynamic.
> So you are not complete against it.
Help me out here. In what situation is it beneficial to make PREEMPT_RCU
visible, given that PREEMPT_NONE, PREEMPT_VOLUNTARY, PREEMPT, and
PREEMPT_FULL already take care of this automatically?
> > > config TASKS_RCU
> > > bool
> > > - default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
> > > + default NEED_TASKS_RCU && PREEMPTION
> > > select IRQ_WORK
> > >
> > > config FORCE_TASKS_RUDE_RCU
> > >
> > > I added TASKS_RCU to the hunk since I am not sure if you wish to follow
> > > PREEMPTION (which is set by LAZY) or PREEMPT_RCU.
> >
> > TASKS_RCU needs to be selected when there is preemption of any kind,
> > lazy or otherwise, regardless of the settign of PREEMPT_RCU.
>
> Okay. In that case PREEMPT_AUTO can be removed.
Makes sense to me!
Thanx, Paul
> > The current substition of vanilla RCU for Tasks RCU works only in
> > kernels that are guaranteed non-preemptible, which does not include
> > kernels built with lazy preemption.
> >
> > Thanx, Paul
>
> Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-17 22:50 ` Ankur Arora
@ 2024-10-18 17:43 ` Paul E. McKenney
2024-10-18 19:18 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-18 17:43 UTC (permalink / raw)
To: Ankur Arora
Cc: Sebastian Andrzej Siewior, Peter Zijlstra, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
On Thu, Oct 17, 2024 at 03:50:46PM -0700, Ankur Arora wrote:
> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> > On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> >>
> >> >>
> >> >> As for PREEMPT_LAZY, you seem to be suggesting a more intrusive change
> >> >> than just keeping non-preemptible RCU when the Kconfig options are
> >> >> consistent with this being expected. If this is the case, what are the
> >> >> benefits of this more-intrusive change?
> >> >
> >> > As far as I understand you are only concerned about PREEMPT_LAZY and
> >> > everything else (PREEMPT_LAZY + PREEMPT_DYNAMIC or PREEMPT_DYNAMIC
> >> > without PREEMPT_LAZY) is fine.
> >> > In the PREEMPT_LAZY + !PREEMPT_DYNAMIC the suggested change
> >> >
> >> > | config PREEMPT_RCU
> >> > | bool
> >> > | default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
> >> > | select TREE_RCU
> >> > | help
> >> >
> >> > would disable PREEMPT_RCU while the default model is PREEMPT. You argue
> >>
> >> With PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n, isn't the default model
> >> PREEMPT_LAZY, which has PREEMPTION=y, but PREEMPT=n?
> >
> > Correct.
> >
> >> > that only people on small embedded would do such a thing and they would
> >> > like to safe additional memory.
> >> >
> >> > I don't think this is always the case because the "preemptible" users
> >> > would also get this and this is an unexpected change for them.
> >>
> >> Can you clarify this? The intent with lazy is to be preemptible but
> >> preempt less often. In that it is meant to be quite different from
> >> CONFIG_PREEMPT.
> >
> > A wake up with PREEMPT may not always lead to a preemption but will lead
> > to preemption once the time slice is up. With LAZY this changes to the
> > point that a preemption point will be delayed to the sched tick. Tasks
> > which are not based on the fail class (SCHED_DL, FIFO, …) will receive a
> > wake up right away.
>
> >> > I don't think this is always the case because the "preemptible" users
> >> > would also get this and this is an unexpected change for them.
>
> Yes. My point was that "preemptible" is a mechanism.
>
> The policy about how often preemption happens is determined by the
> chosen model PREEMPT_NONE/PREEMPT_VOLUNTARY/PREEMPT_LAZY/PREEMPT/
> PREEMPT_RT.
>
> > If in the long term NONE and VOL goes away you could argue that everyone
> > using LAZY + !DYNAMIC is one of those.
>
> > If additionally PREEMPT goes away then you can not.
>
> Sure. But, that's just begging the question.
>
> We want _NONE and _VOLUNTARY to go away because keeping cond_resched()
> around incurs a cost.
When you say "go away", do you mean for your use cases? Or globally?
> > Therefore I would prefer to have the RCU model to be
> > selectable rather than forced. While !PREEMPT_RCU may save memory, it
> > also disable preemption for the read section.
>
> When a user chooses one of PREEMPT_NONE/_VOLUNTARY/_LAZY, the implication
> is that on the throughput -- latency axis, they care about optimizing
> for throughput.
>
> PREEMPT_RCU=n is clearly oriented towards that.
Agreed!
> That said, I'm agnostic about making the RCU model selectable. Paul
> is the best judge of that.
I am not seeing any reason to make it once again be selectable, and
plenty of reasons to keep it automatic.
If I am missing some motivation to bother users with the need to make
this decision, please let me know what that motivation might be.
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-18 17:43 ` Paul E. McKenney
@ 2024-10-18 19:18 ` Ankur Arora
2024-10-18 23:24 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-18 19:18 UTC (permalink / raw)
To: paulmck
Cc: Ankur Arora, Sebastian Andrzej Siewior, Peter Zijlstra,
linux-kernel, tglx, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, frederic,
efault
Paul E. McKenney <paulmck@kernel.org> writes:
> On Thu, Oct 17, 2024 at 03:50:46PM -0700, Ankur Arora wrote:
>> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>> > On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
>> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>> >>
[ ... ]
>> Sure. But, that's just begging the question.
>>
>> We want _NONE and _VOLUNTARY to go away because keeping cond_resched()
>> around incurs a cost.
>
> When you say "go away", do you mean for your use cases? Or globally?
When I say "want _ to go away" I mean: cond_resched() is deleterious
to performance since you are forced to have code which can do the
rescheduling check synchronously -- when this could easily be done
asynchronously (as the non voluntary models do).
And this either means poor performance (ex. in the page zeroing code
where it would be more optimal to work on continguous ranges) or
gyrations like the ones that xen_pv_evtchn_do_upcall() and the
Xen hypervisor have to go through.
And, as we've discussed before, the cond_resched() interface, while it
works, is not ideal.
Also, a man can dream!
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-18 19:18 ` Ankur Arora
@ 2024-10-18 23:24 ` Paul E. McKenney
2024-10-19 1:07 ` Ankur Arora
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-18 23:24 UTC (permalink / raw)
To: Ankur Arora
Cc: Sebastian Andrzej Siewior, Peter Zijlstra, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
On Fri, Oct 18, 2024 at 12:18:04PM -0700, Ankur Arora wrote:
>
> Paul E. McKenney <paulmck@kernel.org> writes:
>
> > On Thu, Oct 17, 2024 at 03:50:46PM -0700, Ankur Arora wrote:
> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> >> > On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> >> >>
> [ ... ]
> >> Sure. But, that's just begging the question.
> >>
> >> We want _NONE and _VOLUNTARY to go away because keeping cond_resched()
> >> around incurs a cost.
> >
> > When you say "go away", do you mean for your use cases? Or globally?
>
> When I say "want _ to go away" I mean: cond_resched() is deleterious
> to performance since you are forced to have code which can do the
> rescheduling check synchronously -- when this could easily be done
> asynchronously (as the non voluntary models do).
>
> And this either means poor performance (ex. in the page zeroing code
> where it would be more optimal to work on continguous ranges) or
> gyrations like the ones that xen_pv_evtchn_do_upcall() and the
> Xen hypervisor have to go through.
>
> And, as we've discussed before, the cond_resched() interface, while it
> works, is not ideal.
I would expect that many instances of cond_resched() could go away given
lazy preemption, but I would not be surprised if there were some that
needed to stay around.
Your thought being that if *all* instance of cond_resched() go away,
then PREEMPT_NONE also goes away? Given how long PREEMPT_NONE has been
around, this would need to be done (and communicated) quite carefully.
> Also, a man can dream!
Fair enough, just be very careful to distinguish dreams from reality. ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-18 23:24 ` Paul E. McKenney
@ 2024-10-19 1:07 ` Ankur Arora
2024-10-19 4:30 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-19 1:07 UTC (permalink / raw)
To: paulmck
Cc: Ankur Arora, Sebastian Andrzej Siewior, Peter Zijlstra,
linux-kernel, tglx, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, frederic,
efault
Paul E. McKenney <paulmck@kernel.org> writes:
> On Fri, Oct 18, 2024 at 12:18:04PM -0700, Ankur Arora wrote:
>>
>> Paul E. McKenney <paulmck@kernel.org> writes:
>>
>> > On Thu, Oct 17, 2024 at 03:50:46PM -0700, Ankur Arora wrote:
>> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>> >> > On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
>> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
>> >> >>
>> [ ... ]
>> >> Sure. But, that's just begging the question.
>> >>
>> >> We want _NONE and _VOLUNTARY to go away because keeping cond_resched()
>> >> around incurs a cost.
>> >
>> > When you say "go away", do you mean for your use cases? Or globally?
>>
>> When I say "want _ to go away" I mean: cond_resched() is deleterious
>> to performance since you are forced to have code which can do the
>> rescheduling check synchronously -- when this could easily be done
>> asynchronously (as the non voluntary models do).
>>
>> And this either means poor performance (ex. in the page zeroing code
>> where it would be more optimal to work on continguous ranges) or
>> gyrations like the ones that xen_pv_evtchn_do_upcall() and the
>> Xen hypervisor have to go through.
>>
>> And, as we've discussed before, the cond_resched() interface, while it
>> works, is not ideal.
>
> I would expect that many instances of cond_resched() could go away given
> lazy preemption, but I would not be surprised if there were some that
> needed to stay around.
>
> Your thought being that if *all* instance of cond_resched() go away,
> then PREEMPT_NONE also goes away?
If *all* instances of cond_resched() go away, is there anything left of
PREEMPT_NONE?
> Given how long PREEMPT_NONE has been
> around, this would need to be done (and communicated) quite carefully.
I don't think there's any question about that.
>> Also, a man can dream!
>
> Fair enough, just be very careful to distinguish dreams from reality. ;-)
I've generally not found that to be a problem, but thanks for the warning.
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-19 1:07 ` Ankur Arora
@ 2024-10-19 4:30 ` Paul E. McKenney
0 siblings, 0 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-19 4:30 UTC (permalink / raw)
To: Ankur Arora
Cc: Sebastian Andrzej Siewior, Peter Zijlstra, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
On Fri, Oct 18, 2024 at 06:07:52PM -0700, Ankur Arora wrote:
>
> Paul E. McKenney <paulmck@kernel.org> writes:
>
> > On Fri, Oct 18, 2024 at 12:18:04PM -0700, Ankur Arora wrote:
> >>
> >> Paul E. McKenney <paulmck@kernel.org> writes:
> >>
> >> > On Thu, Oct 17, 2024 at 03:50:46PM -0700, Ankur Arora wrote:
> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> >> >> > On 2024-10-15 15:13:46 [-0700], Ankur Arora wrote:
> >> >> >> Sebastian Andrzej Siewior <bigeasy@linutronix.de> writes:
> >> >> >>
> >> [ ... ]
> >> >> Sure. But, that's just begging the question.
> >> >>
> >> >> We want _NONE and _VOLUNTARY to go away because keeping cond_resched()
> >> >> around incurs a cost.
> >> >
> >> > When you say "go away", do you mean for your use cases? Or globally?
> >>
> >> When I say "want _ to go away" I mean: cond_resched() is deleterious
> >> to performance since you are forced to have code which can do the
> >> rescheduling check synchronously -- when this could easily be done
> >> asynchronously (as the non voluntary models do).
> >>
> >> And this either means poor performance (ex. in the page zeroing code
> >> where it would be more optimal to work on continguous ranges) or
> >> gyrations like the ones that xen_pv_evtchn_do_upcall() and the
> >> Xen hypervisor have to go through.
> >>
> >> And, as we've discussed before, the cond_resched() interface, while it
> >> works, is not ideal.
> >
> > I would expect that many instances of cond_resched() could go away given
> > lazy preemption, but I would not be surprised if there were some that
> > needed to stay around.
> >
> > Your thought being that if *all* instance of cond_resched() go away,
> > then PREEMPT_NONE also goes away?
>
> If *all* instances of cond_resched() go away, is there anything left of
> PREEMPT_NONE?
Yes, namely its relationship to PREEMPT_RCU. Which, yes, can be adjusted,
perhaps even as Sebastian suggested. But such an adjustment cannot be
applied suddenly without warning.
> > Given how long PREEMPT_NONE has been
> > around, this would need to be done (and communicated) quite carefully.
>
> I don't think there's any question about that.
Whew!!!
> >> Also, a man can dream!
> >
> > Fair enough, just be very careful to distinguish dreams from reality. ;-)
>
> I've generally not found that to be a problem, but thanks for the warning.
"It is a service that I provide." ;-)
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-18 17:38 ` Paul E. McKenney
@ 2024-10-21 11:27 ` Sebastian Andrzej Siewior
2024-10-21 16:48 ` Paul E. McKenney
0 siblings, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-21 11:27 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-18 10:38:15 [-0700], Paul E. McKenney wrote:
> > > > I don't think this is always the case because the "preemptible" users
> > > > would also get this and this is an unexpected change for them.
> > >
> > > Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
> > no, not yet. It is only adding PREEMPT_LAZY as new model, next to
> > PREEMPT_NONE and PREEMPT_VOLUNTARY. But is is likely to be on schedule.
> >
> > > As conceived last time around, the change would affect only kernels
> > > built with one of the other of those two Kconfig options, which will
> > > not be users expecting preemption.
> >
> > If you continue to use PREEMPT_NONE/ PREEMPT_VOLUNTARY nothing changes
> > right now.
>
> Good, thank you!
>
> Presumably PREEMPT_NONE=y && PREEMPT_LAZY=y enables lazy preemption,
> but retains non-preemptible RCU.
PREEMPT_NONE=y && PREEMPT_LAZY=y is exclusive. Either NONE or LAZY.
> > > If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
> > > far better to make PREEMPT_RCU depend on neither of those being set.
> > > That would leave the RCU Kconfig settings fully automatic, and this
> > > automation is not to be abandoned lightly.
> >
> > Yes, that was my intention - only to make is selectable with
> > LAZY-preemption enabled but without dynamic.
> > So you are not complete against it.
>
> Help me out here. In what situation is it beneficial to make PREEMPT_RCU
> visible, given that PREEMPT_NONE, PREEMPT_VOLUNTARY, PREEMPT, and
> PREEMPT_FULL already take care of this automatically?
We have now NONE, VOLUNTARY, PREEMPT, PREEMPT_RT (as in choose one).
This series changes it to NONE, VOLUNTARY, PREEMPT, LAZY, LAZIEST.
Ignore LAZIEST. PREEMPT_RT is a on/ off bool.
Based on my understanding so far, you have nothing to worry about.
With NONE + VOLUNTARY removed in favor of LAZY or without the removal
(yet) you ask yourself what happens to those using NONE, go to LAZY and
want to stay with !PREEMPT_RCU, right?
With LAZY and !PREEMPT_DYNAMIC there is still PREEMPT_RCU as of now.
And you say people using !PREEMPT_DYNAMIC + LAZY are the old NONE/
VOLUNTARY users and want !PREEMPT_RCU.
This could be true but it could also attract people from PREEMPT which
expect additional performance gain due to LAZY and the same "preemption"
level. Additionally if PREEMPT gets removed because LAZY turns out to be
superior then PREEMPT_DYNAMIC makes probably no sense since there is
nothing to switch from/ to.
Therefore I would suggest to make PREEMPT_RCU
- selectable for LAZY && !PREEMPT_DYNAMIC, default yes
- selected for LAZY && PREEMPT_DYNAMIC
- the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
!PREEMPT_DYNAMIC)
Does this make sense to you?
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-21 11:27 ` Sebastian Andrzej Siewior
@ 2024-10-21 16:48 ` Paul E. McKenney
2024-10-21 19:20 ` Ankur Arora
2024-10-22 14:09 ` Sebastian Andrzej Siewior
0 siblings, 2 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-21 16:48 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Mon, Oct 21, 2024 at 01:27:55PM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-18 10:38:15 [-0700], Paul E. McKenney wrote:
> > > > > I don't think this is always the case because the "preemptible" users
> > > > > would also get this and this is an unexpected change for them.
> > > >
> > > > Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
> > > no, not yet. It is only adding PREEMPT_LAZY as new model, next to
> > > PREEMPT_NONE and PREEMPT_VOLUNTARY. But is is likely to be on schedule.
> > >
> > > > As conceived last time around, the change would affect only kernels
> > > > built with one of the other of those two Kconfig options, which will
> > > > not be users expecting preemption.
> > >
> > > If you continue to use PREEMPT_NONE/ PREEMPT_VOLUNTARY nothing changes
> > > right now.
> >
> > Good, thank you!
> >
> > Presumably PREEMPT_NONE=y && PREEMPT_LAZY=y enables lazy preemption,
> > but retains non-preemptible RCU.
>
> PREEMPT_NONE=y && PREEMPT_LAZY=y is exclusive. Either NONE or LAZY.
Ah, I was thinking in terms of the previous lazy-preemption patch series,
and just now got around to looking at the new one. Apologies for my
confusion!
> > > > If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
> > > > far better to make PREEMPT_RCU depend on neither of those being set.
> > > > That would leave the RCU Kconfig settings fully automatic, and this
> > > > automation is not to be abandoned lightly.
> > >
> > > Yes, that was my intention - only to make is selectable with
> > > LAZY-preemption enabled but without dynamic.
> > > So you are not complete against it.
> >
> > Help me out here. In what situation is it beneficial to make PREEMPT_RCU
> > visible, given that PREEMPT_NONE, PREEMPT_VOLUNTARY, PREEMPT, and
> > PREEMPT_FULL already take care of this automatically?
>
> We have now NONE, VOLUNTARY, PREEMPT, PREEMPT_RT (as in choose one).
>
> This series changes it to NONE, VOLUNTARY, PREEMPT, LAZY, LAZIEST.
> Ignore LAZIEST. PREEMPT_RT is a on/ off bool.
In terms of preemptibility, isn't the order from least to most preemptible
NONE, VOLUNTARY, LAZIEST, LAZY, PREEMPT, and PREEMPT_RT? After all,
PREEMPT will preempt more aggressively than will LAZY which in turn
preempts more aggressively than LAZIEST.
And I finally do see the later patch in Peter's series that removes
PREEMPT_RT from the choice. Plus I need to look more at LAZIEST in
order to work out whether Peter's suckage is our robustness. ;-)
> Based on my understanding so far, you have nothing to worry about.
>
> With NONE + VOLUNTARY removed in favor of LAZY or without the removal
> (yet) you ask yourself what happens to those using NONE, go to LAZY and
> want to stay with !PREEMPT_RCU, right?
> With LAZY and !PREEMPT_DYNAMIC there is still PREEMPT_RCU as of now.
> And you say people using !PREEMPT_DYNAMIC + LAZY are the old NONE/
> VOLUNTARY users and want !PREEMPT_RCU.
Yes.
> This could be true but it could also attract people from PREEMPT which
> expect additional performance gain due to LAZY and the same "preemption"
> level. Additionally if PREEMPT gets removed because LAZY turns out to be
> superior then PREEMPT_DYNAMIC makes probably no sense since there is
> nothing to switch from/ to.
We definitely have users that would migrate from NONE to LAZY. Shouldn't
their needs outweigh the possible future users that might (or might not)
find that (1) LAZY might be preferable to PREEMPT for some users and
(2) That those users would prefer that RCU be preemptible?
> Therefore I would suggest to make PREEMPT_RCU
> - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> - selected for LAZY && PREEMPT_DYNAMIC
> - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> !PREEMPT_DYNAMIC)
>
> Does this make sense to you?
Not really. At the very least, default no.
Unless LAZIEST makes the most sense for us (which will take time to
figure out), in which case make PREMPT_RCU:
- hard-defined =n for LAZIEST.
- selectable for LAZY && !PREEMPT_DYNAMIC, default yes
- selected for LAZY && PREEMPT_DYNAMIC
- the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
!PREEMPT_DYNAMIC)
Or am I still missing some aspect of this series?
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-21 16:48 ` Paul E. McKenney
@ 2024-10-21 19:20 ` Ankur Arora
2024-10-22 23:49 ` Paul E. McKenney
2024-10-22 14:09 ` Sebastian Andrzej Siewior
1 sibling, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-21 19:20 UTC (permalink / raw)
To: paulmck
Cc: Sebastian Andrzej Siewior, Peter Zijlstra, Ankur Arora,
linux-kernel, tglx, mingo, juri.lelli, vincent.guittot,
dietmar.eggemann, rostedt, bsegall, mgorman, vschneid, frederic,
efault
Paul E. McKenney <paulmck@kernel.org> writes:
> On Mon, Oct 21, 2024 at 01:27:55PM +0200, Sebastian Andrzej Siewior wrote:
>> On 2024-10-18 10:38:15 [-0700], Paul E. McKenney wrote:
>> > > > > I don't think this is always the case because the "preemptible" users
>> > > > > would also get this and this is an unexpected change for them.
>> > > >
>> > > > Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
>> > > no, not yet. It is only adding PREEMPT_LAZY as new model, next to
>> > > PREEMPT_NONE and PREEMPT_VOLUNTARY. But is is likely to be on schedule.
>> > >
>> > > > As conceived last time around, the change would affect only kernels
>> > > > built with one of the other of those two Kconfig options, which will
>> > > > not be users expecting preemption.
>> > >
>> > > If you continue to use PREEMPT_NONE/ PREEMPT_VOLUNTARY nothing changes
>> > > right now.
>> >
>> > Good, thank you!
>> >
>> > Presumably PREEMPT_NONE=y && PREEMPT_LAZY=y enables lazy preemption,
>> > but retains non-preemptible RCU.
>>
>> PREEMPT_NONE=y && PREEMPT_LAZY=y is exclusive. Either NONE or LAZY.
>
> Ah, I was thinking in terms of the previous lazy-preemption patch series,
> and just now got around to looking at the new one. Apologies for my
> confusion!
Minor point, but you might be thinking of PREEMPT_AUTO=y && PREEMPT_LAZY=y.
>> > > > If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
>> > > > far better to make PREEMPT_RCU depend on neither of those being set.
>> > > > That would leave the RCU Kconfig settings fully automatic, and this
>> > > > automation is not to be abandoned lightly.
>> > >
>> > > Yes, that was my intention - only to make is selectable with
>> > > LAZY-preemption enabled but without dynamic.
>> > > So you are not complete against it.
>> >
>> > Help me out here. In what situation is it beneficial to make PREEMPT_RCU
>> > visible, given that PREEMPT_NONE, PREEMPT_VOLUNTARY, PREEMPT, and
>> > PREEMPT_FULL already take care of this automatically?
>>
>> We have now NONE, VOLUNTARY, PREEMPT, PREEMPT_RT (as in choose one).
>>
>> This series changes it to NONE, VOLUNTARY, PREEMPT, LAZY, LAZIEST.
>> Ignore LAZIEST. PREEMPT_RT is a on/ off bool.
>
> In terms of preemptibility, isn't the order from least to most preemptible
> NONE, VOLUNTARY, LAZIEST, LAZY, PREEMPT, and PREEMPT_RT? After all,
> PREEMPT will preempt more aggressively than will LAZY which in turn
> preempts more aggressively than LAZIEST.
>
> And I finally do see the later patch in Peter's series that removes
> PREEMPT_RT from the choice. Plus I need to look more at LAZIEST in
> order to work out whether Peter's suckage is our robustness. ;-)
>
>> Based on my understanding so far, you have nothing to worry about.
>>
>> With NONE + VOLUNTARY removed in favor of LAZY or without the removal
>> (yet) you ask yourself what happens to those using NONE, go to LAZY and
>> want to stay with !PREEMPT_RCU, right?
>> With LAZY and !PREEMPT_DYNAMIC there is still PREEMPT_RCU as of now.
>> And you say people using !PREEMPT_DYNAMIC + LAZY are the old NONE/
>> VOLUNTARY users and want !PREEMPT_RCU.
>
> Yes.
>
>> This could be true but it could also attract people from PREEMPT which
>> expect additional performance gain due to LAZY and the same "preemption"
>> level. Additionally if PREEMPT gets removed because LAZY turns out to be
>> superior then PREEMPT_DYNAMIC makes probably no sense since there is
>> nothing to switch from/ to.
>
> We definitely have users that would migrate from NONE to LAZY. Shouldn't
Indeed. This was the original intent behind Thomas's proposal of preempt
lazy.
> their needs outweigh the possible future users that might (or might not)
> find that (1) LAZY might be preferable to PREEMPT for some users and
> (2) That those users would prefer that RCU be preemptible?
Users who care about low latency already have perfectly good options:
PREEMPT, PREEMPT_DYNAMIC and now PREEMPT_RT.
I don't see the point of elevating low latency needs in all preemption
models -- even those which are meant to be througput oriented.
>> Therefore I would suggest to make PREEMPT_RCU
>> - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
>> - selected for LAZY && PREEMPT_DYNAMIC
>> - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
>> !PREEMPT_DYNAMIC)
>>
>> Does this make sense to you?
>
> Not really. At the very least, default no.
>
> Unless LAZIEST makes the most sense for us (which will take time to
> figure out), in which case make PREMPT_RCU:
I don't think laziest was ever meant to be a serious option.
Peter mentioned that he'll be dropping it:
https://lore.kernel.org/lkml/20241008144049.GF14587@noisy.programming.kicks-ass.net/
Ankur
> - hard-defined =n for LAZIEST.
> - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> - selected for LAZY && PREEMPT_DYNAMIC
> - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> !PREEMPT_DYNAMIC)
>
> Or am I still missing some aspect of this series?
>
> Thanx, Paul
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-16 9:36 ` Shrikanth Hegde
@ 2024-10-21 19:21 ` Ankur Arora
2024-10-22 5:41 ` Shrikanth Hegde
0 siblings, 1 reply; 75+ messages in thread
From: Ankur Arora @ 2024-10-21 19:21 UTC (permalink / raw)
To: Shrikanth Hegde
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo, bigeasy,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
Shrikanth Hegde <sshegde@linux.ibm.com> writes:
> On 10/9/24 22:24, Ankur Arora wrote:
>> resched_latency_warn() now also warns if TIF_NEED_RESCHED_LAZY is set
>> without rescheduling for more than the latency_warn_ms period.
>>
>
> I am bit confused here. Why do we need to warn if LAZY is set for a long time?
>
> If lazy set, the subsequent tick, it would be set to upgraded to NEED_RESCHED.
>
> Since the value of latency_warn_ms=100ms, that means even on system with HZ=100,
> that means 10 ticks before that warning would be printed no?
That's a fair point. However, the assumption there is that there are no
bugs in upgrade on tick or that there's no situation in which the tick
is off for a prolonged period.
Ankur
> IIUC, the changelog c006fac556e40 ("sched: Warn on long periods of pending
> need_resched") has the concern of need_resched set but if it is non-preemptible
> kernel it would spend a lot of time in kernel mode. In that case print a
> warning.
>
> If someone enables Lazy, that means it is preemptible and probably this whole
> notion of resched_latency_warn doesn't apply to lazy. Please correct me if i am
> not understanding this correctly.
>
>> Cc: Ingo Molnar <mingo@redhat.com>
>> Cc: Peter Ziljstra <peterz@infradead.org>
>> Cc: Juri Lelli <juri.lelli@redhat.com>
>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>> ---
>> kernel/sched/core.c | 2 +-
>> kernel/sched/debug.c | 7 +++++--
>> 2 files changed, 6 insertions(+), 3 deletions(-)
>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>> index 694bfcf153cb..1229766b704e 100644
>> --- a/kernel/sched/core.c
>> +++ b/kernel/sched/core.c
>> @@ -5571,7 +5571,7 @@ static u64 cpu_resched_latency(struct rq *rq)
>> if (sysctl_resched_latency_warn_once && warned_once)
>> return 0;
>> - if (!need_resched() || !latency_warn_ms)
>> + if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
>> return 0;
>> if (system_state == SYSTEM_BOOTING)
>> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
>> index 9abcc6ead11b..f0d551ba64bb 100644
>> --- a/kernel/sched/debug.c
>> +++ b/kernel/sched/debug.c
>> @@ -1293,9 +1293,12 @@ void proc_sched_set_task(struct task_struct *p)
>> void resched_latency_warn(int cpu, u64 latency)
>> {
>> static DEFINE_RATELIMIT_STATE(latency_check_ratelimit, 60 * 60 * HZ, 1);
>> + char *nr;
>> +
>> + nr = tif_need_resched() ? "need_resched" : "need_resched_lazy";
>> WARN(__ratelimit(&latency_check_ratelimit),
>> - "sched: CPU %d need_resched set for > %llu ns (%d ticks) "
>> + "sched: CPU %d %s set for > %llu ns (%d ticks) "
>> "without schedule\n",
>> - cpu, latency, cpu_rq(cpu)->ticks_without_resched);
>> + cpu, nr, latency, cpu_rq(cpu)->ticks_without_resched);
>> }
--
ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY
2024-10-21 19:21 ` Ankur Arora
@ 2024-10-22 5:41 ` Shrikanth Hegde
0 siblings, 0 replies; 75+ messages in thread
From: Shrikanth Hegde @ 2024-10-22 5:41 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault
On 10/22/24 00:51, Ankur Arora wrote:
>
> Shrikanth Hegde <sshegde@linux.ibm.com> writes:
>
>> On 10/9/24 22:24, Ankur Arora wrote:
>>> resched_latency_warn() now also warns if TIF_NEED_RESCHED_LAZY is set
>>> without rescheduling for more than the latency_warn_ms period.
>>>
>>
>> I am bit confused here. Why do we need to warn if LAZY is set for a long time?
>>
>> If lazy set, the subsequent tick, it would be set to upgraded to NEED_RESCHED.
>>
>> Since the value of latency_warn_ms=100ms, that means even on system with HZ=100,
>> that means 10 ticks before that warning would be printed no?
>
> That's a fair point. However, the assumption there is that there are no
> bugs in upgrade on tick or that there's no situation in which the tick
> is off for a prolonged period.
>
ok.
But if tick is off, then ticks_without_resched isn't incremented either.
IIUC, this check is for situation when NR is set and tick is on.
> Ankur
>
>> IIUC, the changelog c006fac556e40 ("sched: Warn on long periods of pending
>> need_resched") has the concern of need_resched set but if it is non-preemptible
>> kernel it would spend a lot of time in kernel mode. In that case print a
>> warning.
>>
>> If someone enables Lazy, that means it is preemptible and probably this whole
>> notion of resched_latency_warn doesn't apply to lazy. Please correct me if i am
>> not understanding this correctly.
>>
>>> Cc: Ingo Molnar <mingo@redhat.com>
>>> Cc: Peter Ziljstra <peterz@infradead.org>
>>> Cc: Juri Lelli <juri.lelli@redhat.com>
>>> Cc: Vincent Guittot <vincent.guittot@linaro.org>
>>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>>> ---
>>> kernel/sched/core.c | 2 +-
>>> kernel/sched/debug.c | 7 +++++--
>>> 2 files changed, 6 insertions(+), 3 deletions(-)
>>> diff --git a/kernel/sched/core.c b/kernel/sched/core.c
>>> index 694bfcf153cb..1229766b704e 100644
>>> --- a/kernel/sched/core.c
>>> +++ b/kernel/sched/core.c
>>> @@ -5571,7 +5571,7 @@ static u64 cpu_resched_latency(struct rq *rq)
>>> if (sysctl_resched_latency_warn_once && warned_once)
>>> return 0;
>>> - if (!need_resched() || !latency_warn_ms)
>>> + if ((!need_resched() && !tif_need_resched_lazy()) || !latency_warn_ms)
>>> return 0;
>>> if (system_state == SYSTEM_BOOTING)
>>> diff --git a/kernel/sched/debug.c b/kernel/sched/debug.c
>>> index 9abcc6ead11b..f0d551ba64bb 100644
>>> --- a/kernel/sched/debug.c
>>> +++ b/kernel/sched/debug.c
>>> @@ -1293,9 +1293,12 @@ void proc_sched_set_task(struct task_struct *p)
>>> void resched_latency_warn(int cpu, u64 latency)
>>> {
>>> static DEFINE_RATELIMIT_STATE(latency_check_ratelimit, 60 * 60 * HZ, 1);
>>> + char *nr;
>>> +
>>> + nr = tif_need_resched() ? "need_resched" : "need_resched_lazy";
>>> WARN(__ratelimit(&latency_check_ratelimit),
>>> - "sched: CPU %d need_resched set for > %llu ns (%d ticks) "
>>> + "sched: CPU %d %s set for > %llu ns (%d ticks) "
>>> "without schedule\n",
>>> - cpu, latency, cpu_rq(cpu)->ticks_without_resched);
>>> + cpu, nr, latency, cpu_rq(cpu)->ticks_without_resched);
>>> }
>
>
> --
> ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-21 16:48 ` Paul E. McKenney
2024-10-21 19:20 ` Ankur Arora
@ 2024-10-22 14:09 ` Sebastian Andrzej Siewior
2024-10-22 23:54 ` Paul E. McKenney
1 sibling, 1 reply; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-22 14:09 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-21 09:48:03 [-0700], Paul E. McKenney wrote:
> > We have now NONE, VOLUNTARY, PREEMPT, PREEMPT_RT (as in choose one).
> >
> > This series changes it to NONE, VOLUNTARY, PREEMPT, LAZY, LAZIEST.
> > Ignore LAZIEST. PREEMPT_RT is a on/ off bool.
>
> In terms of preemptibility, isn't the order from least to most preemptible
> NONE, VOLUNTARY, LAZIEST, LAZY, PREEMPT, and PREEMPT_RT? After all,
> PREEMPT will preempt more aggressively than will LAZY which in turn
> preempts more aggressively than LAZIEST.
>
> And I finally do see the later patch in Peter's series that removes
> PREEMPT_RT from the choice. Plus I need to look more at LAZIEST in
> order to work out whether Peter's suckage is our robustness. ;-)
For LAZIEST PeterZ added "do we want this?". I haven't tested this but
since there is no forced preemption at all, it should be what is NONE
without cond_resched() & friends. So I don't know if it stays, I don't
think so.
> > Based on my understanding so far, you have nothing to worry about.
> >
> > With NONE + VOLUNTARY removed in favor of LAZY or without the removal
> > (yet) you ask yourself what happens to those using NONE, go to LAZY and
> > want to stay with !PREEMPT_RCU, right?
> > With LAZY and !PREEMPT_DYNAMIC there is still PREEMPT_RCU as of now.
> > And you say people using !PREEMPT_DYNAMIC + LAZY are the old NONE/
> > VOLUNTARY users and want !PREEMPT_RCU.
>
> Yes.
>
> > This could be true but it could also attract people from PREEMPT which
> > expect additional performance gain due to LAZY and the same "preemption"
> > level. Additionally if PREEMPT gets removed because LAZY turns out to be
> > superior then PREEMPT_DYNAMIC makes probably no sense since there is
> > nothing to switch from/ to.
>
> We definitely have users that would migrate from NONE to LAZY. Shouldn't
> their needs outweigh the possible future users that might (or might not)
> find that (1) LAZY might be preferable to PREEMPT for some users and
> (2) That those users would prefer that RCU be preemptible?
Yes. I have no idea which of those two (PREEMPT_RCU vs !PREEMPT_RCU) is
to be preferred. Therefore I'm suggesting to make configurable here.
If you have a benchmark for memory consumption or anything else of
interest, I could throw it a box or two to get some numbers. I've been
looking at free memory at boot and this was almost the same (+- noise).
> > Therefore I would suggest to make PREEMPT_RCU
> > - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> > - selected for LAZY && PREEMPT_DYNAMIC
> > - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> > !PREEMPT_DYNAMIC)
> >
> > Does this make sense to you?
>
> Not really. At the very least, default no.
>
> Unless LAZIEST makes the most sense for us (which will take time to
> figure out), in which case make PREMPT_RCU:
>
> - hard-defined =n for LAZIEST.
> - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> - selected for LAZY && PREEMPT_DYNAMIC
> - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> !PREEMPT_DYNAMIC)
>
> Or am I still missing some aspect of this series?
no, that is perfect.
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-21 19:20 ` Ankur Arora
@ 2024-10-22 23:49 ` Paul E. McKenney
0 siblings, 0 replies; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-22 23:49 UTC (permalink / raw)
To: Ankur Arora
Cc: Sebastian Andrzej Siewior, Peter Zijlstra, linux-kernel, tglx,
mingo, juri.lelli, vincent.guittot, dietmar.eggemann, rostedt,
bsegall, mgorman, vschneid, frederic, efault
On Mon, Oct 21, 2024 at 12:20:58PM -0700, Ankur Arora wrote:
>
> Paul E. McKenney <paulmck@kernel.org> writes:
>
> > On Mon, Oct 21, 2024 at 01:27:55PM +0200, Sebastian Andrzej Siewior wrote:
> >> On 2024-10-18 10:38:15 [-0700], Paul E. McKenney wrote:
> >> > > > > I don't think this is always the case because the "preemptible" users
> >> > > > > would also get this and this is an unexpected change for them.
> >> > > >
> >> > > > Is this series now removing PREEMPT_NONE and PREEMPT_VOLUNTARY?
> >> > > no, not yet. It is only adding PREEMPT_LAZY as new model, next to
> >> > > PREEMPT_NONE and PREEMPT_VOLUNTARY. But is is likely to be on schedule.
> >> > >
> >> > > > As conceived last time around, the change would affect only kernels
> >> > > > built with one of the other of those two Kconfig options, which will
> >> > > > not be users expecting preemption.
> >> > >
> >> > > If you continue to use PREEMPT_NONE/ PREEMPT_VOLUNTARY nothing changes
> >> > > right now.
> >> >
> >> > Good, thank you!
> >> >
> >> > Presumably PREEMPT_NONE=y && PREEMPT_LAZY=y enables lazy preemption,
> >> > but retains non-preemptible RCU.
> >>
> >> PREEMPT_NONE=y && PREEMPT_LAZY=y is exclusive. Either NONE or LAZY.
> >
> > Ah, I was thinking in terms of the previous lazy-preemption patch series,
> > and just now got around to looking at the new one. Apologies for my
> > confusion!
>
> Minor point, but you might be thinking of PREEMPT_AUTO=y && PREEMPT_LAZY=y.
Entirely possible.
> >> > > > If PREEMPT_NONE and PREEMPT_VOLUNTARY are still around, it would be
> >> > > > far better to make PREEMPT_RCU depend on neither of those being set.
> >> > > > That would leave the RCU Kconfig settings fully automatic, and this
> >> > > > automation is not to be abandoned lightly.
> >> > >
> >> > > Yes, that was my intention - only to make is selectable with
> >> > > LAZY-preemption enabled but without dynamic.
> >> > > So you are not complete against it.
> >> >
> >> > Help me out here. In what situation is it beneficial to make PREEMPT_RCU
> >> > visible, given that PREEMPT_NONE, PREEMPT_VOLUNTARY, PREEMPT, and
> >> > PREEMPT_FULL already take care of this automatically?
> >>
> >> We have now NONE, VOLUNTARY, PREEMPT, PREEMPT_RT (as in choose one).
> >>
> >> This series changes it to NONE, VOLUNTARY, PREEMPT, LAZY, LAZIEST.
> >> Ignore LAZIEST. PREEMPT_RT is a on/ off bool.
> >
> > In terms of preemptibility, isn't the order from least to most preemptible
> > NONE, VOLUNTARY, LAZIEST, LAZY, PREEMPT, and PREEMPT_RT? After all,
> > PREEMPT will preempt more aggressively than will LAZY which in turn
> > preempts more aggressively than LAZIEST.
> >
> > And I finally do see the later patch in Peter's series that removes
> > PREEMPT_RT from the choice. Plus I need to look more at LAZIEST in
> > order to work out whether Peter's suckage is our robustness. ;-)
> >
> >> Based on my understanding so far, you have nothing to worry about.
> >>
> >> With NONE + VOLUNTARY removed in favor of LAZY or without the removal
> >> (yet) you ask yourself what happens to those using NONE, go to LAZY and
> >> want to stay with !PREEMPT_RCU, right?
> >> With LAZY and !PREEMPT_DYNAMIC there is still PREEMPT_RCU as of now.
> >> And you say people using !PREEMPT_DYNAMIC + LAZY are the old NONE/
> >> VOLUNTARY users and want !PREEMPT_RCU.
> >
> > Yes.
> >
> >> This could be true but it could also attract people from PREEMPT which
> >> expect additional performance gain due to LAZY and the same "preemption"
> >> level. Additionally if PREEMPT gets removed because LAZY turns out to be
> >> superior then PREEMPT_DYNAMIC makes probably no sense since there is
> >> nothing to switch from/ to.
> >
> > We definitely have users that would migrate from NONE to LAZY. Shouldn't
>
> Indeed. This was the original intent behind Thomas's proposal of preempt
> lazy.
Whew!!! That is my recollection as well.
> > their needs outweigh the possible future users that might (or might not)
> > find that (1) LAZY might be preferable to PREEMPT for some users and
> > (2) That those users would prefer that RCU be preemptible?
>
> Users who care about low latency already have perfectly good options:
> PREEMPT, PREEMPT_DYNAMIC and now PREEMPT_RT.
>
> I don't see the point of elevating low latency needs in all preemption
> models -- even those which are meant to be througput oriented.
Agreed!
> >> Therefore I would suggest to make PREEMPT_RCU
> >> - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> >> - selected for LAZY && PREEMPT_DYNAMIC
> >> - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> >> !PREEMPT_DYNAMIC)
> >>
> >> Does this make sense to you?
> >
> > Not really. At the very least, default no.
> >
> > Unless LAZIEST makes the most sense for us (which will take time to
> > figure out), in which case make PREMPT_RCU:
>
> I don't think laziest was ever meant to be a serious option.
>
> Peter mentioned that he'll be dropping it:
> https://lore.kernel.org/lkml/20241008144049.GF14587@noisy.programming.kicks-ass.net/
Well, if Peter is going to drop it, I won't be evaluating it! ;-)
Thanx, Paul
> Ankur
>
> > - hard-defined =n for LAZIEST.
> > - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> > - selected for LAZY && PREEMPT_DYNAMIC
> > - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> > !PREEMPT_DYNAMIC)
> >
> > Or am I still missing some aspect of this series?
> >
> > Thanx, Paul
>
>
> --
> ankur
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-22 14:09 ` Sebastian Andrzej Siewior
@ 2024-10-22 23:54 ` Paul E. McKenney
2024-10-23 6:58 ` Sebastian Andrzej Siewior
0 siblings, 1 reply; 75+ messages in thread
From: Paul E. McKenney @ 2024-10-22 23:54 UTC (permalink / raw)
To: Sebastian Andrzej Siewior
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On Tue, Oct 22, 2024 at 04:09:33PM +0200, Sebastian Andrzej Siewior wrote:
> On 2024-10-21 09:48:03 [-0700], Paul E. McKenney wrote:
> > > We have now NONE, VOLUNTARY, PREEMPT, PREEMPT_RT (as in choose one).
> > >
> > > This series changes it to NONE, VOLUNTARY, PREEMPT, LAZY, LAZIEST.
> > > Ignore LAZIEST. PREEMPT_RT is a on/ off bool.
> >
> > In terms of preemptibility, isn't the order from least to most preemptible
> > NONE, VOLUNTARY, LAZIEST, LAZY, PREEMPT, and PREEMPT_RT? After all,
> > PREEMPT will preempt more aggressively than will LAZY which in turn
> > preempts more aggressively than LAZIEST.
> >
> > And I finally do see the later patch in Peter's series that removes
> > PREEMPT_RT from the choice. Plus I need to look more at LAZIEST in
> > order to work out whether Peter's suckage is our robustness. ;-)
>
> For LAZIEST PeterZ added "do we want this?". I haven't tested this but
> since there is no forced preemption at all, it should be what is NONE
> without cond_resched() & friends. So I don't know if it stays, I don't
> think so.
I don't know of a compelling reason for it to.
> > > Based on my understanding so far, you have nothing to worry about.
> > >
> > > With NONE + VOLUNTARY removed in favor of LAZY or without the removal
> > > (yet) you ask yourself what happens to those using NONE, go to LAZY and
> > > want to stay with !PREEMPT_RCU, right?
> > > With LAZY and !PREEMPT_DYNAMIC there is still PREEMPT_RCU as of now.
> > > And you say people using !PREEMPT_DYNAMIC + LAZY are the old NONE/
> > > VOLUNTARY users and want !PREEMPT_RCU.
> >
> > Yes.
> >
> > > This could be true but it could also attract people from PREEMPT which
> > > expect additional performance gain due to LAZY and the same "preemption"
> > > level. Additionally if PREEMPT gets removed because LAZY turns out to be
> > > superior then PREEMPT_DYNAMIC makes probably no sense since there is
> > > nothing to switch from/ to.
> >
> > We definitely have users that would migrate from NONE to LAZY. Shouldn't
> > their needs outweigh the possible future users that might (or might not)
> > find that (1) LAZY might be preferable to PREEMPT for some users and
> > (2) That those users would prefer that RCU be preemptible?
>
> Yes. I have no idea which of those two (PREEMPT_RCU vs !PREEMPT_RCU) is
> to be preferred. Therefore I'm suggesting to make configurable here.
As Ankur noted, the original intent was to move people from both
PREEMPT_NONE and PREEMPT_VOLUNTARY to lazy preemption. This strongly
suggests setting the value of PREEMPT_RCU to n. Not just the default,
but the value. We need to have a definite non-speculative case for
forcing people to once again worry about RCU preemptibility, and I know
of no such case.
> If you have a benchmark for memory consumption or anything else of
> interest, I could throw it a box or two to get some numbers. I've been
> looking at free memory at boot and this was almost the same (+- noise).
Unfortunately, the benchmark is the fleet and all of the various
non-public applications that run on it. :-(
> > > Therefore I would suggest to make PREEMPT_RCU
> > > - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> > > - selected for LAZY && PREEMPT_DYNAMIC
> > > - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> > > !PREEMPT_DYNAMIC)
> > >
> > > Does this make sense to you?
> >
> > Not really. At the very least, default no.
> >
> > Unless LAZIEST makes the most sense for us (which will take time to
> > figure out), in which case make PREMPT_RCU:
> >
> > - hard-defined =n for LAZIEST.
> > - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> > - selected for LAZY && PREEMPT_DYNAMIC
> > - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> > !PREEMPT_DYNAMIC)
> >
> > Or am I still missing some aspect of this series?
>
> no, that is perfect.
And assuming LAZIEST is not to be with us much longer, this becomes:
- hard-defined to "no" for LAZY && !PREEMPT_DYNAMIC, just like
NONE or VOLUNTARY with !PREEMPT_DYNAMIC.
- selected for LAZY && PREEMPT_DYNAMIC
- the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
!PREEMPT_DYNAMIC)
Fair enough?
Thanx, Paul
^ permalink raw reply [flat|nested] 75+ messages in thread
* Re: [PATCH 2/7] rcu: limit PREEMPT_RCU configurations
2024-10-22 23:54 ` Paul E. McKenney
@ 2024-10-23 6:58 ` Sebastian Andrzej Siewior
0 siblings, 0 replies; 75+ messages in thread
From: Sebastian Andrzej Siewior @ 2024-10-23 6:58 UTC (permalink / raw)
To: Paul E. McKenney
Cc: Peter Zijlstra, Ankur Arora, linux-kernel, tglx, mingo,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault
On 2024-10-22 16:54:11 [-0700], Paul E. McKenney wrote:
> > Yes. I have no idea which of those two (PREEMPT_RCU vs !PREEMPT_RCU) is
> > to be preferred. Therefore I'm suggesting to make configurable here.
>
> As Ankur noted, the original intent was to move people from both
> PREEMPT_NONE and PREEMPT_VOLUNTARY to lazy preemption. This strongly
> suggests setting the value of PREEMPT_RCU to n. Not just the default,
> but the value. We need to have a definite non-speculative case for
> forcing people to once again worry about RCU preemptibility, and I know
> of no such case.
okay.
> > If you have a benchmark for memory consumption or anything else of
> > interest, I could throw it a box or two to get some numbers. I've been
> > looking at free memory at boot and this was almost the same (+- noise).
>
> Unfortunately, the benchmark is the fleet and all of the various
> non-public applications that run on it. :-(
I see.
> > > > Therefore I would suggest to make PREEMPT_RCU
> > > > - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> > > > - selected for LAZY && PREEMPT_DYNAMIC
> > > > - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> > > > !PREEMPT_DYNAMIC)
> > > >
> > > > Does this make sense to you?
> > >
> > > Not really. At the very least, default no.
> > >
> > > Unless LAZIEST makes the most sense for us (which will take time to
> > > figure out), in which case make PREMPT_RCU:
> > >
> > > - hard-defined =n for LAZIEST.
> > > - selectable for LAZY && !PREEMPT_DYNAMIC, default yes
> > > - selected for LAZY && PREEMPT_DYNAMIC
> > > - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> > > !PREEMPT_DYNAMIC)
> > >
> > > Or am I still missing some aspect of this series?
> >
> > no, that is perfect.
>
> And assuming LAZIEST is not to be with us much longer, this becomes:
>
> - hard-defined to "no" for LAZY && !PREEMPT_DYNAMIC, just like
> NONE or VOLUNTARY with !PREEMPT_DYNAMIC.
> - selected for LAZY && PREEMPT_DYNAMIC
> - the current unchanged state for NONE, VOLUNTARY, PREEMPT (with
> !PREEMPT_DYNAMIC)
>
> Fair enough?
Guess this hard no and worry later makes sense.
> Thanx, Paul
Sebastian
^ permalink raw reply [flat|nested] 75+ messages in thread
end of thread, other threads:[~2024-10-23 6:58 UTC | newest]
Thread overview: 75+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-10-09 16:54 [PATCH 0/7] Lazy preemption bits Ankur Arora
2024-10-09 16:54 ` [PATCH 1/7] sched: warn for high latency with TIF_NEED_RESCHED_LAZY Ankur Arora
2024-10-10 6:37 ` Sebastian Andrzej Siewior
2024-10-10 18:19 ` Ankur Arora
2024-10-13 9:44 ` kernel test robot
2024-10-13 9:54 ` kernel test robot
2024-10-16 9:36 ` Shrikanth Hegde
2024-10-21 19:21 ` Ankur Arora
2024-10-22 5:41 ` Shrikanth Hegde
2024-10-09 16:54 ` [PATCH 2/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
2024-10-09 18:01 ` Peter Zijlstra
2024-10-09 18:24 ` Paul E. McKenney
2024-10-09 20:52 ` Peter Zijlstra
2024-10-09 21:16 ` Paul E. McKenney
2024-10-10 7:58 ` Peter Zijlstra
2024-10-10 14:19 ` Paul E. McKenney
2024-10-10 6:32 ` Sebastian Andrzej Siewior
2024-10-10 8:10 ` Peter Zijlstra
2024-10-10 9:13 ` Sebastian Andrzej Siewior
2024-10-10 10:03 ` Peter Zijlstra
2024-10-10 10:26 ` Sebastian Andrzej Siewior
2024-10-10 10:44 ` Peter Zijlstra
2024-10-10 14:29 ` Paul E. McKenney
2024-10-11 8:18 ` Sebastian Andrzej Siewior
2024-10-11 13:59 ` Paul E. McKenney
2024-10-11 14:43 ` Sebastian Andrzej Siewior
2024-10-11 15:59 ` Paul E. McKenney
2024-10-15 11:22 ` Sebastian Andrzej Siewior
2024-10-15 22:13 ` Ankur Arora
2024-10-17 8:04 ` Sebastian Andrzej Siewior
2024-10-17 22:50 ` Ankur Arora
2024-10-18 17:43 ` Paul E. McKenney
2024-10-18 19:18 ` Ankur Arora
2024-10-18 23:24 ` Paul E. McKenney
2024-10-19 1:07 ` Ankur Arora
2024-10-19 4:30 ` Paul E. McKenney
2024-10-15 23:11 ` Paul E. McKenney
2024-10-17 7:07 ` Sebastian Andrzej Siewior
2024-10-18 17:38 ` Paul E. McKenney
2024-10-21 11:27 ` Sebastian Andrzej Siewior
2024-10-21 16:48 ` Paul E. McKenney
2024-10-21 19:20 ` Ankur Arora
2024-10-22 23:49 ` Paul E. McKenney
2024-10-22 14:09 ` Sebastian Andrzej Siewior
2024-10-22 23:54 ` Paul E. McKenney
2024-10-23 6:58 ` Sebastian Andrzej Siewior
2024-10-10 17:35 ` Ankur Arora
2024-10-11 7:58 ` Sebastian Andrzej Siewior
2024-10-15 23:01 ` Ankur Arora
2024-10-10 17:42 ` Ankur Arora
2024-10-09 16:54 ` [PATCH 3/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
2024-10-10 6:41 ` Sebastian Andrzej Siewior
2024-10-10 8:11 ` Peter Zijlstra
2024-10-10 14:29 ` Paul E. McKenney
2024-10-09 16:54 ` [PATCH 4/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
2024-10-09 19:05 ` Ankur Arora
2024-10-10 14:37 ` Paul E. McKenney
2024-10-10 17:59 ` Ankur Arora
2024-10-10 6:50 ` Sebastian Andrzej Siewior
2024-10-10 17:56 ` Ankur Arora
2024-10-11 7:52 ` Sebastian Andrzej Siewior
2024-10-09 16:54 ` [PATCH 5/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
2024-10-09 18:02 ` Peter Zijlstra
2024-10-09 18:52 ` Ankur Arora
2024-10-09 16:54 ` [PATCH 6/7] osnoise: handle quiescent states for PREEMPT_RCU=n, PREEMPTION=y Ankur Arora
2024-10-10 6:53 ` Sebastian Andrzej Siewior
2024-10-10 14:39 ` Paul E. McKenney
2024-10-10 17:50 ` Ankur Arora
2024-10-11 7:36 ` Sebastian Andrzej Siewior
2024-10-14 20:14 ` Ankur Arora
2024-10-09 16:54 ` [PATCH 7/7] powerpc: add support for PREEMPT_LAZY Ankur Arora
2024-10-10 7:22 ` Sebastian Andrzej Siewior
2024-10-10 18:10 ` Ankur Arora
2024-10-11 18:35 ` Shrikanth Hegde
2024-10-12 22:42 ` Michael Ellerman
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox