* [PATCH v3 0/7] RCU changes for PREEMPT_LAZY
@ 2024-12-13 4:06 Ankur Arora
2024-12-13 4:06 ` [PATCH v3 1/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
` (7 more replies)
0 siblings, 8 replies; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
This series adds RCU bits for lazy preemption.
The problem addressed is that pre-PREEMPT_LAZY, PREEMPTION=y implied
PREEMPT_RCU=y. With PREEMPT_LAZY, that's no longer true.
That's because PREEMPT_RCU makes some trade-offs to optimize for
latency as opposed to throughput, and configurations with limited
preemption might prefer the stronger forward-progress guarantees of
PREEMPT_RCU=n.
Accordingly, with standalone PREEMPT_LAZY (much like PREEMPT_NONE,
PREEMPT_VOLUNTARY) we want to use PREEMPT_RCU=n. And, when used in
conjunction with PREEMPT_DYNAMIC, we continue to use PREEMPT_RCU=y.
Patches 1-3 are cleanup patches:
"rcu: fix header guard for rcu_all_qs()"
"rcu: rename PREEMPT_AUTO to PREEMPT_LAZY"
"sched: update __cond_resched comment about RCU quiescent states"
Patch 4,
"rcu: handle unstable rdp in rcu_read_unlock_strict()"
handles a latent RCU bug rcu_report_qs_rdp() could be called with
an unstable rdp.
Patches 5 and 6,
"rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y"
"osnoise: provide quiescent states"
handle quiescent states for the (PREEMPT_LAZY=y, PREEMPT_RCU=n)
configuration.
And, finally patch 7, "rcu: limit PREEMPT_RCU configurations",
explicitly limits PREEMPT_RCU=y to the PREEMPT_DYNAMIC or the latency
oriented models.
Changelog:
v3:
- moved patch-3 to be the last one in the series (suggested by Sebastian)
- added "rcu: handle unstable rdp in rcu_read_unlock_strict()"
(suggested by Frederic Weisbecker).
- switched to a more robust check in rcu_flavor_sched_clock_irq()
(suggested by Frederic Weisbecker).
- simplified check in osnoise (suggested by Frederic Weisbecker).
- dropped an unrelated scheduler patch.
v2:
- fixup incorrect usage of tif_need_resched_lazy() (comment from
from Sebastian Andrzej Siewior)
- massaged the commit messages a bit
- drops the powerpc support for PREEMPT_LAZY as that was orthogonal
to this series (Shrikanth will send that out separately.)
Please review.
Ankur Arora (7):
rcu: fix header guard for rcu_all_qs()
rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
sched: update __cond_resched comment about RCU quiescent states
rcu: handle unstable rdp in rcu_read_unlock_strict()
rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
osnoise: provide quiescent states
rcu: limit PREEMPT_RCU configurations
include/linux/rcupdate.h | 2 +-
include/linux/rcutree.h | 2 +-
include/linux/srcutiny.h | 2 +-
kernel/rcu/Kconfig | 4 ++--
kernel/rcu/srcutiny.c | 14 +++++++-------
kernel/rcu/tree_plugin.h | 22 +++++++++++++++++-----
kernel/sched/core.c | 4 +++-
kernel/trace/trace_osnoise.c | 32 +++++++++++++++-----------------
8 files changed, 47 insertions(+), 35 deletions(-)
--
2.43.5
^ permalink raw reply [flat|nested] 15+ messages in thread
* [PATCH v3 1/7] rcu: fix header guard for rcu_all_qs()
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2024-12-13 4:06 ` [PATCH v3 2/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
` (6 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
rcu_all_qs() is defined for !CONFIG_PREEMPT_RCU but the declaration
is conditioned on CONFIG_PREEMPTION.
With CONFIG_PREEMPT_LAZY, CONFIG_PREEMPTION=y does not imply
CONFIG_PREEMPT_RCU=y.
Decouple the two.
Cc: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/linux/rcutree.h | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/include/linux/rcutree.h b/include/linux/rcutree.h
index 27d86d912781..aad586f15ed0 100644
--- a/include/linux/rcutree.h
+++ b/include/linux/rcutree.h
@@ -103,7 +103,7 @@ extern int rcu_scheduler_active;
void rcu_end_inkernel_boot(void);
bool rcu_inkernel_boot_has_ended(void);
bool rcu_is_watching(void);
-#ifndef CONFIG_PREEMPTION
+#ifndef CONFIG_PREEMPT_RCU
void rcu_all_qs(void);
#endif
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 2/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
2024-12-13 4:06 ` [PATCH v3 1/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2024-12-13 4:06 ` [PATCH v3 3/7] sched: update __cond_resched comment about RCU quiescent states Ankur Arora
` (5 subsequent siblings)
7 siblings, 0 replies; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
Replace mentions of PREEMPT_AUTO with PREEMPT_LAZY.
Also, since PREMPT_LAZY implies PREEMPTION, we can reduce the
TASKS_RCU selection criteria from:
NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
to:
NEED_TASKS_RCU && PREEMPTION
CC: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/linux/srcutiny.h | 2 +-
kernel/rcu/Kconfig | 2 +-
kernel/rcu/srcutiny.c | 14 +++++++-------
3 files changed, 9 insertions(+), 9 deletions(-)
diff --git a/include/linux/srcutiny.h b/include/linux/srcutiny.h
index 1321da803274..31b59b4be2a7 100644
--- a/include/linux/srcutiny.h
+++ b/include/linux/srcutiny.h
@@ -64,7 +64,7 @@ static inline int __srcu_read_lock(struct srcu_struct *ssp)
{
int idx;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
idx = ((READ_ONCE(ssp->srcu_idx) + 1) & 0x2) >> 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], READ_ONCE(ssp->srcu_lock_nesting[idx]) + 1);
preempt_enable();
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index b9b6bc55185d..e2206f3a070c 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -91,7 +91,7 @@ config NEED_TASKS_RCU
config TASKS_RCU
bool
- default NEED_TASKS_RCU && (PREEMPTION || PREEMPT_AUTO)
+ default NEED_TASKS_RCU && PREEMPTION
select IRQ_WORK
config FORCE_TASKS_RUDE_RCU
diff --git a/kernel/rcu/srcutiny.c b/kernel/rcu/srcutiny.c
index 4dcbf8aa80ff..f688bdad293e 100644
--- a/kernel/rcu/srcutiny.c
+++ b/kernel/rcu/srcutiny.c
@@ -98,7 +98,7 @@ void __srcu_read_unlock(struct srcu_struct *ssp, int idx)
{
int newval;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
newval = READ_ONCE(ssp->srcu_lock_nesting[idx]) - 1;
WRITE_ONCE(ssp->srcu_lock_nesting[idx], newval);
preempt_enable();
@@ -120,7 +120,7 @@ void srcu_drive_gp(struct work_struct *wp)
struct srcu_struct *ssp;
ssp = container_of(wp, struct srcu_struct, srcu_work);
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
if (ssp->srcu_gp_running || ULONG_CMP_GE(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max))) {
preempt_enable();
return; /* Already running or nothing to do. */
@@ -138,7 +138,7 @@ void srcu_drive_gp(struct work_struct *wp)
WRITE_ONCE(ssp->srcu_gp_waiting, true); /* srcu_read_unlock() wakes! */
preempt_enable();
swait_event_exclusive(ssp->srcu_wq, !READ_ONCE(ssp->srcu_lock_nesting[idx]));
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
WRITE_ONCE(ssp->srcu_gp_waiting, false); /* srcu_read_unlock() cheap. */
WRITE_ONCE(ssp->srcu_idx, ssp->srcu_idx + 1);
preempt_enable();
@@ -159,7 +159,7 @@ void srcu_drive_gp(struct work_struct *wp)
* at interrupt level, but the ->srcu_gp_running checks will
* straighten that out.
*/
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
WRITE_ONCE(ssp->srcu_gp_running, false);
idx = ULONG_CMP_LT(ssp->srcu_idx, READ_ONCE(ssp->srcu_idx_max));
preempt_enable();
@@ -172,7 +172,7 @@ static void srcu_gp_start_if_needed(struct srcu_struct *ssp)
{
unsigned long cookie;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
cookie = get_state_synchronize_srcu(ssp);
if (ULONG_CMP_GE(READ_ONCE(ssp->srcu_idx_max), cookie)) {
preempt_enable();
@@ -199,7 +199,7 @@ void call_srcu(struct srcu_struct *ssp, struct rcu_head *rhp,
rhp->func = func;
rhp->next = NULL;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
local_irq_save(flags);
*ssp->srcu_cb_tail = rhp;
ssp->srcu_cb_tail = &rhp->next;
@@ -261,7 +261,7 @@ unsigned long start_poll_synchronize_srcu(struct srcu_struct *ssp)
{
unsigned long ret;
- preempt_disable(); // Needed for PREEMPT_AUTO
+ preempt_disable(); // Needed for PREEMPT_LAZY
ret = get_state_synchronize_srcu(ssp);
srcu_gp_start_if_needed(ssp);
preempt_enable();
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 3/7] sched: update __cond_resched comment about RCU quiescent states
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
2024-12-13 4:06 ` [PATCH v3 1/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
2024-12-13 4:06 ` [PATCH v3 2/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2024-12-13 13:21 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 4/7] rcu: handle unstable rdp in rcu_read_unlock_strict() Ankur Arora
` (4 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
Update comment in __cond_resched() clarifying how urgently needed
quiescent state are provided.
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/sched/core.c | 4 +++-
1 file changed, 3 insertions(+), 1 deletion(-)
diff --git a/kernel/sched/core.c b/kernel/sched/core.c
index c6d8232ad9ee..4be3e4f2e54d 100644
--- a/kernel/sched/core.c
+++ b/kernel/sched/core.c
@@ -7281,7 +7281,7 @@ int __sched __cond_resched(void)
return 1;
}
/*
- * In preemptible kernels, ->rcu_read_lock_nesting tells the tick
+ * In PREEMPT_RCU kernels, ->rcu_read_lock_nesting tells the tick
* whether the current CPU is in an RCU read-side critical section,
* so the tick can report quiescent states even for CPUs looping
* in kernel context. In contrast, in non-preemptible kernels,
@@ -7290,6 +7290,8 @@ int __sched __cond_resched(void)
* RCU quiescent state. Therefore, the following code causes
* cond_resched() to report a quiescent state, but only when RCU
* is in urgent need of one.
+ * A third case, preemptible, but non-PREEMPT_RCU provides for
+ * urgently needed quiescent states via rcu_flavor_sched_clock_irq().
*/
#ifndef CONFIG_PREEMPT_RCU
rcu_all_qs();
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 4/7] rcu: handle unstable rdp in rcu_read_unlock_strict()
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
` (2 preceding siblings ...)
2024-12-13 4:06 ` [PATCH v3 3/7] sched: update __cond_resched comment about RCU quiescent states Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2024-12-13 13:38 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
` (3 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
rcu_read_unlock_strict() can be called with preemption enabled
which can make for an unstable rdp and a racy norm value.
Fix this by dropping the preempt-count in __rcu_read_unlock()
after the call to rcu_read_unlock_strict(), adjusting the
preempt-count check appropriately.
Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
include/linux/rcupdate.h | 2 +-
kernel/rcu/tree_plugin.h | 11 ++++++++++-
2 files changed, 11 insertions(+), 2 deletions(-)
diff --git a/include/linux/rcupdate.h b/include/linux/rcupdate.h
index 48e5c03df1dd..257e9ae34414 100644
--- a/include/linux/rcupdate.h
+++ b/include/linux/rcupdate.h
@@ -95,9 +95,9 @@ static inline void __rcu_read_lock(void)
static inline void __rcu_read_unlock(void)
{
- preempt_enable();
if (IS_ENABLED(CONFIG_RCU_STRICT_GRACE_PERIOD))
rcu_read_unlock_strict();
+ preempt_enable();
}
static inline int rcu_preempt_depth(void)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 3927ea5f7955..95a7c6c71a91 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -832,8 +832,17 @@ void rcu_read_unlock_strict(void)
{
struct rcu_data *rdp;
- if (irqs_disabled() || preempt_count() || !rcu_state.gp_kthread)
+ if (irqs_disabled() || in_atomic_preempt_off() || !rcu_state.gp_kthread)
return;
+
+ /*
+ * rcu_report_qs_rdp() can only be invoked with a stable rdp and
+ * from the local CPU.
+ *
+ * The in_atomic_preempt_off() check ensures that we come here holding
+ * the last preempt_count (which will get dropped once we return to
+ * __rcu_read_unlock().
+ */
rdp = this_cpu_ptr(&rcu_data);
rdp->cpu_no_qs.b.norm = false;
rcu_report_qs_rdp(rdp);
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
` (3 preceding siblings ...)
2024-12-13 4:06 ` [PATCH v3 4/7] rcu: handle unstable rdp in rcu_read_unlock_strict() Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2024-12-13 13:59 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 6/7] osnoise: provide quiescent states Ankur Arora
` (2 subsequent siblings)
7 siblings, 1 reply; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
states for read-side critical sections via rcu_all_qs().
One reason why this was needed: lacking preempt-count, the tick
handler has no way of knowing whether it is executing in a
read-side critical section or not.
With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), we get (PREEMPT_COUNT=y,
PREEMPT_RCU=n). In this configuration cond_resched() is a stub and
does not provide quiescent states via rcu_all_qs().
(PREEMPT_RCU=y provides this information via rcu_read_unlock() and
its nesting counter.)
So, use the availability of preempt_count() to report quiescent states
in rcu_flavor_sched_clock_irq().
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/rcu/tree_plugin.h | 11 +++++++----
1 file changed, 7 insertions(+), 4 deletions(-)
diff --git a/kernel/rcu/tree_plugin.h b/kernel/rcu/tree_plugin.h
index 95a7c6c71a91..c7f7820b5e18 100644
--- a/kernel/rcu/tree_plugin.h
+++ b/kernel/rcu/tree_plugin.h
@@ -983,13 +983,16 @@ static void rcu_preempt_check_blocked_tasks(struct rcu_node *rnp)
*/
static void rcu_flavor_sched_clock_irq(int user)
{
- if (user || rcu_is_cpu_rrupt_from_idle()) {
+ if (user || rcu_is_cpu_rrupt_from_idle() ||
+ (IS_ENABLED(CONFIG_PREEMPT_COUNT) &&
+ (preempt_count() == HARDIRQ_OFFSET))) {
/*
* Get here if this CPU took its interrupt from user
- * mode or from the idle loop, and if this is not a
- * nested interrupt. In this case, the CPU is in
- * a quiescent state, so note it.
+ * mode, from the idle loop without this being a nested
+ * interrupt, or while not holding the task preempt count
+ * (with PREEMPT_COUNT=y). In this case, the CPU is in a
+ * quiescent state, so note it.
*
* No memory barrier is required here because rcu_qs()
* references only CPU-local variables that other CPUs
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 6/7] osnoise: provide quiescent states
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
` (4 preceding siblings ...)
2024-12-13 4:06 ` [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2024-12-13 14:34 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 7/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
2025-01-08 1:14 ` [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Paul E. McKenney
7 siblings, 1 reply; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky, Daniel Bristot de Oliveira
To reduce RCU noise for nohz_full configurations, osnoise depends
on cond_resched() providing quiescent states for PREEMPT_RCU=n
configurations. For PREEMPT_RCU=y configurations -- where
cond_resched() is a stub -- we do this by directly calling
rcu_momentary_eqs().
With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), however, we have a
configuration with (PREEMPTION=y, PREEMPT_RCU=n) where neither
of the above can help.
Handle that by providing an explicit quiescent state here for all
configurations.
As mentioned above this is not needed for non-stubbed cond_resched(),
but, providing a quiescent state here just pulls in one that a future
cond_resched() would provide, so doesn't cause any extra work for
this configuration.
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
Cc: Steven Rostedt <rostedt@goodmis.org>
Suggested-by: Paul E. McKenney <paulmck@kernel.org>
Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/trace/trace_osnoise.c | 32 +++++++++++++++-----------------
1 file changed, 15 insertions(+), 17 deletions(-)
diff --git a/kernel/trace/trace_osnoise.c b/kernel/trace/trace_osnoise.c
index b9f96c77527d..2340ffcefb9d 100644
--- a/kernel/trace/trace_osnoise.c
+++ b/kernel/trace/trace_osnoise.c
@@ -1531,27 +1531,25 @@ static int run_osnoise(void)
/*
* In some cases, notably when running on a nohz_full CPU with
- * a stopped tick PREEMPT_RCU has no way to account for QSs.
- * This will eventually cause unwarranted noise as PREEMPT_RCU
- * will force preemption as the means of ending the current
- * grace period. We avoid this problem by calling
- * rcu_momentary_eqs(), which performs a zero duration
- * EQS allowing PREEMPT_RCU to end the current grace period.
- * This call shouldn't be wrapped inside an RCU critical
- * section.
+ * a stopped tick PREEMPT_RCU or PREEMPT_LAZY have no way to
+ * account for QSs. This will eventually cause unwarranted
+ * noise as RCU forces preemption as the means of ending the
+ * current grace period. We avoid this by calling
+ * rcu_momentary_eqs(), which performs a zero duration EQS
+ * allowing RCU to end the current grace period. This call
+ * shouldn't be wrapped inside an RCU critical section.
*
- * Note that in non PREEMPT_RCU kernels QSs are handled through
- * cond_resched()
+ * Normally QSs for other cases are handled through cond_resched().
+ * For simplicity, however, we call rcu_momentary_eqs() for all
+ * configurations here.
*/
- if (IS_ENABLED(CONFIG_PREEMPT_RCU)) {
- if (!disable_irq)
- local_irq_disable();
+ if (!disable_irq)
+ local_irq_disable();
- rcu_momentary_eqs();
+ rcu_momentary_eqs();
- if (!disable_irq)
- local_irq_enable();
- }
+ if (!disable_irq)
+ local_irq_enable();
/*
* For the non-preemptive kernel config: let threads runs, if
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* [PATCH v3 7/7] rcu: limit PREEMPT_RCU configurations
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
` (5 preceding siblings ...)
2024-12-13 4:06 ` [PATCH v3 6/7] osnoise: provide quiescent states Ankur Arora
@ 2024-12-13 4:06 ` Ankur Arora
2025-01-08 1:14 ` [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Paul E. McKenney
7 siblings, 0 replies; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 4:06 UTC (permalink / raw)
To: linux-kernel
Cc: peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, ankur.a.arora, efault, sshegde,
boris.ostrovsky
PREEMPT_LAZY can be enabled stand-alone or alongside PREEMPT_DYNAMIC
which allows for dynamic switching of preemption models.
The choice of PREEMPT_RCU or not, however, is fixed at compile time.
Given that PREEMPT_RCU makes some trade-offs to optimize for latency
as opposed to throughput, configurations with limited preemption
might prefer the stronger forward-progress guarantees of PREEMPT_RCU=n.
Accordingly, explicitly limit PREEMPT_RCU=y to the latency oriented
preemption models: PREEMPT, PREEMPT_RT, and the runtime configurable
model PREEMPT_DYNAMIC.
This means the throughput oriented models, PREEMPT_NONE,
PREEMPT_VOLUNTARY, and PREEMPT_LAZY will run with PREEMPT_RCU=n.
Cc: Paul E. McKenney <paulmck@kernel.org>
Cc: Peter Zijlstra <peterz@infradead.org>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
---
kernel/rcu/Kconfig | 2 +-
1 file changed, 1 insertion(+), 1 deletion(-)
diff --git a/kernel/rcu/Kconfig b/kernel/rcu/Kconfig
index e2206f3a070c..dd6251678e99 100644
--- a/kernel/rcu/Kconfig
+++ b/kernel/rcu/Kconfig
@@ -18,7 +18,7 @@ config TREE_RCU
config PREEMPT_RCU
bool
- default y if PREEMPTION
+ default y if (PREEMPT || PREEMPT_RT || PREEMPT_DYNAMIC)
select TREE_RCU
help
This option selects the RCU implementation that is
--
2.43.5
^ permalink raw reply related [flat|nested] 15+ messages in thread
* Re: [PATCH v3 3/7] sched: update __cond_resched comment about RCU quiescent states
2024-12-13 4:06 ` [PATCH v3 3/7] sched: update __cond_resched comment about RCU quiescent states Ankur Arora
@ 2024-12-13 13:21 ` Frederic Weisbecker
0 siblings, 0 replies; 15+ messages in thread
From: Frederic Weisbecker @ 2024-12-13 13:21 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, efault, sshegde, boris.ostrovsky
Le Thu, Dec 12, 2024 at 08:06:54PM -0800, Ankur Arora a écrit :
> Update comment in __cond_resched() clarifying how urgently needed
> quiescent state are provided.
>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 4/7] rcu: handle unstable rdp in rcu_read_unlock_strict()
2024-12-13 4:06 ` [PATCH v3 4/7] rcu: handle unstable rdp in rcu_read_unlock_strict() Ankur Arora
@ 2024-12-13 13:38 ` Frederic Weisbecker
0 siblings, 0 replies; 15+ messages in thread
From: Frederic Weisbecker @ 2024-12-13 13:38 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, efault, sshegde, boris.ostrovsky
Le Thu, Dec 12, 2024 at 08:06:55PM -0800, Ankur Arora a écrit :
> rcu_read_unlock_strict() can be called with preemption enabled
> which can make for an unstable rdp and a racy norm value.
>
> Fix this by dropping the preempt-count in __rcu_read_unlock()
> after the call to rcu_read_unlock_strict(), adjusting the
> preempt-count check appropriately.
>
> Suggested-by: Frederic Weisbecker <frederic@kernel.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-12-13 4:06 ` [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
@ 2024-12-13 13:59 ` Frederic Weisbecker
2024-12-13 20:44 ` Ankur Arora
0 siblings, 1 reply; 15+ messages in thread
From: Frederic Weisbecker @ 2024-12-13 13:59 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, efault, sshegde, boris.ostrovsky
Le Thu, Dec 12, 2024 at 08:06:56PM -0800, Ankur Arora a écrit :
> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
> states for read-side critical sections via rcu_all_qs().
> One reason why this was needed: lacking preempt-count, the tick
> handler has no way of knowing whether it is executing in a
> read-side critical section or not.
>
> With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), we get (PREEMPT_COUNT=y,
> PREEMPT_RCU=n). In this configuration cond_resched() is a stub and
> does not provide quiescent states via rcu_all_qs().
> (PREEMPT_RCU=y provides this information via rcu_read_unlock() and
> its nesting counter.)
>
> So, use the availability of preempt_count() to report quiescent states
> in rcu_flavor_sched_clock_irq().
>
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 6/7] osnoise: provide quiescent states
2024-12-13 4:06 ` [PATCH v3 6/7] osnoise: provide quiescent states Ankur Arora
@ 2024-12-13 14:34 ` Frederic Weisbecker
0 siblings, 0 replies; 15+ messages in thread
From: Frederic Weisbecker @ 2024-12-13 14:34 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, paulmck, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, efault, sshegde, boris.ostrovsky,
Daniel Bristot de Oliveira
Le Thu, Dec 12, 2024 at 08:06:57PM -0800, Ankur Arora a écrit :
> To reduce RCU noise for nohz_full configurations, osnoise depends
> on cond_resched() providing quiescent states for PREEMPT_RCU=n
> configurations. For PREEMPT_RCU=y configurations -- where
> cond_resched() is a stub -- we do this by directly calling
> rcu_momentary_eqs().
>
> With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), however, we have a
> configuration with (PREEMPTION=y, PREEMPT_RCU=n) where neither
> of the above can help.
>
> Handle that by providing an explicit quiescent state here for all
> configurations.
>
> As mentioned above this is not needed for non-stubbed cond_resched(),
> but, providing a quiescent state here just pulls in one that a future
> cond_resched() would provide, so doesn't cause any extra work for
> this configuration.
>
> Cc: Paul E. McKenney <paulmck@kernel.org>
> Cc: Daniel Bristot de Oliveira <bristot@kernel.org>
> Cc: Steven Rostedt <rostedt@goodmis.org>
> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
> Acked-by: Daniel Bristot de Oliveira <bristot@kernel.org>
> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
2024-12-13 13:59 ` Frederic Weisbecker
@ 2024-12-13 20:44 ` Ankur Arora
0 siblings, 0 replies; 15+ messages in thread
From: Ankur Arora @ 2024-12-13 20:44 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: Ankur Arora, linux-kernel, peterz, tglx, paulmck, mingo, bigeasy,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, efault, sshegde, boris.ostrovsky
Frederic Weisbecker <frederic@kernel.org> writes:
> Le Thu, Dec 12, 2024 at 08:06:56PM -0800, Ankur Arora a écrit :
>> With PREEMPT_RCU=n, cond_resched() provides urgently needed quiescent
>> states for read-side critical sections via rcu_all_qs().
>> One reason why this was needed: lacking preempt-count, the tick
>> handler has no way of knowing whether it is executing in a
>> read-side critical section or not.
>>
>> With (PREEMPT_LAZY=y, PREEMPT_DYNAMIC=n), we get (PREEMPT_COUNT=y,
>> PREEMPT_RCU=n). In this configuration cond_resched() is a stub and
>> does not provide quiescent states via rcu_all_qs().
>> (PREEMPT_RCU=y provides this information via rcu_read_unlock() and
>> its nesting counter.)
>>
>> So, use the availability of preempt_count() to report quiescent states
>> in rcu_flavor_sched_clock_irq().
>>
>> Suggested-by: Paul E. McKenney <paulmck@kernel.org>
>> Reviewed-by: Sebastian Andrzej Siewior <bigeasy@linutronix.de>
>> Signed-off-by: Ankur Arora <ankur.a.arora@oracle.com>
>
> Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Thanks for all the reviews!
--
ankur
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 0/7] RCU changes for PREEMPT_LAZY
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
` (6 preceding siblings ...)
2024-12-13 4:06 ` [PATCH v3 7/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
@ 2025-01-08 1:14 ` Paul E. McKenney
2025-01-08 18:18 ` Ankur Arora
7 siblings, 1 reply; 15+ messages in thread
From: Paul E. McKenney @ 2025-01-08 1:14 UTC (permalink / raw)
To: Ankur Arora
Cc: linux-kernel, peterz, tglx, mingo, bigeasy, juri.lelli,
vincent.guittot, dietmar.eggemann, rostedt, bsegall, mgorman,
vschneid, frederic, efault, sshegde, boris.ostrovsky
On Thu, Dec 12, 2024 at 08:06:51PM -0800, Ankur Arora wrote:
> This series adds RCU bits for lazy preemption.
>
> The problem addressed is that pre-PREEMPT_LAZY, PREEMPTION=y implied
> PREEMPT_RCU=y. With PREEMPT_LAZY, that's no longer true.
>
> That's because PREEMPT_RCU makes some trade-offs to optimize for
> latency as opposed to throughput, and configurations with limited
> preemption might prefer the stronger forward-progress guarantees of
> PREEMPT_RCU=n.
>
> Accordingly, with standalone PREEMPT_LAZY (much like PREEMPT_NONE,
> PREEMPT_VOLUNTARY) we want to use PREEMPT_RCU=n. And, when used in
> conjunction with PREEMPT_DYNAMIC, we continue to use PREEMPT_RCU=y.
>
> Patches 1-3 are cleanup patches:
> "rcu: fix header guard for rcu_all_qs()"
> "rcu: rename PREEMPT_AUTO to PREEMPT_LAZY"
> "sched: update __cond_resched comment about RCU quiescent states"
>
> Patch 4,
> "rcu: handle unstable rdp in rcu_read_unlock_strict()"
>
> handles a latent RCU bug rcu_report_qs_rdp() could be called with
> an unstable rdp.
>
> Patches 5 and 6,
> "rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y"
> "osnoise: provide quiescent states"
>
> handle quiescent states for the (PREEMPT_LAZY=y, PREEMPT_RCU=n)
> configuration.
>
> And, finally patch 7, "rcu: limit PREEMPT_RCU configurations",
> explicitly limits PREEMPT_RCU=y to the PREEMPT_DYNAMIC or the latency
> oriented models.
Pulled into my -rcu tree for further review and testing, with initial
tests passing. Apologies for the delay!
Thanx, Paul
> Changelog:
>
> v3:
> - moved patch-3 to be the last one in the series (suggested by Sebastian)
> - added "rcu: handle unstable rdp in rcu_read_unlock_strict()"
> (suggested by Frederic Weisbecker).
> - switched to a more robust check in rcu_flavor_sched_clock_irq()
> (suggested by Frederic Weisbecker).
> - simplified check in osnoise (suggested by Frederic Weisbecker).
> - dropped an unrelated scheduler patch.
>
> v2:
> - fixup incorrect usage of tif_need_resched_lazy() (comment from
> from Sebastian Andrzej Siewior)
> - massaged the commit messages a bit
> - drops the powerpc support for PREEMPT_LAZY as that was orthogonal
> to this series (Shrikanth will send that out separately.)
>
> Please review.
>
> Ankur Arora (7):
> rcu: fix header guard for rcu_all_qs()
> rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
> sched: update __cond_resched comment about RCU quiescent states
> rcu: handle unstable rdp in rcu_read_unlock_strict()
> rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
> osnoise: provide quiescent states
> rcu: limit PREEMPT_RCU configurations
>
> include/linux/rcupdate.h | 2 +-
> include/linux/rcutree.h | 2 +-
> include/linux/srcutiny.h | 2 +-
> kernel/rcu/Kconfig | 4 ++--
> kernel/rcu/srcutiny.c | 14 +++++++-------
> kernel/rcu/tree_plugin.h | 22 +++++++++++++++++-----
> kernel/sched/core.c | 4 +++-
> kernel/trace/trace_osnoise.c | 32 +++++++++++++++-----------------
> 8 files changed, 47 insertions(+), 35 deletions(-)
>
> --
> 2.43.5
>
^ permalink raw reply [flat|nested] 15+ messages in thread
* Re: [PATCH v3 0/7] RCU changes for PREEMPT_LAZY
2025-01-08 1:14 ` [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Paul E. McKenney
@ 2025-01-08 18:18 ` Ankur Arora
0 siblings, 0 replies; 15+ messages in thread
From: Ankur Arora @ 2025-01-08 18:18 UTC (permalink / raw)
To: paulmck
Cc: Ankur Arora, linux-kernel, peterz, tglx, mingo, bigeasy,
juri.lelli, vincent.guittot, dietmar.eggemann, rostedt, bsegall,
mgorman, vschneid, frederic, efault, sshegde, boris.ostrovsky
Paul E. McKenney <paulmck@kernel.org> writes:
> On Thu, Dec 12, 2024 at 08:06:51PM -0800, Ankur Arora wrote:
>> This series adds RCU bits for lazy preemption.
>>
>> The problem addressed is that pre-PREEMPT_LAZY, PREEMPTION=y implied
>> PREEMPT_RCU=y. With PREEMPT_LAZY, that's no longer true.
>>
>> That's because PREEMPT_RCU makes some trade-offs to optimize for
>> latency as opposed to throughput, and configurations with limited
>> preemption might prefer the stronger forward-progress guarantees of
>> PREEMPT_RCU=n.
>>
>> Accordingly, with standalone PREEMPT_LAZY (much like PREEMPT_NONE,
>> PREEMPT_VOLUNTARY) we want to use PREEMPT_RCU=n. And, when used in
>> conjunction with PREEMPT_DYNAMIC, we continue to use PREEMPT_RCU=y.
>>
>> Patches 1-3 are cleanup patches:
>> "rcu: fix header guard for rcu_all_qs()"
>> "rcu: rename PREEMPT_AUTO to PREEMPT_LAZY"
>> "sched: update __cond_resched comment about RCU quiescent states"
>>
>> Patch 4,
>> "rcu: handle unstable rdp in rcu_read_unlock_strict()"
>>
>> handles a latent RCU bug rcu_report_qs_rdp() could be called with
>> an unstable rdp.
>>
>> Patches 5 and 6,
>> "rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y"
>> "osnoise: provide quiescent states"
>>
>> handle quiescent states for the (PREEMPT_LAZY=y, PREEMPT_RCU=n)
>> configuration.
>>
>> And, finally patch 7, "rcu: limit PREEMPT_RCU configurations",
>> explicitly limits PREEMPT_RCU=y to the PREEMPT_DYNAMIC or the latency
>> oriented models.
>
> Pulled into my -rcu tree for further review and testing, with initial
> tests passing. Apologies for the delay!
Great. Thanks Paul!
Ankur
>> Changelog:
>>
>> v3:
>> - moved patch-3 to be the last one in the series (suggested by Sebastian)
>> - added "rcu: handle unstable rdp in rcu_read_unlock_strict()"
>> (suggested by Frederic Weisbecker).
>> - switched to a more robust check in rcu_flavor_sched_clock_irq()
>> (suggested by Frederic Weisbecker).
>> - simplified check in osnoise (suggested by Frederic Weisbecker).
>> - dropped an unrelated scheduler patch.
>>
>> v2:
>> - fixup incorrect usage of tif_need_resched_lazy() (comment from
>> from Sebastian Andrzej Siewior)
>> - massaged the commit messages a bit
>> - drops the powerpc support for PREEMPT_LAZY as that was orthogonal
>> to this series (Shrikanth will send that out separately.)
>>
>> Please review.
>>
>> Ankur Arora (7):
>> rcu: fix header guard for rcu_all_qs()
>> rcu: rename PREEMPT_AUTO to PREEMPT_LAZY
>> sched: update __cond_resched comment about RCU quiescent states
>> rcu: handle unstable rdp in rcu_read_unlock_strict()
>> rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y
>> osnoise: provide quiescent states
>> rcu: limit PREEMPT_RCU configurations
>>
>> include/linux/rcupdate.h | 2 +-
>> include/linux/rcutree.h | 2 +-
>> include/linux/srcutiny.h | 2 +-
>> kernel/rcu/Kconfig | 4 ++--
>> kernel/rcu/srcutiny.c | 14 +++++++-------
>> kernel/rcu/tree_plugin.h | 22 +++++++++++++++++-----
>> kernel/sched/core.c | 4 +++-
>> kernel/trace/trace_osnoise.c | 32 +++++++++++++++-----------------
>> 8 files changed, 47 insertions(+), 35 deletions(-)
>>
>> --
>> 2.43.5
>>
--
ankur
^ permalink raw reply [flat|nested] 15+ messages in thread
end of thread, other threads:[~2025-01-08 18:18 UTC | newest]
Thread overview: 15+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-12-13 4:06 [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Ankur Arora
2024-12-13 4:06 ` [PATCH v3 1/7] rcu: fix header guard for rcu_all_qs() Ankur Arora
2024-12-13 4:06 ` [PATCH v3 2/7] rcu: rename PREEMPT_AUTO to PREEMPT_LAZY Ankur Arora
2024-12-13 4:06 ` [PATCH v3 3/7] sched: update __cond_resched comment about RCU quiescent states Ankur Arora
2024-12-13 13:21 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 4/7] rcu: handle unstable rdp in rcu_read_unlock_strict() Ankur Arora
2024-12-13 13:38 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 5/7] rcu: handle quiescent states for PREEMPT_RCU=n, PREEMPT_COUNT=y Ankur Arora
2024-12-13 13:59 ` Frederic Weisbecker
2024-12-13 20:44 ` Ankur Arora
2024-12-13 4:06 ` [PATCH v3 6/7] osnoise: provide quiescent states Ankur Arora
2024-12-13 14:34 ` Frederic Weisbecker
2024-12-13 4:06 ` [PATCH v3 7/7] rcu: limit PREEMPT_RCU configurations Ankur Arora
2025-01-08 1:14 ` [PATCH v3 0/7] RCU changes for PREEMPT_LAZY Paul E. McKenney
2025-01-08 18:18 ` Ankur Arora
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox