* [PATCH 0/2] RCU NOCB for v6.13
@ 2024-11-06 15:32 Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 1/2] rcu: Remove unused declaration rcu_segcblist_offload() Frederic Weisbecker
` (2 more replies)
0 siblings, 3 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2024-11-06 15:32 UTC (permalink / raw)
To: LKML
Cc: Frederic Weisbecker, Boqun Feng, Joel Fernandes, Josh Triplett,
Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay,
Paul E . McKenney, Steven Rostedt, Uladzislau Rezki, Zqiang, rcu
Hello,
Please find below the RCU NOCB patches targeted for the upcoming
merge window.
Yue Haibing (1):
rcu: Remove unused declaration rcu_segcblist_offload()
Zqiang (1):
rcu/nocb: Fix missed RCU barrier on deoffloading
kernel/rcu/rcu_segcblist.h | 1 -
kernel/rcu/tree_nocb.h | 13 ++++++++++++-
2 files changed, 12 insertions(+), 2 deletions(-)
Thanks.
--
2.46.0
^ permalink raw reply [flat|nested] 7+ messages in thread
* [PATCH 1/2] rcu: Remove unused declaration rcu_segcblist_offload()
2024-11-06 15:32 [PATCH 0/2] RCU NOCB for v6.13 Frederic Weisbecker
@ 2024-11-06 15:32 ` Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading Frederic Weisbecker
2024-11-11 17:33 ` [PATCH 0/2] RCU NOCB for v6.13 Neeraj Upadhyay
2 siblings, 0 replies; 7+ messages in thread
From: Frederic Weisbecker @ 2024-11-06 15:32 UTC (permalink / raw)
To: LKML
Cc: Yue Haibing, Boqun Feng, Joel Fernandes, Josh Triplett,
Lai Jiangshan, Mathieu Desnoyers, Neeraj Upadhyay,
Paul E . McKenney, Steven Rostedt, Uladzislau Rezki, Zqiang, rcu,
Frederic Weisbecker, Neeraj Upadhyay
From: Yue Haibing <yuehaibing@huawei.com>
Commit 17351eb59abd ("rcu/nocb: Simplify (de-)offloading state machine")
removed the implementation but leave declaration.
Signed-off-by: Yue Haibing <yuehaibing@huawei.com>
Reviewed-by: Frederic Weisbecker <frederic@kernel.org>
Reviewed-by: "Paul E. McKenney" <paulmck@kernel.org>
Signed-off-by: Neeraj Upadhyay <neeraj.upadhyay@kernel.org>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/rcu/rcu_segcblist.h | 1 -
1 file changed, 1 deletion(-)
diff --git a/kernel/rcu/rcu_segcblist.h b/kernel/rcu/rcu_segcblist.h
index 259904075636..fadc08ad4b7b 100644
--- a/kernel/rcu/rcu_segcblist.h
+++ b/kernel/rcu/rcu_segcblist.h
@@ -120,7 +120,6 @@ void rcu_segcblist_inc_len(struct rcu_segcblist *rsclp);
void rcu_segcblist_add_len(struct rcu_segcblist *rsclp, long v);
void rcu_segcblist_init(struct rcu_segcblist *rsclp);
void rcu_segcblist_disable(struct rcu_segcblist *rsclp);
-void rcu_segcblist_offload(struct rcu_segcblist *rsclp, bool offload);
bool rcu_segcblist_ready_cbs(struct rcu_segcblist *rsclp);
bool rcu_segcblist_pend_cbs(struct rcu_segcblist *rsclp);
struct rcu_head *rcu_segcblist_first_cb(struct rcu_segcblist *rsclp);
--
2.46.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading
2024-11-06 15:32 [PATCH 0/2] RCU NOCB for v6.13 Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 1/2] rcu: Remove unused declaration rcu_segcblist_offload() Frederic Weisbecker
@ 2024-11-06 15:32 ` Frederic Weisbecker
2024-11-11 7:37 ` Neeraj Upadhyay
2024-11-11 17:33 ` [PATCH 0/2] RCU NOCB for v6.13 Neeraj Upadhyay
2 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2024-11-06 15:32 UTC (permalink / raw)
To: LKML
Cc: Zqiang, Boqun Feng, Joel Fernandes, Josh Triplett, Lai Jiangshan,
Mathieu Desnoyers, Neeraj Upadhyay, Paul E . McKenney,
Steven Rostedt, Uladzislau Rezki, rcu, Frederic Weisbecker
From: Zqiang <qiang.zhang1211@gmail.com>
Currently, running rcutorture test with torture_type=rcu fwd_progress=8
n_barrier_cbs=8 nocbs_nthreads=8 nocbs_toggle=100 onoff_interval=60
test_boost=2, will trigger the following warning:
WARNING: CPU: 19 PID: 100 at kernel/rcu/tree_nocb.h:1061 rcu_nocb_rdp_deoffload+0x292/0x2a0
RIP: 0010:rcu_nocb_rdp_deoffload+0x292/0x2a0
Call Trace:
<TASK>
? __warn+0x7e/0x120
? rcu_nocb_rdp_deoffload+0x292/0x2a0
? report_bug+0x18e/0x1a0
? handle_bug+0x3d/0x70
? exc_invalid_op+0x18/0x70
? asm_exc_invalid_op+0x1a/0x20
? rcu_nocb_rdp_deoffload+0x292/0x2a0
rcu_nocb_cpu_deoffload+0x70/0xa0
rcu_nocb_toggle+0x136/0x1c0
? __pfx_rcu_nocb_toggle+0x10/0x10
kthread+0xd1/0x100
? __pfx_kthread+0x10/0x10
ret_from_fork+0x2f/0x50
? __pfx_kthread+0x10/0x10
ret_from_fork_asm+0x1a/0x30
</TASK>
CPU0 CPU2 CPU3
//rcu_nocb_toggle //nocb_cb_wait //rcutorture
// deoffload CPU1 // process CPU1's rdp
rcu_barrier()
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 2
// enqueue barrier
// callback to CPU1's
// rdp->cblist
rcu_do_batch()
// invoke CPU1's rdp->cblist
// callback
rcu_barrier_callback()
rcu_barrier()
mutex_lock(&rcu_state.barrier_mutex);
// still see len == 2
// enqueue barrier callback
// to CPU1's rdp->cblist
rcu_segcblist_entrain()
rcu_segcblist_add_len(1);
// len == 3
// decrement len
rcu_segcblist_add_len(-2);
kthread_parkme()
// CPU1's rdp->cblist len == 1
// Warn because there is
// still a pending barrier
// trigger warning
WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
cpus_read_unlock();
// wait CPU1 to comes online and
// invoke barrier callback on
// CPU1 rdp's->cblist
wait_for_completion(&rcu_state.barrier_completion);
// deoffload CPU4
cpus_read_lock()
rcu_barrier()
mutex_lock(&rcu_state.barrier_mutex);
// block on barrier_mutex
// wait rcu_barrier() on
// CPU3 to unlock barrier_mutex
// but CPU3 unlock barrier_mutex
// need to wait CPU1 comes online
// when CPU1 going online will block on cpus_write_lock
The above scenario will not only trigger a WARN_ON_ONCE(), but also
trigger a deadlock.
Thanks to nocb locking, a second racing rcu_barrier() on an offline CPU
will either observe the decremented callback counter down to 0 and spare
the callback enqueue, or rcuo will observe the new callback and keep
rdp->nocb_cb_sleep to false.
Therefore check rdp->nocb_cb_sleep before parking to make sure no
further rcu_barrier() is waiting on the rdp.
Fixes: 1fcb932c8b5c ("rcu/nocb: Simplify (de-)offloading state machine")
Suggested-by: Frederic Weisbecker <frederic@kernel.org>
Signed-off-by: Zqiang <qiang.zhang1211@gmail.com>
Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
---
kernel/rcu/tree_nocb.h | 13 ++++++++++++-
1 file changed, 12 insertions(+), 1 deletion(-)
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 16865475120b..2605dd234a13 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -891,7 +891,18 @@ static void nocb_cb_wait(struct rcu_data *rdp)
swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
nocb_cb_wait_cond(rdp));
if (kthread_should_park()) {
- kthread_parkme();
+ /*
+ * kthread_park() must be preceded by an rcu_barrier().
+ * But yet another rcu_barrier() might have sneaked in between
+ * the barrier callback execution and the callbacks counter
+ * decrement.
+ */
+ if (rdp->nocb_cb_sleep) {
+ rcu_nocb_lock_irqsave(rdp, flags);
+ WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
+ rcu_nocb_unlock_irqrestore(rdp, flags);
+ kthread_parkme();
+ }
} else if (READ_ONCE(rdp->nocb_cb_sleep)) {
WARN_ON(signal_pending(current));
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
--
2.46.0
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading
2024-11-06 15:32 ` [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading Frederic Weisbecker
@ 2024-11-11 7:37 ` Neeraj Upadhyay
2024-11-12 21:37 ` Frederic Weisbecker
0 siblings, 1 reply; 7+ messages in thread
From: Neeraj Upadhyay @ 2024-11-11 7:37 UTC (permalink / raw)
To: Frederic Weisbecker, LKML
Cc: Zqiang, Boqun Feng, Joel Fernandes, Josh Triplett, Lai Jiangshan,
Mathieu Desnoyers, Paul E . McKenney, Steven Rostedt,
Uladzislau Rezki, rcu
> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> index 16865475120b..2605dd234a13 100644
> --- a/kernel/rcu/tree_nocb.h
> +++ b/kernel/rcu/tree_nocb.h
> @@ -891,7 +891,18 @@ static void nocb_cb_wait(struct rcu_data *rdp)
> swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
> nocb_cb_wait_cond(rdp));
> if (kthread_should_park()) {
> - kthread_parkme();
> + /*
> + * kthread_park() must be preceded by an rcu_barrier().
> + * But yet another rcu_barrier() might have sneaked in between
> + * the barrier callback execution and the callbacks counter
> + * decrement.
> + */
> + if (rdp->nocb_cb_sleep) {
Is READ_ONCE() not required here?
- Neeraj
> + rcu_nocb_lock_irqsave(rdp, flags);
> + WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
> + rcu_nocb_unlock_irqrestore(rdp, flags);
> + kthread_parkme();
> + }
> } else if (READ_ONCE(rdp->nocb_cb_sleep)) {
> WARN_ON(signal_pending(current));
> trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 0/2] RCU NOCB for v6.13
2024-11-06 15:32 [PATCH 0/2] RCU NOCB for v6.13 Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 1/2] rcu: Remove unused declaration rcu_segcblist_offload() Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading Frederic Weisbecker
@ 2024-11-11 17:33 ` Neeraj Upadhyay
2 siblings, 0 replies; 7+ messages in thread
From: Neeraj Upadhyay @ 2024-11-11 17:33 UTC (permalink / raw)
To: Frederic Weisbecker, LKML
Cc: Boqun Feng, Joel Fernandes, Josh Triplett, Lai Jiangshan,
Mathieu Desnoyers, Paul E . McKenney, Steven Rostedt,
Uladzislau Rezki, Zqiang, rcu
On 11/6/2024 9:02 PM, Frederic Weisbecker wrote:
> Hello,
>
> Please find below the RCU NOCB patches targeted for the upcoming
> merge window.
>
> Yue Haibing (1):
> rcu: Remove unused declaration rcu_segcblist_offload()
>
> Zqiang (1):
> rcu/nocb: Fix missed RCU barrier on deoffloading
>
> kernel/rcu/rcu_segcblist.h | 1 -
> kernel/rcu/tree_nocb.h | 13 ++++++++++++-
> 2 files changed, 12 insertions(+), 2 deletions(-)
>
> Thanks.
>
Reviewed-by: Neeraj Upadhyay <Neeraj.Upadhyay@amd.com>
- Neeraj
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading
2024-11-11 7:37 ` Neeraj Upadhyay
@ 2024-11-12 21:37 ` Frederic Weisbecker
2024-11-13 3:22 ` Neeraj Upadhyay
0 siblings, 1 reply; 7+ messages in thread
From: Frederic Weisbecker @ 2024-11-12 21:37 UTC (permalink / raw)
To: Neeraj Upadhyay
Cc: LKML, Zqiang, Boqun Feng, Joel Fernandes, Josh Triplett,
Lai Jiangshan, Mathieu Desnoyers, Paul E . McKenney,
Steven Rostedt, Uladzislau Rezki, rcu
Le Mon, Nov 11, 2024 at 01:07:16PM +0530, Neeraj Upadhyay a écrit :
>
> > diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
> > index 16865475120b..2605dd234a13 100644
> > --- a/kernel/rcu/tree_nocb.h
> > +++ b/kernel/rcu/tree_nocb.h
> > @@ -891,7 +891,18 @@ static void nocb_cb_wait(struct rcu_data *rdp)
> > swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
> > nocb_cb_wait_cond(rdp));
> > if (kthread_should_park()) {
> > - kthread_parkme();
> > + /*
> > + * kthread_park() must be preceded by an rcu_barrier().
> > + * But yet another rcu_barrier() might have sneaked in between
> > + * the barrier callback execution and the callbacks counter
> > + * decrement.
> > + */
> > + if (rdp->nocb_cb_sleep) {
>
> Is READ_ONCE() not required here?
No because it can't be written concurrently at this point. The value observed
here if kthread_should_park() must have been written locally on the previous
call to nocb_cb_wait().
Thanks.
>
>
> - Neeraj
>
> > + rcu_nocb_lock_irqsave(rdp, flags);
> > + WARN_ON_ONCE(rcu_segcblist_n_cbs(&rdp->cblist));
> > + rcu_nocb_unlock_irqrestore(rdp, flags);
> > + kthread_parkme();
> > + }
> > } else if (READ_ONCE(rdp->nocb_cb_sleep)) {
> > WARN_ON(signal_pending(current));
> > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, TPS("WokeEmpty"));
>
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading
2024-11-12 21:37 ` Frederic Weisbecker
@ 2024-11-13 3:22 ` Neeraj Upadhyay
0 siblings, 0 replies; 7+ messages in thread
From: Neeraj Upadhyay @ 2024-11-13 3:22 UTC (permalink / raw)
To: Frederic Weisbecker
Cc: LKML, Zqiang, Boqun Feng, Joel Fernandes, Josh Triplett,
Lai Jiangshan, Mathieu Desnoyers, Paul E . McKenney,
Steven Rostedt, Uladzislau Rezki, rcu
On 11/13/2024 3:07 AM, Frederic Weisbecker wrote:
> Le Mon, Nov 11, 2024 at 01:07:16PM +0530, Neeraj Upadhyay a écrit :
>>
>>> diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
>>> index 16865475120b..2605dd234a13 100644
>>> --- a/kernel/rcu/tree_nocb.h
>>> +++ b/kernel/rcu/tree_nocb.h
>>> @@ -891,7 +891,18 @@ static void nocb_cb_wait(struct rcu_data *rdp)
>>> swait_event_interruptible_exclusive(rdp->nocb_cb_wq,
>>> nocb_cb_wait_cond(rdp));
>>> if (kthread_should_park()) {
>>> - kthread_parkme();
>>> + /*
>>> + * kthread_park() must be preceded by an rcu_barrier().
>>> + * But yet another rcu_barrier() might have sneaked in between
>>> + * the barrier callback execution and the callbacks counter
>>> + * decrement.
>>> + */
>>> + if (rdp->nocb_cb_sleep) {
>>
>> Is READ_ONCE() not required here?
>
> No because it can't be written concurrently at this point. The value observed
> here if kthread_should_park() must have been written locally on the previous
> call to nocb_cb_wait().
>
Ok, got it. I was not aware of any other flow (other than the one described in
this fix) which can race with it. So, asked.
- Neeraj
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2024-11-13 3:22 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2024-11-06 15:32 [PATCH 0/2] RCU NOCB for v6.13 Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 1/2] rcu: Remove unused declaration rcu_segcblist_offload() Frederic Weisbecker
2024-11-06 15:32 ` [PATCH 2/2] rcu/nocb: Fix missed RCU barrier on deoffloading Frederic Weisbecker
2024-11-11 7:37 ` Neeraj Upadhyay
2024-11-12 21:37 ` Frederic Weisbecker
2024-11-13 3:22 ` Neeraj Upadhyay
2024-11-11 17:33 ` [PATCH 0/2] RCU NOCB for v6.13 Neeraj Upadhyay
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.