From: Joel Fernandes <joel@joelfernandes.org>
To: Frederic Weisbecker <frederic@kernel.org>
Cc: rcu@vger.kernel.org, linux-kernel@vger.kernel.org,
rushikesh.s.kadam@intel.com, urezki@gmail.com,
neeraj.iitr10@gmail.com, paulmck@kernel.org, rostedt@goodmis.org,
vineeth@bitbyteword.org, boqun.feng@gmail.com
Subject: Re: [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation
Date: Sat, 3 Sep 2022 22:00:29 +0000 [thread overview]
Message-ID: <YxPOfVMzRWEa7xqf@google.com> (raw)
In-Reply-To: <20220902152132.GA115525@lothringen>
On Fri, Sep 02, 2022 at 05:21:32PM +0200, Frederic Weisbecker wrote:
[..]
> > +
> > raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
> >
> > trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason);
> [...]
> > @@ -705,12 +816,21 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
> > my_rdp->nocb_gp_gp = needwait_gp;
> > my_rdp->nocb_gp_seq = needwait_gp ? wait_gp_seq : 0;
> >
> > - if (bypass && !rcu_nocb_poll) {
> > - // At least one child with non-empty ->nocb_bypass, so set
> > - // timer in order to avoid stranding its callbacks.
> > - wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
> > - TPS("WakeBypassIsDeferred"));
> > + // At least one child with non-empty ->nocb_bypass, so set
> > + // timer in order to avoid stranding its callbacks.
> > + if (!rcu_nocb_poll) {
> > + // If bypass list only has lazy CBs. Add a deferred
> > + // lazy wake up.
> > + if (lazy && !bypass) {
> > + wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_LAZY,
> > + TPS("WakeLazyIsDeferred"));
>
> What if:
>
> 1) rdp(1) has lazy callbacks
> 2) all other rdp's have no callback at all
> 3) nocb_gp_wait() runs through all rdp's, everything is handled, except for
> these lazy callbacks
> 4) It reaches the above path, ready to arm the RCU_NOCB_WAKE_LAZY timer,
> but it hasn't yet called wake_nocb_gp_defer()
> 5) Oh but rdp(2) queues a non-lazy callback. interrupts are disabled so it defers
> the wake up to nocb_gp_wait() with arming the timer in RCU_NOCB_WAKE.
> 6) nocb_gp_wait() finally calls wake_nocb_gp_defer() and override the timeout
> to several seconds ahead.
> 7) No more callbacks queued, the non-lazy callback will have to wait several
> seconds to complete.
>
> Or did I miss something? Note that the race exists with RCU_NOCB_WAKE_BYPASS
> but it's only about one jiffy delay, not seconds.
>
So I think the below patch should fix that. But I have not tested it at all
and it could very well have issues. In particular, there is a likelihood of a
wake up while holding a lock which I'm not sure is safe due to scheduler
locks. I'll test it next week. Let me know any thoughts though.
---8<-----------------------
From: "Joel Fernandes (Google)" <joel@joelfernandes.org>
Subject: [PATCH] rcu: Fix race where wake_nocb_gp_defer() lazy wake can
overwrite a non-lazy wake
Fix by holding nocb_gp_lock when observing the state of all rdps. If any
rdp queued a non-lazy CB, we would do a wake up of the main gp thread.
This should address the race Frederick reported (which could effect both
non-lazy CBs using the bypass list, and lazy CBs, though lazy CBs much
more noticeably).
Quoting from Frederic's email:
1) rdp(1) has lazy callbacks
2) all other rdp's have no callback at all
3) nocb_gp_wait() runs through all rdp's, everything is handled, except for
these lazy callbacks
4) It reaches the above path, ready to arm the RCU_NOCB_WAKE_LAZY timer,
but it hasn't yet called wake_nocb_gp_defer()
5) Oh but rdp(2) queues a non-lazy callback. interrupts are disabled so it defers
the wake up to nocb_gp_wait() with arming the timer in RCU_NOCB_WAKE.
6) nocb_gp_wait() finally calls wake_nocb_gp_defer() and override the timeout
to several seconds ahead.
7) No more callbacks queued, the non-lazy callback will have to wait several
seconds to complete.
Here, the nocb gp lock is held when #4 happens. So the deferred wakeup
on #5 has to wait till #4 finishes.
Reported-by: Frederic Weisbecker <fweisbec@gmail.com>
Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
---
kernel/rcu/tree_nocb.h | 28 ++++++++++++++++++++++------
1 file changed, 22 insertions(+), 6 deletions(-)
diff --git a/kernel/rcu/tree_nocb.h b/kernel/rcu/tree_nocb.h
index 8b46442e4473..6690ece8fe20 100644
--- a/kernel/rcu/tree_nocb.h
+++ b/kernel/rcu/tree_nocb.h
@@ -285,14 +285,15 @@ EXPORT_SYMBOL(rcu_lazy_get_jiffies_till_flush);
* Arrange to wake the GP kthread for this NOCB group at some future
* time when it is safe to do so.
*/
-static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
- const char *reason)
+static void __wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
+ const char *reason, bool locked)
{
unsigned long flags;
struct rcu_data *rdp_gp = rdp->nocb_gp_rdp;
unsigned long mod_jif = 0;
- raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
+ if (!locked)
+ raw_spin_lock_irqsave(&rdp_gp->nocb_gp_lock, flags);
/*
* Bypass and lazy wakeup overrides previous deferments. In case of
@@ -323,11 +324,23 @@ static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
if (rdp_gp->nocb_defer_wakeup < waketype)
WRITE_ONCE(rdp_gp->nocb_defer_wakeup, waketype);
- raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
+ if (!locked)
+ raw_spin_unlock_irqrestore(&rdp_gp->nocb_gp_lock, flags);
trace_rcu_nocb_wake(rcu_state.name, rdp->cpu, reason);
}
+static void wake_nocb_gp_defer(struct rcu_data *rdp, int waketype,
+ const char *reason) {
+ __wake_nocb_gp_defer(rdp, waketype, reason, false);
+}
+
+
+static void wake_nocb_gp_defer_locked(struct rcu_data *rdp, int waketype,
+ const char *reason) {
+ __wake_nocb_gp_defer(rdp, waketype, reason, true);
+
/*
* Flush the ->nocb_bypass queue into ->cblist, enqueuing rhp if non-NULL.
* However, if there is a callback to be enqueued and if ->nocb_bypass
@@ -754,6 +767,8 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
* is added to the list, so the skipped-over rcu_data structures
* won't be ignored for long.
*/
+
+ raw_spin_lock_irqsave(&my_rdp->nocb_gp_lock, flags);
list_for_each_entry_rcu(rdp, &my_rdp->nocb_head_rdp, nocb_entry_rdp, 1) {
bool needwake_state = false;
bool flush_bypass = false;
@@ -855,14 +870,15 @@ static void nocb_gp_wait(struct rcu_data *my_rdp)
// If bypass list only has lazy CBs. Add a deferred
// lazy wake up.
if (lazy && !bypass) {
- wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_LAZY,
+ wake_nocb_gp_defer_locked(my_rdp, RCU_NOCB_WAKE_LAZY,
TPS("WakeLazyIsDeferred"));
// Otherwise add a deferred bypass wake up.
} else if (bypass) {
- wake_nocb_gp_defer(my_rdp, RCU_NOCB_WAKE_BYPASS,
+ wake_nocb_gp_defer_locked(my_rdp, RCU_NOCB_WAKE_BYPASS,
TPS("WakeBypassIsDeferred"));
}
}
+ raw_spin_unlock_irqrestore(&my_rdp->nocb_gp_lock, flags);
if (rcu_nocb_poll) {
/* Polling, so trace if first poll in the series. */
--
2.37.2.789.g6183377224-goog
next prev parent reply other threads:[~2022-09-03 22:00 UTC|newest]
Thread overview: 84+ messages / expand[flat|nested] mbox.gz Atom feed top
2022-09-01 22:17 [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 01/18] mm/slub: perform free consistency checks before call_rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 02/18] mm/sl[au]b: rearrange struct slab fields to allow larger rcu_head Joel Fernandes (Google)
2022-09-02 9:26 ` Vlastimil Babka
2022-09-02 9:30 ` Vlastimil Babka
2022-09-02 15:09 ` Joel Fernandes
2022-09-03 13:53 ` Paul E. McKenney
2022-09-01 22:17 ` [PATCH v5 03/18] rcu/tree: Use READ_ONCE() for lockless read of rnp->qsmask Joel Fernandes (Google)
2022-09-06 22:26 ` Boqun Feng
2022-09-06 22:31 ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 04/18] rcu: Fix late wakeup when flush of bypass cblist happens Joel Fernandes (Google)
2022-09-02 11:35 ` Frederic Weisbecker
2022-09-02 23:58 ` Joel Fernandes
2022-09-03 15:10 ` Paul E. McKenney
2022-09-04 21:13 ` Frederic Weisbecker
2022-09-03 14:04 ` Paul E. McKenney
2022-09-03 14:05 ` Joel Fernandes
2022-09-06 3:07 ` Joel Fernandes
2022-09-06 9:48 ` Frederic Weisbecker
2022-09-07 2:43 ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 05/18] rcu: Move trace_rcu_callback() before bypassing Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 06/18] rcu: Introduce call_rcu_lazy() API implementation Joel Fernandes (Google)
2022-09-02 15:21 ` Frederic Weisbecker
2022-09-02 23:09 ` Joel Fernandes
2022-09-05 12:59 ` Frederic Weisbecker
2022-09-05 20:18 ` Joel Fernandes
2022-09-05 20:32 ` Joel Fernandes
2022-09-06 8:55 ` Paul E. McKenney
2022-09-06 16:16 ` Joel Fernandes
2022-09-06 17:05 ` Paul E. McKenney
2022-09-03 22:00 ` Joel Fernandes [this message]
2022-09-04 21:01 ` Frederic Weisbecker
2022-09-05 20:20 ` Joel Fernandes
2022-09-06 3:05 ` Joel Fernandes
2022-09-06 15:17 ` Frederic Weisbecker
2022-09-06 16:15 ` Joel Fernandes
2022-09-06 16:31 ` Joel Fernandes
2022-09-06 16:38 ` Joel Fernandes
2022-09-06 16:43 ` Joel Fernandes
2022-09-06 19:11 ` Frederic Weisbecker
2022-09-07 2:56 ` Joel Fernandes
2022-09-07 9:56 ` Frederic Weisbecker
2022-09-07 10:03 ` Frederic Weisbecker
2022-09-07 14:01 ` Joel Fernandes
2022-09-07 0:06 ` Joel Fernandes
2022-09-07 9:40 ` Frederic Weisbecker
2022-09-07 13:44 ` Joel Fernandes
2022-09-07 15:38 ` Frederic Weisbecker
2022-09-07 15:39 ` Joel Fernandes
2022-09-21 23:52 ` Joel Fernandes
2022-09-22 14:39 ` Joel Fernandes
2022-09-06 18:16 ` Uladzislau Rezki
2022-09-06 18:21 ` Joel Fernandes
2022-09-07 8:52 ` Uladzislau Rezki
2022-09-07 15:23 ` Joel Fernandes
2022-09-03 14:03 ` Paul E. McKenney
2022-09-03 14:05 ` Joel Fernandes
2022-09-01 22:17 ` [PATCH v5 07/18] rcu: shrinker for lazy rcu Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 08/18] rcu: Add per-CB tracing for queuing, flush and invocation Joel Fernandes (Google)
2022-09-02 16:48 ` kernel test robot
2022-09-03 12:39 ` Paul E. McKenney
2022-09-03 12:39 ` Paul E. McKenney
2022-09-03 14:07 ` Joel Fernandes
2022-09-03 14:07 ` Joel Fernandes
2022-09-02 19:01 ` kernel test robot
2022-09-01 22:17 ` [PATCH v5 09/18] rcuscale: Add laziness and kfree tests Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 10/18] rcutorture: Add test code for call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 11/18] fs: Move call_rcu() to call_rcu_lazy() in some paths Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 12/18] cred: Move call_rcu() to call_rcu_lazy() Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 13/18] security: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 14/18] net/core: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 15/18] kernel: Move various core kernel usages " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 16/18] lib: Move call_rcu() " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 17/18] i915: " Joel Fernandes (Google)
2022-09-01 22:17 ` [PATCH v5 18/18] fork: Move thread_stack_free_rcu() " Joel Fernandes (Google)
2022-09-03 15:22 ` [PATCH v5 00/18] Implement call_rcu_lazy() and miscellaneous fixes Paul E. McKenney
[not found] ` <CAEXW_YTxRWvUN3YuwdJ9WPmHH-v+Auozfc_OugpV_hpxC5K+4Q@mail.gmail.com>
[not found] ` <20220914093423.GS246308@paulmck-ThinkPad-P17-Gen-1>
[not found] ` <20220914102319.GA1936@lothringen>
[not found] ` <CA+KHdyUb122cCaeLutzK8Ti-dk4D=iXCtQau=ZXkPRO-1cnD1A@mail.gmail.com>
[not found] ` <20220914112251.GV246308@paulmck-ThinkPad-P17-Gen-1>
[not found] ` <YygjfJkoUzxy9xd/@pc636>
[not found] ` <20220919131152.GF4196@paulmck-ThinkPad-P17-Gen-1>
[not found] ` <CAEXW_YQGPwkTQjxAb5p4OOhJkMA-L93JNTsa-WaXs0Ts7=Oakw@mail.gmail.com>
[not found] ` <20220919223211.GJ4196@paulmck-ThinkPad-P17-Gen-1>
[not found] ` <f061b798-e6f6-1523-52c3-12fe7966f809@joelfernandes.org>
[not found] ` <a2882e80-1723-9004-f67a-cd8332aa701d@joelfernandes.org>
[not found] ` <d72808e1-fc27-f158-37b7-ef23c23e74c1@joelfernandes.org>
2022-09-20 11:40 ` Joel Fernandes
2022-09-20 12:32 ` Paul E. McKenney
2022-09-20 17:55 ` Joel Fernandes
2022-09-20 19:03 ` Paul E. McKenney
2022-09-20 22:27 ` Joel Fernandes
2022-09-20 22:35 ` Paul E. McKenney
2022-09-20 22:40 ` Joel Fernandes
2022-09-21 2:21 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=YxPOfVMzRWEa7xqf@google.com \
--to=joel@joelfernandes.org \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraj.iitr10@gmail.com \
--cc=paulmck@kernel.org \
--cc=rcu@vger.kernel.org \
--cc=rostedt@goodmis.org \
--cc=rushikesh.s.kadam@intel.com \
--cc=urezki@gmail.com \
--cc=vineeth@bitbyteword.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.