All of lore.kernel.org
 help / color / mirror / Atom feed
From: Frederic Weisbecker <frederic@kernel.org>
To: Neeraj upadhyay <neeraj.iitr10@gmail.com>
Cc: Joel Fernandes <joel@joelfernandes.org>,
	zhuangel570 <zhuangel570@gmail.com>,
	paulmck@kernel.org, rcu@vger.kernel.org, josh@joshtriplett.org,
	rostedt@goodmis.org, mathieu.desnoyers@efficios.com,
	jiangshanlai@gmail.com, like.xu.linux@gmail.com,
	linussli@tencent.com, foxywang@tencent.com
Subject: Re: SRCU: kworker hung in synchronize_srcu
Date: Mon, 2 Oct 2023 12:41:43 +0200	[thread overview]
Message-ID: <ZRqeZ31cLs8bocBo@lothringen> (raw)
In-Reply-To: <CAFwiDX9NBdPy-CZA6i9XEd5jM+j9e57zMoW2AwAoNtzDt+8qbw@mail.gmail.com>

On Mon, Oct 02, 2023 at 07:47:55AM +0530, Neeraj upadhyay wrote:
> On Mon, Oct 2, 2023 at 4:02 AM Frederic Weisbecker <frederic@kernel.org> wrote:
> >
> > Le Sun, Oct 01, 2023 at 07:57:14AM +0530, Neeraj upadhyay a écrit :
> > >
> > > But "more" only checks for CBs in DONE tail. The callbacks which have been just
> > > accelerated are not advanced to DONE tail.
> > >
> > > Having said above, I am still trying to figure out, which callbacks
> > > are actually being pointed
> > > by NEXT tail. Given that __call_srcu() already does a advance and
> > > accelerate, all enqueued
> > > callbacks would be in either WAIT tail (the callbacks for current
> > > active GP) or NEXT_READY
> > > tail (the callbacks for next GP after current one completes). So, they
> > > should already have
> > > GP num assigned and srcu_invoke_callbacks() won't accelerate them.
> > > Only case I can
> > > think of is, if current GP completes after we sample
> > > rcu_seq_current(&ssp->srcu_gp_seq) for
> > > rcu_segcblist_advance() (so, WAIT tail cbs are not moved to DONE tail)
> > > and a new GP is started
> > > before we take snapshot ('s') of next GP  for
> > > rcu_segcblist_accelerate(), then the gp num 's'
> > > > gp num of NEXT_READY_TAIL and will be put in NEXT tail. Not sure
> > > if my understanding is correct here. Thoughts?
> > >
> > > __call_srcu()
> > >         rcu_segcblist_advance(&sdp->srcu_cblist,
> > >                               rcu_seq_current(&ssp->srcu_gp_seq));
> > >         s = rcu_seq_snap(&ssp->srcu_gp_seq);
> > >         (void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
> >
> > Good point! This looks plausible.
> >
> > Does the (buggy) acceleration in srcu_invoke_callbacks() exists solely
> > to handle that case? Because then the acceleration could be moved before
> > the advance on callbacks handling, something like:
> >
> 
> I think we might need to accelerate after advance, as the  tail pointers
> (WAIT, NEXT_READY) can be non-empty and we will not be able to
> accelerate (and assign GP num) until we advance WAIT tail to DONE tail?

Right indeed! How about the following patch then, assuming that in SRCU:
1 enqueue == 1 accelerate and therefore it only makes sense
to accelerate on enqueue time and any other accelerate call is error prone.

I can't find a scenario where the second call below to accelerate (and thus also
the second call to advance) would fail. Please tell me if I'm missing something.
Also the role of the remaining advance in srcu_gp_start() is unclear to me...

diff --git a/kernel/rcu/srcutree.c b/kernel/rcu/srcutree.c
index 20d7a238d675..5ac81ca10ec8 100644
--- a/kernel/rcu/srcutree.c
+++ b/kernel/rcu/srcutree.c
@@ -782,8 +782,6 @@ static void srcu_gp_start(struct srcu_struct *ssp)
 	spin_lock_rcu_node(sdp);  /* Interrupts already disabled. */
 	rcu_segcblist_advance(&sdp->srcu_cblist,
 			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
-	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
-				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	spin_unlock_rcu_node(sdp);  /* Interrupts remain disabled. */
 	WRITE_ONCE(ssp->srcu_sup->srcu_gp_start, jiffies);
 	WRITE_ONCE(ssp->srcu_sup->srcu_n_exp_nodelay, 0);
@@ -1245,7 +1243,18 @@ static unsigned long srcu_gp_start_if_needed(struct srcu_struct *ssp,
 	rcu_segcblist_advance(&sdp->srcu_cblist,
 			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
 	s = rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq);
-	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist, s);
+	/*
+	 * Acceleration might fail if the preceding call to
+	 * rcu_segcblist_advance() also failed due to a prior grace
+	 * period seen incomplete before rcu_seq_snap(). If so then a new
+	 * call to advance will see the completed grace period and fix
+	 * the situation.
+	 */
+	if (!rcu_segcblist_accelerate(&sdp->srcu_cblist, s)) {
+		rcu_segcblist_advance(&sdp->srcu_cblist,
+				      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
+		WARN_ON_ONCE(!rcu_segcblist_accelerate(&sdp->srcu_cblist, s));
+	}
 	if (ULONG_CMP_LT(sdp->srcu_gp_seq_needed, s)) {
 		sdp->srcu_gp_seq_needed = s;
 		needgp = true;
@@ -1692,6 +1701,7 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	ssp = sdp->ssp;
 	rcu_cblist_init(&ready_cbs);
 	spin_lock_irq_rcu_node(sdp);
+	WARN_ON_ONCE(!rcu_segcblist_segempty(&sdp->srcu_cblist, RCU_NEXT_TAIL));
 	rcu_segcblist_advance(&sdp->srcu_cblist,
 			      rcu_seq_current(&ssp->srcu_sup->srcu_gp_seq));
 	if (sdp->srcu_cblist_invoking ||
@@ -1720,8 +1730,6 @@ static void srcu_invoke_callbacks(struct work_struct *work)
 	 */
 	spin_lock_irq_rcu_node(sdp);
 	rcu_segcblist_add_len(&sdp->srcu_cblist, -len);
-	(void)rcu_segcblist_accelerate(&sdp->srcu_cblist,
-				       rcu_seq_snap(&ssp->srcu_sup->srcu_gp_seq));
 	sdp->srcu_cblist_invoking = false;
 	more = rcu_segcblist_ready_cbs(&sdp->srcu_cblist);
 	spin_unlock_irq_rcu_node(sdp);

  reply	other threads:[~2023-10-02 10:41 UTC|newest]

Thread overview: 28+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2023-09-28  7:59 SRCU: kworker hung in synchronize_srcu zhuangel570
2023-09-28 21:39 ` Joel Fernandes
2023-09-28 21:40   ` Joel Fernandes
2023-09-29  1:50   ` Zhouyi Zhou
2023-09-29 22:44   ` Frederic Weisbecker
2023-09-30  2:45     ` Neeraj upadhyay
2023-09-30  7:30       ` Frederic Weisbecker
2023-09-30  9:10       ` Frederic Weisbecker
2023-09-30  9:48         ` Neeraj upadhyay
2023-09-30 10:01   ` Neeraj upadhyay
2023-10-01  0:19     ` Joel Fernandes
2023-10-01  2:27       ` Neeraj upadhyay
2023-10-01 22:32         ` Frederic Weisbecker
2023-10-01 22:39           ` Frederic Weisbecker
2023-10-02  2:21             ` Neeraj upadhyay
2023-10-02 11:05               ` Frederic Weisbecker
2023-10-02 22:46               ` Frederic Weisbecker
2023-10-03 12:06                 ` Neeraj upadhyay
2023-10-02  2:17           ` Neeraj upadhyay
2023-10-02 10:41             ` Frederic Weisbecker [this message]
2023-10-02 13:22               ` Neeraj upadhyay
2023-10-02 21:09                 ` Frederic Weisbecker
2023-10-03 12:00                   ` Neeraj upadhyay
2023-10-03 12:22                   ` Frederic Weisbecker
2023-10-03 18:46                     ` Neeraj upadhyay
2023-10-07  9:13         ` zhuangel570
2023-10-07  8:53   ` zhuangel570
2023-09-30 10:11 ` Neeraj upadhyay

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=ZRqeZ31cLs8bocBo@lothringen \
    --to=frederic@kernel.org \
    --cc=foxywang@tencent.com \
    --cc=jiangshanlai@gmail.com \
    --cc=joel@joelfernandes.org \
    --cc=josh@joshtriplett.org \
    --cc=like.xu.linux@gmail.com \
    --cc=linussli@tencent.com \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=neeraj.iitr10@gmail.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    --cc=zhuangel570@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.