All of lore.kernel.org
 help / color / mirror / Atom feed
From: Joel Fernandes <joel@joelfernandes.org>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: LKML <linux-kernel@vger.kernel.org>,
	Josh Triplett <josh@joshtriplett.org>,
	Lai Jiangshan <jiangshanlai@gmail.com>,
	Mathieu Desnoyers <mathieu.desnoyers@efficios.com>,
	rcu <rcu@vger.kernel.org>, Steven Rostedt <rostedt@goodmis.org>
Subject: Re: [PATCH v2] rcu/segcblist: Add debug checks for segment lengths
Date: Thu, 19 Nov 2020 15:42:21 -0500	[thread overview]
Message-ID: <20201119204221.GB812262@google.com> (raw)
In-Reply-To: <20201119201615.GA1437@paulmck-ThinkPad-P72>

On Thu, Nov 19, 2020 at 12:16:15PM -0800, Paul E. McKenney wrote:
> On Thu, Nov 19, 2020 at 02:44:35PM -0500, Joel Fernandes wrote:
> > On Thu, Nov 19, 2020 at 2:22 PM Paul E. McKenney <paulmck@kernel.org> wrote:
> > > > > > > On Wed, Nov 18, 2020 at 11:15:41AM -0500, Joel Fernandes (Google) wrote:
> > > > > > > > After rcu_do_batch(), add a check for whether the seglen counts went to
> > > > > > > > zero if the list was indeed empty.
> > > > > > > >
> > > > > > > > Signed-off-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> > > > > > >
> > > > > > > Queued for testing and further review, thank you!
> > > > > >
> > > > > > FYI, the second of the two checks triggered in all four one-hour runs of
> > > > > > TREE01, all four one-hour runs of TREE04, and one of the four one-hour
> > > > > > runs of TREE07.  This one:
> > > > > >
> > > > > > WARN_ON_ONCE(count != 0 && rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0);
> > > > > >
> > > > > > That is, there are callbacks in the list, but the sum of the segment
> > > > > > counts is nevertheless zero.  The ->nocb_lock is held.
> > > > > >
> > > > > > Thoughts?
> > > > >
> > > > > FWIW, TREE01 reproduces it very quickly compared to the other two
> > > > > scenarios, on all four run, within five minutes.
> > > >
> > > > So far for TREE01, I traced it down to an rcu_barrier happening so it could
> > > > be related to some interaction with rcu_barrier() (Just a guess).
> > >
> > > Well, rcu_barrier() and srcu_barrier() are the only users of
> > > rcu_segcblist_entrain(), if that helps.  Your modification to that
> > > function looks plausible to me, but the system's opinion always overrules
> > > mine.  ;-)
> > 
> > Right. Does anything the bypass code standout? That happens during
> > rcu_barrier() as well, and it messes with the lengths.
> 
> In theory, rcu_barrier_func() flushes the bypass before doing the
> entrain, and does the rcu_segcblist_entrain() afterwards.
> 
> Ah, and that is the issue.  If ->cblist is empty and ->nocb_bypass
> is not, then ->cblist length will be nonzero, and none of the
> segments will be nonzero.
> 
> So you need something like this for that second WARN, correct?
> 
> 	WARN_ON_ONCE(!rcu_segcblist_empty(&rdp->cblist) &&
> 		     rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0);
> 
> This is off the cuff, so should be taken with a grain of salt.  And
> there might well be other similar issues.

Ah, makes sense. Or maybe should be made like the other warning?
        WARN_ON_ONCE(!IS_ENABLED(CONFIG_RCU_NOCB_CPU) && count != 0
		&& rcu_segcblist_n_segment_cbs(&rdp->cblist) == 0);

Though your warning is better.

I will try these out and see if it goes away. I am afraid though that there
is an issue with !NOCB code since you had other configs that were failing
similarly.. :-\.

thanks, :-)

 - Joel


  reply	other threads:[~2020-11-19 20:43 UTC|newest]

Thread overview: 13+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2020-11-18 16:15 [PATCH v2] rcu/segcblist: Add debug checks for segment lengths Joel Fernandes (Google)
2020-11-18 20:13 ` Paul E. McKenney
2020-11-19  3:52   ` Paul E. McKenney
2020-11-19  3:56     ` Paul E. McKenney
2020-11-19 18:32       ` Joel Fernandes
2020-11-19 19:22         ` Paul E. McKenney
2020-11-19 19:44           ` Joel Fernandes
2020-11-19 20:16             ` Paul E. McKenney
2020-11-19 20:42               ` Joel Fernandes [this message]
2020-12-01 22:26                 ` Joel Fernandes
2020-12-02  4:21                   ` Paul E. McKenney
2020-12-02 14:58                     ` Joel Fernandes
2020-12-02 15:25                       ` Paul E. McKenney

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20201119204221.GB812262@google.com \
    --to=joel@joelfernandes.org \
    --cc=jiangshanlai@gmail.com \
    --cc=josh@joshtriplett.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=mathieu.desnoyers@efficios.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=rostedt@goodmis.org \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.