Re: [PATCH v4 3/3] rcu: Use _full() API to debug synchronize_rcu()

rcu.vger.kernel.org archive mirror
 help / color / mirror / Atom feed

From: Joel Fernandes <joelagnelf@nvidia.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Uladzislau Rezki <urezki@gmail.com>,
	Boqun Feng <boqun.feng@gmail.com>, RCU <rcu@vger.kernel.org>,
	LKML <linux-kernel@vger.kernel.org>,
	Frederic Weisbecker <frederic@kernel.org>,
	Cheung Wall <zzqq0103.hey@gmail.com>,
	Neeraj upadhyay <Neeraj.Upadhyay@amd.com>,
	Joel Fernandes <joel@joelfernandes.org>,
	Oleksiy Avramchenko <oleksiy.avramchenko@sony.com>
Subject: Re: [PATCH v4 3/3] rcu: Use _full() API to debug synchronize_rcu()
Date: Mon, 3 Mar 2025 12:00:40 -0500	[thread overview]
Message-ID: <20250303170040.GA31126@joelnvbox> (raw)
In-Reply-To: <20250303001710.GA3997787@joelnvbox>

On Sun, Mar 02, 2025 at 07:17:10PM -0500, Joel Fernandes wrote:
> On Sun, Mar 02, 2025 at 07:15:07PM -0500, Joel Fernandes wrote:
> > On Fri, Feb 28, 2025 at 05:08:49PM -0800, Paul E. McKenney wrote:
> > > On Fri, Feb 28, 2025 at 11:59:55AM -0800, Paul E. McKenney wrote:
> > > > On Fri, Feb 28, 2025 at 08:12:51PM +0100, Uladzislau Rezki wrote:
> > > > > Hello, Paul!
> > > > > 
> > > > > > > > > > 
> > > > > > > > > > Except that I got this from overnight testing of rcu/dev on the shared
> > > > > > > > > > RCU tree:
> > > > > > > > > > 
> > > > > > > > > > WARNING: CPU: 5 PID: 14 at kernel/rcu/tree.c:1636 rcu_sr_normal_complete+0x5c/0x80
> > > > > > > > > > 
> > > > > > > > > > I see this only on TREE05.  Which should not be too surprising, given
> > > > > > > > > > that this is the scenario that tests it.  It happened within five minutes
> > > > > > > > > > on all 14 of the TREE05 runs.
> > > > > > > > > > 
> > > > > > > > > Hm.. This is not fun. I tested this on my system and i did not manage to
> > > > > > > > > trigger this whereas you do. Something is wrong.
> > > > > > > > 
> > > > > > > > If you have a debug patch, I would be happy to give it a go.
> > > > > > > > 
> > > > > > > I can trigger it. But.
> > > > > > > 
> > > > > > > Some background. I tested those patches during many hours on the stable
> > > > > > > kernel which is 6.13. On that kernel i was not able to trigger it. Running
> > > > > > > the rcutorture on the our shared "dev" tree, which i did now, triggers this
> > > > > > > right away.
> > > > > > 
> > > > > > Bisection?  (Hey, you knew that was coming!)
> > > > > > 
> > > > > Looks like this: rcu: Fix get_state_synchronize_rcu_full() GP-start detection
> > > > > 
> > > > > After revert in the dev, rcutorture passes TREE05, 16 instances.
> > > > 
> > > > Huh.  We sure don't get to revert that one...
> > > > 
> > > > Do we have a problem with the ordering in rcu_gp_init() between the calls
> > > > to rcu_seq_start() and portions of rcu_sr_normal_gp_init()?  For example,
> > > > do we need to capture the relevant portion of the list before the call
> > > > to rcu_seq_start(), and do the grace-period-start work afterwards?
> > > 
> > > I tried moving the call to rcu_sr_normal_gp_init() before the call to
> > > rcu_seq_start() and got no failures in a one-hour run of 200*TREE05.
> > > Which does not necessarily mean that this is the correct fix, but I
> > > figured that it might at least provide food for thought.
> > > 
> > >                                           Thanx, Paul
> > > 
> > > ------------------------------------------------------------------------
> > > 
> > > diff --git a/kernel/rcu/tree.c b/kernel/rcu/tree.c
> > > index 48384fa2eaeb8..d3efeff7740e7 100644
> > > --- a/kernel/rcu/tree.c
> > > +++ b/kernel/rcu/tree.c
> > > @@ -1819,10 +1819,10 @@ static noinline_for_stack bool rcu_gp_init(void)
> > >  
> > >        /* Advance to a new grace period and initialize state. */
> > >        record_gp_stall_check_time();
> > > +      start_new_poll = rcu_sr_normal_gp_init();
> > >        /* Record GP times before starting GP, hence rcu_seq_start(). */
> > >        rcu_seq_start(&rcu_state.gp_seq);
> > >        ASSERT_EXCLUSIVE_WRITER(rcu_state.gp_seq);
> > > -      start_new_poll = rcu_sr_normal_gp_init();
> > >        trace_rcu_grace_period(rcu_state.name, rcu_state.gp_seq, TPS("start"));
> > 
> > Oh... so the bug is this? Good catch...
> > 
> > 
> > CPU 0                                           CPU 1
> > 
> >                                                 rcu_gp_init()
> >                                                       rcu_seq_start(rcu_state.gp_seq)
> > sychronize_rcu_normal()
> >       rs.head.func
> >       	= (void *) get_state_synchronize_rcu();
> >                    // save rcu_state.gp_seq
> >       rcu_sr_normal_add_req() ->
> >             llist_add(rcu_state.srs_next)
> >       (void) start_poll_synchronize_rcu();
> > 
> > 
> >                                                       sr_normal_gp_init()
> >                                                             llist_add(wait_head, &rcu_state.srs_next);
> > 							    // pick up the
> > 							    // injected WH
> >                                                             rcu_state.srs_wait_tail = wait_head;
> > 
> >                                                 rcu_gp_cleanup()
> >                                                       rcu_seq_end(&rcu_state.gp_seq);
> >                                                       sr_normal_complete()
> >                                                             WARN_ONCE(IS_ENABLED(CONFIG_PROVE_RCU) &&
> >                                                             !poll_state_synchronize_rcu(oldstate),
> > 
> > Where as reordering sr_normal_gp_init() prevents this:
> > 
> >                                                 rcu_gp_init()
> > 
> >                                                       sr_normal_gp_init()
> > 							    // WH has not
> > 							    // been injected
> > 							    // so nothing to
> > 							    // wait on
> > 
> >                                                       rcu_seq_start(rcu_state.gp_seq)
> > sychronize_rcu_normal()
> >       rs.head.func
> >       	= (void *) get_state_synchronize_rcu();
> >                    // save rcu_state.gp_seq
> >       rcu_sr_normal_add_req() ->
> >             llist_add(rcu_state.srs_next)
> >       (void) start_poll_synchronize_rcu();
> > 
> >                                                 rcu_gp_cleanup()
> >                                                       rcu_seq_end(&rcu_state.gp_seq);
> >                                                       // sr_normal_complete()
> > 						      // wont do anything so
> > 						      // no warning
> > 
> > Did I get that right?
> > 
> > I think this is a real bug AFAICS, hoping all the memory barriers are in
> > place to make sure the code reordering also correctly orders the accesses.
> > I'll double check that.
> > 
> > I also feel its 'theoretical', because as long as rcu_gp_init() and
> > rcu_gp_cleanup() are properly ordered WRT pre-existing readers, then
> > synchronize_rcu_normal() still waits for pre-existing readers even though its
> > a bit confused about the value of the cookies.
> > 
> > For the fix,
> > Reviewed-by: Joel Fernandes (Google) <joel@joelfernandes.org>
> 
> Oops, this should be:
> 
> Reviewed-by: Joel Fernandes <joelagnelf@nvidia.com>

I see the original patch "rcu: Fix get_state_synchronize_rcu_full() GP-start
detection" is not yet on -next. Once we are convinced about the fix, do we
want to squash the fix into this patch and have Boqun take it?

Yet another idea is to apply it for 6.15-rc cycle if more time is needed.

Alternatively, we could squash it and I queue it for 6.16 instead of 6.15.

And I'm guessing Vlad's series is also for 6.16.

thanks,

 - Joel

next prev parent reply	other threads:[~2025-03-03 17:00 UTC|newest]

Thread overview: 40+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2025-02-27 13:16 [PATCH v4 1/3] rcutorture: Allow a negative value for nfakewriters Uladzislau Rezki (Sony)
2025-02-27 13:16 ` [PATCH v4 2/3] rcu: Update TREE05.boot to test normal synchronize_rcu() Uladzislau Rezki (Sony)
2025-02-27 13:16 ` [PATCH v4 3/3] rcu: Use _full() API to debug synchronize_rcu() Uladzislau Rezki (Sony)
2025-02-27 17:12   ` Boqun Feng
2025-02-27 17:26     ` Paul E. McKenney
2025-02-27 17:30       ` Boqun Feng
2025-02-27 17:44       ` Uladzislau Rezki
2025-02-28 15:41         ` Paul E. McKenney
2025-02-28 16:36           ` Uladzislau Rezki
2025-02-28 17:08             ` Uladzislau Rezki
2025-02-28 18:25               ` Paul E. McKenney
2025-02-28 18:30                 ` Uladzislau Rezki
2025-02-28 18:21             ` Paul E. McKenney
2025-02-28 18:24               ` Uladzislau Rezki
2025-02-28 18:38                 ` Paul E. McKenney
2025-02-28 19:12                   ` Uladzislau Rezki
2025-02-28 19:59                     ` Paul E. McKenney
2025-03-01  1:08                       ` Paul E. McKenney
2025-03-02 10:19                         ` Uladzislau Rezki
2025-03-02 17:39                           ` Paul E. McKenney
2025-03-02 18:46                             ` Boqun Feng
2025-03-02 20:36                               ` Paul E. McKenney
2025-03-03 16:03                                 ` Uladzislau Rezki
2025-03-03  0:15                         ` Joel Fernandes
2025-03-03  0:17                           ` Joel Fernandes
2025-03-03 17:00                             ` Joel Fernandes [this message]
2025-03-03 17:07                               ` Boqun Feng
2025-03-03 17:30                                 ` Joel Fernandes
2025-03-03 17:59                                   ` Joel Fernandes
2025-03-03 18:55                                   ` Paul E. McKenney
2025-03-03 20:02                                     ` Joel Fernandes
2025-03-04  3:23                           ` Boqun Feng
2025-03-04 10:52                             ` Uladzislau Rezki
2025-03-04 10:56                               ` Uladzislau Rezki
2025-03-05  2:54                                 ` Boqun Feng
2025-03-05 15:37                                   ` Joel Fernandes
2025-03-05 15:24                             ` Joel Fernandes
2025-02-27 17:43     ` Uladzislau Rezki
2025-03-10  1:55   ` Joel Fernandes
2025-03-11 12:38     ` Uladzislau Rezki

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=20250303170040.GA31126@joelnvbox \
    --to=joelagnelf@nvidia.com \
    --cc=Neeraj.Upadhyay@amd.com \
    --cc=boqun.feng@gmail.com \
    --cc=frederic@kernel.org \
    --cc=joel@joelfernandes.org \
    --cc=linux-kernel@vger.kernel.org \
    --cc=oleksiy.avramchenko@sony.com \
    --cc=paulmck@kernel.org \
    --cc=rcu@vger.kernel.org \
    --cc=urezki@gmail.com \
    --cc=zzqq0103.hey@gmail.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link

Be sure your reply has a Subject: header at the top and a blank line before the message body.

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).