From: Joel Fernandes <joelagnelf@nvidia.com>
To: "Paul E. McKenney" <paulmck@kernel.org>
Cc: Frederic Weisbecker <frederic@kernel.org>,
LKML <linux-kernel@vger.kernel.org>,
Boqun Feng <boqun.feng@gmail.com>,
Neeraj Upadhyay <neeraj.upadhyay@amd.com>,
Uladzislau Rezki <urezki@gmail.com>,
Zqiang <qiang.zhang1211@gmail.com>, rcu <rcu@vger.kernel.org>
Subject: Re: [PATCH 1/2] rcu: Comment on the extraneous delta test on rcu_seq_done_exact()
Date: Wed, 19 Mar 2025 15:38:31 -0400 [thread overview]
Message-ID: <20250319193831.GA3791727@joelnvbox> (raw)
In-Reply-To: <322b052d-0f1f-45a3-bfef-226b15f3a8fd@paulmck-laptop>
On Tue, Mar 18, 2025 at 11:37:38AM -0700, Paul E. McKenney wrote:
> On Tue, Mar 18, 2025 at 02:56:18PM +0100, Frederic Weisbecker wrote:
> > The numbers used in rcu_seq_done_exact() lack some explanation behind
> > their magic. Especially after the commit:
> >
> > 85aad7cc4178 ("rcu: Fix get_state_synchronize_rcu_full() GP-start detection")
> >
> > which reported a subtle issue where a new GP sequence snapshot was taken
> > on the root node state while a grace period had already been started and
> > reflected on the global state sequence but not yet on the root node
> > sequence, making a polling user waiting on a wrong already started grace
> > period that would ignore freshly online CPUs.
> >
> > The fix involved taking the snaphot on the global state sequence and
> > waiting on the root node sequence. And since a grace period is first
> > started on the global state and only afterward reflected on the root
> > node, a snapshot taken on the global state sequence might be two full
> > grace periods ahead of the root node as in the following example:
> >
> > rnp->gp_seq = rcu_state.gp_seq = 0
> >
> > CPU 0 CPU 1
> > ----- -----
> > // rcu_state.gp_seq = 1
> > rcu_seq_start(&rcu_state.gp_seq)
> > // snap = 8
> > snap = rcu_seq_snap(&rcu_state.gp_seq)
> > // Two full GP differences
> > rcu_seq_done_exact(&rnp->gp_seq, snap)
> > // rnp->gp_seq = 1
> > WRITE_ONCE(rnp->gp_seq, rcu_state.gp_seq);
> >
> > Add a comment about those expectations and to clarify the magic within
> > the relevant function.
> >
> > Signed-off-by: Frederic Weisbecker <frederic@kernel.org>
>
> Reviewed-by: Paul E. McKenney <paulmck@kernel.org>
>
> But it would of course be good to get reviews from the others.
I actually don't agree that the magic in the rcu_seq_done_exact() function about the
~2 GPs is related to the lag between rcu_state.gp_seq and root rnp->gp_seq,
because the small lag can just as well survive with the rcu_seq_done()
function in the above sequence right?
The rcu_seq_done_exact() function on the other hand is more about not being
stuck in the ULONG_MAX/2 guard band, but to actually get to that, you need a
wrap around to happen and the delta between "rnp->gp_seq" and "snap" to be at
least ULONG_MAX/2 AFAIU.
So the only time this magic will matter is if you have a huge delta between
what is being compared, not just 2 GPs.
Or, did I miss something?
(Also sorry about slow email responses this week as I had my presentation
today and was busy preparing this week and attending other presentations at
OSPM, I'll provide an update on that soon!).
thanks,
- Joel
>
> > ---
> > kernel/rcu/rcu.h | 7 +++++++
> > 1 file changed, 7 insertions(+)
> >
> > diff --git a/kernel/rcu/rcu.h b/kernel/rcu/rcu.h
> > index eed2951a4962..7acf1f36dd6c 100644
> > --- a/kernel/rcu/rcu.h
> > +++ b/kernel/rcu/rcu.h
> > @@ -157,6 +157,13 @@ static inline bool rcu_seq_done(unsigned long *sp, unsigned long s)
> > * Given a snapshot from rcu_seq_snap(), determine whether or not a
> > * full update-side operation has occurred, but do not allow the
> > * (ULONG_MAX / 2) safety-factor/guard-band.
> > + *
> > + * The token returned by get_state_synchronize_rcu_full() is based on
> > + * rcu_state.gp_seq but it is tested in poll_state_synchronize_rcu_full()
> > + * against the root rnp->gp_seq. Since rcu_seq_start() is first called
> > + * on rcu_state.gp_seq and only later reflected on the root rnp->gp_seq,
> > + * it is possible that rcu_seq_snap(rcu_state.gp_seq) returns 2 full grace
> > + * periods ahead of the root rnp->gp_seq.
> > */
> > static inline bool rcu_seq_done_exact(unsigned long *sp, unsigned long s)
> > {
> > --
> > 2.48.1
> >
next prev parent reply other threads:[~2025-03-19 19:38 UTC|newest]
Thread overview: 14+ messages / expand[flat|nested] mbox.gz Atom feed top
2025-03-18 13:56 [PATCH 0/2] rcu: Random updates Frederic Weisbecker
2025-03-18 13:56 ` [PATCH 1/2] rcu: Comment on the extraneous delta test on rcu_seq_done_exact() Frederic Weisbecker
2025-03-18 18:37 ` Paul E. McKenney
2025-03-19 19:38 ` Joel Fernandes [this message]
2025-03-20 14:15 ` Frederic Weisbecker
2025-03-22 2:06 ` Joel Fernandes
2025-03-22 10:25 ` Frederic Weisbecker
2025-03-22 14:20 ` Joel Fernandes
2025-03-22 14:40 ` Joel Fernandes
2025-03-23 22:05 ` Frederic Weisbecker
2025-03-23 22:57 ` Joel Fernandes
2025-03-18 13:56 ` [PATCH 2/2] rcu: Robustify rcu_is_cpu_rrupt_from_idle() Frederic Weisbecker
2025-03-22 17:00 ` Joel Fernandes
2025-03-23 22:31 ` Frederic Weisbecker
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20250319193831.GA3791727@joelnvbox \
--to=joelagnelf@nvidia.com \
--cc=boqun.feng@gmail.com \
--cc=frederic@kernel.org \
--cc=linux-kernel@vger.kernel.org \
--cc=neeraj.upadhyay@amd.com \
--cc=paulmck@kernel.org \
--cc=qiang.zhang1211@gmail.com \
--cc=rcu@vger.kernel.org \
--cc=urezki@gmail.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.