From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
peterz@infradead.org, rostedt@goodmis.org,
Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com,
patches@linaro.org
Subject: Re: [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU implementation
Date: Mon, 20 Feb 2012 09:44:18 -0800 [thread overview]
Message-ID: <20120220174418.GI2470@linux.vnet.ibm.com> (raw)
In-Reply-To: <4F41F315.1040900@cn.fujitsu.com>
On Mon, Feb 20, 2012 at 03:15:33PM +0800, Lai Jiangshan wrote:
> On 02/13/2012 10:09 AM, Paul E. McKenney wrote:
>
> > /*
> > * Helper function for synchronize_srcu() and synchronize_srcu_expedited().
> > */
> > -static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
> > +static void __synchronize_srcu(struct srcu_struct *sp, bool expedited)
> > {
> > int idx;
> >
> > @@ -178,90 +316,51 @@ static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
> > !lock_is_held(&rcu_sched_lock_map),
> > "Illegal synchronize_srcu() in same-type SRCU (or RCU) read-side critical section");
> >
> > - idx = sp->completed;
> > + idx = ACCESS_ONCE(sp->completed);
> > mutex_lock(&sp->mutex);
> >
> > /*
> > * Check to see if someone else did the work for us while we were
> > - * waiting to acquire the lock. We need -two- advances of
> > + * waiting to acquire the lock. We need -three- advances of
> > * the counter, not just one. If there was but one, we might have
> > * shown up -after- our helper's first synchronize_sched(), thus
> > * having failed to prevent CPU-reordering races with concurrent
> > - * srcu_read_unlock()s on other CPUs (see comment below). So we
> > - * either (1) wait for two or (2) supply the second ourselves.
> > + * srcu_read_unlock()s on other CPUs (see comment below). If there
> > + * was only two, we are guaranteed to have waited through only one
> > + * full index-flip phase. So we either (1) wait for three or
> > + * (2) supply the additional ones we need.
> > */
> >
> > - if ((sp->completed - idx) >= 2) {
> > + if (sp->completed == idx + 2)
> > + idx = 1;
> > + else if (sp->completed == idx + 3) {
> > mutex_unlock(&sp->mutex);
> > return;
> > - }
> > -
> > - sync_func(); /* Force memory barrier on all CPUs. */
> > + } else
> > + idx = 0;
>
>
> Hi, Paul
>
> I don't think this check-and-return path is needed since we will introduce call_srcu().
>
> We just need a correct code to show how it works and to be used for a while,
> and new call_srcu() will be implemented based on this correct code which will be removed.
Hello, Lai!
Yep, this code will be replaced with a state machine driven by callbacks.
> And I think this unneeded check-and-return path is incorrect. See the following:
>
> Reader Updater Helper thread
> old_ptr = rcu_ptr;
> /* rcu_ptr = NULL; but be reordered to (1) */
> start synchronize_srcu()
> idx = ACCESS_ONCE(sp->completed);(2)
> synchronize_srcu()
> synchronize_srcu()
> srcu_read_lock();
> old_ptr = rcu_ptr;
> rcu_ptr = NULL;(1)
> mutex_lock() and read sp->completed
> and return from synchronize_srcu()
> free(old_ptr);
> use freed old_ptr
> srcu_read_unlock();
>
>
> So, we need a smb_mb() between (1) and (2) to force the order.
>
> __synchronize_srcu() {
> smp_mb(); /* F */
> idx = ACCESS_ONCE(sp->completed); /* (2) */
And one here as well because mutex_lock() allows code to bleed in from
outside the critical section.
> ....
> }
Good catch! And shows the limitations of testing -- I hit this pretty
hard and didn't get a failure. I was focused so much on the complex
part of the patch that I failed to get the simple stuff right!!!
Shows the value of the Linux community's review processes, I guess. ;-)
> And this smp_mb() F is paired with helper's smp_mb() D. So if Updater sees X advances of
> ->completed, Updater must sees X times of *full* flip_and_wait(). So We need see -two- advances of
> ->completed from Helper only, not -three-.
Hmmm... Let's see... The case I was worried about is where the updater
samples ->completed just before it is incremented, then samples it again
just after it is incremented a second time. So you are arguing that this
cannot happen because the second sample occurs after acquiring the lock,
so that the second flip-and-wait cycle has to have already completed,
correct?
So waiting for three is appropriate for mutex_try_lock(), but is overly
conservative for mutex_lock().
> if (sp->completed == idx + 1)
> idx = 1;
> else if (sp->completed == idx + 2) {
> mutex_unlock(&sp->mutex);
> return;
> } else
> idx = 0;
>
>
> Or simpler:
>
> __synchronize_srcu() {
> unsigned int idx; /* <-------- unsigned */
>
> /* comments for smp_mb() F */
> smp_mb(); /* F */
> idx = ACCESS_ONCE(sp->completed);
>
> mutex_lock(&sp->mutex);
> idx = sp->completed - idx;
>
> /* original comments */
> for (; idx < 2; idx++)
> flip_idx_and_wait(sp, expedited);
> mutex_unlock(&sp->mutex);
> }
>
> At last, I can't understand the comments of this check-and-return path.
> So maybe the above reply and I are totally wrong.
I -think- you might be correct, but my approach is going to be to implement
call_srcu() which will eliminate this anyway.
> But the comments of this check-and-return path does not describe the code
> well(especially the order), and it contains the old "synchronize_sched()"
> which make me confuse.
The diffs are confusing -- I have to look at the actual code in this case.
> My conclusion, we can just remove the check-and-return path to reduce
> the complexity since we will introduce call_srcu().
If I actually submit the above upstream, that would be quite reasonable.
My thought is that patch remains RFC and the upstream version has
call_srcu().
> This new srcu is very great, especially the SRCU_USAGE_COUNT for every
> lock/unlock witch forces any increment/decrement pair changes the counter
> for me.
Glad you like it! ;-)
And thank you for your review and feedback!
Thanx, Paul
> Thanks,
> Lai
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2012-02-20 17:44 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-13 2:09 [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU implementation Paul E. McKenney
2012-02-15 12:59 ` Peter Zijlstra
2012-02-16 6:35 ` Paul E. McKenney
2012-02-16 10:50 ` Mathieu Desnoyers
2012-02-16 10:52 ` Peter Zijlstra
2012-02-16 11:14 ` Mathieu Desnoyers
2012-02-15 14:31 ` Mathieu Desnoyers
2012-02-15 14:51 ` Mathieu Desnoyers
2012-02-16 6:38 ` Paul E. McKenney
2012-02-16 11:00 ` Mathieu Desnoyers
2012-02-16 11:51 ` Peter Zijlstra
2012-02-16 12:18 ` Mathieu Desnoyers
2012-02-16 12:44 ` Peter Zijlstra
2012-02-16 14:52 ` Mathieu Desnoyers
2012-02-16 14:58 ` Peter Zijlstra
2012-02-16 15:13 ` Paul E. McKenney
2012-02-20 7:15 ` Lai Jiangshan
2012-02-20 17:44 ` Paul E. McKenney [this message]
2012-02-21 1:11 ` Lai Jiangshan
2012-02-21 1:50 ` Paul E. McKenney
2012-02-21 8:44 ` Lai Jiangshan
2012-02-21 17:24 ` Paul E. McKenney
2012-02-22 9:29 ` [PATCH 1/3 RFC paul/rcu/srcu] srcu: Remove fast check path Lai Jiangshan
2012-02-22 9:29 ` [PATCH 2/3 RFC paul/rcu/srcu] srcu: only increase the upper bit for srcu_read_lock() Lai Jiangshan
2012-02-22 9:50 ` Peter Zijlstra
2012-02-22 21:20 ` Paul E. McKenney
2012-02-22 21:26 ` Paul E. McKenney
2012-02-22 21:39 ` Steven Rostedt
2012-02-23 1:01 ` Paul E. McKenney
2012-02-22 9:29 ` [PATCH 3/3 RFC paul/rcu/srcu] srcu: flip only once for every grace period Lai Jiangshan
2012-02-23 1:01 ` Paul E. McKenney
2012-02-24 8:06 ` Lai Jiangshan
2012-02-24 20:01 ` Paul E. McKenney
2012-02-27 8:01 ` [PATCH 1/2 RFC] srcu: change the comments of the wait algorithm Lai Jiangshan
2012-02-27 8:01 ` [PATCH 2/2 RFC] srcu: implement Peter's checking algorithm Lai Jiangshan
2012-02-27 18:30 ` Paul E. McKenney
2012-02-28 1:51 ` Lai Jiangshan
2012-02-28 13:47 ` Paul E. McKenney
2012-02-29 10:07 ` Lai Jiangshan
2012-02-29 13:55 ` Paul E. McKenney
2012-03-01 2:31 ` Lai Jiangshan
2012-03-01 13:20 ` Paul E. McKenney
2012-03-10 3:41 ` Lai Jiangshan
2012-03-06 8:42 ` [RFC PATCH 0/6 paul/rcu/srcu] srcu: implement call_srcu() Lai Jiangshan
2012-03-06 9:57 ` [PATCH 1/6] remove unused srcu_barrier() Lai Jiangshan
2012-03-06 9:57 ` [PATCH 2/6] Don't touch the snap in srcu_readers_active() Lai Jiangshan
2012-03-08 19:14 ` Paul E. McKenney
2012-03-06 9:57 ` [PATCH 3/6] use "int trycount" instead of "bool expedited" Lai Jiangshan
2012-03-08 19:25 ` Paul E. McKenney
2012-03-06 9:57 ` [PATCH 4/6] remove flip_idx_and_wait() Lai Jiangshan
2012-03-06 10:41 ` Peter Zijlstra
2012-03-07 3:54 ` [RFC PATCH 5/5 single-thread-version] implement per-domain single-thread state machine call_srcu() Lai Jiangshan
2012-03-08 13:04 ` Peter Zijlstra
2012-03-08 14:17 ` Lai Jiangshan
2012-03-08 13:08 ` Peter Zijlstra
2012-03-08 20:35 ` Paul E. McKenney
2012-03-10 3:16 ` Lai Jiangshan
2012-03-12 18:03 ` Paul E. McKenney
2012-03-14 7:47 ` Lai Jiangshan
2012-04-10 20:15 ` Paul E. McKenney
2012-03-06 9:57 ` [RFC PATCH 5/6] implement per-cpu&per-domain " Lai Jiangshan
2012-03-06 10:47 ` Peter Zijlstra
2012-03-08 19:44 ` Paul E. McKenney
2012-03-06 10:58 ` Peter Zijlstra
2012-03-06 15:17 ` Lai Jiangshan
2012-03-06 15:38 ` Peter Zijlstra
2012-03-08 19:49 ` Paul E. McKenney
2012-03-10 10:12 ` Peter Zijlstra
2012-03-12 17:52 ` Paul E. McKenney
2012-03-06 11:16 ` Peter Zijlstra
2012-03-06 15:12 ` Lai Jiangshan
2012-03-06 15:34 ` Peter Zijlstra
2012-03-08 19:58 ` Paul E. McKenney
2012-03-10 3:32 ` Lai Jiangshan
2012-03-10 10:09 ` Peter Zijlstra
2012-03-12 17:54 ` Paul E. McKenney
2012-03-12 17:58 ` Peter Zijlstra
2012-03-12 18:32 ` Paul E. McKenney
2012-03-12 20:25 ` Peter Zijlstra
2012-03-12 23:15 ` Paul E. McKenney
2012-03-12 23:18 ` Peter Zijlstra
2012-03-12 23:38 ` Paul E. McKenney
2012-03-06 15:26 ` Lai Jiangshan
2012-03-06 15:37 ` Peter Zijlstra
2012-03-06 11:17 ` Peter Zijlstra
2012-03-06 11:22 ` Peter Zijlstra
2012-03-06 11:35 ` Peter Zijlstra
2012-03-06 11:36 ` Peter Zijlstra
2012-03-06 11:39 ` Peter Zijlstra
2012-03-06 14:50 ` Lai Jiangshan
2012-03-06 11:52 ` Peter Zijlstra
2012-03-06 14:44 ` Lai Jiangshan
2012-03-06 15:31 ` Peter Zijlstra
2012-03-06 15:32 ` Peter Zijlstra
2012-03-07 6:44 ` Lai Jiangshan
2012-03-07 8:10 ` Gilad Ben-Yossef
2012-03-07 9:21 ` Lai Jiangshan
2012-03-06 14:47 ` Lai Jiangshan
2012-03-06 9:57 ` [PATCH 6/6] add srcu torture test Lai Jiangshan
2012-03-08 19:03 ` [PATCH 1/6] remove unused srcu_barrier() Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120220174418.GI2470@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=darren@dvhart.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=patches@linaro.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).