From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Lai Jiangshan <laijs@cn.fujitsu.com>
Cc: linux-kernel@vger.kernel.org, mingo@elte.hu, dipankar@in.ibm.com,
akpm@linux-foundation.org, mathieu.desnoyers@polymtl.ca,
josh@joshtriplett.org, niv@us.ibm.com, tglx@linutronix.de,
peterz@infradead.org, rostedt@goodmis.org,
Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
eric.dumazet@gmail.com, darren@dvhart.com, fweisbec@gmail.com,
patches@linaro.org
Subject: Re: [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU implementation
Date: Mon, 20 Feb 2012 09:44:18 -0800 [thread overview]
Message-ID: <20120220174418.GI2470@linux.vnet.ibm.com> (raw)
In-Reply-To: <4F41F315.1040900@cn.fujitsu.com>
On Mon, Feb 20, 2012 at 03:15:33PM +0800, Lai Jiangshan wrote:
> On 02/13/2012 10:09 AM, Paul E. McKenney wrote:
>
> > /*
> > * Helper function for synchronize_srcu() and synchronize_srcu_expedited().
> > */
> > -static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
> > +static void __synchronize_srcu(struct srcu_struct *sp, bool expedited)
> > {
> > int idx;
> >
> > @@ -178,90 +316,51 @@ static void __synchronize_srcu(struct srcu_struct *sp, void (*sync_func)(void))
> > !lock_is_held(&rcu_sched_lock_map),
> > "Illegal synchronize_srcu() in same-type SRCU (or RCU) read-side critical section");
> >
> > - idx = sp->completed;
> > + idx = ACCESS_ONCE(sp->completed);
> > mutex_lock(&sp->mutex);
> >
> > /*
> > * Check to see if someone else did the work for us while we were
> > - * waiting to acquire the lock. We need -two- advances of
> > + * waiting to acquire the lock. We need -three- advances of
> > * the counter, not just one. If there was but one, we might have
> > * shown up -after- our helper's first synchronize_sched(), thus
> > * having failed to prevent CPU-reordering races with concurrent
> > - * srcu_read_unlock()s on other CPUs (see comment below). So we
> > - * either (1) wait for two or (2) supply the second ourselves.
> > + * srcu_read_unlock()s on other CPUs (see comment below). If there
> > + * was only two, we are guaranteed to have waited through only one
> > + * full index-flip phase. So we either (1) wait for three or
> > + * (2) supply the additional ones we need.
> > */
> >
> > - if ((sp->completed - idx) >= 2) {
> > + if (sp->completed == idx + 2)
> > + idx = 1;
> > + else if (sp->completed == idx + 3) {
> > mutex_unlock(&sp->mutex);
> > return;
> > - }
> > -
> > - sync_func(); /* Force memory barrier on all CPUs. */
> > + } else
> > + idx = 0;
>
>
> Hi, Paul
>
> I don't think this check-and-return path is needed since we will introduce call_srcu().
>
> We just need a correct code to show how it works and to be used for a while,
> and new call_srcu() will be implemented based on this correct code which will be removed.
Hello, Lai!
Yep, this code will be replaced with a state machine driven by callbacks.
> And I think this unneeded check-and-return path is incorrect. See the following:
>
> Reader Updater Helper thread
> old_ptr = rcu_ptr;
> /* rcu_ptr = NULL; but be reordered to (1) */
> start synchronize_srcu()
> idx = ACCESS_ONCE(sp->completed);(2)
> synchronize_srcu()
> synchronize_srcu()
> srcu_read_lock();
> old_ptr = rcu_ptr;
> rcu_ptr = NULL;(1)
> mutex_lock() and read sp->completed
> and return from synchronize_srcu()
> free(old_ptr);
> use freed old_ptr
> srcu_read_unlock();
>
>
> So, we need a smb_mb() between (1) and (2) to force the order.
>
> __synchronize_srcu() {
> smp_mb(); /* F */
> idx = ACCESS_ONCE(sp->completed); /* (2) */
And one here as well because mutex_lock() allows code to bleed in from
outside the critical section.
> ....
> }
Good catch! And shows the limitations of testing -- I hit this pretty
hard and didn't get a failure. I was focused so much on the complex
part of the patch that I failed to get the simple stuff right!!!
Shows the value of the Linux community's review processes, I guess. ;-)
> And this smp_mb() F is paired with helper's smp_mb() D. So if Updater sees X advances of
> ->completed, Updater must sees X times of *full* flip_and_wait(). So We need see -two- advances of
> ->completed from Helper only, not -three-.
Hmmm... Let's see... The case I was worried about is where the updater
samples ->completed just before it is incremented, then samples it again
just after it is incremented a second time. So you are arguing that this
cannot happen because the second sample occurs after acquiring the lock,
so that the second flip-and-wait cycle has to have already completed,
correct?
So waiting for three is appropriate for mutex_try_lock(), but is overly
conservative for mutex_lock().
> if (sp->completed == idx + 1)
> idx = 1;
> else if (sp->completed == idx + 2) {
> mutex_unlock(&sp->mutex);
> return;
> } else
> idx = 0;
>
>
> Or simpler:
>
> __synchronize_srcu() {
> unsigned int idx; /* <-------- unsigned */
>
> /* comments for smp_mb() F */
> smp_mb(); /* F */
> idx = ACCESS_ONCE(sp->completed);
>
> mutex_lock(&sp->mutex);
> idx = sp->completed - idx;
>
> /* original comments */
> for (; idx < 2; idx++)
> flip_idx_and_wait(sp, expedited);
> mutex_unlock(&sp->mutex);
> }
>
> At last, I can't understand the comments of this check-and-return path.
> So maybe the above reply and I are totally wrong.
I -think- you might be correct, but my approach is going to be to implement
call_srcu() which will eliminate this anyway.
> But the comments of this check-and-return path does not describe the code
> well(especially the order), and it contains the old "synchronize_sched()"
> which make me confuse.
The diffs are confusing -- I have to look at the actual code in this case.
> My conclusion, we can just remove the check-and-return path to reduce
> the complexity since we will introduce call_srcu().
If I actually submit the above upstream, that would be quite reasonable.
My thought is that patch remains RFC and the upstream version has
call_srcu().
> This new srcu is very great, especially the SRCU_USAGE_COUNT for every
> lock/unlock witch forces any increment/decrement pair changes the counter
> for me.
Glad you like it! ;-)
And thank you for your review and feedback!
Thanx, Paul
> Thanks,
> Lai
>
> --
> To unsubscribe from this list: send the line "unsubscribe linux-kernel" in
> the body of a message to majordomo@vger.kernel.org
> More majordomo info at http://vger.kernel.org/majordomo-info.html
> Please read the FAQ at http://www.tux.org/lkml/
>
next prev parent reply other threads:[~2012-02-20 17:44 UTC|newest]
Thread overview: 100+ messages / expand[flat|nested] mbox.gz Atom feed top
2012-02-13 2:09 [PATCH RFC tip/core/rcu] rcu: direct algorithmic SRCU implementation Paul E. McKenney
2012-02-15 12:59 ` Peter Zijlstra
2012-02-16 6:35 ` Paul E. McKenney
2012-02-16 10:50 ` Mathieu Desnoyers
2012-02-16 10:52 ` Peter Zijlstra
2012-02-16 11:14 ` Mathieu Desnoyers
2012-02-15 14:31 ` Mathieu Desnoyers
2012-02-15 14:51 ` Mathieu Desnoyers
2012-02-16 6:38 ` Paul E. McKenney
2012-02-16 11:00 ` Mathieu Desnoyers
2012-02-16 11:51 ` Peter Zijlstra
2012-02-16 12:18 ` Mathieu Desnoyers
2012-02-16 12:44 ` Peter Zijlstra
2012-02-16 14:52 ` Mathieu Desnoyers
2012-02-16 14:58 ` Peter Zijlstra
2012-02-16 15:13 ` Paul E. McKenney
2012-02-20 7:15 ` Lai Jiangshan
2012-02-20 17:44 ` Paul E. McKenney [this message]
2012-02-21 1:11 ` Lai Jiangshan
2012-02-21 1:50 ` Paul E. McKenney
2012-02-21 8:44 ` Lai Jiangshan
2012-02-21 17:24 ` Paul E. McKenney
2012-02-22 9:29 ` [PATCH 1/3 RFC paul/rcu/srcu] srcu: Remove fast check path Lai Jiangshan
2012-02-22 9:29 ` [PATCH 2/3 RFC paul/rcu/srcu] srcu: only increase the upper bit for srcu_read_lock() Lai Jiangshan
2012-02-22 9:50 ` Peter Zijlstra
2012-02-22 21:20 ` Paul E. McKenney
2012-02-22 21:26 ` Paul E. McKenney
2012-02-22 21:39 ` Steven Rostedt
2012-02-23 1:01 ` Paul E. McKenney
2012-02-22 9:29 ` [PATCH 3/3 RFC paul/rcu/srcu] srcu: flip only once for every grace period Lai Jiangshan
2012-02-23 1:01 ` Paul E. McKenney
2012-02-24 8:06 ` Lai Jiangshan
2012-02-24 20:01 ` Paul E. McKenney
2012-02-27 8:01 ` [PATCH 1/2 RFC] srcu: change the comments of the wait algorithm Lai Jiangshan
2012-02-27 8:01 ` [PATCH 2/2 RFC] srcu: implement Peter's checking algorithm Lai Jiangshan
2012-02-27 18:30 ` Paul E. McKenney
2012-02-28 1:51 ` Lai Jiangshan
2012-02-28 13:47 ` Paul E. McKenney
2012-02-29 10:07 ` Lai Jiangshan
2012-02-29 13:55 ` Paul E. McKenney
2012-03-01 2:31 ` Lai Jiangshan
2012-03-01 13:20 ` Paul E. McKenney
2012-03-10 3:41 ` Lai Jiangshan
2012-03-06 8:42 ` [RFC PATCH 0/6 paul/rcu/srcu] srcu: implement call_srcu() Lai Jiangshan
2012-03-06 9:57 ` [PATCH 1/6] remove unused srcu_barrier() Lai Jiangshan
2012-03-06 9:57 ` [PATCH 2/6] Don't touch the snap in srcu_readers_active() Lai Jiangshan
2012-03-08 19:14 ` Paul E. McKenney
2012-03-06 9:57 ` [PATCH 3/6] use "int trycount" instead of "bool expedited" Lai Jiangshan
2012-03-08 19:25 ` Paul E. McKenney
2012-03-06 9:57 ` [PATCH 4/6] remove flip_idx_and_wait() Lai Jiangshan
2012-03-06 10:41 ` Peter Zijlstra
2012-03-07 3:54 ` [RFC PATCH 5/5 single-thread-version] implement per-domain single-thread state machine call_srcu() Lai Jiangshan
2012-03-08 13:04 ` Peter Zijlstra
2012-03-08 14:17 ` Lai Jiangshan
2012-03-08 13:08 ` Peter Zijlstra
2012-03-08 20:35 ` Paul E. McKenney
2012-03-10 3:16 ` Lai Jiangshan
2012-03-12 18:03 ` Paul E. McKenney
2012-03-14 7:47 ` Lai Jiangshan
2012-04-10 20:15 ` Paul E. McKenney
2012-03-06 9:57 ` [RFC PATCH 5/6] implement per-cpu&per-domain " Lai Jiangshan
2012-03-06 10:47 ` Peter Zijlstra
2012-03-08 19:44 ` Paul E. McKenney
2012-03-06 10:58 ` Peter Zijlstra
2012-03-06 15:17 ` Lai Jiangshan
2012-03-06 15:38 ` Peter Zijlstra
2012-03-08 19:49 ` Paul E. McKenney
2012-03-10 10:12 ` Peter Zijlstra
2012-03-12 17:52 ` Paul E. McKenney
2012-03-06 11:16 ` Peter Zijlstra
2012-03-06 15:12 ` Lai Jiangshan
2012-03-06 15:34 ` Peter Zijlstra
2012-03-08 19:58 ` Paul E. McKenney
2012-03-10 3:32 ` Lai Jiangshan
2012-03-10 10:09 ` Peter Zijlstra
2012-03-12 17:54 ` Paul E. McKenney
2012-03-12 17:58 ` Peter Zijlstra
2012-03-12 18:32 ` Paul E. McKenney
2012-03-12 20:25 ` Peter Zijlstra
2012-03-12 23:15 ` Paul E. McKenney
2012-03-12 23:18 ` Peter Zijlstra
2012-03-12 23:38 ` Paul E. McKenney
2012-03-06 15:26 ` Lai Jiangshan
2012-03-06 15:37 ` Peter Zijlstra
2012-03-06 11:17 ` Peter Zijlstra
2012-03-06 11:22 ` Peter Zijlstra
2012-03-06 11:35 ` Peter Zijlstra
2012-03-06 11:36 ` Peter Zijlstra
2012-03-06 11:39 ` Peter Zijlstra
2012-03-06 14:50 ` Lai Jiangshan
2012-03-06 11:52 ` Peter Zijlstra
2012-03-06 14:44 ` Lai Jiangshan
2012-03-06 15:31 ` Peter Zijlstra
2012-03-06 15:32 ` Peter Zijlstra
2012-03-07 6:44 ` Lai Jiangshan
2012-03-07 8:10 ` Gilad Ben-Yossef
2012-03-07 9:21 ` Lai Jiangshan
2012-03-06 14:47 ` Lai Jiangshan
2012-03-06 9:57 ` [PATCH 6/6] add srcu torture test Lai Jiangshan
2012-03-08 19:03 ` [PATCH 1/6] remove unused srcu_barrier() Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20120220174418.GI2470@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=darren@dvhart.com \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=eric.dumazet@gmail.com \
--cc=fweisbec@gmail.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=niv@us.ibm.com \
--cc=patches@linaro.org \
--cc=peterz@infradead.org \
--cc=rostedt@goodmis.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.