From: Steven Rostedt <rostedt@goodmis.org>
To: Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>
Cc: paulmck@linux.vnet.ibm.com, Oleg Nesterov <oleg@redhat.com>,
Peter Zijlstra <peterz@infradead.org>,
linux-kernel@vger.kernel.org, Ingo Molnar <mingo@elte.hu>,
akpm@linux-foundation.org, josh@joshtriplett.org,
tglx@linutronix.de, Valdis.Kletnieks@vt.edu, dhowells@redhat.com,
laijs@cn.fujitsu.com, dipankar@in.ibm.com
Subject: Re: [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier
Date: Sun, 10 Jan 2010 16:02:01 -0500 [thread overview]
Message-ID: <1263157321.4561.25.camel@frodo> (raw)
In-Reply-To: <20100110171020.GA13329@Krystal>
On Sun, 2010-01-10 at 12:10 -0500, Mathieu Desnoyers wrote:
> >
> > CPU 0 CPU 1
> > ---------- --------------
> > obj = list->obj;
> > <user space>
> > rcu_read_lock();
> > obj = rcu_dereference(list->obj);
> > obj->foo = bar;
> >
> > <preempt>
> > <kernel space>
> >
> > schedule();
> > cpumask_clear(mm_cpumask, cpu);
> >
> > sys_membarrier();
> > free(obj);
> >
> > <store to obj->foo goes to memory> <- corruption
> >
>
> Hrm, having a writer like this in a rcu read-side would be a bit weird.
> We have to look at the actual rcu_read_lock() implementation in urcu to
> see why load/stores are important on the rcu read-side.
>
No it is not weird, it is common. The read is on the link list that we
can access. Yes a write should be protected by other locks, so maybe
that is the weird part.
> (note: _STORE_SHARED is simply a volatile store)
>
> (Thread-local variable, shared with the thread doing synchronize_rcu())
> struct urcu_reader __thread urcu_reader;
>
> static inline void _rcu_read_lock(void)
> {
> long tmp;
>
> tmp = urcu_reader.ctr;
> if (likely(!(tmp & RCU_GP_CTR_NEST_MASK))) {
> _STORE_SHARED(urcu_reader.ctr, _LOAD_SHARED(urcu_gp_ctr));
> /*
> * Set active readers count for outermost nesting level before
> * accessing the pointer. See force_mb_all_threads().
> */
> barrier();
> } else {
> _STORE_SHARED(urcu_reader.ctr, tmp + RCU_GP_COUNT);
> }
> }
>
> So as you see here, we have to ensure that the store to urcu_reader.ctr
> is globally visible before entering the critical section (previous
> stores must complete before following loads). For rcu_read_unlock, it's
> the opposite:
>
> static inline void _rcu_read_unlock(void)
> {
> long tmp;
>
> tmp = urcu_reader.ctr;
> /*
> * Finish using rcu before decrementing the pointer.
> * See force_mb_all_threads().
> */
> if (likely((tmp & RCU_GP_CTR_NEST_MASK) == RCU_GP_COUNT)) {
> barrier();
> _STORE_SHARED(urcu_reader.ctr, urcu_reader.ctr - RCU_GP_COUNT);
> } else {
> _STORE_SHARED(urcu_reader.ctr, urcu_reader.ctr - RCU_GP_COUNT);
> }
> }
Thanks for the insight of the code. I need to get around and look at
your userspace implementation ;-)
>
> We need to ensure that previous loads complete before following stores.
>
> Therefore, the race with unlock showing that we need to order loads
> before stores:
>
> CPU 0 CPU 1
> -------------- --------------
> <user space> (already in read-side C.S.)
> obj = rcu_dereference(list->next);
> -> load list->next
> copy = obj->foo;
> rcu_read_unlock();
> -> store to urcu_reader.ctr
> <urcu_reader.ctr store is globally visible>
> list_del(obj);
> <preempt>
> <kernel space>
>
> schedule();
> cpumask_clear(mm_cpumask, cpu);
but here we are switching to a new task.
>
> sys_membarrier();
> set global g.p. (urcu_gp_ctr) phase to 1
> wait for all urcu_reader.ctr in phase 0
> set global g.p. (urcu_gp_ctr) phase to 0
> wait for all urcu_reader.ctr in phase 1
> sys_membarrier();
> free(obj);
> <list->next load hits memory>
> <obj->foo load hits memory> <- corruption
load of obj->foo is really load foo(obj) into some register. And for the
above to fail, that means that this load happened even after we switched
to kernel space, and that load of foo(obj) is still pending to get into
the thread stack that saved that register.
But I'm sure Paul will point me to some arch that does this ;-)
>
> >
> > So, if there's no smp_wmb() between the <preempt> and cpumask_clear()
> > then we have an issue?
>
> Considering the scenario above, we would need a full smp_mb() (or
> equivalent) rather than just smp_wmb() to be strictly correct.
I agree with Paul, we should just punt and grab the rq locks. That seems
to be the safest way without resorting to funny tricks to save 15% on a
slow path.
-- Steve
next prev parent reply other threads:[~2010-01-10 21:03 UTC|newest]
Thread overview: 107+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-01-07 4:40 [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier Mathieu Desnoyers
2010-01-07 5:02 ` Paul E. McKenney
2010-01-07 5:39 ` Mathieu Desnoyers
2010-01-07 8:32 ` Peter Zijlstra
2010-01-07 16:39 ` Paul E. McKenney
2010-01-07 5:28 ` Josh Triplett
2010-01-07 6:04 ` Mathieu Desnoyers
2010-01-07 6:32 ` Josh Triplett
2010-01-07 17:45 ` Mathieu Desnoyers
2010-01-07 16:46 ` Paul E. McKenney
2010-01-07 5:40 ` Steven Rostedt
2010-01-07 6:19 ` Mathieu Desnoyers
2010-01-07 6:35 ` Josh Triplett
2010-01-07 8:44 ` Peter Zijlstra
2010-01-07 13:15 ` Steven Rostedt
2010-01-07 15:07 ` Mathieu Desnoyers
2010-01-07 16:52 ` Paul E. McKenney
2010-01-07 17:18 ` Peter Zijlstra
2010-01-07 17:31 ` Paul E. McKenney
2010-01-07 17:44 ` Mathieu Desnoyers
2010-01-07 17:55 ` Paul E. McKenney
2010-01-07 17:44 ` Steven Rostedt
2010-01-07 17:56 ` Paul E. McKenney
2010-01-07 18:04 ` Steven Rostedt
2010-01-07 18:40 ` Paul E. McKenney
2010-01-07 17:36 ` Mathieu Desnoyers
2010-01-07 14:27 ` Steven Rostedt
2010-01-07 15:10 ` Mathieu Desnoyers
2010-01-07 16:49 ` Paul E. McKenney
2010-01-07 17:00 ` Steven Rostedt
2010-01-07 8:27 ` Peter Zijlstra
2010-01-07 18:30 ` Oleg Nesterov
2010-01-07 18:39 ` Paul E. McKenney
2010-01-07 18:59 ` Steven Rostedt
2010-01-07 19:16 ` Paul E. McKenney
2010-01-07 19:40 ` Steven Rostedt
2010-01-07 20:58 ` Paul E. McKenney
2010-01-07 21:35 ` Steven Rostedt
2010-01-07 22:34 ` Paul E. McKenney
2010-01-08 22:28 ` Mathieu Desnoyers
2010-01-08 23:53 ` Mathieu Desnoyers
2010-01-09 0:20 ` Paul E. McKenney
2010-01-09 1:02 ` Mathieu Desnoyers
2010-01-09 1:21 ` Paul E. McKenney
2010-01-09 1:22 ` Paul E. McKenney
2010-01-09 2:38 ` Mathieu Desnoyers
2010-01-09 5:42 ` Paul E. McKenney
2010-01-09 19:20 ` Mathieu Desnoyers
2010-01-09 23:05 ` Steven Rostedt
2010-01-09 23:16 ` Steven Rostedt
2010-01-10 0:03 ` Paul E. McKenney
2010-01-10 0:41 ` Steven Rostedt
2010-01-10 1:14 ` Mathieu Desnoyers
2010-01-10 1:44 ` Mathieu Desnoyers
2010-01-10 2:12 ` Steven Rostedt
2010-01-10 5:25 ` Paul E. McKenney
2010-01-10 11:50 ` Steven Rostedt
2010-01-10 16:03 ` Mathieu Desnoyers
2010-01-10 16:21 ` Steven Rostedt
2010-01-10 17:10 ` Mathieu Desnoyers
2010-01-10 21:02 ` Steven Rostedt [this message]
2010-01-10 21:41 ` Mathieu Desnoyers
2010-01-11 1:21 ` Paul E. McKenney
2010-01-10 17:45 ` Paul E. McKenney
2010-01-10 18:24 ` Mathieu Desnoyers
2010-01-11 1:17 ` Paul E. McKenney
2010-01-11 4:25 ` Mathieu Desnoyers
2010-01-11 4:29 ` [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v3a) Mathieu Desnoyers
2010-01-11 17:27 ` Paul E. McKenney
2010-01-11 17:35 ` Mathieu Desnoyers
2010-01-11 17:50 ` Peter Zijlstra
2010-01-11 20:52 ` Mathieu Desnoyers
2010-01-11 21:19 ` Peter Zijlstra
2010-01-11 22:04 ` Mathieu Desnoyers
2010-01-11 22:20 ` Peter Zijlstra
2010-01-11 22:48 ` Paul E. McKenney
2010-01-11 22:48 ` Mathieu Desnoyers
2010-01-11 21:19 ` Peter Zijlstra
2010-01-11 21:31 ` Peter Zijlstra
2010-01-11 4:30 ` [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier (v3b) Mathieu Desnoyers
2010-01-11 22:43 ` Paul E. McKenney
2010-01-12 15:38 ` Mathieu Desnoyers
2010-01-12 16:27 ` Steven Rostedt
2010-01-12 16:38 ` Mathieu Desnoyers
2010-01-12 16:54 ` Paul E. McKenney
2010-01-12 18:12 ` Paul E. McKenney
2010-01-12 18:56 ` Mathieu Desnoyers
2010-01-13 0:23 ` Paul E. McKenney
2010-01-11 16:25 ` [RFC PATCH] introduce sys_membarrier(): process-wide memory barrier Paul E. McKenney
2010-01-11 20:21 ` Mathieu Desnoyers
2010-01-11 21:48 ` Paul E. McKenney
2010-01-14 2:56 ` Lai Jiangshan
2010-01-14 5:13 ` Paul E. McKenney
2010-01-14 5:39 ` Mathieu Desnoyers
2010-01-10 5:18 ` Paul E. McKenney
2010-01-10 1:12 ` Mathieu Desnoyers
2010-01-10 5:19 ` Paul E. McKenney
2010-01-10 1:04 ` Mathieu Desnoyers
2010-01-10 1:01 ` Mathieu Desnoyers
2010-01-09 23:59 ` Paul E. McKenney
2010-01-10 1:11 ` Mathieu Desnoyers
2010-01-07 9:50 ` Andi Kleen
2010-01-07 15:12 ` Mathieu Desnoyers
2010-01-07 16:56 ` Paul E. McKenney
2010-01-07 11:04 ` David Howells
2010-01-07 15:15 ` Mathieu Desnoyers
2010-01-07 15:47 ` David Howells
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1263157321.4561.25.camel@frodo \
--to=rostedt@goodmis.org \
--cc=Valdis.Kletnieks@vt.edu \
--cc=akpm@linux-foundation.org \
--cc=dhowells@redhat.com \
--cc=dipankar@in.ibm.com \
--cc=josh@joshtriplett.org \
--cc=laijs@cn.fujitsu.com \
--cc=linux-kernel@vger.kernel.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mingo@elte.hu \
--cc=oleg@redhat.com \
--cc=paulmck@linux.vnet.ibm.com \
--cc=peterz@infradead.org \
--cc=tglx@linutronix.de \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox