From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Michael Neuling <mikey@neuling.org>,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
LKML <linux-kernel@vger.kernel.org>,
Oleg Nesterov <oleg@redhat.com>,
Linux PPC dev <linuxppc-dev@ozlabs.org>,
Anton Blanchard <anton@samba.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
Victor Kaplansky <VICTORK@il.ibm.com>
Subject: Re: perf events ring buffer memory barrier on powerpc
Date: Mon, 4 Nov 2013 02:00:43 -0800 [thread overview]
Message-ID: <20131104100042.GK3947@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131104090744.GE10651@twins.programming.kicks-ass.net>
On Mon, Nov 04, 2013 at 10:07:44AM +0100, Peter Zijlstra wrote:
> On Sat, Nov 02, 2013 at 08:20:48AM -0700, Paul E. McKenney wrote:
> > On Fri, Nov 01, 2013 at 11:30:17AM +0100, Peter Zijlstra wrote:
> > > Furthermore there's a gazillion parallel userspace programs.
> >
> > Most of which have very unaggressive concurrency designs.
>
> pthread_mutex_t A, B;
>
> char data_A[x];
> int counter_B = 1;
>
> void funA(void)
> {
> pthread_mutex_lock(&A);
> memset(data_A, 0, sizeof(data_A));
> pthread_mutex_unlock(&A);
> }
>
> void funB(void)
> {
> pthread_mutex_lock(&B);
> counter_B++;
> pthread_mutex_unlock(&B);
> }
>
> void funC(void)
> {
> pthread_mutex_lock(&B)
> printf("%d\n", counter_B);
> pthread_mutex_unlock(&B);
> }
>
> Then run: funA, funB, funC concurrently, and end with a funC.
>
> Then explain to userman than his unaggressive program can return:
> 0
> 1
>
> Because the memset() thought it might be a cute idea to overwrite
> counter_B and fix it up 'later'. Which if I understood you right is
> valid in C/C++ :-(
>
> Not that any actual memset implementation exhibiting this trait wouldn't
> be shot on the spot.
Even without such a malicious memcpy() implementation I must still explain
about false sharing when the developer notices that the unaggressive
program isn't running as fast as expected.
> > > > By marking "ptr" as atomic, thus telling the compiler not to mess with it.
> > > > And thus requiring that all accesses to it be decorated, which in the
> > > > case of RCU could be buried in the RCU accessors.
> > >
> > > This seems contradictory; marking it atomic would look like:
> > >
> > > struct foo {
> > > unsigned long value;
> > > __atomic void *ptr;
> > > unsigned long value1;
> > > };
> > >
> > > Clearly we cannot hide this definition in accessors, because then
> > > accesses to value* won't see the annotation.
> >
> > #define __rcu __atomic
>
> Yeah, except we don't use __rcu all that consistently; in fact I don't
> know if I ever added it.
There are more than 300 of them in the kernel. Plus sparse can be
convinced to yell at you if you don't use them. So lack of __rcu could
be fixed without too much trouble.
The C/C++11 need to annotate functions that take arguments or return
values taken from rcu_dereference() is another story. But the compilers
have to get significantly more aggressive or developers have to be doing
unusual things that result in rcu_dereference() returning something whose
value the compiler can predict exactly.
Thanx, Paul
WARNING: multiple messages have this Message-ID (diff)
From: "Paul E. McKenney" <paulmck@linux.vnet.ibm.com>
To: Peter Zijlstra <peterz@infradead.org>
Cc: Victor Kaplansky <VICTORK@il.ibm.com>,
Anton Blanchard <anton@samba.org>,
Benjamin Herrenschmidt <benh@kernel.crashing.org>,
Frederic Weisbecker <fweisbec@gmail.com>,
LKML <linux-kernel@vger.kernel.org>,
Linux PPC dev <linuxppc-dev@ozlabs.org>,
Mathieu Desnoyers <mathieu.desnoyers@polymtl.ca>,
Michael Ellerman <michael@ellerman.id.au>,
Michael Neuling <mikey@neuling.org>,
Oleg Nesterov <oleg@redhat.com>
Subject: Re: perf events ring buffer memory barrier on powerpc
Date: Mon, 4 Nov 2013 02:00:43 -0800 [thread overview]
Message-ID: <20131104100042.GK3947@linux.vnet.ibm.com> (raw)
In-Reply-To: <20131104090744.GE10651@twins.programming.kicks-ass.net>
On Mon, Nov 04, 2013 at 10:07:44AM +0100, Peter Zijlstra wrote:
> On Sat, Nov 02, 2013 at 08:20:48AM -0700, Paul E. McKenney wrote:
> > On Fri, Nov 01, 2013 at 11:30:17AM +0100, Peter Zijlstra wrote:
> > > Furthermore there's a gazillion parallel userspace programs.
> >
> > Most of which have very unaggressive concurrency designs.
>
> pthread_mutex_t A, B;
>
> char data_A[x];
> int counter_B = 1;
>
> void funA(void)
> {
> pthread_mutex_lock(&A);
> memset(data_A, 0, sizeof(data_A));
> pthread_mutex_unlock(&A);
> }
>
> void funB(void)
> {
> pthread_mutex_lock(&B);
> counter_B++;
> pthread_mutex_unlock(&B);
> }
>
> void funC(void)
> {
> pthread_mutex_lock(&B)
> printf("%d\n", counter_B);
> pthread_mutex_unlock(&B);
> }
>
> Then run: funA, funB, funC concurrently, and end with a funC.
>
> Then explain to userman than his unaggressive program can return:
> 0
> 1
>
> Because the memset() thought it might be a cute idea to overwrite
> counter_B and fix it up 'later'. Which if I understood you right is
> valid in C/C++ :-(
>
> Not that any actual memset implementation exhibiting this trait wouldn't
> be shot on the spot.
Even without such a malicious memcpy() implementation I must still explain
about false sharing when the developer notices that the unaggressive
program isn't running as fast as expected.
> > > > By marking "ptr" as atomic, thus telling the compiler not to mess with it.
> > > > And thus requiring that all accesses to it be decorated, which in the
> > > > case of RCU could be buried in the RCU accessors.
> > >
> > > This seems contradictory; marking it atomic would look like:
> > >
> > > struct foo {
> > > unsigned long value;
> > > __atomic void *ptr;
> > > unsigned long value1;
> > > };
> > >
> > > Clearly we cannot hide this definition in accessors, because then
> > > accesses to value* won't see the annotation.
> >
> > #define __rcu __atomic
>
> Yeah, except we don't use __rcu all that consistently; in fact I don't
> know if I ever added it.
There are more than 300 of them in the kernel. Plus sparse can be
convinced to yell at you if you don't use them. So lack of __rcu could
be fixed without too much trouble.
The C/C++11 need to annotate functions that take arguments or return
values taken from rcu_dereference() is another story. But the compilers
have to get significantly more aggressive or developers have to be doing
unusual things that result in rcu_dereference() returning something whose
value the compiler can predict exactly.
Thanx, Paul
next prev parent reply other threads:[~2013-11-04 10:07 UTC|newest]
Thread overview: 215+ messages / expand[flat|nested] mbox.gz Atom feed top
2013-10-22 23:54 perf events ring buffer memory barrier on powerpc Michael Neuling
2013-10-23 7:39 ` Victor Kaplansky
2013-10-23 7:39 ` Victor Kaplansky
2013-10-23 14:19 ` Frederic Weisbecker
2013-10-23 14:19 ` Frederic Weisbecker
2013-10-23 14:25 ` Frederic Weisbecker
2013-10-23 14:25 ` Frederic Weisbecker
2013-10-25 17:37 ` Peter Zijlstra
2013-10-25 17:37 ` Peter Zijlstra
2013-10-25 20:31 ` Michael Neuling
2013-10-25 20:31 ` Michael Neuling
2013-10-27 9:00 ` Victor Kaplansky
2013-10-27 9:00 ` Victor Kaplansky
2013-10-28 9:22 ` Peter Zijlstra
2013-10-28 9:22 ` Peter Zijlstra
2013-10-28 10:02 ` Frederic Weisbecker
2013-10-28 10:02 ` Frederic Weisbecker
2013-10-28 12:38 ` Victor Kaplansky
2013-10-28 12:38 ` Victor Kaplansky
2013-10-28 13:26 ` Peter Zijlstra
2013-10-28 13:26 ` Peter Zijlstra
2013-10-28 16:34 ` Paul E. McKenney
2013-10-28 16:34 ` Paul E. McKenney
2013-10-28 20:17 ` Oleg Nesterov
2013-10-28 20:17 ` Oleg Nesterov
2013-10-28 20:58 ` Victor Kaplansky
2013-10-28 20:58 ` Victor Kaplansky
2013-10-29 10:21 ` Peter Zijlstra
2013-10-29 10:21 ` Peter Zijlstra
2013-10-29 10:30 ` Peter Zijlstra
2013-10-29 10:30 ` Peter Zijlstra
2013-10-29 10:35 ` Peter Zijlstra
2013-10-29 10:35 ` Peter Zijlstra
2013-10-29 20:15 ` Oleg Nesterov
2013-10-29 20:15 ` Oleg Nesterov
2013-10-29 19:27 ` Vince Weaver
2013-10-29 19:27 ` Vince Weaver
2013-10-30 10:42 ` Peter Zijlstra
2013-10-30 10:42 ` Peter Zijlstra
2013-10-30 11:48 ` James Hogan
2013-10-30 11:48 ` James Hogan
2013-10-30 12:48 ` Peter Zijlstra
2013-10-30 12:48 ` Peter Zijlstra
2013-11-06 13:19 ` [tip:perf/core] tools/perf: Add required memory barriers tip-bot for Peter Zijlstra
2013-11-06 13:50 ` Vince Weaver
2013-11-06 14:00 ` Peter Zijlstra
2013-11-06 14:28 ` Peter Zijlstra
2013-11-06 14:55 ` Vince Weaver
2013-11-06 15:10 ` Peter Zijlstra
2013-11-06 15:23 ` Peter Zijlstra
2013-11-06 14:44 ` Peter Zijlstra
2013-11-06 16:07 ` Peter Zijlstra
2013-11-06 17:31 ` Vince Weaver
2013-11-06 18:24 ` Peter Zijlstra
2013-11-07 8:21 ` Ingo Molnar
2013-11-07 14:27 ` Vince Weaver
2013-11-07 15:55 ` Ingo Molnar
2013-11-11 16:24 ` Peter Zijlstra
2013-11-11 21:10 ` Ingo Molnar
2013-10-29 21:23 ` perf events ring buffer memory barrier on powerpc Michael Neuling
2013-10-29 21:23 ` Michael Neuling
2013-10-30 9:27 ` Paul E. McKenney
2013-10-30 9:27 ` Paul E. McKenney
2013-10-30 11:25 ` Peter Zijlstra
2013-10-30 11:25 ` Peter Zijlstra
2013-10-30 14:52 ` Victor Kaplansky
2013-10-30 14:52 ` Victor Kaplansky
2013-10-30 15:39 ` Peter Zijlstra
2013-10-30 15:39 ` Peter Zijlstra
2013-10-30 17:14 ` Victor Kaplansky
2013-10-30 17:14 ` Victor Kaplansky
2013-10-30 17:44 ` Peter Zijlstra
2013-10-30 17:44 ` Peter Zijlstra
2013-10-31 6:16 ` Paul E. McKenney
2013-10-31 6:16 ` Paul E. McKenney
2013-11-01 13:12 ` Victor Kaplansky
2013-11-01 13:12 ` Victor Kaplansky
2013-11-02 16:36 ` Paul E. McKenney
2013-11-02 16:36 ` Paul E. McKenney
2013-11-02 17:26 ` Paul E. McKenney
2013-11-02 17:26 ` Paul E. McKenney
2013-10-31 6:40 ` Paul E. McKenney
2013-10-31 6:40 ` Paul E. McKenney
2013-11-01 14:25 ` Victor Kaplansky
2013-11-01 14:25 ` Victor Kaplansky
2013-11-02 17:28 ` Paul E. McKenney
2013-11-02 17:28 ` Paul E. McKenney
2013-11-01 14:56 ` Peter Zijlstra
2013-11-01 14:56 ` Peter Zijlstra
2013-11-02 17:32 ` Paul E. McKenney
2013-11-02 17:32 ` Paul E. McKenney
2013-11-03 14:40 ` Paul E. McKenney
2013-11-03 14:40 ` Paul E. McKenney
2013-11-03 15:17 ` [RFC] arch: Introduce new TSO memory barrier smp_tmb() Peter Zijlstra
2013-11-03 15:17 ` Peter Zijlstra
2013-11-03 18:08 ` Linus Torvalds
2013-11-03 18:08 ` Linus Torvalds
2013-11-03 20:01 ` Peter Zijlstra
2013-11-03 20:01 ` Peter Zijlstra
2013-11-03 22:42 ` Paul E. McKenney
2013-11-03 22:42 ` Paul E. McKenney
2013-11-03 23:34 ` Linus Torvalds
2013-11-03 23:34 ` Linus Torvalds
2013-11-04 10:51 ` Paul E. McKenney
2013-11-04 10:51 ` Paul E. McKenney
2013-11-04 11:22 ` Peter Zijlstra
2013-11-04 11:22 ` Peter Zijlstra
2013-11-04 16:27 ` Paul E. McKenney
2013-11-04 16:27 ` Paul E. McKenney
2013-11-04 16:48 ` Peter Zijlstra
2013-11-04 16:48 ` Peter Zijlstra
2013-11-04 19:11 ` Peter Zijlstra
2013-11-04 19:11 ` Peter Zijlstra
2013-11-04 19:18 ` Peter Zijlstra
2013-11-04 19:18 ` Peter Zijlstra
2013-11-04 20:54 ` Paul E. McKenney
2013-11-04 20:54 ` Paul E. McKenney
2013-11-04 20:53 ` Paul E. McKenney
2013-11-04 20:53 ` Paul E. McKenney
2013-11-05 14:05 ` Will Deacon
2013-11-05 14:05 ` Will Deacon
2013-11-05 14:49 ` Paul E. McKenney
2013-11-05 14:49 ` Paul E. McKenney
2013-11-05 18:49 ` Peter Zijlstra
2013-11-05 18:49 ` Peter Zijlstra
2013-11-06 11:00 ` Will Deacon
2013-11-06 11:00 ` Will Deacon
2013-11-06 12:39 ` Peter Zijlstra
2013-11-06 12:39 ` Peter Zijlstra
2013-11-06 12:51 ` Geert Uytterhoeven
2013-11-06 12:51 ` Geert Uytterhoeven
2013-11-06 13:57 ` Peter Zijlstra
2013-11-06 13:57 ` Peter Zijlstra
2013-11-06 18:48 ` Paul E. McKenney
2013-11-06 18:48 ` Paul E. McKenney
2013-11-06 19:42 ` Peter Zijlstra
2013-11-06 19:42 ` Peter Zijlstra
2013-11-07 11:17 ` Will Deacon
2013-11-07 11:17 ` Will Deacon
2013-11-07 13:36 ` Peter Zijlstra
2013-11-07 13:36 ` Peter Zijlstra
2013-11-07 23:50 ` Mathieu Desnoyers
2013-11-07 23:50 ` Mathieu Desnoyers
2013-11-04 11:05 ` Will Deacon
2013-11-04 11:05 ` Will Deacon
2013-11-04 16:34 ` Paul E. McKenney
2013-11-04 16:34 ` Paul E. McKenney
2013-11-03 20:59 ` Benjamin Herrenschmidt
2013-11-03 20:59 ` Benjamin Herrenschmidt
2013-11-03 22:43 ` Paul E. McKenney
2013-11-03 22:43 ` Paul E. McKenney
2013-11-03 17:07 ` perf events ring buffer memory barrier on powerpc Will Deacon
2013-11-03 22:47 ` Paul E. McKenney
2013-11-04 9:57 ` Will Deacon
2013-11-04 10:52 ` Paul E. McKenney
2013-11-01 16:11 ` Peter Zijlstra
2013-11-01 16:11 ` Peter Zijlstra
2013-11-02 17:46 ` Paul E. McKenney
2013-11-02 17:46 ` Paul E. McKenney
2013-11-01 16:18 ` Peter Zijlstra
2013-11-01 16:18 ` Peter Zijlstra
2013-11-02 17:49 ` Paul E. McKenney
2013-11-02 17:49 ` Paul E. McKenney
2013-10-30 13:28 ` Victor Kaplansky
2013-10-30 13:28 ` Victor Kaplansky
2013-10-30 15:51 ` Peter Zijlstra
2013-10-30 15:51 ` Peter Zijlstra
2013-10-30 18:29 ` Peter Zijlstra
2013-10-30 18:29 ` Peter Zijlstra
2013-10-30 19:11 ` Peter Zijlstra
2013-10-30 19:11 ` Peter Zijlstra
2013-10-31 4:33 ` Paul E. McKenney
2013-10-31 4:33 ` Paul E. McKenney
2013-10-31 4:32 ` Paul E. McKenney
2013-10-31 4:32 ` Paul E. McKenney
2013-10-31 9:04 ` Peter Zijlstra
2013-10-31 9:04 ` Peter Zijlstra
2013-10-31 15:07 ` Paul E. McKenney
2013-10-31 15:07 ` Paul E. McKenney
2013-10-31 15:19 ` Peter Zijlstra
2013-10-31 15:19 ` Peter Zijlstra
2013-11-01 9:28 ` Paul E. McKenney
2013-11-01 9:28 ` Paul E. McKenney
2013-11-01 10:30 ` Peter Zijlstra
2013-11-01 10:30 ` Peter Zijlstra
2013-11-02 15:20 ` Paul E. McKenney
2013-11-02 15:20 ` Paul E. McKenney
2013-11-04 9:07 ` Peter Zijlstra
2013-11-04 9:07 ` Peter Zijlstra
2013-11-04 10:00 ` Paul E. McKenney [this message]
2013-11-04 10:00 ` Paul E. McKenney
2013-10-31 9:59 ` Victor Kaplansky
2013-10-31 9:59 ` Victor Kaplansky
2013-10-31 12:28 ` David Laight
2013-10-31 12:28 ` David Laight
2013-10-31 12:55 ` Victor Kaplansky
2013-10-31 12:55 ` Victor Kaplansky
2013-10-31 15:25 ` Paul E. McKenney
2013-10-31 15:25 ` Paul E. McKenney
2013-11-01 16:06 ` Victor Kaplansky
2013-11-01 16:06 ` Victor Kaplansky
2013-11-01 16:25 ` David Laight
2013-11-01 16:25 ` David Laight
2013-11-01 16:30 ` Victor Kaplansky
2013-11-01 16:30 ` Victor Kaplansky
2013-11-03 20:57 ` Benjamin Herrenschmidt
2013-11-03 20:57 ` Benjamin Herrenschmidt
2013-11-02 15:46 ` Paul E. McKenney
2013-11-02 15:46 ` Paul E. McKenney
2013-10-28 19:09 ` Oleg Nesterov
2013-10-28 19:09 ` Oleg Nesterov
2013-10-29 14:06 ` [tip:perf/urgent] perf: Fix perf ring buffer memory ordering tip-bot for Peter Zijlstra
-- strict thread matches above, loose matches on Subject: below --
2014-05-08 20:46 perf events ring buffer memory barrier on powerpc Mikulas Patocka
[not found] ` <OF667059AA.7F151BCC-ONC2257CD3.0036CFEB-C2257CD3.003BBF01@il.ibm.com>
2014-05-09 12:20 ` Mikulas Patocka
2014-05-09 13:47 ` Paul E. McKenney
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20131104100042.GK3947@linux.vnet.ibm.com \
--to=paulmck@linux.vnet.ibm.com \
--cc=VICTORK@il.ibm.com \
--cc=anton@samba.org \
--cc=fweisbec@gmail.com \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@ozlabs.org \
--cc=mathieu.desnoyers@polymtl.ca \
--cc=mikey@neuling.org \
--cc=oleg@redhat.com \
--cc=peterz@infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.