From: peterz@infradead.org (Peter Zijlstra)
To: linux-arm-kernel@lists.infradead.org
Subject: [BUG] 2.6.37-rc3 massive interactivity regression on ARM
Date: Fri, 10 Dec 2010 21:32:37 +0100 [thread overview]
Message-ID: <1292013157.13513.69.camel@laptop> (raw)
In-Reply-To: <alpine.DEB.2.00.1012101410571.13986@router.home>
On Fri, 2010-12-10 at 14:23 -0600, Christoph Lameter wrote:
> On Fri, 10 Dec 2010, Peter Zijlstra wrote:
>
> > Its not about passing per-cpu pointers, its about passing long pointers.
> >
> > When I write:
> >
> > void foo(u64 *bla)
> > {
> > *bla++;
> > }
> >
> > DEFINE_PER_CPU(u64, plop);
> >
> > void bar(void)
> > {
> > foo(__this_cpu_ptr(plop));
> > }
> >
> > I want gcc to emit the equivalent to:
> >
> > __this_cpu_inc(plop); /* incq %fs:(%0) */
> >
> > Now I guess the C type system will get in the way of this ever working,
> > since a long pointer would have a distinct type from a regular
> > pointer :/
> >
> > The idea is to use 'regular' functions with the per-cpu data in a
> > transparent manner so as not to have to replicate all logic.
>
> That would mean you would have to pass information in the pointer at
> runtime indicating that this particular pointer is a per cpu pointer.
>
> Code for the Itanium arch can do that because it has per cpu virtual
> mappings. So you define a virtual area for per cpu data and then map it
> differently for each processor. If we would have a different page table
> for each processor then we could avoid using segment register and do the
> same on x86.
I don't think its a runtime issue, its a compile time issue. At compile
time the compiler can see the argument is a long pointer:
%fs:(addr,idx,size), and could propagate this into the caller.
The above example will compute the effective address by doing something
like:
lea %fs:(addr,idx,size),%ebx
and will then do something like
inc (%ebx)
Where it could easily have optimized this into:
inc %fs:(addr,idx,size)
esp when foo would be inlined. If its an actual call-site you need
function overloading because a long pointer has a different signature
from a regular pointer, and that is something C doesn't do.
> > > Seems that you do not have that use case in mind. So a seqlock restricted
> > > to a single processor? If so then you wont need any of those smp write
> > > barriers mentioned earlier. A simple compiler barrier() is sufficient.
> >
> > The seqcount is sometimes read by different CPUs, but I don't see why we
> > couldn't do what Eric suggested.
>
> But you would have to define a per cpu seqlock. Each cpu would have
> its own seqlock. Then you could have this_cpu_read_seqcount_begin and
> friends:
>
> Then you can do
>
> this_cpu_read_seqcount_begin(&bla)
>
Which to me seems to be exactly what Eric proposed..
> But then this seemed to be a discussion related to ARM. ARM does not have
> optimized per cpu accesses.
Nah, there's multiple issues all nicely mangled into one thread ;-)
WARNING: multiple messages have this Message-ID (diff)
From: Peter Zijlstra <peterz@infradead.org>
To: Christoph Lameter <cl@linux.com>
Cc: Eric Dumazet <eric.dumazet@gmail.com>,
Venkatesh Pallipadi <venki@google.com>,
Russell King - ARM Linux <linux@arm.linux.org.uk>,
Mikael Pettersson <mikpe@it.uu.se>, Ingo Molnar <mingo@elte.hu>,
linux-kernel@vger.kernel.org,
linux-arm-kernel@lists.infradead.org,
John Stultz <johnstul@us.ibm.com>
Subject: Re: [BUG] 2.6.37-rc3 massive interactivity regression on ARM
Date: Fri, 10 Dec 2010 21:32:37 +0100 [thread overview]
Message-ID: <1292013157.13513.69.camel@laptop> (raw)
In-Reply-To: <alpine.DEB.2.00.1012101410571.13986@router.home>
On Fri, 2010-12-10 at 14:23 -0600, Christoph Lameter wrote:
> On Fri, 10 Dec 2010, Peter Zijlstra wrote:
>
> > Its not about passing per-cpu pointers, its about passing long pointers.
> >
> > When I write:
> >
> > void foo(u64 *bla)
> > {
> > *bla++;
> > }
> >
> > DEFINE_PER_CPU(u64, plop);
> >
> > void bar(void)
> > {
> > foo(__this_cpu_ptr(plop));
> > }
> >
> > I want gcc to emit the equivalent to:
> >
> > __this_cpu_inc(plop); /* incq %fs:(%0) */
> >
> > Now I guess the C type system will get in the way of this ever working,
> > since a long pointer would have a distinct type from a regular
> > pointer :/
> >
> > The idea is to use 'regular' functions with the per-cpu data in a
> > transparent manner so as not to have to replicate all logic.
>
> That would mean you would have to pass information in the pointer at
> runtime indicating that this particular pointer is a per cpu pointer.
>
> Code for the Itanium arch can do that because it has per cpu virtual
> mappings. So you define a virtual area for per cpu data and then map it
> differently for each processor. If we would have a different page table
> for each processor then we could avoid using segment register and do the
> same on x86.
I don't think its a runtime issue, its a compile time issue. At compile
time the compiler can see the argument is a long pointer:
%fs:(addr,idx,size), and could propagate this into the caller.
The above example will compute the effective address by doing something
like:
lea %fs:(addr,idx,size),%ebx
and will then do something like
inc (%ebx)
Where it could easily have optimized this into:
inc %fs:(addr,idx,size)
esp when foo would be inlined. If its an actual call-site you need
function overloading because a long pointer has a different signature
from a regular pointer, and that is something C doesn't do.
> > > Seems that you do not have that use case in mind. So a seqlock restricted
> > > to a single processor? If so then you wont need any of those smp write
> > > barriers mentioned earlier. A simple compiler barrier() is sufficient.
> >
> > The seqcount is sometimes read by different CPUs, but I don't see why we
> > couldn't do what Eric suggested.
>
> But you would have to define a per cpu seqlock. Each cpu would have
> its own seqlock. Then you could have this_cpu_read_seqcount_begin and
> friends:
>
> Then you can do
>
> this_cpu_read_seqcount_begin(&bla)
>
Which to me seems to be exactly what Eric proposed..
> But then this seemed to be a discussion related to ARM. ARM does not have
> optimized per cpu accesses.
Nah, there's multiple issues all nicely mangled into one thread ;-)
next prev parent reply other threads:[~2010-12-10 20:32 UTC|newest]
Thread overview: 102+ messages / expand[flat|nested] mbox.gz Atom feed top
2010-11-27 15:16 [BUG] 2.6.37-rc3 massive interactivity regression on ARM Mikael Pettersson
2010-11-27 15:16 ` Mikael Pettersson
2010-12-05 12:32 ` Mikael Pettersson
2010-12-05 12:32 ` Mikael Pettersson
2010-12-05 13:17 ` Russell King - ARM Linux
2010-12-05 13:17 ` Russell King - ARM Linux
2010-12-05 14:19 ` Russell King - ARM Linux
2010-12-05 14:19 ` Russell King - ARM Linux
2010-12-05 16:07 ` Mikael Pettersson
2010-12-05 16:07 ` Mikael Pettersson
2010-12-05 16:21 ` Russell King - ARM Linux
2010-12-05 16:21 ` Russell King - ARM Linux
2010-12-08 12:40 ` Peter Zijlstra
2010-12-08 12:40 ` Peter Zijlstra
2010-12-08 12:55 ` Russell King - ARM Linux
2010-12-08 12:55 ` Russell King - ARM Linux
2010-12-08 14:04 ` Peter Zijlstra
2010-12-08 14:04 ` Peter Zijlstra
2010-12-08 14:28 ` Russell King - ARM Linux
2010-12-08 14:28 ` Russell King - ARM Linux
2010-12-08 14:44 ` Peter Zijlstra
2010-12-08 14:44 ` Peter Zijlstra
2010-12-08 15:05 ` Russell King - ARM Linux
2010-12-08 15:05 ` Russell King - ARM Linux
2010-12-08 15:43 ` Linus Walleij
2010-12-08 15:43 ` Linus Walleij
2010-12-08 20:42 ` john stultz
2010-12-08 20:42 ` john stultz
2010-12-08 23:31 ` Venkatesh Pallipadi
2010-12-08 23:31 ` Venkatesh Pallipadi
2010-12-09 12:52 ` Peter Zijlstra
2010-12-09 12:52 ` Peter Zijlstra
2010-12-09 17:43 ` Venkatesh Pallipadi
2010-12-09 17:43 ` Venkatesh Pallipadi
2010-12-09 17:55 ` Peter Zijlstra
2010-12-09 17:55 ` Peter Zijlstra
2010-12-09 18:11 ` Venkatesh Pallipadi
2010-12-09 18:11 ` Venkatesh Pallipadi
2010-12-09 18:55 ` Peter Zijlstra
2010-12-09 18:55 ` Peter Zijlstra
2010-12-09 22:21 ` Venkatesh Pallipadi
2010-12-09 22:21 ` Venkatesh Pallipadi
2010-12-09 23:16 ` Peter Zijlstra
2010-12-09 23:16 ` Peter Zijlstra
2010-12-09 23:35 ` Venkatesh Pallipadi
2010-12-09 23:35 ` Venkatesh Pallipadi
2010-12-10 10:08 ` Peter Zijlstra
2010-12-10 10:08 ` Peter Zijlstra
2010-12-10 13:17 ` Peter Zijlstra
2010-12-10 13:17 ` Peter Zijlstra
2010-12-10 13:27 ` Peter Zijlstra
2010-12-10 13:27 ` Peter Zijlstra
2010-12-10 13:47 ` Peter Zijlstra
2010-12-10 13:47 ` Peter Zijlstra
2010-12-10 16:50 ` Russell King - ARM Linux
2010-12-10 16:50 ` Russell King - ARM Linux
2010-12-10 16:54 ` Peter Zijlstra
2010-12-10 16:54 ` Peter Zijlstra
2010-12-10 17:18 ` Eric Dumazet
2010-12-10 17:18 ` Eric Dumazet
2010-12-10 17:49 ` Peter Zijlstra
2010-12-10 17:49 ` Peter Zijlstra
2010-12-10 18:14 ` Eric Dumazet
2010-12-10 18:14 ` Eric Dumazet
2010-12-10 18:39 ` Christoph Lameter
2010-12-10 18:39 ` Christoph Lameter
2010-12-10 18:46 ` Peter Zijlstra
2010-12-10 18:46 ` Peter Zijlstra
2010-12-10 19:51 ` Christoph Lameter
2010-12-10 19:51 ` Christoph Lameter
2010-12-10 20:07 ` Peter Zijlstra
2010-12-10 20:07 ` Peter Zijlstra
2010-12-10 20:23 ` Christoph Lameter
2010-12-10 20:23 ` Christoph Lameter
2010-12-10 20:32 ` Peter Zijlstra [this message]
2010-12-10 20:32 ` Peter Zijlstra
2010-12-10 20:39 ` Eric Dumazet
2010-12-10 20:39 ` Eric Dumazet
2010-12-10 20:49 ` Eric Dumazet
2010-12-10 20:49 ` Eric Dumazet
2010-12-10 21:09 ` Christoph Lameter
2010-12-10 21:09 ` Christoph Lameter
2010-12-10 21:22 ` Eric Dumazet
2010-12-10 21:22 ` Eric Dumazet
2010-12-10 21:45 ` Christoph Lameter
2010-12-10 21:45 ` Christoph Lameter
2010-12-10 17:56 ` Russell King - ARM Linux
2010-12-10 17:56 ` Russell King - ARM Linux
2010-12-10 18:10 ` Peter Zijlstra
2010-12-10 18:10 ` Peter Zijlstra
2010-12-10 18:43 ` Peter Zijlstra
2010-12-10 18:43 ` Peter Zijlstra
2010-12-10 19:17 ` Russell King - ARM Linux
2010-12-10 19:17 ` Russell King - ARM Linux
2010-12-10 19:37 ` Peter Zijlstra
2010-12-10 19:37 ` Peter Zijlstra
2010-12-10 19:25 ` Peter Zijlstra
2010-12-10 19:25 ` Peter Zijlstra
2010-12-13 14:33 ` Jack Daniel
2010-12-13 14:33 ` Jack Daniel
2010-12-06 21:29 ` Venkatesh Pallipadi
2010-12-06 21:29 ` Venkatesh Pallipadi
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1292013157.13513.69.camel@laptop \
--to=peterz@infradead.org \
--cc=linux-arm-kernel@lists.infradead.org \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.