* Recommended functions for accessing internal registers
@ 2009-12-01 16:44 Fortini Matteo
2009-12-02 20:57 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 4+ messages in thread
From: Fortini Matteo @ 2009-12-01 16:44 UTC (permalink / raw)
To: linuxppc-dev@lists.ozlabs.org
I see that throughout the kernel source, internal PPC registers are
accessed through [in|out]_be[32|16|8]() functions. However, they are
translated into 3 inline assembly instructions, one of which is an
isync, which has a huge performance hit.
I tried using readl_be() which seems to be the right function according
to the Documentation/ dir, but it is translated directly to in_be32(),
so no luck.
Is it really necessary to use all those instructions? I know I could use
a (volatile u32 *) variable to avoid subsequent read/writes to be
optimized out, but it seems to be a deprecated use.
Thank you in advance,
Matteo
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recommended functions for accessing internal registers
2009-12-01 16:44 Recommended functions for accessing internal registers Fortini Matteo
@ 2009-12-02 20:57 ` Benjamin Herrenschmidt
2009-12-03 11:16 ` Fortini Matteo
0 siblings, 1 reply; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2009-12-02 20:57 UTC (permalink / raw)
To: Fortini Matteo; +Cc: linuxppc-dev@lists.ozlabs.org
On Tue, 2009-12-01 at 17:44 +0100, Fortini Matteo wrote:
> I see that throughout the kernel source, internal PPC registers are
> accessed through [in|out]_be[32|16|8]() functions. However, they are
> translated into 3 inline assembly instructions, one of which is an
> isync, which has a huge performance hit.
> I tried using readl_be() which seems to be the right function according
> to the Documentation/ dir, but it is translated directly to in_be32(),
> so no luck.
>
> Is it really necessary to use all those instructions? I know I could use
> a (volatile u32 *) variable to avoid subsequent read/writes to be
> optimized out, but it seems to be a deprecated use.
There are good reasons why the accessors contain those barriers. What
are you doing that would be performance critical enough for those to be
a problem ?
Cheers,
Ben.
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recommended functions for accessing internal registers
2009-12-02 20:57 ` Benjamin Herrenschmidt
@ 2009-12-03 11:16 ` Fortini Matteo
2009-12-03 21:10 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 4+ messages in thread
From: Fortini Matteo @ 2009-12-03 11:16 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev@lists.ozlabs.org
I'm on an embedded system, so every resource counts.
One of the biggest impacts is when writing to a communication/memory
access FIFO or reading/writing configurations.
In these cases, I'd just need to make sure that there's no I/O
reordering and/or subsequent r/w are not optimized away, I believe.
Should I switch to the deprecated "volatile" attribute?
Thank you.
Cheers,
Matteo
Il 02/12/2009 21.57, Benjamin Herrenschmidt ha scritto:
> On Tue, 2009-12-01 at 17:44 +0100, Fortini Matteo wrote:
>
>> I see that throughout the kernel source, internal PPC registers are
>> accessed through [in|out]_be[32|16|8]() functions. However, they are
>> translated into 3 inline assembly instructions, one of which is an
>> isync, which has a huge performance hit.
>> I tried using readl_be() which seems to be the right function according
>> to the Documentation/ dir, but it is translated directly to in_be32(),
>> so no luck.
>>
>> Is it really necessary to use all those instructions? I know I could use
>> a (volatile u32 *) variable to avoid subsequent read/writes to be
>> optimized out, but it seems to be a deprecated use.
>>
> There are good reasons why the accessors contain those barriers. What
> are you doing that would be performance critical enough for those to be
> a problem ?
>
> Cheers,
> Ben.
>
>
>
^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recommended functions for accessing internal registers
2009-12-03 11:16 ` Fortini Matteo
@ 2009-12-03 21:10 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 4+ messages in thread
From: Benjamin Herrenschmidt @ 2009-12-03 21:10 UTC (permalink / raw)
To: Fortini Matteo; +Cc: linuxppc-dev@lists.ozlabs.org
On Thu, 2009-12-03 at 12:16 +0100, Fortini Matteo wrote:
> I'm on an embedded system, so every resource counts.
> One of the biggest impacts is when writing to a communication/memory
> access FIFO or reading/writing configurations.
> In these cases, I'd just need to make sure that there's no I/O
> reordering and/or subsequent r/w are not optimized away, I believe.
>
> Should I switch to the deprecated "volatile" attribute?
If it's a single register fifo, you can use the _outs/_ins functions or
the iomap.h variants which are cleaner.
If it's a linear region, look at memcpy_to_io...
You can always go directly poking at it but you need appropriate memory
barriers in and around your accesses.
The reason there's a twi/isync pair inside in_* is to ensure that the
read is actually performed immediately. Without this, it could be
delayed until the CPU decides to "consume" the data which could cause
timing issues when a read is followed by a delay for example. There's a
few other reasons why it's a good idea to do so.
Cheers,
Ben.
> Thank you.
>
> Cheers,
> Matteo
>
> Il 02/12/2009 21.57, Benjamin Herrenschmidt ha scritto:
> > On Tue, 2009-12-01 at 17:44 +0100, Fortini Matteo wrote:
> >
> >> I see that throughout the kernel source, internal PPC registers are
> >> accessed through [in|out]_be[32|16|8]() functions. However, they are
> >> translated into 3 inline assembly instructions, one of which is an
> >> isync, which has a huge performance hit.
> >> I tried using readl_be() which seems to be the right function according
> >> to the Documentation/ dir, but it is translated directly to in_be32(),
> >> so no luck.
> >>
> >> Is it really necessary to use all those instructions? I know I could use
> >> a (volatile u32 *) variable to avoid subsequent read/writes to be
> >> optimized out, but it seems to be a deprecated use.
> >>
> > There are good reasons why the accessors contain those barriers. What
> > are you doing that would be performance critical enough for those to be
> > a problem ?
> >
> > Cheers,
> > Ben.
> >
> >
> >
^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-12-03 21:11 UTC | newest]
Thread overview: 4+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2009-12-01 16:44 Recommended functions for accessing internal registers Fortini Matteo
2009-12-02 20:57 ` Benjamin Herrenschmidt
2009-12-03 11:16 ` Fortini Matteo
2009-12-03 21:10 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).