* Recommended functions for accessing internal registers @ 2009-12-01 16:44 Fortini Matteo 2009-12-02 20:57 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 4+ messages in thread From: Fortini Matteo @ 2009-12-01 16:44 UTC (permalink / raw) To: linuxppc-dev@lists.ozlabs.org I see that throughout the kernel source, internal PPC registers are accessed through [in|out]_be[32|16|8]() functions. However, they are translated into 3 inline assembly instructions, one of which is an isync, which has a huge performance hit. I tried using readl_be() which seems to be the right function according to the Documentation/ dir, but it is translated directly to in_be32(), so no luck. Is it really necessary to use all those instructions? I know I could use a (volatile u32 *) variable to avoid subsequent read/writes to be optimized out, but it seems to be a deprecated use. Thank you in advance, Matteo ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recommended functions for accessing internal registers 2009-12-01 16:44 Recommended functions for accessing internal registers Fortini Matteo @ 2009-12-02 20:57 ` Benjamin Herrenschmidt 2009-12-03 11:16 ` Fortini Matteo 0 siblings, 1 reply; 4+ messages in thread From: Benjamin Herrenschmidt @ 2009-12-02 20:57 UTC (permalink / raw) To: Fortini Matteo; +Cc: linuxppc-dev@lists.ozlabs.org On Tue, 2009-12-01 at 17:44 +0100, Fortini Matteo wrote: > I see that throughout the kernel source, internal PPC registers are > accessed through [in|out]_be[32|16|8]() functions. However, they are > translated into 3 inline assembly instructions, one of which is an > isync, which has a huge performance hit. > I tried using readl_be() which seems to be the right function according > to the Documentation/ dir, but it is translated directly to in_be32(), > so no luck. > > Is it really necessary to use all those instructions? I know I could use > a (volatile u32 *) variable to avoid subsequent read/writes to be > optimized out, but it seems to be a deprecated use. There are good reasons why the accessors contain those barriers. What are you doing that would be performance critical enough for those to be a problem ? Cheers, Ben. ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recommended functions for accessing internal registers 2009-12-02 20:57 ` Benjamin Herrenschmidt @ 2009-12-03 11:16 ` Fortini Matteo 2009-12-03 21:10 ` Benjamin Herrenschmidt 0 siblings, 1 reply; 4+ messages in thread From: Fortini Matteo @ 2009-12-03 11:16 UTC (permalink / raw) To: Benjamin Herrenschmidt; +Cc: linuxppc-dev@lists.ozlabs.org I'm on an embedded system, so every resource counts. One of the biggest impacts is when writing to a communication/memory access FIFO or reading/writing configurations. In these cases, I'd just need to make sure that there's no I/O reordering and/or subsequent r/w are not optimized away, I believe. Should I switch to the deprecated "volatile" attribute? Thank you. Cheers, Matteo Il 02/12/2009 21.57, Benjamin Herrenschmidt ha scritto: > On Tue, 2009-12-01 at 17:44 +0100, Fortini Matteo wrote: > >> I see that throughout the kernel source, internal PPC registers are >> accessed through [in|out]_be[32|16|8]() functions. However, they are >> translated into 3 inline assembly instructions, one of which is an >> isync, which has a huge performance hit. >> I tried using readl_be() which seems to be the right function according >> to the Documentation/ dir, but it is translated directly to in_be32(), >> so no luck. >> >> Is it really necessary to use all those instructions? I know I could use >> a (volatile u32 *) variable to avoid subsequent read/writes to be >> optimized out, but it seems to be a deprecated use. >> > There are good reasons why the accessors contain those barriers. What > are you doing that would be performance critical enough for those to be > a problem ? > > Cheers, > Ben. > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
* Re: Recommended functions for accessing internal registers 2009-12-03 11:16 ` Fortini Matteo @ 2009-12-03 21:10 ` Benjamin Herrenschmidt 0 siblings, 0 replies; 4+ messages in thread From: Benjamin Herrenschmidt @ 2009-12-03 21:10 UTC (permalink / raw) To: Fortini Matteo; +Cc: linuxppc-dev@lists.ozlabs.org On Thu, 2009-12-03 at 12:16 +0100, Fortini Matteo wrote: > I'm on an embedded system, so every resource counts. > One of the biggest impacts is when writing to a communication/memory > access FIFO or reading/writing configurations. > In these cases, I'd just need to make sure that there's no I/O > reordering and/or subsequent r/w are not optimized away, I believe. > > Should I switch to the deprecated "volatile" attribute? If it's a single register fifo, you can use the _outs/_ins functions or the iomap.h variants which are cleaner. If it's a linear region, look at memcpy_to_io... You can always go directly poking at it but you need appropriate memory barriers in and around your accesses. The reason there's a twi/isync pair inside in_* is to ensure that the read is actually performed immediately. Without this, it could be delayed until the CPU decides to "consume" the data which could cause timing issues when a read is followed by a delay for example. There's a few other reasons why it's a good idea to do so. Cheers, Ben. > Thank you. > > Cheers, > Matteo > > Il 02/12/2009 21.57, Benjamin Herrenschmidt ha scritto: > > On Tue, 2009-12-01 at 17:44 +0100, Fortini Matteo wrote: > > > >> I see that throughout the kernel source, internal PPC registers are > >> accessed through [in|out]_be[32|16|8]() functions. However, they are > >> translated into 3 inline assembly instructions, one of which is an > >> isync, which has a huge performance hit. > >> I tried using readl_be() which seems to be the right function according > >> to the Documentation/ dir, but it is translated directly to in_be32(), > >> so no luck. > >> > >> Is it really necessary to use all those instructions? I know I could use > >> a (volatile u32 *) variable to avoid subsequent read/writes to be > >> optimized out, but it seems to be a deprecated use. > >> > > There are good reasons why the accessors contain those barriers. What > > are you doing that would be performance critical enough for those to be > > a problem ? > > > > Cheers, > > Ben. > > > > > > ^ permalink raw reply [flat|nested] 4+ messages in thread
end of thread, other threads:[~2009-12-03 21:11 UTC | newest] Thread overview: 4+ messages (download: mbox.gz follow: Atom feed -- links below jump to the message on this page -- 2009-12-01 16:44 Recommended functions for accessing internal registers Fortini Matteo 2009-12-02 20:57 ` Benjamin Herrenschmidt 2009-12-03 11:16 ` Fortini Matteo 2009-12-03 21:10 ` Benjamin Herrenschmidt
This is a public inbox, see mirroring instructions for how to clone and mirror all data and code used for this inbox