* [PATCH] Add fast little-endian switch system call
@ 2008-04-28 3:52 Paul Mackerras
2008-04-28 14:43 ` Christoph Hellwig
0 siblings, 1 reply; 7+ messages in thread
From: Paul Mackerras @ 2008-04-28 3:52 UTC (permalink / raw)
To: linuxppc-dev
This adds a system call on 64-bit platforms for switching between
little-endian and big-endian modes that is much faster than doing a
prctl call. This system call is handled as a special case right at
the start of the system call entry code, and because it is a special
case, it uses a system call number which is out of the range of
normal system calls, namely 0x1ebe.
Measurements with lmbench on a 4.2GHz POWER6 showed no measurable
change in the speed of normal system calls with this patch.
Switching endianness with this new system call takes around 60ns on a
4.2GHz POWER6, compared with around 300ns to switch endian mode with a
prctl. This can provide a significant performance advantage for
emulators for little-endian architectures that want to switch between
big-endian and little-endian mode frequently, e.g. because they are
generating instructions sequences on the fly and they want to run
those sequences in little-endian mode.
Signed-off-by: Paul Mackerras <paulus@samba.org>
---
diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 215973a..2eb49a7 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -239,6 +239,10 @@ instruction_access_slb_pSeries:
.globl system_call_pSeries
system_call_pSeries:
HMT_MEDIUM
+BEGIN_FTR_SECTION
+ cmpdi r0,0x1ebe
+ beq- 1f
+END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
mr r9,r13
mfmsr r10
mfspr r13,SPRN_SPRG3
@@ -253,6 +257,13 @@ system_call_pSeries:
rfid
b . /* prevent speculative execution */
+/* Fast LE/BE switch system call */
+1: mfspr r12,SPRN_SRR1
+ xori r12,r12,MSR_LE
+ mtspr SPRN_SRR1,r12
+ rfid /* return to userspace */
+ b .
+
STD_EXCEPTION_PSERIES(0xd00, single_step)
STD_EXCEPTION_PSERIES(0xe00, trap_0e)
^ permalink raw reply related [flat|nested] 7+ messages in thread
* Re: [PATCH] Add fast little-endian switch system call
2008-04-28 3:52 [PATCH] Add fast little-endian switch system call Paul Mackerras
@ 2008-04-28 14:43 ` Christoph Hellwig
2008-04-28 15:42 ` Michael Kerrisk
2008-04-29 2:46 ` Paul Mackerras
0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2008-04-28 14:43 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linux-arch, linuxppc-dev, mtk.manpages
Please see Michael Kerrisk on userspace ABI updates. A nice little
manpage for this gimmick would be helpful, and maybe help other
platforms that want one aswell to implement the same API.
On Mon, Apr 28, 2008 at 01:52:31PM +1000, Paul Mackerras wrote:
> This adds a system call on 64-bit platforms for switching between
> little-endian and big-endian modes that is much faster than doing a
> prctl call. This system call is handled as a special case right at
> the start of the system call entry code, and because it is a special
> case, it uses a system call number which is out of the range of
> normal system calls, namely 0x1ebe.
>
> Measurements with lmbench on a 4.2GHz POWER6 showed no measurable
> change in the speed of normal system calls with this patch.
>
> Switching endianness with this new system call takes around 60ns on a
> 4.2GHz POWER6, compared with around 300ns to switch endian mode with a
> prctl. This can provide a significant performance advantage for
> emulators for little-endian architectures that want to switch between
> big-endian and little-endian mode frequently, e.g. because they are
> generating instructions sequences on the fly and they want to run
> those sequences in little-endian mode.
>
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
>
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index 215973a..2eb49a7 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -239,6 +239,10 @@ instruction_access_slb_pSeries:
> .globl system_call_pSeries
> system_call_pSeries:
> HMT_MEDIUM
> +BEGIN_FTR_SECTION
> + cmpdi r0,0x1ebe
> + beq- 1f
> +END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
Am I missing something here or does this add a branch for every normal
syscall?
> mr r9,r13
> mfmsr r10
> mfspr r13,SPRN_SPRG3
> @@ -253,6 +257,13 @@ system_call_pSeries:
> rfid
> b . /* prevent speculative execution */
>
> +/* Fast LE/BE switch system call */
> +1: mfspr r12,SPRN_SRR1
> + xori r12,r12,MSR_LE
> + mtspr SPRN_SRR1,r12
> + rfid /* return to userspace */
> + b .
> +
> STD_EXCEPTION_PSERIES(0xd00, single_step)
> STD_EXCEPTION_PSERIES(0xe00, trap_0e)
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
---end quoted text---
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add fast little-endian switch system call
2008-04-28 14:43 ` Christoph Hellwig
@ 2008-04-28 15:42 ` Michael Kerrisk
2008-04-29 2:46 ` Paul Mackerras
1 sibling, 0 replies; 7+ messages in thread
From: Michael Kerrisk @ 2008-04-28 15:42 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-arch, linuxppc-dev, Paul Mackerras, mtk.manpages
On Mon, Apr 28, 2008 at 4:43 PM, Christoph Hellwig <hch@lst.de> wrote:
> Please see Michael Kerrisk on userspace ABI updates. A nice little
> manpage for this gimmick would be helpful, and maybe help other
> platforms that want one aswell to implement the same API.
Thanks Chrsitoph. I'm not on any of these lists at the moment.
Paul -- is this syscall defintely going in? Could you write a short
description for userland programmers? I'll do the grotty *roff stuff.
> On Mon, Apr 28, 2008 at 01:52:31PM +1000, Paul Mackerras wrote:
> > This adds a system call on 64-bit platforms for switching between
> > little-endian and big-endian modes that is much faster than doing a
> > prctl call. This system call is handled as a special case right at
> > the start of the system call entry code, and because it is a special
> > case, it uses a system call number which is out of the range of
> > normal system calls, namely 0x1ebe.
> >
> > Measurements with lmbench on a 4.2GHz POWER6 showed no measurable
> > change in the speed of normal system calls with this patch.
> >
> > Switching endianness with this new system call takes around 60ns on a
> > 4.2GHz POWER6, compared with around 300ns to switch endian mode with a
> > prctl. This can provide a significant performance advantage for
> > emulators for little-endian architectures that want to switch between
> > big-endian and little-endian mode frequently, e.g. because they are
> > generating instructions sequences on the fly and they want to run
> > those sequences in little-endian mode.
> >
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> > ---
> >
> > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> > index 215973a..2eb49a7 100644
> > --- a/arch/powerpc/kernel/head_64.S
> > +++ b/arch/powerpc/kernel/head_64.S
> > @@ -239,6 +239,10 @@ instruction_access_slb_pSeries:
> > .globl system_call_pSeries
> > system_call_pSeries:
> > HMT_MEDIUM
> > +BEGIN_FTR_SECTION
> > + cmpdi r0,0x1ebe
> > + beq- 1f
> > +END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
>
> Am I missing something here or does this add a branch for every normal
> syscall?
>
> > mr r9,r13
> > mfmsr r10
> > mfspr r13,SPRN_SPRG3
> > @@ -253,6 +257,13 @@ system_call_pSeries:
> > rfid
> > b . /* prevent speculative execution */
> >
> > +/* Fast LE/BE switch system call */
> > +1: mfspr r12,SPRN_SRR1
> > + xori r12,r12,MSR_LE
> > + mtspr SPRN_SRR1,r12
> > + rfid /* return to userspace */
> > + b .
> > +
> > STD_EXCEPTION_PSERIES(0xd00, single_step)
> > STD_EXCEPTION_PSERIES(0xe00, trap_0e)
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@ozlabs.org
> > https://ozlabs.org/mailman/listinfo/linuxppc-dev
> ---end quoted text---
>
--
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add fast little-endian switch system call
2008-04-28 14:43 ` Christoph Hellwig
2008-04-28 15:42 ` Michael Kerrisk
@ 2008-04-29 2:46 ` Paul Mackerras
2008-04-29 18:40 ` Wolfgang Denk
1 sibling, 1 reply; 7+ messages in thread
From: Paul Mackerras @ 2008-04-29 2:46 UTC (permalink / raw)
To: Christoph Hellwig; +Cc: linux-arch, linuxppc-dev, mtk.manpages
Christoph Hellwig writes:
> Am I missing something here or does this add a branch for every normal
> syscall?
It does, but the impact is so small as to be unmeasurable with
lmbench, even on the null syscall measurement. The overhead of the
easily-predicted not-taken branch is completely swamped by the amount
of time that the sc and rfid instructions take. I had it under a
config option at one point but then decided not to bother with that
when I couldn't measure any difference.
Paul.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add fast little-endian switch system call
2008-04-29 2:46 ` Paul Mackerras
@ 2008-04-29 18:40 ` Wolfgang Denk
2008-04-29 18:46 ` Christoph Hellwig
2008-04-29 21:16 ` Paul Mackerras
0 siblings, 2 replies; 7+ messages in thread
From: Wolfgang Denk @ 2008-04-29 18:40 UTC (permalink / raw)
To: Paul Mackerras; +Cc: linux-arch, linuxppc-dev, Christoph Hellwig, mtk.manpages
In message <18454.35824.527711.355488@cargo.ozlabs.ibm.com> you wrote:
>
> > Am I missing something here or does this add a branch for every normal
> > syscall?
>
> It does, but the impact is so small as to be unmeasurable with
> lmbench, even on the null syscall measurement. The overhead of the
> easily-predicted not-taken branch is completely swamped by the amount
> of time that the sc and rfid instructions take. I had it under a
> config option at one point but then decided not to bother with that
> when I couldn't measure any difference.
This probably depends a bit on the performance of the system in
question. Did you measure it - for example - on a 50 MHz MPC850 ?
Best regards,
Wolfgang Denk
--
DENX Software Engineering GmbH, MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
To get something done, a committee should consist of no more than
three men, two of them absent.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add fast little-endian switch system call
2008-04-29 18:40 ` Wolfgang Denk
@ 2008-04-29 18:46 ` Christoph Hellwig
2008-04-29 21:16 ` Paul Mackerras
1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2008-04-29 18:46 UTC (permalink / raw)
To: Wolfgang Denk
Cc: linux-arch, linuxppc-dev, Paul Mackerras, Christoph Hellwig,
mtk.manpages
On Tue, Apr 29, 2008 at 08:40:47PM +0200, Wolfgang Denk wrote:
> This probably depends a bit on the performance of the system in
> question. Did you measure it - for example - on a 50 MHz MPC850 ?
You got a 64bit kernel to run on a MPC850? wow :)
Not sure what the slowest supported 64bit cpu is (RS64-II?), but Paul
might be right and in this case it really doesn't matter.
^ permalink raw reply [flat|nested] 7+ messages in thread
* Re: [PATCH] Add fast little-endian switch system call
2008-04-29 18:40 ` Wolfgang Denk
2008-04-29 18:46 ` Christoph Hellwig
@ 2008-04-29 21:16 ` Paul Mackerras
1 sibling, 0 replies; 7+ messages in thread
From: Paul Mackerras @ 2008-04-29 21:16 UTC (permalink / raw)
To: Wolfgang Denk; +Cc: linux-arch, linuxppc-dev, Christoph Hellwig, mtk.manpages
Wolfgang Denk writes:
> This probably depends a bit on the performance of the system in
> question. Did you measure it - for example - on a 50 MHz MPC850 ?
The patch only affects arch/powerpc/kernel/entry_64.S. So no, I
didn't measure it on a 50MHz MPC850, or indeed any 32-bit system. :)
Paul.
^ permalink raw reply [flat|nested] 7+ messages in thread
end of thread, other threads:[~2008-04-29 21:16 UTC | newest]
Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-28 3:52 [PATCH] Add fast little-endian switch system call Paul Mackerras
2008-04-28 14:43 ` Christoph Hellwig
2008-04-28 15:42 ` Michael Kerrisk
2008-04-29 2:46 ` Paul Mackerras
2008-04-29 18:40 ` Wolfgang Denk
2008-04-29 18:46 ` Christoph Hellwig
2008-04-29 21:16 ` Paul Mackerras
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).