linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
* [PATCH] Add fast little-endian switch system call
@ 2008-04-28  3:52 Paul Mackerras
  2008-04-28 14:43 ` Christoph Hellwig
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Mackerras @ 2008-04-28  3:52 UTC (permalink / raw)
  To: linuxppc-dev

This adds a system call on 64-bit platforms for switching between
little-endian and big-endian modes that is much faster than doing a
prctl call.  This system call is handled as a special case right at
the start of the system call entry code, and because it is a special
case, it uses a system call number which is out of the range of
normal system calls, namely 0x1ebe.

Measurements with lmbench on a 4.2GHz POWER6 showed no measurable
change in the speed of normal system calls with this patch.

Switching endianness with this new system call takes around 60ns on a
4.2GHz POWER6, compared with around 300ns to switch endian mode with a
prctl.  This can provide a significant performance advantage for
emulators for little-endian architectures that want to switch between
big-endian and little-endian mode frequently, e.g. because they are
generating instructions sequences on the fly and they want to run
those sequences in little-endian mode.

Signed-off-by: Paul Mackerras <paulus@samba.org>
---

diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
index 215973a..2eb49a7 100644
--- a/arch/powerpc/kernel/head_64.S
+++ b/arch/powerpc/kernel/head_64.S
@@ -239,6 +239,10 @@ instruction_access_slb_pSeries:
 	.globl	system_call_pSeries
 system_call_pSeries:
 	HMT_MEDIUM
+BEGIN_FTR_SECTION
+	cmpdi	r0,0x1ebe
+	beq-	1f
+END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
 	mr	r9,r13
 	mfmsr	r10
 	mfspr	r13,SPRN_SPRG3
@@ -253,6 +257,13 @@ system_call_pSeries:
 	rfid
 	b	.	/* prevent speculative execution */
 
+/* Fast LE/BE switch system call */
+1:	mfspr	r12,SPRN_SRR1
+	xori	r12,r12,MSR_LE
+	mtspr	SPRN_SRR1,r12
+	rfid		/* return to userspace */
+	b	.
+
 	STD_EXCEPTION_PSERIES(0xd00, single_step)
 	STD_EXCEPTION_PSERIES(0xe00, trap_0e)
 

^ permalink raw reply related	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add fast little-endian switch system call
  2008-04-28  3:52 [PATCH] Add fast little-endian switch system call Paul Mackerras
@ 2008-04-28 14:43 ` Christoph Hellwig
  2008-04-28 15:42   ` Michael Kerrisk
  2008-04-29  2:46   ` Paul Mackerras
  0 siblings, 2 replies; 7+ messages in thread
From: Christoph Hellwig @ 2008-04-28 14:43 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-arch, linuxppc-dev, mtk.manpages

Please see Michael Kerrisk on userspace ABI updates.  A nice little
manpage for this gimmick would be helpful, and maybe help other
platforms that want one aswell to implement the same API.

On Mon, Apr 28, 2008 at 01:52:31PM +1000, Paul Mackerras wrote:
> This adds a system call on 64-bit platforms for switching between
> little-endian and big-endian modes that is much faster than doing a
> prctl call.  This system call is handled as a special case right at
> the start of the system call entry code, and because it is a special
> case, it uses a system call number which is out of the range of
> normal system calls, namely 0x1ebe.
> 
> Measurements with lmbench on a 4.2GHz POWER6 showed no measurable
> change in the speed of normal system calls with this patch.
> 
> Switching endianness with this new system call takes around 60ns on a
> 4.2GHz POWER6, compared with around 300ns to switch endian mode with a
> prctl.  This can provide a significant performance advantage for
> emulators for little-endian architectures that want to switch between
> big-endian and little-endian mode frequently, e.g. because they are
> generating instructions sequences on the fly and they want to run
> those sequences in little-endian mode.
> 
> Signed-off-by: Paul Mackerras <paulus@samba.org>
> ---
> 
> diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> index 215973a..2eb49a7 100644
> --- a/arch/powerpc/kernel/head_64.S
> +++ b/arch/powerpc/kernel/head_64.S
> @@ -239,6 +239,10 @@ instruction_access_slb_pSeries:
>  	.globl	system_call_pSeries
>  system_call_pSeries:
>  	HMT_MEDIUM
> +BEGIN_FTR_SECTION
> +	cmpdi	r0,0x1ebe
> +	beq-	1f
> +END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)

Am I missing something here or does this add a branch for every normal
syscall?

>  	mr	r9,r13
>  	mfmsr	r10
>  	mfspr	r13,SPRN_SPRG3
> @@ -253,6 +257,13 @@ system_call_pSeries:
>  	rfid
>  	b	.	/* prevent speculative execution */
>  
> +/* Fast LE/BE switch system call */
> +1:	mfspr	r12,SPRN_SRR1
> +	xori	r12,r12,MSR_LE
> +	mtspr	SPRN_SRR1,r12
> +	rfid		/* return to userspace */
> +	b	.
> +
>  	STD_EXCEPTION_PSERIES(0xd00, single_step)
>  	STD_EXCEPTION_PSERIES(0xe00, trap_0e)
>  
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
---end quoted text---

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add fast little-endian switch system call
  2008-04-28 14:43 ` Christoph Hellwig
@ 2008-04-28 15:42   ` Michael Kerrisk
  2008-04-29  2:46   ` Paul Mackerras
  1 sibling, 0 replies; 7+ messages in thread
From: Michael Kerrisk @ 2008-04-28 15:42 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-arch, linuxppc-dev, Paul Mackerras, mtk.manpages

On Mon, Apr 28, 2008 at 4:43 PM, Christoph Hellwig <hch@lst.de> wrote:
> Please see Michael Kerrisk on userspace ABI updates.  A nice little
> manpage for this gimmick would be helpful, and maybe help other
> platforms that want one aswell to implement the same API.

Thanks Chrsitoph.  I'm not on any of these lists at the moment.

Paul -- is this syscall defintely going in?  Could you write a short
description for userland programmers?  I'll do the grotty *roff stuff.

> On Mon, Apr 28, 2008 at 01:52:31PM +1000, Paul Mackerras wrote:
> > This adds a system call on 64-bit platforms for switching between
> > little-endian and big-endian modes that is much faster than doing a
> > prctl call.  This system call is handled as a special case right at
> > the start of the system call entry code, and because it is a special
> > case, it uses a system call number which is out of the range of
> > normal system calls, namely 0x1ebe.
> >
> > Measurements with lmbench on a 4.2GHz POWER6 showed no measurable
> > change in the speed of normal system calls with this patch.
> >
> > Switching endianness with this new system call takes around 60ns on a
> > 4.2GHz POWER6, compared with around 300ns to switch endian mode with a
> > prctl.  This can provide a significant performance advantage for
> > emulators for little-endian architectures that want to switch between
> > big-endian and little-endian mode frequently, e.g. because they are
> > generating instructions sequences on the fly and they want to run
> > those sequences in little-endian mode.
> >
> > Signed-off-by: Paul Mackerras <paulus@samba.org>
> > ---
> >
> > diff --git a/arch/powerpc/kernel/head_64.S b/arch/powerpc/kernel/head_64.S
> > index 215973a..2eb49a7 100644
> > --- a/arch/powerpc/kernel/head_64.S
> > +++ b/arch/powerpc/kernel/head_64.S
> > @@ -239,6 +239,10 @@ instruction_access_slb_pSeries:
> >       .globl  system_call_pSeries
> >  system_call_pSeries:
> >       HMT_MEDIUM
> > +BEGIN_FTR_SECTION
> > +     cmpdi   r0,0x1ebe
> > +     beq-    1f
> > +END_FTR_SECTION_IFSET(CPU_FTR_REAL_LE)
>
> Am I missing something here or does this add a branch for every normal
> syscall?
>
> >       mr      r9,r13
> >       mfmsr   r10
> >       mfspr   r13,SPRN_SPRG3
> > @@ -253,6 +257,13 @@ system_call_pSeries:
> >       rfid
> >       b       .       /* prevent speculative execution */
> >
> > +/* Fast LE/BE switch system call */
> > +1:   mfspr   r12,SPRN_SRR1
> > +     xori    r12,r12,MSR_LE
> > +     mtspr   SPRN_SRR1,r12
> > +     rfid            /* return to userspace */
> > +     b       .
> > +
> >       STD_EXCEPTION_PSERIES(0xd00, single_step)
> >       STD_EXCEPTION_PSERIES(0xe00, trap_0e)
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@ozlabs.org
> > https://ozlabs.org/mailman/listinfo/linuxppc-dev
> ---end quoted text---
>



-- 
Michael Kerrisk
Linux man-pages maintainer; http://www.kernel.org/doc/man-pages/
Found a bug? http://www.kernel.org/doc/man-pages/reporting_bugs.html

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add fast little-endian switch system call
  2008-04-28 14:43 ` Christoph Hellwig
  2008-04-28 15:42   ` Michael Kerrisk
@ 2008-04-29  2:46   ` Paul Mackerras
  2008-04-29 18:40     ` Wolfgang Denk
  1 sibling, 1 reply; 7+ messages in thread
From: Paul Mackerras @ 2008-04-29  2:46 UTC (permalink / raw)
  To: Christoph Hellwig; +Cc: linux-arch, linuxppc-dev, mtk.manpages

Christoph Hellwig writes:

> Am I missing something here or does this add a branch for every normal
> syscall?

It does, but the impact is so small as to be unmeasurable with
lmbench, even on the null syscall measurement.  The overhead of the
easily-predicted not-taken branch is completely swamped by the amount
of time that the sc and rfid instructions take.  I had it under a
config option at one point but then decided not to bother with that
when I couldn't measure any difference.

Paul.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add fast little-endian switch system call
  2008-04-29  2:46   ` Paul Mackerras
@ 2008-04-29 18:40     ` Wolfgang Denk
  2008-04-29 18:46       ` Christoph Hellwig
  2008-04-29 21:16       ` Paul Mackerras
  0 siblings, 2 replies; 7+ messages in thread
From: Wolfgang Denk @ 2008-04-29 18:40 UTC (permalink / raw)
  To: Paul Mackerras; +Cc: linux-arch, linuxppc-dev, Christoph Hellwig, mtk.manpages

In message <18454.35824.527711.355488@cargo.ozlabs.ibm.com> you wrote:
> 
> > Am I missing something here or does this add a branch for every normal
> > syscall?
> 
> It does, but the impact is so small as to be unmeasurable with
> lmbench, even on the null syscall measurement.  The overhead of the
> easily-predicted not-taken branch is completely swamped by the amount
> of time that the sc and rfid instructions take.  I had it under a
> config option at one point but then decided not to bother with that
> when I couldn't measure any difference.

This probably depends a bit on  the  performance  of  the  system  in
question. Did you measure it - for example - on a 50 MHz MPC850 ?

Best regards,

Wolfgang Denk

-- 
DENX Software Engineering GmbH,     MD: Wolfgang Denk & Detlev Zundel
HRB 165235 Munich, Office: Kirchenstr.5, D-82194 Groebenzell, Germany
Phone: (+49)-8142-66989-10 Fax: (+49)-8142-66989-80 Email: wd@denx.de
To get something done, a committee should consist  of  no  more  than
three men, two of them absent.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add fast little-endian switch system call
  2008-04-29 18:40     ` Wolfgang Denk
@ 2008-04-29 18:46       ` Christoph Hellwig
  2008-04-29 21:16       ` Paul Mackerras
  1 sibling, 0 replies; 7+ messages in thread
From: Christoph Hellwig @ 2008-04-29 18:46 UTC (permalink / raw)
  To: Wolfgang Denk
  Cc: linux-arch, linuxppc-dev, Paul Mackerras, Christoph Hellwig,
	mtk.manpages

On Tue, Apr 29, 2008 at 08:40:47PM +0200, Wolfgang Denk wrote:
> This probably depends a bit on  the  performance  of  the  system  in
> question. Did you measure it - for example - on a 50 MHz MPC850 ?

You got a 64bit kernel to run on a MPC850? wow :)

Not sure what the slowest supported 64bit cpu is (RS64-II?), but Paul
might be right and in this case it really doesn't matter.

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: [PATCH] Add fast little-endian switch system call
  2008-04-29 18:40     ` Wolfgang Denk
  2008-04-29 18:46       ` Christoph Hellwig
@ 2008-04-29 21:16       ` Paul Mackerras
  1 sibling, 0 replies; 7+ messages in thread
From: Paul Mackerras @ 2008-04-29 21:16 UTC (permalink / raw)
  To: Wolfgang Denk; +Cc: linux-arch, linuxppc-dev, Christoph Hellwig, mtk.manpages

Wolfgang Denk writes:

> This probably depends a bit on  the  performance  of  the  system  in
> question. Did you measure it - for example - on a 50 MHz MPC850 ?

The patch only affects arch/powerpc/kernel/entry_64.S.  So no, I
didn't measure it on a 50MHz MPC850, or indeed any 32-bit system. :)

Paul.

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2008-04-29 21:16 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2008-04-28  3:52 [PATCH] Add fast little-endian switch system call Paul Mackerras
2008-04-28 14:43 ` Christoph Hellwig
2008-04-28 15:42   ` Michael Kerrisk
2008-04-29  2:46   ` Paul Mackerras
2008-04-29 18:40     ` Wolfgang Denk
2008-04-29 18:46       ` Christoph Hellwig
2008-04-29 21:16       ` Paul Mackerras

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).