linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Michael Neuling <mikey@neuling.org>
To: Aaron Sawdey <sawdey@us.ibm.com>
Cc: Denis Kirjanov <kda@linux-powerpc.org>,
	Anton Blanchard <anton@samba.org>,
	 benh <benh@kernel.crashing.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Michael Ellerman <michael@ellerman.id.au>,
	Steve Munroe <sjmunroe@us.ibm.com>
Subject: Re: [PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage()
Date: Thu, 24 Sep 2015 09:28:17 +1000	[thread overview]
Message-ID: <1443050897.2492.9.camel@neuling.org> (raw)
In-Reply-To: <201509231835.t8NIZr0Q029029@d01av04.pok.ibm.com>

On Wed, 2015-09-23 at 13:35 -0500, Aaron Sawdey wrote:
> One thing that occurred to me was that since the
> __kernel_datapage_offset is located immediately before the function,
> the offset is small and you could get rid of the addi and just fold it
> into the lwz.

r3 is reused at the end to add to r0, so if r3 doesn't have the right
offset for __kernel_datapage_offset it doesn't work.

Mikey


>=20
>=20
> Aaron Sawdey, Ph.D. sawdey@us.ibm.com
> 050-2/C113 (507) 253-7520 TL: 553-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>=20
> Inactive hide details for Denis Kirjanov ---09/23/2015 01:22:39
> PM---On 9/23/15, Michael Neuling <mikey@neuling.org> wrote: > TDenis
> Kirjanov ---09/23/2015 01:22:39 PM---On 9/23/15, Michael Neuling
> <mikey@neuling.org> wrote: > The 32 and 64 bit variants of
> __get_datapag
>=20
> From: Denis Kirjanov <kda@linux-powerpc.org>
> To: Michael Neuling <mikey@neuling.org>
> Cc: Michael Ellerman <michael@ellerman.id.au>, benh
> <benh@kernel.crashing.org>, Aaron Sawdey/Rochester/IBM@IBMUS,
> linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, Anton Blanchard
> <anton@samba.org>, Steve Munroe/Rochester/IBM@IBMUS
> Date: 09/23/2015 01:22 PM
> Subject: Re: [PATCH] powerpc/vdso: Avoid link stack corruption in
> __get_datapage()
>=20
>=20
>=20
> ______________________________________________________________________
>=20
>=20
>=20
> On 9/23/15, Michael Neuling <mikey@neuling.org> wrote:
> > The 32 and 64 bit variants of __get_datapage() use a "bcl; mflr" to
> > determine the loaded address of the VDSO. The current version of
> these
> > attempt to use the special bcl variant which avoids pushing to the
> > link stack.
> >
> > Unfortunately it uses bcl+8 rather than the required bcl+4. Hence
> the
> > current code results in link stack corruption and the resulting
> > performance degradation (due to branch mis-prediction).
> >
> > This patch moves us to bcl+4 by moving __kernel_datapage_offset
> > out of __get_datapage().
> >
> > With this patch, running the below benchmark we get a bump in
> > performance on POWER8 for gettimeofday() (which uses
> > __get_datapage()).
> >
> > 64bit gets ~4% improvement:
> >   Without patch:
> >     # ./tb
> >     time =3D 0.180321
> >   With patch:
> >     # ./tb
> >     time =3D 0.187408
> >
> > 32bit gets ~9% improvement:
> >   Without patch:
> >     # ./tb
> >     time =3D 0.276551
> >   With patch:
> >     # ./tb
> >     time =3D 0.252767
>=20
> I've got the following results on POWER7 64bit
> without the patch:
> # ./tb
> time =3D 0.263337
> # ./tb
> time =3D 0.251273
> # ./tb
> time =3D 0.258453
> # ./tb
> time =3D 0.260189
>=20
> with the patch:
> # ./tb
> time =3D 0.241517
> # ./tb
> time =3D 0.241973
> # ./tb
> time =3D 0.239365
> # ./tb
> time =3D 0.240109
>=20
>=20
>=20
> >
> > Testcase tb.c (stolen from Anton)
> >   /* gcc -O2 tb.c -o tb */
> >   #include <sys/time.h>
> >   #include <stdio.h>
> >
> >   int main()
> >   {
> >   int i;
> >
> >   struct timeval tv_start, tv_end;
> >
> >   gettimeofday(&tv_start, NULL);
> >
> >   for(i =3D 0; i < 10000000; i++) {
> >   gettimeofday(&tv_end, NULL);
> >   }
> >
> >   printf("time =3D %.6f\n", tv_end.tv_sec - tv_start.tv_sec +
> (tv_end.tv_usec
> > - tv_start.tv_usec) * 1e-6);
> >
> >   return 0;
> >   }
> >
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > Reported-by: Aaron Sawdey <sawdey@us.ibm.com>
> >
> > diff --git a/arch/powerpc/kernel/vdso32/datapage.S
> > b/arch/powerpc/kernel/vdso32/datapage.S
> > index dc21e89..59cf5f4 100644
> > --- a/arch/powerpc/kernel/vdso32/datapage.S
> > +++ b/arch/powerpc/kernel/vdso32/datapage.S
> > @@ -16,6 +16,10 @@
> >  #include <asm/vdso.h>
> >
> >   .text
> > + .global __kernel_datapage_offset;
> > +__kernel_datapage_offset:
> > + .long 0
> > +
> >  V_FUNCTION_BEGIN(__get_datapage)
> >    .cfi_startproc
> >   /* We don't want that exposed or overridable as we want other
> objects
> > @@ -27,13 +31,11 @@ V_FUNCTION_BEGIN(__get_datapage)
> >   mflr r0
> >    .cfi_register lr,r0
> >
> > - bcl 20,31,1f
> > - .global __kernel_datapage_offset;
> > -__kernel_datapage_offset:
> > - .long 0
> > -1:
> > + bcl 20,31,data_page_branch
> > +data_page_branch:
> >   mflr r3
> >   mtlr r0
> > + addi r3, r3, __kernel_datapage_offset-data_page_branch
> >   lwz r0,0(r3)
> >   add r3,r0,r3
> >   blr
> > diff --git a/arch/powerpc/kernel/vdso64/datapage.S
> > b/arch/powerpc/kernel/vdso64/datapage.S
> > index 79796de..2f01c4a 100644
> > --- a/arch/powerpc/kernel/vdso64/datapage.S
> > +++ b/arch/powerpc/kernel/vdso64/datapage.S
> > @@ -16,6 +16,10 @@
> >  #include <asm/vdso.h>
> >
> >   .text
> > +.global __kernel_datapage_offset;
> > +__kernel_datapage_offset:
> > + .long 0
> > +
> >  V_FUNCTION_BEGIN(__get_datapage)
> >    .cfi_startproc
> >   /* We don't want that exposed or overridable as we want other
> objects
> > @@ -27,13 +31,11 @@ V_FUNCTION_BEGIN(__get_datapage)
> >   mflr r0
> >    .cfi_register lr,r0
> >
> > - bcl 20,31,1f
> > - .global __kernel_datapage_offset;
> > -__kernel_datapage_offset:
> > - .long 0
> > -1:
> > + bcl 20,31,data_page_branch
> > +data_page_branch:
> >   mflr r3
> >   mtlr r0
> > + addi r3, r3, __kernel_datapage_offset-data_page_branch
> >   lwz r0,0(r3)
> >   add r3,r0,r3
> >   blr
> >
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
>=20
>=20
>=20
>=20

  parent reply	other threads:[~2015-09-23 23:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-23  6:05 [PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage() Michael Neuling
2015-09-23 18:21 ` Denis Kirjanov
2015-09-23 18:35   ` Aaron Sawdey
     [not found]   ` <201509231835.t8NIZr0Q029029@d01av04.pok.ibm.com>
2015-09-23 23:28     ` Michael Neuling [this message]
2015-09-23 23:33   ` Michael Neuling
2015-09-23 22:23 ` Michael Ellerman
2015-09-24 10:10   ` Denis Kirjanov
2015-09-25  0:05     ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1443050897.2492.9.camel@neuling.org \
    --to=mikey@neuling.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=kda@linux-powerpc.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=sawdey@us.ibm.com \
    --cc=sjmunroe@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).