All of lore.kernel.org
 help / color / mirror / Atom feed
From: Michael Neuling <mikey@neuling.org>
To: Aaron Sawdey <sawdey@us.ibm.com>
Cc: Denis Kirjanov <kda@linux-powerpc.org>,
	Anton Blanchard <anton@samba.org>,
	 benh <benh@kernel.crashing.org>,
	linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
	Michael Ellerman <michael@ellerman.id.au>,
	Steve Munroe <sjmunroe@us.ibm.com>
Subject: Re: [PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage()
Date: Thu, 24 Sep 2015 09:28:17 +1000	[thread overview]
Message-ID: <1443050897.2492.9.camel@neuling.org> (raw)
In-Reply-To: <201509231835.t8NIZr0Q029029@d01av04.pok.ibm.com>

On Wed, 2015-09-23 at 13:35 -0500, Aaron Sawdey wrote:
> One thing that occurred to me was that since the
> __kernel_datapage_offset is located immediately before the function,
> the offset is small and you could get rid of the addi and just fold it
> into the lwz.

r3 is reused at the end to add to r0, so if r3 doesn't have the right
offset for __kernel_datapage_offset it doesn't work.

Mikey


>=20
>=20
> Aaron Sawdey, Ph.D. sawdey@us.ibm.com
> 050-2/C113 (507) 253-7520 TL: 553-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>=20
> Inactive hide details for Denis Kirjanov ---09/23/2015 01:22:39
> PM---On 9/23/15, Michael Neuling <mikey@neuling.org> wrote: > TDenis
> Kirjanov ---09/23/2015 01:22:39 PM---On 9/23/15, Michael Neuling
> <mikey@neuling.org> wrote: > The 32 and 64 bit variants of
> __get_datapag
>=20
> From: Denis Kirjanov <kda@linux-powerpc.org>
> To: Michael Neuling <mikey@neuling.org>
> Cc: Michael Ellerman <michael@ellerman.id.au>, benh
> <benh@kernel.crashing.org>, Aaron Sawdey/Rochester/IBM@IBMUS,
> linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, Anton Blanchard
> <anton@samba.org>, Steve Munroe/Rochester/IBM@IBMUS
> Date: 09/23/2015 01:22 PM
> Subject: Re: [PATCH] powerpc/vdso: Avoid link stack corruption in
> __get_datapage()
>=20
>=20
>=20
> ______________________________________________________________________
>=20
>=20
>=20
> On 9/23/15, Michael Neuling <mikey@neuling.org> wrote:
> > The 32 and 64 bit variants of __get_datapage() use a "bcl; mflr" to
> > determine the loaded address of the VDSO. The current version of
> these
> > attempt to use the special bcl variant which avoids pushing to the
> > link stack.
> >
> > Unfortunately it uses bcl+8 rather than the required bcl+4. Hence
> the
> > current code results in link stack corruption and the resulting
> > performance degradation (due to branch mis-prediction).
> >
> > This patch moves us to bcl+4 by moving __kernel_datapage_offset
> > out of __get_datapage().
> >
> > With this patch, running the below benchmark we get a bump in
> > performance on POWER8 for gettimeofday() (which uses
> > __get_datapage()).
> >
> > 64bit gets ~4% improvement:
> >   Without patch:
> >     # ./tb
> >     time =3D 0.180321
> >   With patch:
> >     # ./tb
> >     time =3D 0.187408
> >
> > 32bit gets ~9% improvement:
> >   Without patch:
> >     # ./tb
> >     time =3D 0.276551
> >   With patch:
> >     # ./tb
> >     time =3D 0.252767
>=20
> I've got the following results on POWER7 64bit
> without the patch:
> # ./tb
> time =3D 0.263337
> # ./tb
> time =3D 0.251273
> # ./tb
> time =3D 0.258453
> # ./tb
> time =3D 0.260189
>=20
> with the patch:
> # ./tb
> time =3D 0.241517
> # ./tb
> time =3D 0.241973
> # ./tb
> time =3D 0.239365
> # ./tb
> time =3D 0.240109
>=20
>=20
>=20
> >
> > Testcase tb.c (stolen from Anton)
> >   /* gcc -O2 tb.c -o tb */
> >   #include <sys/time.h>
> >   #include <stdio.h>
> >
> >   int main()
> >   {
> >   int i;
> >
> >   struct timeval tv_start, tv_end;
> >
> >   gettimeofday(&tv_start, NULL);
> >
> >   for(i =3D 0; i < 10000000; i++) {
> >   gettimeofday(&tv_end, NULL);
> >   }
> >
> >   printf("time =3D %.6f\n", tv_end.tv_sec - tv_start.tv_sec +
> (tv_end.tv_usec
> > - tv_start.tv_usec) * 1e-6);
> >
> >   return 0;
> >   }
> >
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > Reported-by: Aaron Sawdey <sawdey@us.ibm.com>
> >
> > diff --git a/arch/powerpc/kernel/vdso32/datapage.S
> > b/arch/powerpc/kernel/vdso32/datapage.S
> > index dc21e89..59cf5f4 100644
> > --- a/arch/powerpc/kernel/vdso32/datapage.S
> > +++ b/arch/powerpc/kernel/vdso32/datapage.S
> > @@ -16,6 +16,10 @@
> >  #include <asm/vdso.h>
> >
> >   .text
> > + .global __kernel_datapage_offset;
> > +__kernel_datapage_offset:
> > + .long 0
> > +
> >  V_FUNCTION_BEGIN(__get_datapage)
> >    .cfi_startproc
> >   /* We don't want that exposed or overridable as we want other
> objects
> > @@ -27,13 +31,11 @@ V_FUNCTION_BEGIN(__get_datapage)
> >   mflr r0
> >    .cfi_register lr,r0
> >
> > - bcl 20,31,1f
> > - .global __kernel_datapage_offset;
> > -__kernel_datapage_offset:
> > - .long 0
> > -1:
> > + bcl 20,31,data_page_branch
> > +data_page_branch:
> >   mflr r3
> >   mtlr r0
> > + addi r3, r3, __kernel_datapage_offset-data_page_branch
> >   lwz r0,0(r3)
> >   add r3,r0,r3
> >   blr
> > diff --git a/arch/powerpc/kernel/vdso64/datapage.S
> > b/arch/powerpc/kernel/vdso64/datapage.S
> > index 79796de..2f01c4a 100644
> > --- a/arch/powerpc/kernel/vdso64/datapage.S
> > +++ b/arch/powerpc/kernel/vdso64/datapage.S
> > @@ -16,6 +16,10 @@
> >  #include <asm/vdso.h>
> >
> >   .text
> > +.global __kernel_datapage_offset;
> > +__kernel_datapage_offset:
> > + .long 0
> > +
> >  V_FUNCTION_BEGIN(__get_datapage)
> >    .cfi_startproc
> >   /* We don't want that exposed or overridable as we want other
> objects
> > @@ -27,13 +31,11 @@ V_FUNCTION_BEGIN(__get_datapage)
> >   mflr r0
> >    .cfi_register lr,r0
> >
> > - bcl 20,31,1f
> > - .global __kernel_datapage_offset;
> > -__kernel_datapage_offset:
> > - .long 0
> > -1:
> > + bcl 20,31,data_page_branch
> > +data_page_branch:
> >   mflr r3
> >   mtlr r0
> > + addi r3, r3, __kernel_datapage_offset-data_page_branch
> >   lwz r0,0(r3)
> >   add r3,r0,r3
> >   blr
> >
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
>=20
>=20
>=20
>=20

  parent reply	other threads:[~2015-09-23 23:28 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2015-09-23  6:05 [PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage() Michael Neuling
2015-09-23 18:21 ` Denis Kirjanov
2015-09-23 18:35   ` Aaron Sawdey
     [not found]   ` <201509231835.t8NIZr0Q029029@d01av04.pok.ibm.com>
2015-09-23 23:28     ` Michael Neuling [this message]
2015-09-23 23:33   ` Michael Neuling
2015-09-23 22:23 ` Michael Ellerman
2015-09-24 10:10   ` Denis Kirjanov
2015-09-25  0:05     ` Michael Ellerman

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1443050897.2492.9.camel@neuling.org \
    --to=mikey@neuling.org \
    --cc=anton@samba.org \
    --cc=benh@kernel.crashing.org \
    --cc=kda@linux-powerpc.org \
    --cc=linuxppc-dev@lists.ozlabs.org \
    --cc=michael@ellerman.id.au \
    --cc=sawdey@us.ibm.com \
    --cc=sjmunroe@us.ibm.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.