From: Michael Neuling <mikey@neuling.org>
To: Aaron Sawdey <sawdey@us.ibm.com>
Cc: Denis Kirjanov <kda@linux-powerpc.org>,
Anton Blanchard <anton@samba.org>,
benh <benh@kernel.crashing.org>,
linuxppc-dev <linuxppc-dev@lists.ozlabs.org>,
Michael Ellerman <michael@ellerman.id.au>,
Steve Munroe <sjmunroe@us.ibm.com>
Subject: Re: [PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage()
Date: Thu, 24 Sep 2015 09:28:17 +1000 [thread overview]
Message-ID: <1443050897.2492.9.camel@neuling.org> (raw)
In-Reply-To: <201509231835.t8NIZr0Q029029@d01av04.pok.ibm.com>
On Wed, 2015-09-23 at 13:35 -0500, Aaron Sawdey wrote:
> One thing that occurred to me was that since the
> __kernel_datapage_offset is located immediately before the function,
> the offset is small and you could get rid of the addi and just fold it
> into the lwz.
r3 is reused at the end to add to r0, so if r3 doesn't have the right
offset for __kernel_datapage_offset it doesn't work.
Mikey
>=20
>=20
> Aaron Sawdey, Ph.D. sawdey@us.ibm.com
> 050-2/C113 (507) 253-7520 TL: 553-7520 home: 507/263-0782
> IBM Linux Technology Center - PPC Toolchain
>=20
> Inactive hide details for Denis Kirjanov ---09/23/2015 01:22:39
> PM---On 9/23/15, Michael Neuling <mikey@neuling.org> wrote: > TDenis
> Kirjanov ---09/23/2015 01:22:39 PM---On 9/23/15, Michael Neuling
> <mikey@neuling.org> wrote: > The 32 and 64 bit variants of
> __get_datapag
>=20
> From: Denis Kirjanov <kda@linux-powerpc.org>
> To: Michael Neuling <mikey@neuling.org>
> Cc: Michael Ellerman <michael@ellerman.id.au>, benh
> <benh@kernel.crashing.org>, Aaron Sawdey/Rochester/IBM@IBMUS,
> linuxppc-dev <linuxppc-dev@lists.ozlabs.org>, Anton Blanchard
> <anton@samba.org>, Steve Munroe/Rochester/IBM@IBMUS
> Date: 09/23/2015 01:22 PM
> Subject: Re: [PATCH] powerpc/vdso: Avoid link stack corruption in
> __get_datapage()
>=20
>=20
>=20
> ______________________________________________________________________
>=20
>=20
>=20
> On 9/23/15, Michael Neuling <mikey@neuling.org> wrote:
> > The 32 and 64 bit variants of __get_datapage() use a "bcl; mflr" to
> > determine the loaded address of the VDSO. The current version of
> these
> > attempt to use the special bcl variant which avoids pushing to the
> > link stack.
> >
> > Unfortunately it uses bcl+8 rather than the required bcl+4. Hence
> the
> > current code results in link stack corruption and the resulting
> > performance degradation (due to branch mis-prediction).
> >
> > This patch moves us to bcl+4 by moving __kernel_datapage_offset
> > out of __get_datapage().
> >
> > With this patch, running the below benchmark we get a bump in
> > performance on POWER8 for gettimeofday() (which uses
> > __get_datapage()).
> >
> > 64bit gets ~4% improvement:
> > Without patch:
> > # ./tb
> > time =3D 0.180321
> > With patch:
> > # ./tb
> > time =3D 0.187408
> >
> > 32bit gets ~9% improvement:
> > Without patch:
> > # ./tb
> > time =3D 0.276551
> > With patch:
> > # ./tb
> > time =3D 0.252767
>=20
> I've got the following results on POWER7 64bit
> without the patch:
> # ./tb
> time =3D 0.263337
> # ./tb
> time =3D 0.251273
> # ./tb
> time =3D 0.258453
> # ./tb
> time =3D 0.260189
>=20
> with the patch:
> # ./tb
> time =3D 0.241517
> # ./tb
> time =3D 0.241973
> # ./tb
> time =3D 0.239365
> # ./tb
> time =3D 0.240109
>=20
>=20
>=20
> >
> > Testcase tb.c (stolen from Anton)
> > /* gcc -O2 tb.c -o tb */
> > #include <sys/time.h>
> > #include <stdio.h>
> >
> > int main()
> > {
> > int i;
> >
> > struct timeval tv_start, tv_end;
> >
> > gettimeofday(&tv_start, NULL);
> >
> > for(i =3D 0; i < 10000000; i++) {
> > gettimeofday(&tv_end, NULL);
> > }
> >
> > printf("time =3D %.6f\n", tv_end.tv_sec - tv_start.tv_sec +
> (tv_end.tv_usec
> > - tv_start.tv_usec) * 1e-6);
> >
> > return 0;
> > }
> >
> > Signed-off-by: Michael Neuling <mikey@neuling.org>
> > Reported-by: Aaron Sawdey <sawdey@us.ibm.com>
> >
> > diff --git a/arch/powerpc/kernel/vdso32/datapage.S
> > b/arch/powerpc/kernel/vdso32/datapage.S
> > index dc21e89..59cf5f4 100644
> > --- a/arch/powerpc/kernel/vdso32/datapage.S
> > +++ b/arch/powerpc/kernel/vdso32/datapage.S
> > @@ -16,6 +16,10 @@
> > #include <asm/vdso.h>
> >
> > .text
> > + .global __kernel_datapage_offset;
> > +__kernel_datapage_offset:
> > + .long 0
> > +
> > V_FUNCTION_BEGIN(__get_datapage)
> > .cfi_startproc
> > /* We don't want that exposed or overridable as we want other
> objects
> > @@ -27,13 +31,11 @@ V_FUNCTION_BEGIN(__get_datapage)
> > mflr r0
> > .cfi_register lr,r0
> >
> > - bcl 20,31,1f
> > - .global __kernel_datapage_offset;
> > -__kernel_datapage_offset:
> > - .long 0
> > -1:
> > + bcl 20,31,data_page_branch
> > +data_page_branch:
> > mflr r3
> > mtlr r0
> > + addi r3, r3, __kernel_datapage_offset-data_page_branch
> > lwz r0,0(r3)
> > add r3,r0,r3
> > blr
> > diff --git a/arch/powerpc/kernel/vdso64/datapage.S
> > b/arch/powerpc/kernel/vdso64/datapage.S
> > index 79796de..2f01c4a 100644
> > --- a/arch/powerpc/kernel/vdso64/datapage.S
> > +++ b/arch/powerpc/kernel/vdso64/datapage.S
> > @@ -16,6 +16,10 @@
> > #include <asm/vdso.h>
> >
> > .text
> > +.global __kernel_datapage_offset;
> > +__kernel_datapage_offset:
> > + .long 0
> > +
> > V_FUNCTION_BEGIN(__get_datapage)
> > .cfi_startproc
> > /* We don't want that exposed or overridable as we want other
> objects
> > @@ -27,13 +31,11 @@ V_FUNCTION_BEGIN(__get_datapage)
> > mflr r0
> > .cfi_register lr,r0
> >
> > - bcl 20,31,1f
> > - .global __kernel_datapage_offset;
> > -__kernel_datapage_offset:
> > - .long 0
> > -1:
> > + bcl 20,31,data_page_branch
> > +data_page_branch:
> > mflr r3
> > mtlr r0
> > + addi r3, r3, __kernel_datapage_offset-data_page_branch
> > lwz r0,0(r3)
> > add r3,r0,r3
> > blr
> >
> >
> > _______________________________________________
> > Linuxppc-dev mailing list
> > Linuxppc-dev@lists.ozlabs.org
> > https://lists.ozlabs.org/listinfo/linuxppc-dev
>=20
>=20
>=20
>=20
next prev parent reply other threads:[~2015-09-23 23:28 UTC|newest]
Thread overview: 8+ messages / expand[flat|nested] mbox.gz Atom feed top
2015-09-23 6:05 [PATCH] powerpc/vdso: Avoid link stack corruption in __get_datapage() Michael Neuling
2015-09-23 18:21 ` Denis Kirjanov
2015-09-23 18:35 ` Aaron Sawdey
[not found] ` <201509231835.t8NIZr0Q029029@d01av04.pok.ibm.com>
2015-09-23 23:28 ` Michael Neuling [this message]
2015-09-23 23:33 ` Michael Neuling
2015-09-23 22:23 ` Michael Ellerman
2015-09-24 10:10 ` Denis Kirjanov
2015-09-25 0:05 ` Michael Ellerman
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=1443050897.2492.9.camel@neuling.org \
--to=mikey@neuling.org \
--cc=anton@samba.org \
--cc=benh@kernel.crashing.org \
--cc=kda@linux-powerpc.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=michael@ellerman.id.au \
--cc=sawdey@us.ibm.com \
--cc=sjmunroe@us.ibm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).