From: Segher Boessenkool <segher@kernel.crashing.org>
To: Christophe Leroy <christophe.leroy@c-s.fr>
Cc: linux-kernel@vger.kernel.org, Paul Mackerras <paulus@samba.org>,
luto@kernel.org, Thomas Gleixner <tglx@linutronix.de>,
vincenzo.frascino@arm.com, linuxppc-dev@lists.ozlabs.org
Subject: Re: [RFC PATCH] powerpc/32: Switch VDSO to C implementation.
Date: Sun, 27 Oct 2019 14:07:09 -0500 [thread overview]
Message-ID: <20191027190709.GZ28442@gate.crashing.org> (raw)
In-Reply-To: <8e4d0b82-a7a1-b7f1-308e-df871b32d317@c-s.fr>
On Sun, Oct 27, 2019 at 10:21:25AM +0100, Christophe Leroy wrote:
> Le 27/10/2019 à 01:06, Segher Boessenkool a écrit :
> >The hand-optimised asm code will pretty likely win handsomely, whatever
> >you do. Especially on cores like the 885 (no branch prediction, single
> >issue, small caches, etc.: every instruction counts).
> >
> >Is there any reason to replace this hand-optimised code? It was written
> >for exacty this reason? These functions are critical and should be as
> >fast as possible.
>
> Well, all this started with COARSE clocks not being supported by PPC32
> VDSO. I first submitted a series with a set of optimisations including
> the implementation of COARSE clocks
> (https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=126779)
>
> Then after a comment received on patch 4 of the series from Santosh
> Sivaraj asking for a common implementation of it for PPC32 and PPC64, I
> started looking into making the whole VDSO source code common to PPC32
> and PPC64. Most functions are similar. Time functions are also rather
> similar but unfortunately don't use the same registers. They also don't
> cover all possible clocks. And getres() is also buggy, see series
> https://patchwork.ozlabs.org/project/linuxppc-dev/list/?series=110321
That is all nice work :-)
> So instead of reworking the existing time functions, I started
> investigating whether we could plug powerpc to the generic
> implementation. One drawback of PPC is that we need to setup an ASM
> trampoline to handle the SO bit as it can't be handled from C directly,
> can it ?
There is no way to say what CR bits to return. The ABI requires some of
those bits to be preserved, and some are volatile. System calls use a
different ABI, one the compiler knows nothing about, so you cannot even
show system calls as calls to the compiler.
> How critical are these functions ? Although we have a slight degration
> with the C implementation, they are still way faster than the
> corresponding syscall.
"Slight":
With current powerpc/32 ASM VDSO:
gettimeofday: vdso: 750 nsec/call
clock-getres-realtime: vdso: 382 nsec/call
clock-gettime-realtime: vdso: 928 nsec/call
clock-getres-monotonic: vdso: 382 nsec/call
clock-gettime-monotonic: vdso: 1033 nsec/call
Once switched to C implementation:
gettimeofday: vdso: 1533 nsec/call
clock-getres-realtime: vdso: 853 nsec/call
clock-gettime-realtime: vdso: 1570 nsec/call
clock-getres-monotonic: vdso: 835 nsec/call
clock-gettime-monotonic: vdso: 1605 nsec/call
---> Those that are not more than two times slower are almost that. <---
This also needs measurements on more representative PowerPC cores, say
some G3 or G4; and on modern CPUs (Power7/8/9).
It also needs context with those measurements: what CPU core is it?
Running at what frequency clock?
> Another thing I was wondering, is it worth using the 64 bit timebase on
> PPC32 ? As far as I understand, the timebase is there to calculate a
> linear date update since last VDSO datapage update. How often is the
> VDSO datapage updated ? On the 885 clocked at 132Mhz, the timebase is at
> 8.25 Mhz, which means it needs more than 8 minutes to loop over 32 bits.
On most PowerPC cores the time base is incremented significantly faster.
Usual speeds for older cores are 50MHz to 100MHz, and for newer cores ten
times that. Recommended frequency is currently 512MHz, so you'll wrap the
low 32 bits in 8s or so on those, and in about a minute on many powermac
etc. machines already. How can you know this long hasn't passed since the
last time you read the high half of the time base? Without reading that
high part?
The current (assembler) code already optimises converting this to some
other scale quite well, better than a compiler can (see __do_get_tspec).
Segher
next prev parent reply other threads:[~2019-10-27 19:09 UTC|newest]
Thread overview: 16+ messages / expand[flat|nested] mbox.gz Atom feed top
2019-10-21 12:53 [RFC PATCH] powerpc/32: Switch VDSO to C implementation Christophe Leroy
2019-10-21 21:29 ` Thomas Gleixner
2019-10-22 9:01 ` Christophe Leroy
2019-10-22 13:56 ` Christophe Leroy
2019-10-26 13:55 ` Andy Lutomirski
2019-10-26 15:54 ` Christophe Leroy
2019-10-26 15:53 ` Thomas Gleixner
2019-10-26 16:06 ` Christophe Leroy
2019-10-26 18:48 ` Thomas Gleixner
2019-10-26 23:06 ` Segher Boessenkool
2019-10-27 9:21 ` Christophe Leroy
2019-10-27 19:07 ` Segher Boessenkool [this message]
2019-12-20 18:24 ` Christophe Leroy
2020-01-09 14:05 ` Thomas Gleixner
2020-01-09 15:21 ` Christophe Leroy
2020-01-10 22:42 ` Thomas Gleixner
Reply instructions:
You may reply publicly to this message via plain-text email
using any one of the following methods:
* Save the following mbox file, import it into your mail client,
and reply-to-all from there: mbox
Avoid top-posting and favor interleaved quoting:
https://en.wikipedia.org/wiki/Posting_style#Interleaved_style
* Reply using the --to, --cc, and --in-reply-to
switches of git-send-email(1):
git send-email \
--in-reply-to=20191027190709.GZ28442@gate.crashing.org \
--to=segher@kernel.crashing.org \
--cc=christophe.leroy@c-s.fr \
--cc=linux-kernel@vger.kernel.org \
--cc=linuxppc-dev@lists.ozlabs.org \
--cc=luto@kernel.org \
--cc=paulus@samba.org \
--cc=tglx@linutronix.de \
--cc=vincenzo.frascino@arm.com \
/path/to/YOUR_REPLY
https://kernel.org/pub/software/scm/git/docs/git-send-email.html
* If your mail client supports setting the In-Reply-To header
via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line
before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).