* Saving to 32 bits of GPRs in signal context
@ 2007-05-29 7:24 Benjamin Herrenschmidt
2007-05-29 7:52 ` Dan Malek
2007-05-29 13:53 ` Ulrich Weigand
0 siblings, 2 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 7:24 UTC (permalink / raw)
To: Paul Mackerras
Cc: linuxppc-dev list, Steve Munroe, Anton Blanchard, Ulrich Weigand
Hi Folks !
I've been looking at saving & restoring the top 32 bits of 64 bits
registers when delivering signals to 32 bits processes on a 64 bits
kernel. The principle is easy, but I wonder where to stick them.
I was initially tempted to add them to the end of struct mcontext32 but
decided against it.
Then, knowing that we have this uc_regs pointer that points to the
mcontext, I was thinking about adding next to it a uc_highregs pointer
that points to them (and I can then add them anywhere in the signal
frame, the user don't have to know, just follow the pointer).
But that means changing slightly the layout of ucontext32...
Thus my question is which fields in there have their location
ABIficated ? (not necessarily written ABI, but more like gdb "knows"
about them for example, or the infamous old style signal frame unmangler
in gcc C++ exception runtime).
Specifically, are everybody using the uc_regs pointer to get to the
mcontext or are some people likely to expect the mcontext to always be
at the same offset from the beginning of the signal frame ?
I'd like to add my highregs pointer just before the mcontext (after all
the other fields) but I see a uc_pad2 in there which makes me wonder...
I suppose I could also hijack one of the pad fields... they are only
here to make sure the mcontext is 16 bytes aligned right ?
There are a few other issues... one is, the pad fields aren't cleared.
Thus how can userland or rt_sigreturn differenciate between a valid
highregs pointers and ramdom junk ? Is there a trick one of you can come
up with that I could do to let userland/gdb/rt_sigreturn know that
there's something there ?
rt_sigreturn isn't much of a problem since I will initialize the
contexts I create to 0 in that field, and will check the pointer
validity so the worst that can happen is crap in the top 32 bits if some
app mucks around too much, which isn't not a problem.
I'm a bit more worried about how will gdb know that there's something
useful to peek/poke at in there.
Cheers,
Ben
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 7:24 Saving to 32 bits of GPRs in signal context Benjamin Herrenschmidt
@ 2007-05-29 7:52 ` Dan Malek
2007-05-29 8:05 ` Benjamin Herrenschmidt
2007-05-29 13:53 ` Ulrich Weigand
1 sibling, 1 reply; 63+ messages in thread
From: Dan Malek @ 2007-05-29 7:52 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras, Steve Munroe,
Anton Blanchard
On May 29, 2007, at 3:24 AM, Benjamin Herrenschmidt wrote:
> Hi Folks !
>
> I've been looking at saving & restoring the top 32 bits of 64 bits
> registers when delivering signals to 32 bits processes on a 64 bits
> kernel. The principle is easy, but I wonder where to stick them.
I'm wondering why you need to do this at all?
Why would a 32-bit application care about or
know what to do with these?
Thanks.
-- Dan
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 7:52 ` Dan Malek
@ 2007-05-29 8:05 ` Benjamin Herrenschmidt
2007-05-29 9:26 ` Gabriel Paubert
2007-05-29 13:10 ` Kumar Gala
0 siblings, 2 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 8:05 UTC (permalink / raw)
To: Dan Malek
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras, Steve Munroe,
Anton Blanchard
On Tue, 2007-05-29 at 03:52 -0400, Dan Malek wrote:
> > I've been looking at saving & restoring the top 32 bits of 64 bits
> > registers when delivering signals to 32 bits processes on a 64 bits
> > kernel. The principle is easy, but I wonder where to stick them.
>
> I'm wondering why you need to do this at all?
> Why would a 32-bit application care about or
> know what to do with these?
There are regular demands for the ability to use the full 64 bits
registers in 32 bits applications when running on a 64 bits processor.
That ranges from, iirc, the java folks, to people wanting to optimize
some libs to use 64 bits registers internally when called from 32 bits
apps etc...
You can use the full 64 bits easily on powerpc, ld/std just work, it's
only the flags calculations and branches, mostly, that are truncated
when running in 32 bits mode. Also, the kernel syscall & interrupt
entry/exit path will save & restore the full 64 bits.
The problem is when you use signals. The compat signal code for 32 bits
apps will only save and restore the bottom 32 bits, thus an application
using signals will potentially corrupt the top 32 bits, which can be a
problem if, for example, it uses a library that has optimisations based
on using the full 64 bits.
We don't intend to update jmpbuf, getcontext/setcontext etc... for
those... they are purely call clobbered etc..., but it would be nice if
at least the signal frame save/restore could properly deal with them so
they don't get randomly clobbered.
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 8:05 ` Benjamin Herrenschmidt
@ 2007-05-29 9:26 ` Gabriel Paubert
2007-05-29 9:44 ` Benjamin Herrenschmidt
2007-05-29 13:04 ` Segher Boessenkool
2007-05-29 13:10 ` Kumar Gala
1 sibling, 2 replies; 63+ messages in thread
From: Gabriel Paubert @ 2007-05-29 9:26 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Anton Blanchard,
Paul Mackerras
On Tue, May 29, 2007 at 06:05:00PM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2007-05-29 at 03:52 -0400, Dan Malek wrote:
> > > I've been looking at saving & restoring the top 32 bits of 64 bits
> > > registers when delivering signals to 32 bits processes on a 64 bits
> > > kernel. The principle is easy, but I wonder where to stick them.
> >
> > I'm wondering why you need to do this at all?
> > Why would a 32-bit application care about or
> > know what to do with these?
>
> There are regular demands for the ability to use the full 64 bits
> registers in 32 bits applications when running on a 64 bits processor.
> That ranges from, iirc, the java folks, to people wanting to optimize
> some libs to use 64 bits registers internally when called from 32 bits
> apps etc...
My thoughts exactly, that's a very useful mode for some applications,
either loading a different library or having alternate routines depending
on the availability of 64 bit GPRs for "long long" integer computations,
block copying and initializing (and therefore struct copies and memset()),
etc...
But it also makes me wonder about a few things:
- do you use the standard 32 bit ABI, in which case the caller of libraries
does not care and the libraries can be put in the standard places, or are
there cases where the ability to pass 64 bit values in a single
register would improve performance to the point that it is worth
having an incompatible library (where to put it and how to name it)?
I'd rather lean towards the first solution but I don't have enough
data to judge.
- how can an application know that it can use 64 bit registers and call
the optimized routines?
Finally, I've not seen a compiler (well, GCC, but I don't have 4.2 or
4.3 installed yet) that allows you to tell the compiler to use 32 bit
addresses but assume that integer registers are 64 bit wide. As long
as such an option does not exist, the usefulness of this feature is
somewhat limited. In other words, GCC for now has support for ILP32 and
LP64 modes, but it would be better to also have support for IP32L64.
[...]
Well, after downloading the gcc trunk, it still seems to be the case.
Adding the IP32L64 mode des not seem to be such a huge project
(comparable to many Google SoC ones, but it's too late for 2007),
the problem is taht if it is started now, I don't think it will hit
an official GCC release before 2009 at the earliest.
>
> You can use the full 64 bits easily on powerpc, ld/std just work, it's
> only the flags calculations and branches, mostly, that are truncated
> when running in 32 bits mode.
Only the implicit flags calculations, cmp has both 32 and 64 bit
versions, so it's not _that_ bad. However multiprecision arithmetic will
have to be done in 32 bit mode because the carry setting[1] is linked
with the mode (well I seem to remember a variant of the architecture
which had 32 and 64 bit variants for generating the flags, but it was a
long time ago. I've not seen support for this hybrid monster in gcc's md
file and I don't know whether it ever saw the light).
> Also, the kernel syscall & interrupt
> entry/exit path will save & restore the full 64 bits.
>
> The problem is when you use signals. The compat signal code for 32 bits
> apps will only save and restore the bottom 32 bits, thus an application
> using signals will potentially corrupt the top 32 bits, which can be a
> problem if, for example, it uses a library that has optimisations based
> on using the full 64 bits.
>
> We don't intend to update jmpbuf, getcontext/setcontext etc... for
> those... they are purely call clobbered etc..., but it would be nice if
> at least the signal frame save/restore could properly deal with them so
> they don't get randomly clobbered.
Have you considered the case of a mixed (IP32L64) mode calling a pure
32 bit library (or a user provided callback). The high part of R13-R31
will also be clobbered. Therefore the caller has to save and restore
all registers that are live around the call, which results in
significant code bloat and needs compiler support.
All of this of course unless all people who write mixed mode code
are masochistic enough to limit themselves to 100% pure assembly...
Gabriel
[1] and overflow but sadly nobody cares about it, look at Ada code compiled
when checking for overflows and get deeply depressed :-(
>
> Ben.
>
>
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@ozlabs.org
> https://ozlabs.org/mailman/listinfo/linuxppc-dev
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 9:26 ` Gabriel Paubert
@ 2007-05-29 9:44 ` Benjamin Herrenschmidt
2007-05-29 13:12 ` Segher Boessenkool
2007-05-29 13:04 ` Segher Boessenkool
1 sibling, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 9:44 UTC (permalink / raw)
To: Gabriel Paubert
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Anton Blanchard,
Paul Mackerras
> - do you use the standard 32 bit ABI, in which case the caller of libraries
> does not care and the libraries can be put in the standard places, or are
> there cases where the ability to pass 64 bit values in a single
> register would improve performance to the point that it is worth
> having an incompatible library (where to put it and how to name it)?
>
> I'd rather lean towards the first solution but I don't have enough
> data to judge.
>
> - how can an application know that it can use 64 bit registers and call
> the optimized routines?
I'd say use the 32 bits ABI, AT_HWCAP will tell you if you are running
on a 64 bits capable machine. You can then either use hand tuned code at
runtime, or I think ld.so can load alternate libs based on the bits in
there.
> Finally, I've not seen a compiler (well, GCC, but I don't have 4.2 or
> 4.3 installed yet) that allows you to tell the compiler to use 32 bit
> addresses but assume that integer registers are 64 bit wide. As long
> as such an option does not exist, the usefulness of this feature is
> somewhat limited. In other words, GCC for now has support for ILP32 and
> LP64 modes, but it would be better to also have support for IP32L64.
Depends... If such binaries are actual 64 bits binaries from a kernel
POV, then no change is necessary.
> Have you considered the case of a mixed (IP32L64) mode calling a pure
> 32 bit library (or a user provided callback). The high part of R13-R31
> will also be clobbered. Therefore the caller has to save and restore
> all registers that are live around the call, which results in
> significant code bloat and needs compiler support.
Yes. The high parts are call clobbered. Either you use C code and you'll
need some new gcc options to use this mode, or, as it's mostly the
request for now, you use hand tuned asm code and you know what you are
doing.
> All of this of course unless all people who write mixed mode code
> are masochistic enough to limit themselves to 100% pure assembly...
:-)
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 9:26 ` Gabriel Paubert
2007-05-29 9:44 ` Benjamin Herrenschmidt
@ 2007-05-29 13:04 ` Segher Boessenkool
2007-05-29 14:28 ` Arnd Bergmann
2007-05-29 21:27 ` Benjamin Herrenschmidt
1 sibling, 2 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-29 13:04 UTC (permalink / raw)
To: Gabriel Paubert
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
> But it also makes me wonder about a few things:
> - do you use the standard 32 bit ABI, in which case the caller of
> libraries
> does not care and the libraries can be put in the standard places,
The compiler should either a) use full 64-bit only for
"volatile" (call-clobbered) registers, or b) save and
restore other 64-bit mode registers around calls. I don't
remember if powerpc-linux-gcc does a) or b), if either.
> or are
> there cases where the ability to pass 64 bit values in a single
> register would improve performance to the point that it is worth
> having an incompatible library (where to put it and how to name it)?
That would be a third ABI. Is it worth that?
> - how can an application know that it can use 64 bit registers and call
> the optimized routines?
Just call them and trap the SEGV ;-) You can check the
aux vector of course, or ask glibc -- but the SEGV way
is the only really portable way. Strange world :-)
> Finally, I've not seen a compiler (well, GCC, but I don't have 4.2 or
> 4.3 installed yet) that allows you to tell the compiler to use 32 bit
> addresses but assume that integer registers are 64 bit wide.
This feature was developed in the 3.3 timeframe IIRC. The
flags to use are -m32 -mpowerpc64 .
> As long
> as such an option does not exist, the usefulness of this feature is
> somewhat limited. In other words, GCC for now has support for ILP32 and
> LP64 modes, but it would be better to also have support for IP32L64.
ILP32LL64. The C "mode" stays the same, only the generated
machine insns are changed.
> Adding the IP32L64 mode des not seem to be such a huge project
> (comparable to many Google SoC ones, but it's too late for 2007),
GCC (and many 3rd party apps/libs) *really* like pointers and
longs to have the same size.
> the problem is taht if it is started now, I don't think it will hit
> an official GCC release before 2009 at the earliest.
It has been there for a long time now ;-)
> Only the implicit flags calculations, cmp has both 32 and 64 bit
> versions, so it's not _that_ bad. However multiprecision arithmetic
> will
> have to be done in 32 bit mode
You actually want to do MP calc in 64-bit mode -- multiplies
are four times faster! (64x64 vs. 32x32).
> because the carry setting[1] is linked with the mode
True enough, but you never really need the CA reg anyway.
> (well I seem to remember a variant of the architecture
> which had 32 and 64 bit variants for generating the flags, but it was a
> long time ago. I've not seen support for this hybrid monster in gcc's
> md
> file and I don't know whether it ever saw the light).
Book E 64. You want to forget; well I want to anyway.
> Have you considered the case of a mixed (IP32L64) mode calling a pure
> 32 bit library (or a user provided callback). The high part of R13-R31
> will also be clobbered. Therefore the caller has to save and restore
> all registers that are live around the call, which results in
> significant code bloat and needs compiler support.
I believe the compiler supports this. The code bloat you
think is there just doesn't exist; it takes a lot more code
to do the actual ops on a pair of 32-bit regs than it takes
to do a bit of 64-bit save/restore work.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 8:05 ` Benjamin Herrenschmidt
2007-05-29 9:26 ` Gabriel Paubert
@ 2007-05-29 13:10 ` Kumar Gala
2007-05-29 21:32 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 63+ messages in thread
From: Kumar Gala @ 2007-05-29 13:10 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Anton Blanchard,
Paul Mackerras
On May 29, 2007, at 3:05 AM, Benjamin Herrenschmidt wrote:
> On Tue, 2007-05-29 at 03:52 -0400, Dan Malek wrote:
>>> I've been looking at saving & restoring the top 32 bits of 64 bits
>>> registers when delivering signals to 32 bits processes on a 64 bits
>>> kernel. The principle is easy, but I wonder where to stick them.
>>
>> I'm wondering why you need to do this at all?
>> Why would a 32-bit application care about or
>> know what to do with these?
>
> There are regular demands for the ability to use the full 64 bits
> registers in 32 bits applications when running on a 64 bits processor.
> That ranges from, iirc, the java folks, to people wanting to optimize
> some libs to use 64 bits registers internally when called from 32 bits
> apps etc...
>
> You can use the full 64 bits easily on powerpc, ld/std just work, it's
> only the flags calculations and branches, mostly, that are truncated
> when running in 32 bits mode. Also, the kernel syscall & interrupt
> entry/exit path will save & restore the full 64 bits.
>
> The problem is when you use signals. The compat signal code for 32
> bits
> apps will only save and restore the bottom 32 bits, thus an
> application
> using signals will potentially corrupt the top 32 bits, which can be a
> problem if, for example, it uses a library that has optimisations
> based
> on using the full 64 bits.
>
> We don't intend to update jmpbuf, getcontext/setcontext etc... for
> those... they are purely call clobbered etc..., but it would be
> nice if
> at least the signal frame save/restore could properly deal with
> them so
> they don't get randomly clobbered.
This is all problematic since some 64-bit implementations may not
guarantee the upper bits are valid when in 32-bit mode. Look at the
'Computation Modes' section in the architecture specs 2.03 or greater
for embedded processors.
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 9:44 ` Benjamin Herrenschmidt
@ 2007-05-29 13:12 ` Segher Boessenkool
2007-05-29 14:00 ` Steve Munroe
0 siblings, 1 reply; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-29 13:12 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
>> - how can an application know that it can use 64 bit registers and
>> call
>> the optimized routines?
>
> I'd say use the 32 bits ABI, AT_HWCAP will tell you if you are running
> on a 64 bits capable machine. You can then either use hand tuned code
> at
> runtime, or I think ld.so can load alternate libs based on the bits in
> there.
Or you can simply only install 64-bit binaries on 64-bit
machines.
>> Finally, I've not seen a compiler (well, GCC, but I don't have 4.2 or
>> 4.3 installed yet) that allows you to tell the compiler to use 32 bit
>> addresses but assume that integer registers are 64 bit wide. As long
>> as such an option does not exist, the usefulness of this feature is
>> somewhat limited. In other words, GCC for now has support for ILP32
>> and
>> LP64 modes, but it would be better to also have support for IP32L64.
>
> Depends... If such binaries are actual 64 bits binaries from a kernel
> POV, then no change is necessary.
The kernel (and some libraries that do explicit mmap()s)
will have to make sure that all pointers stay within the
32-bit address space. This is most easily done by only
allowing mappings in the low 32-bit, and some initial
memory layout work is needed I guess. Seems quite easy
actually, certainly from the kernel perspective. And the
gain is higher than that of -m32 -mpowerpc64 over -m32.
If you want to do the 32-bit ABI in 64-bit ELF binaries,
more work is involved. Might very well be even better
though.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 7:24 Saving to 32 bits of GPRs in signal context Benjamin Herrenschmidt
2007-05-29 7:52 ` Dan Malek
@ 2007-05-29 13:53 ` Ulrich Weigand
1 sibling, 0 replies; 63+ messages in thread
From: Ulrich Weigand @ 2007-05-29 13:53 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Paul Mackerras, Anton Blanchard, Steve Munroe
[-- Attachment #1: Type: text/plain, Size: 1320 bytes --]
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/29/2007
09:24:15 AM:
> Specifically, are everybody using the uc_regs pointer to get to the
> mcontext or are some people likely to expect the mcontext to always be
> at the same offset from the beginning of the signal frame ?
As far as I can see, both GDB and the GCC unwind-from-signal code
always read the uc_regs pointer. (Or the sigcontext.regs pointer for
old-style signal frames.)
> There are a few other issues... one is, the pad fields aren't cleared.
> Thus how can userland or rt_sigreturn differenciate between a valid
> highregs pointers and ramdom junk ? Is there a trick one of you can come
> up with that I could do to let userland/gdb/rt_sigreturn know that
> there's something there ?
One idea we had about this was to use a bit in uc_flags. Those are
completely unused today, but should always have been initialized to 0.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
GNU compiler/toolchain for Linux on System z and Cell BE
IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung:
Herbert Kircher
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
[-- Attachment #2: Type: text/html, Size: 1779 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 13:12 ` Segher Boessenkool
@ 2007-05-29 14:00 ` Steve Munroe
2007-05-29 14:08 ` Ulrich Weigand
` (2 more replies)
0 siblings, 3 replies; 63+ messages in thread
From: Steve Munroe @ 2007-05-29 14:00 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
Segher Boessenkool <segher@kernel.crashing.org> wrote on 05/29/2007
08:12:24 AM:
> >> - how can an application know that it can use 64 bit registers and
> >> call
> >> the optimized routines?
> >
> > I'd say use the 32 bits ABI, AT_HWCAP will tell you if you are running
> > on a 64 bits capable machine. You can then either use hand tuned code
> > at
> > runtime, or I think ld.so can load alternate libs based on the bits in
> > there.
>
> Or you can simply only install 64-bit binaries on 64-bit
> machines.
>
Yes exactly why make an incompatible ABI change to the powerp32 ABI, when
you can just use the existing 64-bit ABI.
Especially as you can only run what is proposed on 64-bit hardware!
We don't need another ABI change to powerpc32 (still recovering from the
-msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
ABI changes ripple everywhere (not just GCC/GLIBC) including all debuggers
and performance tools. Believe me you really don't want this.
Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:00 ` Steve Munroe
@ 2007-05-29 14:08 ` Ulrich Weigand
2007-05-29 14:17 ` Kumar Gala
` (2 more replies)
2007-05-29 14:28 ` Segher Boessenkool
2007-05-29 21:37 ` Benjamin Herrenschmidt
2 siblings, 3 replies; 63+ messages in thread
From: Ulrich Weigand @ 2007-05-29 14:08 UTC (permalink / raw)
To: Steve Munroe; +Cc: linuxppc-dev list, Paul Mackerras, Anton Blanchard
[-- Attachment #1: Type: text/plain, Size: 1258 bytes --]
Steve Munroe <sjmunroe@us.ibm.com> wrote on 05/29/2007 04:00:42 PM:
> Yes exactly why make an incompatible ABI change to the powerp32 ABI,
when
> you can just use the existing 64-bit ABI.
>
> Especially as you can only run what is proposed on 64-bit hardware!
>
> We don't need another ABI change to powerpc32 (still recovering from the
> -msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
>
> ABI changes ripple everywhere (not just GCC/GLIBC) including all
debuggers
> and performance tools. Believe me you really don't want this.
Fully agreed. This may have gotten lost in the discussion thread, but
what
Ben originally proposed was *not* an ABI change, for exactly that reason.
We simply want to allow strictly local use of 64-bit registers for
performance optimization purposes, while still fully complying with
the 32-bit ABI.
Mit freundlichen Gruessen / Best Regards
Ulrich Weigand
--
Dr. Ulrich Weigand | Phone: +49-7031/16-3727
GNU compiler/toolchain for Linux on System z and Cell BE
IBM Deutschland Entwicklung GmbH
Vorsitzender des Aufsichtsrats: Martin Jetter | Geschäftsführung:
Herbert Kircher
Sitz der Gesellschaft: Böblingen | Registergericht: Amtsgericht
Stuttgart, HRB 243294
[-- Attachment #2: Type: text/html, Size: 1661 bytes --]
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:08 ` Ulrich Weigand
@ 2007-05-29 14:17 ` Kumar Gala
2007-05-29 14:38 ` Segher Boessenkool
2007-05-29 14:31 ` Segher Boessenkool
2007-05-29 14:51 ` Steve Munroe
2 siblings, 1 reply; 63+ messages in thread
From: Kumar Gala @ 2007-05-29 14:17 UTC (permalink / raw)
To: Ulrich Weigand
Cc: linuxppc-dev list, Paul Mackerras, Steve Munroe, Anton Blanchard
On May 29, 2007, at 9:08 AM, Ulrich Weigand wrote:
>
> Steve Munroe <sjmunroe@us.ibm.com> wrote on 05/29/2007 04:00:42 PM:
>
> > Yes exactly why make an incompatible ABI change to the powerp32
> ABI, when
> > you can just use the existing 64-bit ABI.
> >
> > Especially as you can only run what is proposed on 64-bit hardware!
> >
> > We don't need another ABI change to powerpc32 (still recovering
> from the
> > -msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
> >
> > ABI changes ripple everywhere (not just GCC/GLIBC) including all
> debuggers
> > and performance tools. Believe me you really don't want this.
>
> Fully agreed. This may have gotten lost in the discussion thread,
> but what
> Ben originally proposed was *not* an ABI change, for exactly that
> reason.
> We simply want to allow strictly local use of 64-bit registers for
> performance optimization purposes, while still fully complying with
> the 32-bit ABI.
But we can't do that any more since the architecture specifically
allows for the 'upper bits' not to have valid data in them.
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 13:04 ` Segher Boessenkool
@ 2007-05-29 14:28 ` Arnd Bergmann
2007-05-29 14:43 ` Segher Boessenkool
2007-05-29 21:27 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 63+ messages in thread
From: Arnd Bergmann @ 2007-05-29 14:28 UTC (permalink / raw)
To: linuxppc-dev
Cc: Steve Munroe, Ulrich Weigand, Paul Mackerras, Anton Blanchard
On Tuesday 29 May 2007, Segher Boessenkool wrote:
> > But it also makes me wonder about a few things:
> > - do you use the standard 32 bit ABI, in which case the caller of
> > libraries
> > does not care and the libraries can be put in the standard places,
>
> The compiler should either a) use full 64-bit only for
> "volatile" (call-clobbered) registers, or b) save and
> restore other 64-bit mode registers around calls. I don't
> remember if powerpc-linux-gcc does a) or b), if either.
As benh explained, the 64 bit register contents are maintained
over normal function calls, iirc the ABI treats the upper halves
of each register as call-clobbered.
The problem is really just signal handlers.
> > or are
> > there cases where the ability to pass 64 bit values in a single
> > register would improve performance to the point that it is worth
> > having an incompatible library (where to put it and how to name it)?
>
> That would be a third ABI. Is it worth that?
no ;-)
> > - how can an application know that it can use 64 bit registers and call
> > the optimized routines?
>
> Just call them and trap the SEGV ;-) You can check the
> aux vector of course, or ask glibc -- but the SEGV way
> is the only really portable way. Strange world :-)
shouldn't that be SIGILL?
> > As long
> > as such an option does not exist, the usefulness of this feature is
> > somewhat limited. In other words, GCC for now has support for ILP32 and
> > LP64 modes, but it would be better to also have support for IP32L64.
>
> ILP32LL64. The C "mode" stays the same, only the generated
> machine insns are changed.
right, as mentioned before, IP32L64 would imply introducing a new
ABI, which we don't want.
Arnd <><
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:00 ` Steve Munroe
2007-05-29 14:08 ` Ulrich Weigand
@ 2007-05-29 14:28 ` Segher Boessenkool
2007-05-29 21:37 ` Benjamin Herrenschmidt
2 siblings, 0 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-29 14:28 UTC (permalink / raw)
To: Steve Munroe
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
>>>> - how can an application know that it can use 64 bit registers and
>>>> call
>>>> the optimized routines?
>>>
>>> I'd say use the 32 bits ABI, AT_HWCAP will tell you if you are
>>> running
>>> on a 64 bits capable machine. You can then either use hand tuned code
>>> at
>>> runtime, or I think ld.so can load alternate libs based on the bits
>>> in
>>> there.
>>
>> Or you can simply only install 64-bit binaries on 64-bit
>> machines.
>>
> Yes exactly why make an incompatible ABI change to the powerp32 ABI,
> when
> you can just use the existing 64-bit ABI.
I meant programs using 64-bit insns while running in the
32-bit personality when I said "64-bit binaries". No ABI
change is necessary, except very few applications might
want to look at saved registers.
Plain 64-bit programs using 32 bits of address space only
is a much nicer idea indeed. And it doesn't even need
an ABI change! Just a new kernel personality (or an ELF
header flag or whatever).
> Especially as you can only run what is proposed on 64-bit hardware!
>
> We don't need another ABI change to powerpc32 (still recovering from
> the
> -msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
>
> ABI changes ripple everywhere (not just GCC/GLIBC) including all
> debuggers
> and performance tools. Believe me you really don't want this.
I've had to deal with exactly this on Darwin before, and
although that was as a simple user only, I can confirm:
I really do not want it. Unless it magically would be
100% stable at once of course :-)
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:08 ` Ulrich Weigand
2007-05-29 14:17 ` Kumar Gala
@ 2007-05-29 14:31 ` Segher Boessenkool
2007-05-29 14:51 ` Steve Munroe
2 siblings, 0 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-29 14:31 UTC (permalink / raw)
To: Ulrich Weigand
Cc: Steve Munroe, linuxppc-dev list, Paul Mackerras, Anton Blanchard
> > We don't need another ABI change to powerpc32 (still recovering=20
> from the
> > -msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
> >
> > ABI changes ripple everywhere (not just GCC/GLIBC) including all=20
> debuggers
> > and performance tools. Believe me you really don't want this.
>
> Fully agreed. =A0This may have gotten lost in the discussion thread, =
but=20
> what
> Ben originally proposed was *not* an ABI change, for exactly that=20
> reason.
> We simply want to allow strictly local use of 64-bit registers for
> performance optimization purposes, while still fully complying with
> the 32-bit ABI.
Some stuff gets added to the user version of the signal
frame; is that not an ABI change? Quite possibly a
(supposedly) compatible change, but a change anyway.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:17 ` Kumar Gala
@ 2007-05-29 14:38 ` Segher Boessenkool
2007-05-29 19:04 ` Becky Bruce
0 siblings, 1 reply; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-29 14:38 UTC (permalink / raw)
To: Kumar Gala
Cc: Ulrich Weigand, Paul Mackerras, Steve Munroe, Anton Blanchard,
linuxppc-dev list
> But we can't do that any more since the architecture specifically
> allows for the 'upper bits' not to have valid data in them.
That's not PowerPC, that's BookE ;-P If what you're
saying is true in the 2.03 POWER ISA, even for server
class implementations, that would be very unfortunate.
Or perhaps you misread, there are a few insns that
have undefined results in the high half word. Could
you give a reference please?
If you mean Book I 1.5.2, that is for embedded class
(i.e., BookE) CPUs only.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:28 ` Arnd Bergmann
@ 2007-05-29 14:43 ` Segher Boessenkool
2007-05-29 15:54 ` Geert Uytterhoeven
2007-05-29 18:48 ` Arnd Bergmann
0 siblings, 2 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-29 14:43 UTC (permalink / raw)
To: Arnd Bergmann
Cc: linuxppc-dev, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
>> Just call them and trap the SEGV ;-) You can check the
>> aux vector of course, or ask glibc -- but the SEGV way
>> is the only really portable way. Strange world :-)
>
> shouldn't that be SIGILL?
Yes sir. Although you can do it with segmentation
faults as well, I did mean SIGILL.
>> ILP32LL64. The C "mode" stays the same, only the generated
>> machine insns are changed.
>
> right, as mentioned before, IP32L64 would imply introducing a new
> ABI, which we don't want.
Hey, you could hijack the mswindows ABI, no need to
define your own ;-)
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:08 ` Ulrich Weigand
2007-05-29 14:17 ` Kumar Gala
2007-05-29 14:31 ` Segher Boessenkool
@ 2007-05-29 14:51 ` Steve Munroe
2007-05-29 21:44 ` Benjamin Herrenschmidt
2007-05-30 3:37 ` Paul Mackerras
2 siblings, 2 replies; 63+ messages in thread
From: Steve Munroe @ 2007-05-29 14:51 UTC (permalink / raw)
To: Ulrich Weigand; +Cc: linuxppc-dev list, Paul Mackerras, Anton Blanchard
Ulrich Weigand <Ulrich.Weigand@de.ibm.com> wrote on 05/29/2007 09:08:39 AM:
>
> Steve Munroe <sjmunroe@us.ibm.com> wrote on 05/29/2007 04:00:42 PM:
>
> > Yes exactly why make an incompatible ABI change to the powerp32 ABI,
when
> > you can just use the existing 64-bit ABI.
> >
> > Especially as you can only run what is proposed on 64-bit hardware!
> >
> > We don't need another ABI change to powerpc32 (still recovering from
the
> > -msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
> >
> > ABI changes ripple everywhere (not just GCC/GLIBC) including all
debuggers
> > and performance tools. Believe me you really don't want this.
>
> Fully agreed. This may have gotten lost in the discussion thread, but
what
> Ben originally proposed was *not* an ABI change, for exactly that reason.
> We simply want to allow strictly local use of 64-bit registers for
> performance optimization purposes, while still fully complying with
> the 32-bit ABI.
>
But unless you take the time to write it up like a full ABI change you are
never sure that it IS compatible. And any change to the size/shape of
ucontext_t is an ABI change.
Also if you want to debug this code (see long long variables correctly from
GDB or even see the upper 32-bits of GPRs) you will need an ABI change so
that GDB/DWARF knows what to do.
Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:43 ` Segher Boessenkool
@ 2007-05-29 15:54 ` Geert Uytterhoeven
2007-05-29 18:48 ` Arnd Bergmann
1 sibling, 0 replies; 63+ messages in thread
From: Geert Uytterhoeven @ 2007-05-29 15:54 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Arnd Bergmann, Ulrich Weigand, Steve Munroe, linuxppc-dev,
Paul Mackerras, Anton Blanchard
On Tue, 29 May 2007, Segher Boessenkool wrote:
> > right, as mentioned before, IP32L64 would imply introducing a new
> > ABI, which we don't want.
>
> Hey, you could hijack the mswindows ABI, no need to
> define your own ;-)
Isn't that (IL32)P64?
Gr{oetje,eeting}s,
Geert
--
Geert Uytterhoeven -- Sony Network and Software Technology Center Europe (NSCE)
Geert.Uytterhoeven@sonycom.com ------- The Corporate Village, Da Vincilaan 7-D1
Voice +32-2-7008453 Fax +32-2-7008622 ---------------- B-1935 Zaventem, Belgium
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:43 ` Segher Boessenkool
2007-05-29 15:54 ` Geert Uytterhoeven
@ 2007-05-29 18:48 ` Arnd Bergmann
1 sibling, 0 replies; 63+ messages in thread
From: Arnd Bergmann @ 2007-05-29 18:48 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On Tuesday 29 May 2007, Segher Boessenkool wrote:
> > right, as mentioned before, IP32L64 would imply introducing a new
> > ABI, which we don't want.
>
> Hey, you could hijack the mswindows ABI, no need to
> define your own ;-)
Windows is IL32P64, not IP32L64
Arnd <><
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:38 ` Segher Boessenkool
@ 2007-05-29 19:04 ` Becky Bruce
2007-05-30 10:04 ` Christoph Hellwig
2007-05-30 12:30 ` Segher Boessenkool
0 siblings, 2 replies; 63+ messages in thread
From: Becky Bruce @ 2007-05-29 19:04 UTC (permalink / raw)
To: Segher Boessenkool
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On May 29, 2007, at 9:38 AM, Segher Boessenkool wrote:
>> But we can't do that any more since the architecture specifically
>> allows for the 'upper bits' not to have valid data in them.
>
> That's not PowerPC, that's BookE ;-P
Is anybody within kicking distance of Segher? ;-P
> If what you're
> saying is true in the 2.03 POWER ISA, even for server
> class implementations, that would be very unfortunate.
I believe Category:Server implementations are required to behave as
you expect. Category:Embedded implementations may have different
behavior. Like it or not, BookE *is* part of the Power architecture,
and there are going to be 64-bit implementations of BookE that we
need to take into account.
>
> Or perhaps you misread, there are a few insns that
> have undefined results in the high half word. Could
> you give a reference please?
>
> If you mean Book I 1.5.2, that is for embedded class
> (i.e., BookE) CPUs only.
I think that's exactly what Kumar's talking about. The assumption
that all 64-bit Power processors will use those upper bits in some
meaningful way is not valid. Also, ld and std do not architecturally
"just work" on BookE implementations running in 32b mode. Those
instructions are part of the 64-bit category in 2.03, and may illop
on BookE processors running in 32b mode.
Cheers,
-Becky
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 13:04 ` Segher Boessenkool
2007-05-29 14:28 ` Arnd Bergmann
@ 2007-05-29 21:27 ` Benjamin Herrenschmidt
2007-05-29 21:45 ` Felix Domke
1 sibling, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 21:27 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On Tue, 2007-05-29 at 15:04 +0200, Segher Boessenkool wrote:
>
> Just call them and trap the SEGV ;-) You can check the
> aux vector of course, or ask glibc -- but the SEGV way
> is the only really portable way. Strange world :-)
SIGILL you mean ? :-)
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 13:10 ` Kumar Gala
@ 2007-05-29 21:32 ` Benjamin Herrenschmidt
2007-05-29 23:46 ` Olof Johansson
0 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 21:32 UTC (permalink / raw)
To: Kumar Gala
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Anton Blanchard,
Paul Mackerras
On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
> This is all problematic since some 64-bit implementations may not
> guarantee the upper bits are valid when in 32-bit mode. Look at the
> 'Computation Modes' section in the architecture specs 2.03 or
> greater
> for embedded processors.
Yuck. Well, we might need to export a spearate CPU feature bit to
indicate that it's the case then.
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:00 ` Steve Munroe
2007-05-29 14:08 ` Ulrich Weigand
2007-05-29 14:28 ` Segher Boessenkool
@ 2007-05-29 21:37 ` Benjamin Herrenschmidt
2007-05-29 21:38 ` Benjamin Herrenschmidt
2 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 21:37 UTC (permalink / raw)
To: Steve Munroe
Cc: Ulrich Weigand, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On Tue, 2007-05-29 at 09:00 -0500, Steve Munroe wrote:
>
> >
> Yes exactly why make an incompatible ABI change to the powerp32 ABI,
> when you can just use the existing 64-bit ABI.
Why do you keep saying we are making an incompatible ABI change while we
are not ?
> Especially as you can only run what is proposed on 64-bit hardware!
Because people want to do it ... I suspect this has a lot to do with not
having 64 bits pointers or providing specific optimisations in low level
routines within overall 32 bits apps but I don't know the details.
> We don't need another ABI change to powerpc32 (still recovering from
> the
> -msecure-plt ABI change) and WE DONT NEED a 3rd ABI.
>
> ABI changes ripple everywhere (not just GCC/GLIBC) including all
> debuggers
> and performance tools. Believe me you really don't want this.
BUT WE ARE NOT CHANGING THE BLOODY ABI IN ANY INCOMPATIBLE WAY SHAPE OR
FORM AND THERE IS NO NEED TO CHANGE GLIBC ! Have I been clear enough ?
If not, I'll let Uli explain why again for the 4th time at least.
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:37 ` Benjamin Herrenschmidt
@ 2007-05-29 21:38 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 21:38 UTC (permalink / raw)
To: Steve Munroe
Cc: Ulrich Weigand, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
> BUT WE ARE NOT CHANGING THE BLOODY ABI IN ANY INCOMPATIBLE WAY SHAPE OR
> FORM AND THERE IS NO NEED TO CHANGE GLIBC ! Have I been clear enough ?
> If not, I'll let Uli explain why again for the 4th time at least.
Sorry for the shouting... shouldn't reply to email before I had
breakfast :-)
Cheers,
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:51 ` Steve Munroe
@ 2007-05-29 21:44 ` Benjamin Herrenschmidt
2007-05-29 23:16 ` Steve Munroe
2007-05-30 11:40 ` Segher Boessenkool
2007-05-30 3:37 ` Paul Mackerras
1 sibling, 2 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 21:44 UTC (permalink / raw)
To: Steve Munroe
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
> But unless you take the time to write it up like a full ABI change you are
> never sure that it IS compatible. And any change to the size/shape of
> ucontext_t is an ABI change.
>
> Also if you want to debug this code (see long long variables correctly from
> GDB or even see the upper 32-bits of GPRs) you will need an ABI change so
> that GDB/DWARF knows what to do.
I personally don't care about gdb seeing those or anything like that,
those would be strictly local asm optimisations, at least that's my
point of view on the matter.
I intend not to extend or change the shape of ucontext neither. I'll add
the highregs after the ucontext32 on the compat signal frame, the only
change/addition is the use of a pad field to point to it and maybe
setting a flag that was previously unused and always 0 to indicate that
it's there.
Do you see any possible compatibility problem there ? Do you know of any
piece of software that makes hard assumptions on the shape and size of a
complete signal frame (not just the ucontext part of it) ?
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:27 ` Benjamin Herrenschmidt
@ 2007-05-29 21:45 ` Felix Domke
2007-05-30 11:23 ` Benjamin Herrenschmidt
2007-05-30 11:54 ` Segher Boessenkool
0 siblings, 2 replies; 63+ messages in thread
From: Felix Domke @ 2007-05-29 21:45 UTC (permalink / raw)
To: linuxppc-dev list
Benjamin Herrenschmidt wrote:
> On Tue, 2007-05-29 at 15:04 +0200, Segher Boessenkool wrote:
>> Just call them and trap the SEGV ;-) You can check the
>> aux vector of course, or ask glibc -- but the SEGV way
>> is the only really portable way. Strange world :-)
> SIGILL you mean ? :-)
Anyway, please don't. It is *not* portable.
Or can you guarantee that no CPU ever will implement a.) only a 64bit
subset or b.) other instructions using the same encoding as the 64bit
insn you will use for testing?
I still remember the pain of trying to tell that ffmpeg that my CPU
can't do real altivec, even when it implements some parts of it without
SIGILLing (which ffmpeg used for testing).
And: What will happen if you manage to run your code under an operating
system which doesn't even save the upper bits at all on interrupts? You
can't check for that with SIGILL.
Having a decent way (like aux/glibc) would also solve the problem with
"incompatible CPUs", which you mentioned.
And i'd still like to see some decent ILP32LL64 support. Maybe even with
a new "native 64bit" datatype (how ugly), in order to not break the ABI.
I just want to call my hand-optimized 64bit assembler code with 64bit
arguments.
How does OS X handle this? Don't they have the same problem there?
Felix
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:44 ` Benjamin Herrenschmidt
@ 2007-05-29 23:16 ` Steve Munroe
2007-05-29 23:19 ` Benjamin Herrenschmidt
2007-05-30 7:34 ` Hiroyuki Machida
2007-05-30 11:40 ` Segher Boessenkool
1 sibling, 2 replies; 63+ messages in thread
From: Steve Munroe @ 2007-05-29 23:16 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Ulrich Weigand, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/29/2007
04:44:15 PM:
>
> > But unless you take the time to write it up like a full ABI change you
are
> > never sure that it IS compatible. And any change to the size/shape of
> > ucontext_t is an ABI change.
> >
> > Also if you want to debug this code (see long long variables correctly
from
> > GDB or even see the upper 32-bits of GPRs) you will need an ABI change
so
> > that GDB/DWARF knows what to do.
>
> I personally don't care about gdb seeing those or anything like that,
> those would be strictly local asm optimisations, at least that's my
> point of view on the matter.
>
Well others do. If gcc supports code gen for this they will expect GDB
support.
> I intend not to extend or change the shape of ucontext neither. I'll add
> the highregs after the ucontext32 on the compat signal frame, the only
> change/addition is the use of a pad field to point to it and maybe
> setting a flag that was previously unused and always 0 to indicate that
> it's there.
>
The pad field may be occupied with data if the code was compiled on a older
distro and ucontext_t is misaligned (an odd Doubleword). So the pad field
is free for reuse onless you version the code to handle unalligned VMX
registers.
> Do you see any possible compatibility problem there ? Do you know of any
> piece of software that makes hard assumptions on the shape and size of a
> complete signal frame (not just the ucontext part of it) ?
>
The signal frame can change, as long as the relative offset and size of the
ucontext_t is unchanged.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 23:16 ` Steve Munroe
@ 2007-05-29 23:19 ` Benjamin Herrenschmidt
2007-05-30 7:34 ` Hiroyuki Machida
1 sibling, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-29 23:19 UTC (permalink / raw)
To: Steve Munroe
Cc: Ulrich Weigand, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On Tue, 2007-05-29 at 18:16 -0500, Steve Munroe wrote:
>
> The pad field may be occupied with data if the code was compiled on a
> older
> distro and ucontext_t is misaligned (an odd Doubleword). So the pad
> field
> is free for reuse onless you version the code to handle unalligned VMX
> registers.
Actually... there are two pad areas ... also, I would only use that on
contexts that I generated myself (signal contexts) so I suppose that
should be allright.
> > Do you see any possible compatibility problem there ? Do you know of
> any
> > piece of software that makes hard assumptions on the shape and size
> of a
> > complete signal frame (not just the ucontext part of it) ?
> >
>
> The signal frame can change, as long as the relative offset and size
> of the
> ucontext_t is unchanged.
Ok.
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:32 ` Benjamin Herrenschmidt
@ 2007-05-29 23:46 ` Olof Johansson
2007-05-30 0:43 ` Kumar Gala
0 siblings, 1 reply; 63+ messages in thread
From: Olof Johansson @ 2007-05-29 23:46 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt wrote:
> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
> > This is all problematic since some 64-bit implementations may not
> > guarantee the upper bits are valid when in 32-bit mode. Look at the
> > 'Computation Modes' section in the architecture specs 2.03 or
> > greater
> > for embedded processors.
>
> Yuck. Well, we might need to export a spearate CPU feature bit to
> indicate that it's the case then.
No need for a new bit, you should be able to key off of PPC_FEATURE_64
&& !PPC_FEATURE_BOOKE.
-Olof
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 23:46 ` Olof Johansson
@ 2007-05-30 0:43 ` Kumar Gala
2007-05-30 2:54 ` Steve Munroe
0 siblings, 1 reply; 63+ messages in thread
From: Kumar Gala @ 2007-05-30 0:43 UTC (permalink / raw)
To: Olof Johansson
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On May 29, 2007, at 6:46 PM, Olof Johansson wrote:
> On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt
> wrote:
>> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
>>> This is all problematic since some 64-bit implementations may not
>>> guarantee the upper bits are valid when in 32-bit mode. Look at the
>>> 'Computation Modes' section in the architecture specs 2.03 or
>>> greater
>>> for embedded processors.
>>
>> Yuck. Well, we might need to export a spearate CPU feature bit to
>> indicate that it's the case then.
>
> No need for a new bit, you should be able to key off of PPC_FEATURE_64
> && !PPC_FEATURE_BOOKE.
Nope, the architecture allows embedded to behave like server parts
and support the full 64-bit registers. We really should have a new
feature bit so that if someone has an implementation of an embedded
part that supports the functionality, they get the benefit.
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 0:43 ` Kumar Gala
@ 2007-05-30 2:54 ` Steve Munroe
2007-05-30 5:31 ` Kumar Gala
0 siblings, 1 reply; 63+ messages in thread
From: Steve Munroe @ 2007-05-30 2:54 UTC (permalink / raw)
To: Kumar Gala
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard, Olof Johansson
Kumar Gala <galak@kernel.crashing.org> wrote on 05/29/2007 07:43:05 PM:
>
> On May 29, 2007, at 6:46 PM, Olof Johansson wrote:
>
> > On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt
> > wrote:
> >> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
> >>> This is all problematic since some 64-bit implementations may not
> >>> guarantee the upper bits are valid when in 32-bit mode. Look at the
> >>> 'Computation Modes' section in the architecture specs 2.03 or
> >>> greater
> >>> for embedded processors.
> >>
> >> Yuck. Well, we might need to export a spearate CPU feature bit to
> >> indicate that it's the case then.
> >
> > No need for a new bit, you should be able to key off of PPC_FEATURE_64
> > && !PPC_FEATURE_BOOKE.
>
> Nope, the architecture allows embedded to behave like server parts
> and support the full 64-bit registers. We really should have a new
> feature bit so that if someone has an implementation of an embedded
> part that supports the functionality, they get the benefit.
>
When such exists we can add a bit, until then we can wait. The current
32-bit AT_HWCAP is almost full. so we should not allocate bits on
speculation.
Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 14:51 ` Steve Munroe
2007-05-29 21:44 ` Benjamin Herrenschmidt
@ 2007-05-30 3:37 ` Paul Mackerras
2007-05-30 5:32 ` Kumar Gala
2007-05-30 11:59 ` Segher Boessenkool
1 sibling, 2 replies; 63+ messages in thread
From: Paul Mackerras @ 2007-05-30 3:37 UTC (permalink / raw)
To: Steve Munroe; +Cc: Ulrich Weigand, linuxppc-dev list, Anton Blanchard
Steve Munroe writes:
> But unless you take the time to write it up like a full ABI change you are
> never sure that it IS compatible. And any change to the size/shape of
> ucontext_t is an ABI change.
There is no change to the size or shape of the ucontext_t. There is
no change to the ABI at all, in the sense that everything that is
currently guaranteed by the ABI is still guaranteed. An extra
guarantee is added: the top 32 bits of the GPRs will not change
unpredictably as long as you don't call a function and don't use
setcontext or swapcontext to return from a signal handler.
I think actually it would be useful to have the saving/restoring of
the high 32 bits controlled by a prctl, so that programs have to ask
explicitly for the new behaviour (and programs that don't want to use
the high 32 bits don't incur the extra overhead).
Paul.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 2:54 ` Steve Munroe
@ 2007-05-30 5:31 ` Kumar Gala
2007-05-30 19:47 ` Steve Munroe
0 siblings, 1 reply; 63+ messages in thread
From: Kumar Gala @ 2007-05-30 5:31 UTC (permalink / raw)
To: Steve Munroe
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard, Olof Johansson
On May 29, 2007, at 9:54 PM, Steve Munroe wrote:
>
> Kumar Gala <galak@kernel.crashing.org> wrote on 05/29/2007 07:43:05
> PM:
>
>>
>> On May 29, 2007, at 6:46 PM, Olof Johansson wrote:
>>
>>> On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt
>>> wrote:
>>>> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
>>>>> This is all problematic since some 64-bit implementations may not
>>>>> guarantee the upper bits are valid when in 32-bit mode. Look
>>>>> at the
>>>>> 'Computation Modes' section in the architecture specs 2.03 or
>>>>> greater
>>>>> for embedded processors.
>>>>
>>>> Yuck. Well, we might need to export a spearate CPU feature bit to
>>>> indicate that it's the case then.
>>>
>>> No need for a new bit, you should be able to key off of
>>> PPC_FEATURE_64
>>> && !PPC_FEATURE_BOOKE.
>>
>> Nope, the architecture allows embedded to behave like server parts
>> and support the full 64-bit registers. We really should have a new
>> feature bit so that if someone has an implementation of an embedded
>> part that supports the functionality, they get the benefit.
>>
> When such exists we can add a bit, until then we can wait. The current
> 32-bit AT_HWCAP is almost full. so we should not allocate bits on
> speculation.
Understandable.. dare I ask about a few of the current AT_HWCAPs we
do have:
#define PPC_FEATURE_POWER4 0x00080000
#define PPC_FEATURE_POWER5 0x00040000
#define PPC_FEATURE_POWER5_PLUS 0x00020000
#define PPC_FEATURE_ARCH_2_05 0x00001000
#define PPC_FEATURE_PA6T 0x00000800
#define PPC_FEATURE_POWER6_EXT 0x00000200
What exactly are we using these for? Can we not use platform for
some of these?
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 3:37 ` Paul Mackerras
@ 2007-05-30 5:32 ` Kumar Gala
2007-05-30 11:44 ` Benjamin Herrenschmidt
2007-05-30 12:01 ` Segher Boessenkool
2007-05-30 11:59 ` Segher Boessenkool
1 sibling, 2 replies; 63+ messages in thread
From: Kumar Gala @ 2007-05-30 5:32 UTC (permalink / raw)
To: Paul Mackerras
Cc: Ulrich Weigand, Steve Munroe, Anton Blanchard, linuxppc-dev list
On May 29, 2007, at 10:37 PM, Paul Mackerras wrote:
> Steve Munroe writes:
>
>> But unless you take the time to write it up like a full ABI change
>> you are
>> never sure that it IS compatible. And any change to the size/shape of
>> ucontext_t is an ABI change.
>
> There is no change to the size or shape of the ucontext_t. There is
> no change to the ABI at all, in the sense that everything that is
> currently guaranteed by the ABI is still guaranteed. An extra
> guarantee is added: the top 32 bits of the GPRs will not change
> unpredictably as long as you don't call a function and don't use
> setcontext or swapcontext to return from a signal handler.
>
> I think actually it would be useful to have the saving/restoring of
> the high 32 bits controlled by a prctl, so that programs have to ask
> explicitly for the new behaviour (and programs that don't want to use
> the high 32 bits don't incur the extra overhead).
I like this, it means we can error if HW doesn't support it and
requires applications to do something specific to enable the feature.
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 23:16 ` Steve Munroe
2007-05-29 23:19 ` Benjamin Herrenschmidt
@ 2007-05-30 7:34 ` Hiroyuki Machida
1 sibling, 0 replies; 63+ messages in thread
From: Hiroyuki Machida @ 2007-05-30 7:34 UTC (permalink / raw)
To: Steve Munroe; +Cc: linuxppc-dev list
Hi Steve,
2007/5/30, Steve Munroe <sjmunroe@us.ibm.com>:
>
> Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/29/2007
> 04:44:15 PM:
:
:
> > >
> > > Also if you want to debug this code (see long long variables correctly
> from
> > > GDB or even see the upper 32-bits of GPRs) you will need an ABI change
> so
> > > that GDB/DWARF knows what to do.
As already mentioned, ABI (calling convetion and relocations) wont't change,
so there's no exteion to DWARF required, I think. Do you have any concern ?
> >
> > I personally don't care about gdb seeing those or anything like that,
> > those would be strictly local asm optimisations, at least that's my
> > point of view on the matter.
> >
> Well others do. If gcc supports code gen for this they will expect GDB
> support.
Even now, gcc -m32 -mpowerpc64 produces 64 bit insns.
I've cheked with gcc version 4.1.1 20070105 (Red Hat 4.1.1-51).
But in most case, developers would use these options to some
specific files which are critical to performance.
I think GDB support is not critical, but would be expected.
---
Hiroyuki Machida
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 19:04 ` Becky Bruce
@ 2007-05-30 10:04 ` Christoph Hellwig
2007-05-30 12:13 ` Kumar Gala
2007-05-30 12:30 ` Segher Boessenkool
1 sibling, 1 reply; 63+ messages in thread
From: Christoph Hellwig @ 2007-05-30 10:04 UTC (permalink / raw)
To: Becky Bruce
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On Tue, May 29, 2007 at 02:04:02PM -0500, Becky Bruce wrote:
> I think that's exactly what Kumar's talking about. The assumption
> that all 64-bit Power processors will use those upper bits in some
> meaningful way is not valid. Also, ld and std do not architecturally
> "just work" on BookE implementations running in 32b mode. Those
> instructions are part of the 64-bit category in 2.03, and may illop
> on BookE processors running in 32b mode.
Then Bens suggest mode will only work on the sane IBM processors and
not the braindead freescale ones. Wouldn't be the first time.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:45 ` Felix Domke
@ 2007-05-30 11:23 ` Benjamin Herrenschmidt
2007-05-30 11:52 ` Felix Domke
2007-05-30 11:54 ` Segher Boessenkool
1 sibling, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-30 11:23 UTC (permalink / raw)
To: Felix Domke; +Cc: linuxppc-dev list
> Anyway, please don't. It is *not* portable.
What are you talking about ? Really, I mean, I'm not sure I understand
what you mean :-)
> Or can you guarantee that no CPU ever will implement a.) only a 64bit
> subset or b.) other instructions using the same encoding as the 64bit
> insn you will use for testing?
Well, the idea is that we do expose via AT_HWCAP that the ppc64 insn set
is supported. I reckon we might just strip that bit for 32 bits
processes if they can't do 64 bits insn, no need to even get another
one.
> I still remember the pain of trying to tell that ffmpeg that my CPU
> can't do real altivec, even when it implements some parts of it without
> SIGILLing (which ffmpeg used for testing).
Yeah well, ffmpeg is crap, news at 11... there are ways to test wether
you have altivec or not (and more than one) but it looks like most
ffmpeg packages around don't care.
> And: What will happen if you manage to run your code under an operating
> system which doesn't even save the upper bits at all on interrupts? You
> can't check for that with SIGILL.
What are you talking about ? (bis) :-)
> Having a decent way (like aux/glibc) would also solve the problem with
> "incompatible CPUs", which you mentioned.
Ugh ?
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:44 ` Benjamin Herrenschmidt
2007-05-29 23:16 ` Steve Munroe
@ 2007-05-30 11:40 ` Segher Boessenkool
2007-05-30 11:48 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 11:40 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
>> Also if you want to debug this code (see long long variables
>> correctly from
>> GDB or even see the upper 32-bits of GPRs) you will need an ABI
>> change so
>> that GDB/DWARF knows what to do.
>
> I personally don't care about gdb seeing those or anything like that,
> those would be strictly local asm optimisations, at least that's my
> point of view on the matter.
GDB can step into asm though, it will have to know
about it for full functionality.
> I intend not to extend or change the shape of ucontext neither. I'll
> add
> the highregs after the ucontext32 on the compat signal frame, the only
> change/addition is the use of a pad field to point to it and maybe
> setting a flag that was previously unused and always 0 to indicate that
> it's there.
>
> Do you see any possible compatibility problem there ? Do you know of
> any
> piece of software that makes hard assumptions on the shape and size of
> a
> complete signal frame (not just the ucontext part of it) ?
Perhaps something in a test suite somewhere; other
than that, nothing important I suspect. Well some
version of some JVM will abuse it I'm sure ;-)
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 5:32 ` Kumar Gala
@ 2007-05-30 11:44 ` Benjamin Herrenschmidt
2007-05-30 12:15 ` Kumar Gala
2007-05-30 21:02 ` Gabriel Paubert
2007-05-30 12:01 ` Segher Boessenkool
1 sibling, 2 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-30 11:44 UTC (permalink / raw)
To: Kumar Gala
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras, Steve Munroe,
Anton Blanchard
On Wed, 2007-05-30 at 00:32 -0500, Kumar Gala wrote:
> > I think actually it would be useful to have the saving/restoring of
> > the high 32 bits controlled by a prctl, so that programs have to ask
> > explicitly for the new behaviour (and programs that don't want to
> use
> > the high 32 bits don't incur the extra overhead).
>
> I like this, it means we can error if HW doesn't support it and
> requires applications to do something specific to enable the feature.
Yeah well.... I liked the prctl at first.. but then, I though twice :-)
Thing is, a typical usage pattern would be some library having a hand
optimized tigh loop or something like that using 64 bits registers. An
example, would be some memcpy-type thing in glibc.
You don't want those things to do prctl's all over the place on behalf
of the host application.
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:40 ` Segher Boessenkool
@ 2007-05-30 11:48 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-30 11:48 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On Wed, 2007-05-30 at 13:40 +0200, Segher Boessenkool wrote:
>
> > I personally don't care about gdb seeing those or anything like
> that,
> > those would be strictly local asm optimisations, at least that's my
> > point of view on the matter.
>
> GDB can step into asm though, it will have to know
> about it for full functionality.
It can already... there are ptrace hooks to get both halves if you are a
32 bits gdb and the 64 bits ptrace will return the full 64 bits. There
is no way to know wether an app or a lib is using both halves or not
though.
> Perhaps something in a test suite somewhere; other
> than that, nothing important I suspect. Well some
> version of some JVM will abuse it I'm sure ;-)
That's what I was worried about :-)
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:23 ` Benjamin Herrenschmidt
@ 2007-05-30 11:52 ` Felix Domke
2007-05-30 13:14 ` Segher Boessenkool
0 siblings, 1 reply; 63+ messages in thread
From: Felix Domke @ 2007-05-30 11:52 UTC (permalink / raw)
To: Benjamin Herrenschmidt; +Cc: linuxppc-dev list
>> Anyway, please don't. It is *not* portable.
> What are you talking about ? Really, I mean, I'm not sure I understand
> what you mean :-)
Segher stated that there is no portable way of detecting whether 64bit
instructions are available, and said that just trying a 64bit insn (and
catching the SIGILL if thre cpu is 32bit) is probably the most portable
way to do so. (Or did i got *that* wrong? If so, please ignore.)
Now my objection is that the "SIGILL"-way is not only ugly, but can be
easily *wrong*, as there are certain possibilities (Book-E 64bit,
non-64bit-aware OS, ...) when the CPU might not throw an exception. (My
"ffmpeg with vmx"-experience shows this is a real world issue, although
the situation is a bit different there, i agree, since vmx opcodes are
not exclusively reserved for vmx)
Yes, those hwcap-bits should care for this. Are they portable/usable,
even on older systems?
Felix
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 21:45 ` Felix Domke
2007-05-30 11:23 ` Benjamin Herrenschmidt
@ 2007-05-30 11:54 ` Segher Boessenkool
2007-05-30 12:07 ` Felix Domke
1 sibling, 1 reply; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 11:54 UTC (permalink / raw)
To: Felix Domke; +Cc: linuxppc-dev list
>>> Just call them and trap the SEGV ;-) You can check the
>>> aux vector of course, or ask glibc -- but the SEGV way
>>> is the only really portable way. Strange world :-)
>> SIGILL you mean ? :-)
> Anyway, please don't. It is *not* portable.
It is simple and *quite* portable. "Real" applications
of course should just ask the C library about this,
somewhere deep inside a maze of #ifdefs for all the OSes
supported. A job for autoxxxx I guess.
> Or can you guarantee that no CPU ever will implement a.) only a 64bit
> subset or b.) other instructions using the same encoding as the 64bit
> insn you will use for testing?
I don't understand a); and no CPU _should_ do b), but we
all know that bad things happen sometimes.
> I still remember the pain of trying to tell that ffmpeg that my CPU
> can't do real altivec, even when it implements some parts of it without
> SIGILLing (which ffmpeg used for testing).
To be fair, ffmpeg has had this test since before there
were proper ways to detect AltiVec on Linux/glibc.
> And: What will happen if you manage to run your code under an operating
> system which doesn't even save the upper bits at all on interrupts? You
> can't check for that with SIGILL.
Sure you can, do some loop with a data-dependent branch
in there that detects corruption of that high half reg.
If you really want a SIGILL you can just generate one ;-)
A test like this is never 100% of course.
> And i'd still like to see some decent ILP32LL64 support.
GCC has supported this for a long time now, it seems the
last few pieces needed from the kernel and userland support
are falling into place now. Hurray!
> Maybe even with
> a new "native 64bit" datatype (how ugly), in order to not break the
> ABI.
> I just want to call my hand-optimized 64bit assembler code with 64bit
> arguments.
> How does OS X handle this? Don't they have the same problem there?
AFAIK Darwin saves full 64-bit GPRs on a 64-bit CPU always.
An application can ask for 64-bit signal frames by giving
the SA_64REGSET flag in sa_flags to sigaction().
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 3:37 ` Paul Mackerras
2007-05-30 5:32 ` Kumar Gala
@ 2007-05-30 11:59 ` Segher Boessenkool
2007-05-30 12:01 ` Benjamin Herrenschmidt
1 sibling, 1 reply; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 11:59 UTC (permalink / raw)
To: Paul Mackerras
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Anton Blanchard
> I think actually it would be useful to have the saving/restoring of
> the high 32 bits controlled by a prctl, so that programs have to ask
> explicitly for the new behaviour (and programs that don't want to use
> the high 32 bits don't incur the extra overhead).
This prctl() is needed if you want to support DSOs containing
64-bit insns. It would be nice to have some ELF flag as well
though, so most of this can be handled automatically.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:59 ` Segher Boessenkool
@ 2007-05-30 12:01 ` Benjamin Herrenschmidt
2007-05-30 12:07 ` Segher Boessenkool
0 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-30 12:01 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On Wed, 2007-05-30 at 13:59 +0200, Segher Boessenkool wrote:
> This prctl() is needed if you want to support DSOs containing
> 64-bit insns. It would be nice to have some ELF flag as well
> though, so most of this can be handled automatically.
Actually, it's the opposite... the prctl becomes a problem if you have
libs wanting to use 64 bits for optims
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 5:32 ` Kumar Gala
2007-05-30 11:44 ` Benjamin Herrenschmidt
@ 2007-05-30 12:01 ` Segher Boessenkool
1 sibling, 0 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 12:01 UTC (permalink / raw)
To: Kumar Gala
Cc: Ulrich Weigand, Paul Mackerras, Steve Munroe, Anton Blanchard,
linuxppc-dev list
>> I think actually it would be useful to have the saving/restoring of
>> the high 32 bits controlled by a prctl, so that programs have to ask
>> explicitly for the new behaviour (and programs that don't want to use
>> the high 32 bits don't incur the extra overhead).
>
> I like this, it means we can error if HW doesn't support it and
> requires applications to do something specific to enable the feature.
It also means every such application has to make sure
it calls the prctl() before running any 64-bit insns.
No one will get this right unless support for this
is put into the ELF loader.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:54 ` Segher Boessenkool
@ 2007-05-30 12:07 ` Felix Domke
2007-05-31 5:39 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 63+ messages in thread
From: Felix Domke @ 2007-05-30 12:07 UTC (permalink / raw)
To: Segher Boessenkool; +Cc: linuxppc-dev list
Segher Boessenkool wrote:
>>>> Just call them and trap the SEGV ;-) You can check the
>>>> aux vector of course, or ask glibc -- but the SEGV way
>>>> is the only really portable way. Strange world :-)
>>> SIGILL you mean ? :-)
>> Anyway, please don't. It is *not* portable.
> It is simple and *quite* portable. [...]
>> [ffmpeg]
> To be fair, ffmpeg has had this test since before there
> were proper ways to detect AltiVec on Linux/glibc.
That's *exactly* my point:
If you don't provide a real, portable, useful way *now* for detecting
compatibility with 64bit insn, people (=ffmpeg, mplayer first) *will*
invent their own way of detecting it, possibly using SIGILL, faster than
you could imagine.
Please avoid that this time. And please declare the use of SIGILL for
detecting extensions as plainly wrong, not as a "bad workaround, but
still better than what's available". If you can't be sure that an
extension will work as expected (for example because there is just no
interface to query the OS for it), then simply don't use it. If this is
going to be a performance problem, bug the kernel people to fix it.
(Sorry, this is the point of view of myself a pure *user*. I don't want
to debug crashing programs with incorrect memcpy results because some
program decided on its own that it's safe to use this extension when it
wasn't.)
>> And: What will happen if you manage to run your code under an operating
>> system which doesn't even save the upper bits at all on interrupts? You
>> can't check for that with SIGILL.
> Sure you can, do some loop with a data-dependent branch
> in there that detects corruption of that high half reg.
> If you really want a SIGILL you can just generate one ;-)
> A test like this is never 100% of course.
Tell that those people who have a SIGILL check in their "production" code.
Felix
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:01 ` Benjamin Herrenschmidt
@ 2007-05-30 12:07 ` Segher Boessenkool
2007-05-30 12:09 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 12:07 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
>> This prctl() is needed if you want to support DSOs containing
>> 64-bit insns. It would be nice to have some ELF flag as well
>> though, so most of this can be handled automatically.
>
> Actually, it's the opposite... the prctl becomes a problem if you have
> libs wanting to use 64 bits for optims
The host application, or the dynamic loader, can call
the prctl() when it loads the DSO that needs it.
In almost all cases this should all be transparent
for the user IMHO, based on some ELF flag.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:07 ` Segher Boessenkool
@ 2007-05-30 12:09 ` Benjamin Herrenschmidt
2007-05-30 12:36 ` Segher Boessenkool
0 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-30 12:09 UTC (permalink / raw)
To: Segher Boessenkool
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On Wed, 2007-05-30 at 14:07 +0200, Segher Boessenkool wrote:
> > Actually, it's the opposite... the prctl becomes a problem if you
> have
> > libs wanting to use 64 bits for optims
>
> The host application, or the dynamic loader, can call
> the prctl() when it loads the DSO that needs it.
Provided you know it does... and with static binaries it gets harder...
> In almost all cases this should all be transparent
> for the user IMHO, based on some ELF flag.
You reckon ? I was wondering about that ... maybe we should define some
ELF personality for that ...
But that means that existing programs wouldn't get it even while some
libs they depend on might have such optims without the program knowing
about it ...
Also, if I can avoid changing glibc ... (you know how hard it is !)
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 10:04 ` Christoph Hellwig
@ 2007-05-30 12:13 ` Kumar Gala
0 siblings, 0 replies; 63+ messages in thread
From: Kumar Gala @ 2007-05-30 12:13 UTC (permalink / raw)
To: Christoph Hellwig
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
On May 30, 2007, at 5:04 AM, Christoph Hellwig wrote:
> On Tue, May 29, 2007 at 02:04:02PM -0500, Becky Bruce wrote:
>> I think that's exactly what Kumar's talking about. The assumption
>> that all 64-bit Power processors will use those upper bits in some
>> meaningful way is not valid. Also, ld and std do not architecturally
>> "just work" on BookE implementations running in 32b mode. Those
>> instructions are part of the 64-bit category in 2.03, and may illop
>> on BookE processors running in 32b mode.
>
> Then Bens suggest mode will only work on the sane IBM processors and
> not the braindead freescale ones. Wouldn't be the first time.
This isn't sane vs braindead. Its one thing if this was some quirk
of an implementation, but we are talking about what the architecture
does and doesn't allow. There a reasonable reason that a processor
for the embedded processor would be able to save power in 32-bit mode
by not powering the upper bits of the register file.
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:44 ` Benjamin Herrenschmidt
@ 2007-05-30 12:15 ` Kumar Gala
2007-05-30 12:48 ` Hiroyuki Machida
2007-05-30 21:02 ` Gabriel Paubert
1 sibling, 1 reply; 63+ messages in thread
From: Kumar Gala @ 2007-05-30 12:15 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras, Steve Munroe,
Anton Blanchard
On May 30, 2007, at 6:44 AM, Benjamin Herrenschmidt wrote:
> On Wed, 2007-05-30 at 00:32 -0500, Kumar Gala wrote:
>>> I think actually it would be useful to have the saving/restoring of
>>> the high 32 bits controlled by a prctl, so that programs have to ask
>>> explicitly for the new behaviour (and programs that don't want to
>> use
>>> the high 32 bits don't incur the extra overhead).
>>
>> I like this, it means we can error if HW doesn't support it and
>> requires applications to do something specific to enable the feature.
>
> Yeah well.... I liked the prctl at first.. but then, I though
> twice :-)
>
> Thing is, a typical usage pattern would be some library having a hand
> optimized tigh loop or something like that using 64 bits registers. An
> example, would be some memcpy-type thing in glibc.
>
> You don't want those things to do prctl's all over the place on behalf
> of the host application.
Yeah, I can see that being a pain. However, how would the AT_HWCAP
make this any easier on the library to detect? (I might have missed
that discussion of that magic in the thread).
- k
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-29 19:04 ` Becky Bruce
2007-05-30 10:04 ` Christoph Hellwig
@ 2007-05-30 12:30 ` Segher Boessenkool
1 sibling, 0 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 12:30 UTC (permalink / raw)
To: Becky Bruce
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
>> If what you're
>> saying is true in the 2.03 POWER ISA, even for server
>> class implementations, that would be very unfortunate.
>
> I believe Category:Server implementations are required to behave as
> you expect. Category:Embedded implementations may have different
> behavior. Like it or not, BookE *is* part of the Power architecture,
> and there are going to be 64-bit implementations of BookE that we need
> to take into account.
Sure. How about we worry about this when support for
such CPUs is added _at all_ to Linux; for this issue,
the only required change would be to refuse the prctl()
call, right?
>> If you mean Book I 1.5.2, that is for embedded class
>> (i.e., BookE) CPUs only.
>
> I think that's exactly what Kumar's talking about. The assumption that
> all 64-bit Power processors will use those upper bits in some
> meaningful way is not valid. Also, ld and std do not architecturally
> "just work" on BookE implementations running in 32b mode. Those
> instructions are part of the 64-bit category in 2.03, and may illop on
> BookE processors running in 32b mode.
Sounds like it will be a lovely effort to add support for
such CPUs to Linux.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:09 ` Benjamin Herrenschmidt
@ 2007-05-30 12:36 ` Segher Boessenkool
0 siblings, 0 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 12:36 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
>>> Actually, it's the opposite... the prctl becomes a problem if you
>> have
>>> libs wanting to use 64 bits for optims
>>
>> The host application, or the dynamic loader, can call
>> the prctl() when it loads the DSO that needs it.
>
> Provided you know it does... and with static binaries it gets harder...
A mechanism like what is done for executable stacks
can be used, so you end up with something in the ELF
headers that tells you. Not that I like this particular
mechanism.
>> In almost all cases this should all be transparent
>> for the user IMHO, based on some ELF flag.
>
> You reckon ? I was wondering about that ... maybe we should define some
> ELF personality for that ...
>
> But that means that existing programs wouldn't get it even while some
> libs they depend on might have such optims without the program knowing
> about it ...
Like I said, the dynamic loader should do the work
in such cases.
> Also, if I can avoid changing glibc ... (you know how hard it is !)
No really? Tell me about it? :-)
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:15 ` Kumar Gala
@ 2007-05-30 12:48 ` Hiroyuki Machida
2007-05-30 12:58 ` Benjamin Herrenschmidt
0 siblings, 1 reply; 63+ messages in thread
From: Hiroyuki Machida @ 2007-05-30 12:48 UTC (permalink / raw)
To: Kumar Gala
Cc: linuxppc-dev list, Steve Munroe, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
2007/5/30, Kumar Gala <galak@kernel.crashing.org>:
>
> On May 30, 2007, at 6:44 AM, Benjamin Herrenschmidt wrote:
>
> > On Wed, 2007-05-30 at 00:32 -0500, Kumar Gala wrote:
> >>> I think actually it would be useful to have the saving/restoring of
> >>> the high 32 bits controlled by a prctl, so that programs have to ask
> >>> explicitly for the new behaviour (and programs that don't want to
> >> use
> >>> the high 32 bits don't incur the extra overhead).
> >>
> >> I like this, it means we can error if HW doesn't support it and
> >> requires applications to do something specific to enable the feature.
> >
> > Yeah well.... I liked the prctl at first.. but then, I though
> > twice :-)
> >
> > Thing is, a typical usage pattern would be some library having a hand
> > optimized tigh loop or something like that using 64 bits registers. An
> > example, would be some memcpy-type thing in glibc.
> >
> > You don't want those things to do prctl's all over the place on behalf
> > of the host application.
>
> Yeah, I can see that being a pain. However, how would the AT_HWCAP
> make this any easier on the library to detect? (I might have missed
> that discussion of that magic in the thread).
>
I think same framework as proposed at follwoing URLs, works.
http://penguinppc.org/dev/glibc/glibc-powerpc-cpu-addon.html
http://sources.redhat.com/ml/libc-alpha/2006-01/msg00094.html
Hiroyuki..
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:48 ` Hiroyuki Machida
@ 2007-05-30 12:58 ` Benjamin Herrenschmidt
2007-05-30 18:09 ` Steve Munroe
0 siblings, 1 reply; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-30 12:58 UTC (permalink / raw)
To: Hiroyuki.Mach
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On Wed, 2007-05-30 at 21:48 +0900, Hiroyuki Machida wrote:
>
> I think same framework as proposed at follwoing URLs, works.
> http://penguinppc.org/dev/glibc/glibc-powerpc-cpu-addon.html
> http://sources.redhat.com/ml/libc-alpha/2006-01/msg00094.html
Yeah, it would make sense to define a new feature bit to trigger
automatic loading of optimized libs...
Steve, how hard do you think it would be to extend AT_HWCAP ? Like
adding an AT_HWCAP2 or something like that ?
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:52 ` Felix Domke
@ 2007-05-30 13:14 ` Segher Boessenkool
0 siblings, 0 replies; 63+ messages in thread
From: Segher Boessenkool @ 2007-05-30 13:14 UTC (permalink / raw)
To: Felix Domke; +Cc: linuxppc-dev list
>>> Anyway, please don't. It is *not* portable.
>> What are you talking about ? Really, I mean, I'm not sure I understand
>> what you mean :-)
> Segher stated that there is no portable way of detecting whether 64bit
> instructions are available, and said that just trying a 64bit insn (and
> catching the SIGILL if thre cpu is 32bit) is probably the most portable
> way to do so. (Or did i got *that* wrong? If so, please ignore.)
That is exactly what I said. Note I didn't say it
is the *best* method, just the most portable one :-)
> Now my objection is that the "SIGILL"-way is not only ugly, but can be
> easily *wrong*, as there are certain possibilities (Book-E 64bit,
> non-64bit-aware OS, ...) when the CPU might not throw an exception.
All those cases would throw an exception.
> (My
> "ffmpeg with vmx"-experience shows this is a real world issue, although
> the situation is a bit different there, i agree, since vmx opcodes are
> not exclusively reserved for vmx)
Yeah nasty business.
Segher
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:58 ` Benjamin Herrenschmidt
@ 2007-05-30 18:09 ` Steve Munroe
0 siblings, 0 replies; 63+ messages in thread
From: Steve Munroe @ 2007-05-30 18:09 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Ulrich Weigand, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center
Benjamin Herrenschmidt <benh@kernel.crashing.org> wrote on 05/30/2007
07:58:12 AM:
> On Wed, 2007-05-30 at 21:48 +0900, Hiroyuki Machida wrote:
> >
> > I think same framework as proposed at follwoing URLs, works.
> > http://penguinppc.org/dev/glibc/glibc-powerpc-cpu-addon.html
This is about keeping the cpu-specific code separated out in the glibc
build system. The appropriat code can be selected by configuring
--with=cpu=<cpu-type>
> > http://sources.redhat.com/ml/libc-alpha/2006-01/msg00094.html
>
This is about selecting whole cpu-tuned libraries dynamically (at load
time) that separated in the directory structure.
> Yeah, it would make sense to define a new feature bit to trigger
> automatic loading of optimized libs...
>
For dynamic library selection the primary selection is based on
AT_PLATFORM. Specific bits from AT_WHCAP can qualify the search path. For
example /lib/power4 is searched on POWER4 systems based on AT_PLATFORM. The
directory /lib/ppc970/altivec is searched on 970s from AT_PLATFORM=ppc970
and AT_HWCAP=PPC_FEATURE_HAS_ALTIVEC.
Not all bits in the AT_HWCAP effect the search, controls be the
HWCAP_IMPORTANT mask. Current only optional ISA features are includes in
the mask at this time (PPC_FEATURE_HAS_ALTIVEC and PPC_FEATURE_HAS_DFP).
> Steve, how hard do you think it would be to extend AT_HWCAP ? Like
> adding an AT_HWCAP2 or something like that ?
>
If needed, I dont think we are there yet.
Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 5:31 ` Kumar Gala
@ 2007-05-30 19:47 ` Steve Munroe
2007-05-30 20:52 ` Olof Johansson
0 siblings, 1 reply; 63+ messages in thread
From: Steve Munroe @ 2007-05-30 19:47 UTC (permalink / raw)
To: Kumar Gala
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard, Olof Johansson
Kumar Gala <galak@kernel.crashing.org> wrote on 05/30/2007 12:31:32 AM:
>
> On May 29, 2007, at 9:54 PM, Steve Munroe wrote:
>
> >
> > Kumar Gala <galak@kernel.crashing.org> wrote on 05/29/2007 07:43:05
> > PM:
> >
> >>
> >> On May 29, 2007, at 6:46 PM, Olof Johansson wrote:
> >>
> >>> On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt
> >>> wrote:
> >>>> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
> >>>>> This is all problematic since some 64-bit implementations may not
> >>>>> guarantee the upper bits are valid when in 32-bit mode. Look
> >>>>> at the
> >>>>> 'Computation Modes' section in the architecture specs 2.03 or
> >>>>> greater
> >>>>> for embedded processors.
> >>>>
> >>>> Yuck. Well, we might need to export a spearate CPU feature bit to
> >>>> indicate that it's the case then.
> >>>
> >>> No need for a new bit, you should be able to key off of
> >>> PPC_FEATURE_64
> >>> && !PPC_FEATURE_BOOKE.
> >>
> >> Nope, the architecture allows embedded to behave like server parts
> >> and support the full 64-bit registers. We really should have a new
> >> feature bit so that if someone has an implementation of an embedded
> >> part that supports the functionality, they get the benefit.
> >>
> > When such exists we can add a bit, until then we can wait. The current
> > 32-bit AT_HWCAP is almost full. so we should not allocate bits on
> > speculation.
>
> Understandable.. dare I ask about a few of the current AT_HWCAPs we
> do have:
>
> #define PPC_FEATURE_POWER4 0x00080000
> #define PPC_FEATURE_POWER5 0x00040000
> #define PPC_FEATURE_POWER5_PLUS 0x00020000
> #define PPC_FEATURE_ARCH_2_05 0x00001000
> #define PPC_FEATURE_PA6T 0x00000800
> #define PPC_FEATURE_POWER6_EXT 0x00000200
>
> What exactly are we using these for? Can we not use platform for
> some of these?
>
These are poorly named ISA versions
PPC_FEATURE_POWER4 == PPC_FEATURE_ARCH_2_0
PPC_FEATURE_POWER5 == PPC_FEATURE_ARCH_2_02
PPC_FEATURE_POWER5+ == PPC_FEATURE_ARCH_2_03
Ask Olof about this but I think
PPC_FEATURE_PA6T == PPC_FEATURE_ARCH_2_04
but I think it is more then 2_04 and less than 2_05.
This one PPC_FEATURE_POWER6_EXT is for the mftgpr/mffgpr instructions
unique to power6 native mode.
These support inline runtime tests to use instructions for newer versions
of the ISA
AT_PLATFORM is for slecting whole libraries. Not appropriate for inline
tests.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 19:47 ` Steve Munroe
@ 2007-05-30 20:52 ` Olof Johansson
2007-05-30 21:33 ` Steve Munroe
0 siblings, 1 reply; 63+ messages in thread
From: Olof Johansson @ 2007-05-30 20:52 UTC (permalink / raw)
To: Steve Munroe
Cc: linuxppc-dev list, Paul Mackerras, Anton Blanchard,
Ulrich Weigand
On Wed, May 30, 2007 at 02:47:37PM -0500, Steve Munroe wrote:
>
> Kumar Gala <galak@kernel.crashing.org> wrote on 05/30/2007 12:31:32 AM:
>
> >
> > On May 29, 2007, at 9:54 PM, Steve Munroe wrote:
> >
> > >
> > > Kumar Gala <galak@kernel.crashing.org> wrote on 05/29/2007 07:43:05
> > > PM:
> > >
> > >>
> > >> On May 29, 2007, at 6:46 PM, Olof Johansson wrote:
> > >>
> > >>> On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt
> > >>> wrote:
> > >>>> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
> > >>>>> This is all problematic since some 64-bit implementations may not
> > >>>>> guarantee the upper bits are valid when in 32-bit mode. Look
> > >>>>> at the
> > >>>>> 'Computation Modes' section in the architecture specs 2.03 or
> > >>>>> greater
> > >>>>> for embedded processors.
> > >>>>
> > >>>> Yuck. Well, we might need to export a spearate CPU feature bit to
> > >>>> indicate that it's the case then.
> > >>>
> > >>> No need for a new bit, you should be able to key off of
> > >>> PPC_FEATURE_64
> > >>> && !PPC_FEATURE_BOOKE.
> > >>
> > >> Nope, the architecture allows embedded to behave like server parts
> > >> and support the full 64-bit registers. We really should have a new
> > >> feature bit so that if someone has an implementation of an embedded
> > >> part that supports the functionality, they get the benefit.
> > >>
> > > When such exists we can add a bit, until then we can wait. The current
> > > 32-bit AT_HWCAP is almost full. so we should not allocate bits on
> > > speculation.
> >
> > Understandable.. dare I ask about a few of the current AT_HWCAPs we
> > do have:
> >
> > #define PPC_FEATURE_POWER4 0x00080000
> > #define PPC_FEATURE_POWER5 0x00040000
> > #define PPC_FEATURE_POWER5_PLUS 0x00020000
> > #define PPC_FEATURE_ARCH_2_05 0x00001000
> > #define PPC_FEATURE_PA6T 0x00000800
> > #define PPC_FEATURE_POWER6_EXT 0x00000200
> >
> > What exactly are we using these for? Can we not use platform for
> > some of these?
> >
> These are poorly named ISA versions
>
> PPC_FEATURE_POWER4 == PPC_FEATURE_ARCH_2_0
> PPC_FEATURE_POWER5 == PPC_FEATURE_ARCH_2_02
> PPC_FEATURE_POWER5+ == PPC_FEATURE_ARCH_2_03
>
> Ask Olof about this but I think
> PPC_FEATURE_PA6T == PPC_FEATURE_ARCH_2_04
> but I think it is more then 2_04 and less than 2_05.
The problem is that IBM has never (before) had to care about what was
implementation and what was architecture. The implementation WAS the
architecture up until POWER5+, and the PPC ISA went lock-step with the
new server processor releases.
PA6T is 2.04 + a few 2.05 bits, give or take. But it's not equivalent of
POWER6 (nor is it equivalent of POWER5+, since they implement different
optional features of the architecture).
I'm not sure just how to make this scale down the road -- if we are to
use a PPC_FEATURE_* for every optional feature in the ISA, we'll run
out of bits in no time. If we end up using a flag per implementation,
it probably won't be quite as bad, but I'm guessing the actual code that
uses it will get hairier.
-Olof
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 11:44 ` Benjamin Herrenschmidt
2007-05-30 12:15 ` Kumar Gala
@ 2007-05-30 21:02 ` Gabriel Paubert
2007-05-30 21:41 ` Steve Munroe
1 sibling, 1 reply; 63+ messages in thread
From: Gabriel Paubert @ 2007-05-30 21:02 UTC (permalink / raw)
To: Benjamin Herrenschmidt
Cc: Ulrich Weigand, Steve Munroe, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
On Wed, May 30, 2007 at 09:44:44PM +1000, Benjamin Herrenschmidt wrote:
> On Wed, 2007-05-30 at 00:32 -0500, Kumar Gala wrote:
> > > I think actually it would be useful to have the saving/restoring of
> > > the high 32 bits controlled by a prctl, so that programs have to ask
> > > explicitly for the new behaviour (and programs that don't want to
> > use
> > > the high 32 bits don't incur the extra overhead).
> >
> > I like this, it means we can error if HW doesn't support it and
> > requires applications to do something specific to enable the feature.
>
> Yeah well.... I liked the prctl at first.. but then, I though twice :-)
I agree, sooner or later, distribution might install two copies in two
different places and the dynamic loader will select one depending
availability of 64 bit registers. At this point virtually all applications
will effectively use 64 bit registers even when compiled in pure 32 bit
mode but the prctl will have to stay only for "hysterical raisins".
In 32 bit mode, 64 bit divides use a libcall for example. But the
libgcc routine can and should use the 64 bit instructions. There are
many other libcall cases that would benefit from a libgcc compiled
to use 64 bit instructions (64 bit int to floating point conversions and
back). This indirectly affects a lot of functions.
Gabriel (starting to having nightmares about somebody inventing
just another processor flavor, like a 64 bit BookE processor
with SPE and Altivec).
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 20:52 ` Olof Johansson
@ 2007-05-30 21:33 ` Steve Munroe
0 siblings, 0 replies; 63+ messages in thread
From: Steve Munroe @ 2007-05-30 21:33 UTC (permalink / raw)
To: Olof Johansson
Cc: linuxppc-dev list, Ulrich Weigand, Paul Mackerras,
Anton Blanchard
olof@lixom.net (Olof Johansson) wrote on 05/30/2007 03:52:39 PM:
> On Wed, May 30, 2007 at 02:47:37PM -0500, Steve Munroe wrote:
> >
> > Kumar Gala <galak@kernel.crashing.org> wrote on 05/30/2007 12:31:32 AM:
> >
> > >
> > > On May 29, 2007, at 9:54 PM, Steve Munroe wrote:
> > >
> > > >
> > > > Kumar Gala <galak@kernel.crashing.org> wrote on 05/29/2007 07:43:05
> > > > PM:
> > > >
> > > >>
> > > >> On May 29, 2007, at 6:46 PM, Olof Johansson wrote:
> > > >>
> > > >>> On Wed, May 30, 2007 at 07:32:33AM +1000, Benjamin Herrenschmidt
> > > >>> wrote:
> > > >>>> On Tue, 2007-05-29 at 08:10 -0500, Kumar Gala wrote:
> > > >>>>> This is all problematic since some 64-bit implementations may
not
> > > >>>>> guarantee the upper bits are valid when in 32-bit mode. Look
> > > >>>>> at the
> > > >>>>> 'Computation Modes' section in the architecture specs 2.03 or
> > > >>>>> greater
> > > >>>>> for embedded processors.
> > > >>>>
> > > >>>> Yuck. Well, we might need to export a spearate CPU feature bit
to
> > > >>>> indicate that it's the case then.
> > > >>>
> > > >>> No need for a new bit, you should be able to key off of
> > > >>> PPC_FEATURE_64
> > > >>> && !PPC_FEATURE_BOOKE.
> > > >>
> > > >> Nope, the architecture allows embedded to behave like server parts
> > > >> and support the full 64-bit registers. We really should have a
new
> > > >> feature bit so that if someone has an implementation of an
embedded
> > > >> part that supports the functionality, they get the benefit.
> > > >>
> > > > When such exists we can add a bit, until then we can wait. The
current
> > > > 32-bit AT_HWCAP is almost full. so we should not allocate bits on
> > > > speculation.
> > >
> > > Understandable.. dare I ask about a few of the current AT_HWCAPs we
> > > do have:
> > >
> > > #define PPC_FEATURE_POWER4 0x00080000
> > > #define PPC_FEATURE_POWER5 0x00040000
> > > #define PPC_FEATURE_POWER5_PLUS 0x00020000
> > > #define PPC_FEATURE_ARCH_2_05 0x00001000
> > > #define PPC_FEATURE_PA6T 0x00000800
> > > #define PPC_FEATURE_POWER6_EXT 0x00000200
> > >
> > > What exactly are we using these for? Can we not use platform for
> > > some of these?
> > >
> > These are poorly named ISA versions
> >
> > PPC_FEATURE_POWER4 == PPC_FEATURE_ARCH_2_0
> > PPC_FEATURE_POWER5 == PPC_FEATURE_ARCH_2_02
> > PPC_FEATURE_POWER5+ == PPC_FEATURE_ARCH_2_03
> >
> > Ask Olof about this but I think
> > PPC_FEATURE_PA6T == PPC_FEATURE_ARCH_2_04
> > but I think it is more then 2_04 and less than 2_05.
>
> The problem is that IBM has never (before) had to care about what was
> implementation and what was architecture. The implementation WAS the
> architecture up until POWER5+, and the PPC ISA went lock-step with the
> new server processor releases.
>
> PA6T is 2.04 + a few 2.05 bits, give or take. But it's not equivalent of
> POWER6 (nor is it equivalent of POWER5+, since they implement different
> optional features of the architecture).
>
> I'm not sure just how to make this scale down the road -- if we are to
> use a PPC_FEATURE_* for every optional feature in the ISA, we'll run
> out of bits in no time. If we end up using a flag per implementation,
> it probably won't be quite as bad, but I'm guessing the actual code that
> uses it will get hairier.
>
The current thinking is that AT_PLATFORM is for implementations
(micro-architectures where a few instruction twiques are not enough.
Different micro-architectures require recompilation with different
instruction scheduling (-mtune=<cpu-type>). For example power5 and ppc-cell
are both ISA 2.02, but completely different micro-architectures (8
pipelines out of order vs 2 in order).
AT_HWCAP is for instruction features where a quick runtime test to use
specific instructions is meaningful.
Steven J. Munroe
Linux on Power Toolchain Architect
IBM Corporation, Linux Technology Center
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 21:02 ` Gabriel Paubert
@ 2007-05-30 21:41 ` Steve Munroe
0 siblings, 0 replies; 63+ messages in thread
From: Steve Munroe @ 2007-05-30 21:41 UTC (permalink / raw)
To: Gabriel Paubert
Cc: Ulrich Weigand, linuxppc-dev list, Paul Mackerras,
Anton Blanchard
Gabriel Paubert <paubert@iram.es> wrote on 05/30/2007 04:02:12 PM:
>
> In 32 bit mode, 64 bit divides use a libcall for example. But the
> libgcc routine can and should use the 64 bit instructions. There are
> many other libcall cases that would benefit from a libgcc compiled
> to use 64 bit instructions (64 bit int to floating point conversions and
> back). This indirectly affects a lot of functions.
>
That is not an issue. The ISA 2.0 removed the useage restriction for fctid,
fctidz, and fcfid in 32-bit implementations.
^ permalink raw reply [flat|nested] 63+ messages in thread
* Re: Saving to 32 bits of GPRs in signal context
2007-05-30 12:07 ` Felix Domke
@ 2007-05-31 5:39 ` Benjamin Herrenschmidt
0 siblings, 0 replies; 63+ messages in thread
From: Benjamin Herrenschmidt @ 2007-05-31 5:39 UTC (permalink / raw)
To: Felix Domke; +Cc: linuxppc-dev list
On Wed, 2007-05-30 at 14:07 +0200, Felix Domke wrote:
>
> If you don't provide a real, portable, useful way *now* for detecting
> compatibility with 64bit insn, people (=ffmpeg, mplayer first) *will*
> invent their own way of detecting it, possibly using SIGILL, faster
> than
> you could imagine.
I'm pretty sure I had the feature bits before mplayer did altivec, but
then, nobody cares about the feature bits because they exist only on
linux.
There is no portable way to do these things and there won't be because
apple doesn't care what linux does and glibc people don't care about
what apple do etc...
> Please avoid that this time. And please declare the use of SIGILL for
> detecting extensions as plainly wrong, not as a "bad workaround, but
> still better than what's available". If you can't be sure that an
> extension will work as expected (for example because there is just no
> interface to query the OS for it), then simply don't use it. If this
> is
> going to be a performance problem, bug the kernel people to fix it.
>
> (Sorry, this is the point of view of myself a pure *user*. I don't
> want
> to debug crashing programs with incorrect memcpy results because some
> program decided on its own that it's safe to use this extension when
> it
> wasn't.)
Unfortunately, we can't just "declare" things and have people follow
us :-)
Ben.
^ permalink raw reply [flat|nested] 63+ messages in thread
end of thread, other threads:[~2007-05-31 5:40 UTC | newest]
Thread overview: 63+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2007-05-29 7:24 Saving to 32 bits of GPRs in signal context Benjamin Herrenschmidt
2007-05-29 7:52 ` Dan Malek
2007-05-29 8:05 ` Benjamin Herrenschmidt
2007-05-29 9:26 ` Gabriel Paubert
2007-05-29 9:44 ` Benjamin Herrenschmidt
2007-05-29 13:12 ` Segher Boessenkool
2007-05-29 14:00 ` Steve Munroe
2007-05-29 14:08 ` Ulrich Weigand
2007-05-29 14:17 ` Kumar Gala
2007-05-29 14:38 ` Segher Boessenkool
2007-05-29 19:04 ` Becky Bruce
2007-05-30 10:04 ` Christoph Hellwig
2007-05-30 12:13 ` Kumar Gala
2007-05-30 12:30 ` Segher Boessenkool
2007-05-29 14:31 ` Segher Boessenkool
2007-05-29 14:51 ` Steve Munroe
2007-05-29 21:44 ` Benjamin Herrenschmidt
2007-05-29 23:16 ` Steve Munroe
2007-05-29 23:19 ` Benjamin Herrenschmidt
2007-05-30 7:34 ` Hiroyuki Machida
2007-05-30 11:40 ` Segher Boessenkool
2007-05-30 11:48 ` Benjamin Herrenschmidt
2007-05-30 3:37 ` Paul Mackerras
2007-05-30 5:32 ` Kumar Gala
2007-05-30 11:44 ` Benjamin Herrenschmidt
2007-05-30 12:15 ` Kumar Gala
2007-05-30 12:48 ` Hiroyuki Machida
2007-05-30 12:58 ` Benjamin Herrenschmidt
2007-05-30 18:09 ` Steve Munroe
2007-05-30 21:02 ` Gabriel Paubert
2007-05-30 21:41 ` Steve Munroe
2007-05-30 12:01 ` Segher Boessenkool
2007-05-30 11:59 ` Segher Boessenkool
2007-05-30 12:01 ` Benjamin Herrenschmidt
2007-05-30 12:07 ` Segher Boessenkool
2007-05-30 12:09 ` Benjamin Herrenschmidt
2007-05-30 12:36 ` Segher Boessenkool
2007-05-29 14:28 ` Segher Boessenkool
2007-05-29 21:37 ` Benjamin Herrenschmidt
2007-05-29 21:38 ` Benjamin Herrenschmidt
2007-05-29 13:04 ` Segher Boessenkool
2007-05-29 14:28 ` Arnd Bergmann
2007-05-29 14:43 ` Segher Boessenkool
2007-05-29 15:54 ` Geert Uytterhoeven
2007-05-29 18:48 ` Arnd Bergmann
2007-05-29 21:27 ` Benjamin Herrenschmidt
2007-05-29 21:45 ` Felix Domke
2007-05-30 11:23 ` Benjamin Herrenschmidt
2007-05-30 11:52 ` Felix Domke
2007-05-30 13:14 ` Segher Boessenkool
2007-05-30 11:54 ` Segher Boessenkool
2007-05-30 12:07 ` Felix Domke
2007-05-31 5:39 ` Benjamin Herrenschmidt
2007-05-29 13:10 ` Kumar Gala
2007-05-29 21:32 ` Benjamin Herrenschmidt
2007-05-29 23:46 ` Olof Johansson
2007-05-30 0:43 ` Kumar Gala
2007-05-30 2:54 ` Steve Munroe
2007-05-30 5:31 ` Kumar Gala
2007-05-30 19:47 ` Steve Munroe
2007-05-30 20:52 ` Olof Johansson
2007-05-30 21:33 ` Steve Munroe
2007-05-29 13:53 ` Ulrich Weigand
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).