Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR

linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed

* Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
@ 2001-06-29 15:16 Mark Hatle
  2001-06-29 18:29 ` Albert D. Cahalan
  2001-06-29 21:03 ` Mark Hatle
  0 siblings, 2 replies; 7+ messages in thread
From: Mark Hatle @ 2001-06-29 15:16 UTC (permalink / raw)
  To: Albert D. Cahalan, linuxppc-dev

>> While I was puzzling over that (stepping through a SIGFPE handler in
>> gdb), I noticed something disturbing: some newly created processes
>> (grep and more and other random programs) started dying with unhandled
>> "Floating point exception" messages.  I'm at a loss to explain this,
>> but I saw it happen often enough to be convinced that I'm not imagining
>> the behavior.  I do wonder whether "lazily" enabling the FPU (and
> enabling FPU exceptions) when FPSCR[FEX] may be set is really a good
>> idea.
>...
>> I guess that I'm reporting a bug (or a few bugs) here; I certainly
>> understand the motivation behind doing lazy FPU switching, but question
>> whether it's done with adequate care when FP exceptions are enabled.
>
>It is a bad idea, because gcc now uses FP registers to copy structs.
>Every program can be an FP program now, so why add complexity and
>keep taking traps?

One thing to keep in mind, GCC is perfectly capable of compiling without
using floating point.  I routinely use code that has no floating point
compiled in (including glibc).  If you build a system with -msoft-float
(libraries through the apps) then the FPU never gets enabled and your
context switching is faster.  (Is this measurable?  I'm not sure.
But..) The system "seems" to preform better.

Lazy FPU initialization IMHO is a good thing for single purpose
(embedded?) systems that are on a high end CPU, but do not need floating
point.  One example could be a signal processing system that uses
altivec and integer math heavily, but no floating point.

--Mark Hatle
MontaVista Software, Inc.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
  2001-06-29 15:16 giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR Mark Hatle
@ 2001-06-29 18:29 ` Albert D. Cahalan
  2001-06-29 21:03 ` Mark Hatle
  1 sibling, 0 replies; 7+ messages in thread
From: Albert D. Cahalan @ 2001-06-29 18:29 UTC (permalink / raw)
  To: Mark Hatle; +Cc: Albert D. Cahalan, linuxppc-dev

Mark Hatle writes:
> [Albert Cahalan]
>> [somebody]

>>> [about lazy FP save/restore]
>>
>> It is a bad idea, because gcc now uses FP registers to copy structs.
>> Every program can be an FP program now, so why add complexity and
>> keep taking traps?
>
> One thing to keep in mind, GCC is perfectly capable of compiling without
> using floating point.  I routinely use code that has no floating point
> compiled in (including glibc).  If you build a system with -msoft-float
> (libraries through the apps) then the FPU never gets enabled and your
> context switching is faster.  (Is this measurable?  I'm not sure.
> But..) The system "seems" to preform better.

Maybe your gcc can't take advantage of FP registers for non-FP uses.

Your non-FP code might seem a wee bit better, but your FP code
ends up taking faults. Then with all the extra code, we end up
with extra problems -- for example the original poster's trouble.

> Lazy FPU initialization IMHO is a good thing for single purpose
> (embedded?) systems that are on a high end CPU, but do not need
> floating point.  One example could be a signal processing system
> that uses altivec and integer math heavily, but no floating point.

That is almost exactly the sort of system I work with.
AltiVec is used for signal processing, though with "float" data.

FP registers are still useful, for struct copies and for the
occasional standard C library function.

I suppose, if one does want lazy FP save/restore, that it ought
to be done with a per-process flag to prevent frequent faults.
When switching to an FP process, restore the registers. From time
to time take away the FP registers to deal with processes that
only use them once in a great while or only at startup. Maybe
take away FP after 1, 2, 4, 8, 16... ticks of use.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
  2001-06-29 15:16 giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR Mark Hatle
  2001-06-29 18:29 ` Albert D. Cahalan
@ 2001-06-29 21:03 ` Mark Hatle
  1 sibling, 0 replies; 7+ messages in thread
From: Mark Hatle @ 2001-06-29 21:03 UTC (permalink / raw)
  To: Albert D. Cahalan, linuxppc-dev


>> Lazy FPU initialization IMHO is a good thing for single purpose
>> (embedded?) systems that are on a high end CPU, but do not need floating
>> point.  One example could be a signal processing system that uses
>> altivec and integer math heavily, but no floating point.

>Your non-FP code might seem a wee bit better, but your FP code
>ends up taking faults. Then with all the extra code, we end up
>with extra problems -- for example the original poster's trouble.

Maybe I missed something here, but my understanding is that if you in
SMP lazy initialization never happens.  And if you are single CPU you
take _ONE_ fault per process.

One FP fault per process seems very minor to me, (unless of course you
are spawning a hell of a lot of process, but most likely the process
spawn time would outweigh the single fault.)

>I suppose, if one does want lazy FP save/restore, that it ought
>to be done with a per-process flag to prevent frequent faults.
>When switching to an FP process, restore the registers. From time
>to time take away the FP registers to deal with processes that
>only use them once in a great while or only at startup. Maybe
>take away FP after 1, 2, 4, 8, 16... ticks of use.

That might be of some value, but I'd be concerned that instead of 1
fault per process we could run in to a lot of faults, or into a
situation where each process would need some type of a counter to detect
faults.  (Probably messy...)

--Mark Hatle
MontaVista Software, Inc.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
@ 2001-06-28 22:32 Albert D. Cahalan
  0 siblings, 0 replies; 7+ messages in thread
From: Albert D. Cahalan @ 2001-06-28 22:32 UTC (permalink / raw)
  To: linuxppc-dev; +Cc: gb

> Sometime in the last year or two (sorry not to be more precise; I
> wasn't paying attention) the linuxppc kernel seems to have started
> doing lazy FPU switching.  If I read the code (in
> ../arch/ppc/kernel/head.S) and judge its effects correctly, code in
> load_up_fpu() unconditionally sets the FE0 and FE1 bits in the MSR.  I

If the user does not want traps, both FE0 and FE1 should be set
to 0 in addition to FPSCR being set to zero. This is supposed to
perform best.

If the user does want traps, the kernel should choose:

If a debugger is attached or a SIGFPE handler is present,
then FE0 and FE1 should both be 1. (precise mode)

Otherwise, FE0 should be 0 and FE1 should be 1. This is the
imprecise non-recoverable mode, which may be faster.

> It seems that when a user-level SIGFPE handler begins execution, its
> own FPSCR[FEX] (and other exception-related bits in the FPSCR) are set
> and the FPU (MSR[FP, FE0, FE1]) is disabled.  (I'd be more confident
> in saying that than I am if gdb 5.0 had ever heard of a register
> called the fpscr, or if its "info float" command wasn't so convinced
> that there was no FP info available ...)
>
> It seems like any attempt to use the FPU inside the SIGFPE handler
> cause the kernel to turn it back on (setting MSR[FP, FE0, and FE1]);
> with FPSCR[FEX] set, this seems to raise SIGFPE again; attempts to
> clear FPSCR[FEX] from user code involve ... well, they involve using
> the FPU again.

"the program exception occurs before the next synchronizing event
if an instruction alters those bits (thus enabling the program
exception). When this occurs, SRR0 points to the instruction that
would have executed next and not to the instruction that modified MSR"

> While I was puzzling over that (stepping through a SIGFPE handler in
> gdb), I noticed something disturbing: some newly created processes
> (grep and more and other random programs) started dying with unhandled
> "Floating point exception" messages.  I'm at a loss to explain this,
> but I saw it happen often enough to be convinced that I'm not imagining
> the behavior.  I do wonder whether "lazily" enabling the FPU (and
> enabling FPU exceptions) when FPSCR[FEX] may be set is really a good
> idea.
...
> I guess that I'm reporting a bug (or a few bugs) here; I certainly
> understand the motivation behind doing lazy FPU switching, but question
> whether it's done with adequate care when FP exceptions are enabled.

It is a bad idea, because gcc now uses FP registers to copy structs.
Every program can be an FP program now, so why add complexity and
keep taking traps?

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
@ 2001-06-26 14:10 Gary Byers
  2001-07-07  1:07 ` Paul Mackerras
  0 siblings, 1 reply; 7+ messages in thread
From: Gary Byers @ 2001-06-26 14:10 UTC (permalink / raw)
  To: linuxppc-dev

Sometime in the last year or two (sorry not to be more precise; I
wasn't paying attention) the linuxppc kernel seems to have started
doing lazy FPU switching.  If I read the code (in
../arch/ppc/kernel/head.S) and judge its effects correctly, code in
load_up_fpu() unconditionally sets the FE0 and FE1 bits in the MSR.  I
wonder if this change (setting the bits) was intentional: there's
still some code in ./signal.c and ./ptrace.c that tries to allow
signal handlers/ptrace to change those bits, though I don't know if
that's ever worked.

With MSR[FE0] and MSR[FE1] set, an enabled FPU exception (something
that causes FPSCR[FEX] to be set) causes a program exception with
SRR1[11] set, and Linux maps this to SIGFPE.

It seems that when a user-level SIGFPE handler begins execution, its
own FPSCR[FEX] (and other exception-related bits in the FPSCR) are set
and the FPU (MSR[FP, FE0, FE1]) is disabled.  (I'd be more confident
in saying that than I am if gdb 5.0 had ever heard of a register
called the fpscr, or if its "info float" command wasn't so convinced
that there was no FP info available ...)

It seems like any attempt to use the FPU inside the SIGFPE handler
cause the kernel to turn it back on (setting MSR[FP, FE0, and FE1]);
with FPSCR[FEX] set, this seems to raise SIGFPE again; attempts to
clear FPSCR[FEX] from user code involve ... well, they involve using
the FPU again.

While I was puzzling over that (stepping through a SIGFPE handler in
gdb), I noticed something disturbing: some newly created processes
(grep and more and other random programs) started dying with unhandled
"Floating point exception" messages.  I'm at a loss to explain this,
but I saw it happen often enough to be convinced that I'm not imagining
the behavior.  I do wonder whether "lazily" enabling the FPU (and
enabling FPU exceptions) when FPSCR[FEX] may be set is really a good
idea.

I didn't see any significant differences in later versions of
'head.S', but for the record  all of this happens in a pretty
vanilla 2.2.17 kernel.

I guess that I'm reporting a bug (or a few bugs) here; I certainly
understand the motivation behind doing lazy FPU switching, but question
whether it's done with adequate care when FP exceptions are enabled.

Gary Byers
gb@gse.com

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
  2001-06-26 14:10 Gary Byers
@ 2001-07-07  1:07 ` Paul Mackerras
  2001-07-09 17:41   ` Dan Malek
  0 siblings, 1 reply; 7+ messages in thread
From: Paul Mackerras @ 2001-07-07  1:07 UTC (permalink / raw)
  To: Gary Byers; +Cc: linuxppc-dev

Gary Byers writes:

> Sometime in the last year or two (sorry not to be more precise; I
> wasn't paying attention) the linuxppc kernel seems to have started
> doing lazy FPU switching.  If I read the code (in

No, that code is older than that, I think it dates from 96 or 97. :)

> ../arch/ppc/kernel/head.S) and judge its effects correctly, code in
> load_up_fpu() unconditionally sets the FE0 and FE1 bits in the MSR.  I
> wonder if this change (setting the bits) was intentional: there's
> still some code in ./signal.c and ./ptrace.c that tries to allow
> signal handlers/ptrace to change those bits, though I don't know if
> that's ever worked.

I believe that does work and that glibc uses it.  Not that that is the
ideal way to handle it...  Albert Cahalan's suggestion of setting
FE0/FE1 based on the disposition of the SIGFPE signal is a good one.

I'm not totally sure yet what to do when SIGFPE is blocked though - I
guess if we get a floating-point exception and SIGFPE is blocked we
should clear FE0/FE1 and continue.  Maybe we should clear the
exception enable bits in the FPSCR too, although I would rather not.
I don't like the idea of the kernel frobbing the FPSCR when the
process is not expecting it, but on the other hand you may get
surprising results if you do a floating-point operation that causes an
exception, and the exception enable bit in the FPSCR is set, but
FE0/FE1 are zero.  (Section 3.3.6 of the Programming Environments
Manual describes this.)  But then I guess if you are doing
floating-point stuff with SIGFPE blocked you presumably know what you
are doing... :)

> It seems that when a user-level SIGFPE handler begins execution, its
> own FPSCR[FEX] (and other exception-related bits in the FPSCR) are set
> and the FPU (MSR[FP, FE0, FE1]) is disabled.  (I'd be more confident

Interesting... I guess you are right but I haven't found the code that
does that yet. 8-)

> It seems like any attempt to use the FPU inside the SIGFPE handler
> cause the kernel to turn it back on (setting MSR[FP, FE0, and FE1]);
> with FPSCR[FEX] set, this seems to raise SIGFPE again; attempts to
> clear FPSCR[FEX] from user code involve ... well, they involve using
> the FPU again.

OK.  The question I have here is whether the signal handler should
have its own FPSCR value.  I notice that the signal delivery code in
the kernel saves the floating-point registers (fr0 - fr31) on the
stack, and restores them in sys_sigreturn, but doesn't do anything
with FPSCR.  Arguably it should save and restore FPSCR and initialize
the FPSCR to 0 for the signal handler.  That would mean a change to
the signal stack layout though, and it would mean that if the signal
handler wanted to change the FPSCR used by the program (e.g. to clear
the exception status bits) it would need to find and change the value
on the stack rather than changing the FPSCR directly.

> While I was puzzling over that (stepping through a SIGFPE handler in
> gdb), I noticed something disturbing: some newly created processes
> (grep and more and other random programs) started dying with unhandled
> "Floating point exception" messages.  I'm at a loss to explain this,
> but I saw it happen often enough to be convinced that I'm not imagining
> the behavior.  I do wonder whether "lazily" enabling the FPU (and
> enabling FPU exceptions) when FPSCR[FEX] may be set is really a good

In fact the MSR[FP] bit has no effect on whether the cpu will take a
floating-point exception (the 0x700 program exception, not the FP
unavailable exception, I mean).  It's fairly clear now that
setting/clearing FE0 and FE1 at the same time as FP is not the right
thing to do.

> I guess that I'm reporting a bug (or a few bugs) here; I certainly
> understand the motivation behind doing lazy FPU switching, but question
> whether it's done with adequate care when FP exceptions are enabled.

Certainly the current behaviour is suboptimal; we can fix it but first
we need to discuss what the Right Way (tm) to do things would be.

Paul.

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

* Re: giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR
  2001-07-07  1:07 ` Paul Mackerras
@ 2001-07-09 17:41   ` Dan Malek
  0 siblings, 0 replies; 7+ messages in thread
From: Dan Malek @ 2001-07-09 17:41 UTC (permalink / raw)
  To: paulus; +Cc: Gary Byers, linuxppc-dev

Paul Mackerras wrote:

> I believe that does work and that glibc uses it.  Not that that is the
> ideal way to handle it...  Albert Cahalan's suggestion of setting
> FE0/FE1 based on the disposition of the SIGFPE signal is a good one.

There was some discussion about how we handle the exception flags
in FPSCR a long time (a year or more :-) ago.  A search of the
archives may turn up something (I haven't searched).  I know there
was a change to the library and the user ability to set this flags
way back then.......

	-- Dan

** Sent via the linuxppc-dev mail list. See http://lists.linuxppc.org/

^ permalink raw reply	[flat|nested] 7+ messages in thread

end of thread, other threads:[~2001-07-09 17:41 UTC | newest]

Thread overview: 7+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2001-06-29 15:16 giving up the FPU, MSR[FE0], MSR[FE1], and the FPSCR Mark Hatle
2001-06-29 18:29 ` Albert D. Cahalan
2001-06-29 21:03 ` Mark Hatle
  -- strict thread matches above, loose matches on Subject: below --
2001-06-28 22:32 Albert D. Cahalan
2001-06-26 14:10 Gary Byers
2001-07-07  1:07 ` Paul Mackerras
2001-07-09 17:41   ` Dan Malek

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).