FCSR Management

Linux MIPS Architecture development
 help / color / mirror / Atom feed

* FCSR Management
@ 2002-09-24  7:51 Kevin D. Kissell
  2002-09-24  7:51 ` Kevin D. Kissell
                   ` (2 more replies)
  0 siblings, 3 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2002-09-24  7:51 UTC (permalink / raw)
  To: linux-mips

In looking at some anomalous behavior on another software
platform, I checked the current MIPS/Linux kernel sources
and I wonder if we don't have yet another FP context problem
lurking under the surface.

On most, if not all, MIPS CPUs with integrated FPUs,
the act of writing a value to the FP CSR (Control and
Status Register, fcr31) which has the "E" bit, or any matching
pair of Enable/Cause bits for the V/Z/O/U/I IEEE exceptions
set will trigger a floating point exception.  In the case of
the Unimplemented Operation exception (the "E" bit),
the emulator is invoked and all of the Cause bits are cleared
in the context before user execution is resumed.  In the
case of other FP exceptions, the default behavior is to
dump core, so the user never executes again.  But *if*
the user has registered a handler for SIGFPE, and one
of the IEEE exceptions occurs, I don't see where the
associated Cause bit is being cleared, and I would think
that the consequence would be that the process would
get into an endless loop of trapping, posting the signal,
restoring the FCSR from the context with the bits set,
and trapping again, whether or not the PC is modified
to avoid re-executing the faulting instruction.

Am I missing something, or is this a problem?

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* FCSR Management
  2002-09-24  7:51 FCSR Management Kevin D. Kissell
@ 2002-09-24  7:51 ` Kevin D. Kissell
  2002-09-24 11:42 ` Maciej W. Rozycki
  2002-09-24 17:37 ` Jun Sun
  2 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2002-09-24  7:51 UTC (permalink / raw)
  To: linux-mips

In looking at some anomalous behavior on another software
platform, I checked the current MIPS/Linux kernel sources
and I wonder if we don't have yet another FP context problem
lurking under the surface.

On most, if not all, MIPS CPUs with integrated FPUs,
the act of writing a value to the FP CSR (Control and
Status Register, fcr31) which has the "E" bit, or any matching
pair of Enable/Cause bits for the V/Z/O/U/I IEEE exceptions
set will trigger a floating point exception.  In the case of
the Unimplemented Operation exception (the "E" bit),
the emulator is invoked and all of the Cause bits are cleared
in the context before user execution is resumed.  In the
case of other FP exceptions, the default behavior is to
dump core, so the user never executes again.  But *if*
the user has registered a handler for SIGFPE, and one
of the IEEE exceptions occurs, I don't see where the
associated Cause bit is being cleared, and I would think
that the consequence would be that the process would
get into an endless loop of trapping, posting the signal,
restoring the FCSR from the context with the bits set,
and trapping again, whether or not the PC is modified
to avoid re-executing the faulting instruction.

Am I missing something, or is this a problem?

            Regards,

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FCSR Management
  2002-09-24  7:51 FCSR Management Kevin D. Kissell
  2002-09-24  7:51 ` Kevin D. Kissell
@ 2002-09-24 11:42 ` Maciej W. Rozycki
  2002-09-24 12:37   ` Kevin D. Kissell
  2002-09-24 17:37 ` Jun Sun
  2 siblings, 1 reply; 8+ messages in thread
From: Maciej W. Rozycki @ 2002-09-24 11:42 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: linux-mips

On Tue, 24 Sep 2002, Kevin D. Kissell wrote:

> dump core, so the user never executes again.  But *if*
> the user has registered a handler for SIGFPE, and one
> of the IEEE exceptions occurs, I don't see where the
> associated Cause bit is being cleared, and I would think
> that the consequence would be that the process would
> get into an endless loop of trapping, posting the signal,
> restoring the FCSR from the context with the bits set,
> and trapping again, whether or not the PC is modified
> to avoid re-executing the faulting instruction.

 Obviously user code is responsible to clear the bit it acted upon in the
saved context. 

-- 
+  Maciej W. Rozycki, Technical University of Gdansk, Poland   +
+--------------------------------------------------------------+
+        e-mail: macro@ds2.pg.gda.pl, PGP key available        +

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FCSR Management
  2002-09-24 11:42 ` Maciej W. Rozycki
@ 2002-09-24 12:37   ` Kevin D. Kissell
  2002-09-24 12:37     ` Kevin D. Kissell
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin D. Kissell @ 2002-09-24 12:37 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-mips

> On Tue, 24 Sep 2002, Kevin D. Kissell wrote:
> 
> > dump core, so the user never executes again.  But *if*
> > the user has registered a handler for SIGFPE, and one
> > of the IEEE exceptions occurs, I don't see where the
> > associated Cause bit is being cleared, and I would think
> > that the consequence would be that the process would
> > get into an endless loop of trapping, posting the signal,
> > restoring the FCSR from the context with the bits set,
> > and trapping again, whether or not the PC is modified
> > to avoid re-executing the faulting instruction.
> 
>  Obviously user code is responsible to clear the bit it acted upon in the
> saved context. 

It may be obvious that someone *intended* that user code 
clear the bit.  But the FCSR value containing the trapping
condition seems to be saved as part of both the thread 
and the signal contexts, thus (a) it could be restored as 
part of the sigcontext load of the signal handler, causing 
a re-entrant trap, possibly ad infinitum, and (b) will be
restored in the thread state after the execution of the
signal in any case, since we don't allow signals to have
side-effects on the FP register state, including the FCSR.
So even if the signal handler executed far enough to clear
the relevant Cause bit, it looks to me as if it would simply
be re-set the next time the thread loaded the FPU context.

I haven't seen anyone complaining about threads hanging
when SIGFPE's are being caught, so things may be working
somehow - but we may be blundering through some number
of spurious traps for no good reason before we get there.

I'll be delighted if someone on the list could point out
how the probelem is being bypassed..

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FCSR Management
  2002-09-24 12:37   ` Kevin D. Kissell
@ 2002-09-24 12:37     ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2002-09-24 12:37 UTC (permalink / raw)
  To: Maciej W. Rozycki; +Cc: linux-mips

> On Tue, 24 Sep 2002, Kevin D. Kissell wrote:
> 
> > dump core, so the user never executes again.  But *if*
> > the user has registered a handler for SIGFPE, and one
> > of the IEEE exceptions occurs, I don't see where the
> > associated Cause bit is being cleared, and I would think
> > that the consequence would be that the process would
> > get into an endless loop of trapping, posting the signal,
> > restoring the FCSR from the context with the bits set,
> > and trapping again, whether or not the PC is modified
> > to avoid re-executing the faulting instruction.
> 
>  Obviously user code is responsible to clear the bit it acted upon in the
> saved context. 

It may be obvious that someone *intended* that user code 
clear the bit.  But the FCSR value containing the trapping
condition seems to be saved as part of both the thread 
and the signal contexts, thus (a) it could be restored as 
part of the sigcontext load of the signal handler, causing 
a re-entrant trap, possibly ad infinitum, and (b) will be
restored in the thread state after the execution of the
signal in any case, since we don't allow signals to have
side-effects on the FP register state, including the FCSR.
So even if the signal handler executed far enough to clear
the relevant Cause bit, it looks to me as if it would simply
be re-set the next time the thread loaded the FPU context.

I haven't seen anyone complaining about threads hanging
when SIGFPE's are being caught, so things may be working
somehow - but we may be blundering through some number
of spurious traps for no good reason before we get there.

I'll be delighted if someone on the list could point out
how the probelem is being bypassed..

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FCSR Management
  2002-09-24  7:51 FCSR Management Kevin D. Kissell
  2002-09-24  7:51 ` Kevin D. Kissell
  2002-09-24 11:42 ` Maciej W. Rozycki
@ 2002-09-24 17:37 ` Jun Sun
  2002-09-24 19:29   ` Kevin D. Kissell
  2 siblings, 1 reply; 8+ messages in thread
From: Jun Sun @ 2002-09-24 17:37 UTC (permalink / raw)
  To: Kevin D. Kissell; +Cc: linux-mips, jsun

On Tue, Sep 24, 2002 at 09:51:18AM +0200, Kevin D. Kissell wrote:
> In looking at some anomalous behavior on another software
> platform, I checked the current MIPS/Linux kernel sources
> and I wonder if we don't have yet another FP context problem
> lurking under the surface.
> 
> On most, if not all, MIPS CPUs with integrated FPUs,
> the act of writing a value to the FP CSR (Control and
> Status Register, fcr31) which has the "E" bit, or any matching
> pair of Enable/Cause bits for the V/Z/O/U/I IEEE exceptions
> set will trigger a floating point exception.  In the case of
> the Unimplemented Operation exception (the "E" bit),
> the emulator is invoked and all of the Cause bits are cleared
> in the context before user execution is resumed.  In the
> case of other FP exceptions, the default behavior is to
> dump core, so the user never executes again.  But *if*
> the user has registered a handler for SIGFPE, and one
> of the IEEE exceptions occurs, I don't see where the
> associated Cause bit is being cleared, and I would think
> that the consequence would be that the process would
> get into an endless loop of trapping, posting the signal,
> restoring the FCSR from the context with the bits set,
> and trapping again, whether or not the PC is modified
> to avoid re-executing the faulting instruction.
> 
> Am I missing something, or is this a problem?
>

FPE exceptions, actually almost all exceptions, are cleared before their
handlers are invoked.  See kernel/entry.S and look for BUILD_HANDLER().

Those macro defines are really mind-twisting and usually don't show up on
grep radar...

Jun

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FCSR Management
  2002-09-24 17:37 ` Jun Sun
@ 2002-09-24 19:29   ` Kevin D. Kissell
  2002-09-24 19:29     ` Kevin D. Kissell
  0 siblings, 1 reply; 8+ messages in thread
From: Kevin D. Kissell @ 2002-09-24 19:29 UTC (permalink / raw)
  To: Jun Sun; +Cc: linux-mips

From: "Jun Sun" <jsun@mvista.com>
> > Am I missing something, or is this a problem?
> >
> 
> FPE exceptions, actually almost all exceptions, are cleared before their
> handlers are invoked.  See kernel/entry.S and look for BUILD_HANDLER().
> 
> Those macro defines are really mind-twisting and usually don't show up on
> grep radar...

Right you are.  Thanks.  Maciej gave me a bit of a scare there. ;-)
In an ideal universe, the unmodified FCSR which is correctly 
passed as a parameter to handle_fpe() (now that I look at entry.S,
it all comes back.. ;-) would be passed on as part of the
signal "payload" if SIGFPE is caught, but at least things
aren't drastically broken.

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

* Re: FCSR Management
  2002-09-24 19:29   ` Kevin D. Kissell
@ 2002-09-24 19:29     ` Kevin D. Kissell
  0 siblings, 0 replies; 8+ messages in thread
From: Kevin D. Kissell @ 2002-09-24 19:29 UTC (permalink / raw)
  To: Jun Sun; +Cc: linux-mips

From: "Jun Sun" <jsun@mvista.com>
> > Am I missing something, or is this a problem?
> >
> 
> FPE exceptions, actually almost all exceptions, are cleared before their
> handlers are invoked.  See kernel/entry.S and look for BUILD_HANDLER().
> 
> Those macro defines are really mind-twisting and usually don't show up on
> grep radar...

Right you are.  Thanks.  Maciej gave me a bit of a scare there. ;-)
In an ideal universe, the unmodified FCSR which is correctly 
passed as a parameter to handle_fpe() (now that I look at entry.S,
it all comes back.. ;-) would be passed on as part of the
signal "payload" if SIGFPE is caught, but at least things
aren't drastically broken.

            Kevin K.

^ permalink raw reply	[flat|nested] 8+ messages in thread

end of thread, other threads:[~2002-09-24 19:29 UTC | newest]

Thread overview: 8+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2002-09-24  7:51 FCSR Management Kevin D. Kissell
2002-09-24  7:51 ` Kevin D. Kissell
2002-09-24 11:42 ` Maciej W. Rozycki
2002-09-24 12:37   ` Kevin D. Kissell
2002-09-24 12:37     ` Kevin D. Kissell
2002-09-24 17:37 ` Jun Sun
2002-09-24 19:29   ` Kevin D. Kissell
2002-09-24 19:29     ` Kevin D. Kissell

This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox