[PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation

All of lore.kernel.org
 help / color / mirror / Atom feed

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03  8:36 ` Kautuk Consul
  0 siblings, 0 replies; 16+ messages in thread
From: Kautuk Consul @ 2012-02-03  8:36 UTC (permalink / raw)
  To: linux-arm-kernel

There is a lot of ARM/VFP hardware which does not invoke the
undefined instruction exception handler (i.e. __und_usr) at all
even in case of an invalid floating point operation in usermode.
For example, 100.0 divided by 0.0 will not lead to the undefined
instruction exception being called by the processor although the
hardware does reliably set the DZC bit in the FPSCR.

Currently, the usermode code will simply proceed to execute without
the SIGFPE signal being raised on that process thus limiting the
developer's knowledge of what really went wrong in his code logic.

This patch enables the kernel to raise the SIGFPE signal when the
faulting thread next calls schedule() in any of the following ways:
- When the thread is interrupted by IRQ
- When the thread calls a system call
- When the thread encounters exception

Although the SIGFPE is not delivered at the exact faulting instruction,
this patch at least allows the system to handle FPU exception scenarios
in a more generic "Linux/Unix" manner.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Signed-off-by: Mohd. Faris <mohdfarisq2010@gmail.com>
---
 arch/arm/include/asm/vfp.h |    2 ++
 arch/arm/vfp/vfpmodule.c   |   15 ++++++++++++++-
 2 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
index f4ab34f..a37c265 100644
--- a/arch/arm/include/asm/vfp.h
+++ b/arch/arm/include/asm/vfp.h
@@ -71,6 +71,8 @@
 #define FPSCR_UFC		(1<<3)
 #define FPSCR_IXC		(1<<4)
 #define FPSCR_IDC		(1<<7)
+#define FPSCR_CUMULATIVE_EXCEPTION_MASK	\
+		(FPSCR_IOC|FPSCR_DZC|FPSCR_OFC|FPSCR_UFC|FPSCR_IXC|FPSCR_IDC)
 
 /* MVFR0 bits */
 #define MVFR0_A_SIMD_BIT	(0)
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 8f3ccdd..39824a1 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -52,6 +52,8 @@ unsigned int VFP_arch;
  */
 union vfp_state *vfp_current_hw_state[NR_CPUS];
 
+static void vfp_raise_sigfpe(unsigned int, struct pt_regs *);
+
 /*
  * Is 'thread's most up to date state stored in this CPUs hardware?
  * Must be called from non-preemptible context.
@@ -72,8 +74,13 @@ static bool vfp_state_in_hw(unsigned int cpu, struct thread_info *thread)
  */
 static void vfp_force_reload(unsigned int cpu, struct thread_info *thread)
 {
+	unsigned int fpexc = fmrx(FPEXC);
+
 	if (vfp_state_in_hw(cpu, thread)) {
-		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
+		if ((fpexc & FPEXC_EN) &&
+			(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
+			vfp_raise_sigfpe(0, task_pt_regs(current));
+		fmxr(FPEXC, fpexc & ~FPEXC_EN);
 		vfp_current_hw_state[cpu] = NULL;
 	}
 #ifdef CONFIG_SMP
@@ -100,6 +107,8 @@ static void vfp_thread_flush(struct thread_info *thread)
 	cpu = get_cpu();
 	if (vfp_current_hw_state[cpu] == vfp)
 		vfp_current_hw_state[cpu] = NULL;
+	if (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK)
+		vfp_raise_sigfpe(0, task_pt_regs(current));
 	fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
 	put_cpu();
 
@@ -181,6 +190,10 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 			vfp_save_state(vfp_current_hw_state[cpu], fpexc);
 #endif
 
+		if ((fpexc & FPEXC_EN) &&
+				(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
+			vfp_raise_sigfpe(0, task_pt_regs(current));
+
 		/*
 		 * Always disable VFP so we can lazily save/restore the
 		 * old state.
-- 
1.7.6

^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03  8:36 ` Kautuk Consul
  0 siblings, 0 replies; 16+ messages in thread
From: Kautuk Consul @ 2012-02-03  8:36 UTC (permalink / raw)
  To: Russell King; +Cc: linux-arm-kernel, linux-kernel, Mohd. Faris, Kautuk Consul

There is a lot of ARM/VFP hardware which does not invoke the
undefined instruction exception handler (i.e. __und_usr) at all
even in case of an invalid floating point operation in usermode.
For example, 100.0 divided by 0.0 will not lead to the undefined
instruction exception being called by the processor although the
hardware does reliably set the DZC bit in the FPSCR.

Currently, the usermode code will simply proceed to execute without
the SIGFPE signal being raised on that process thus limiting the
developer's knowledge of what really went wrong in his code logic.

This patch enables the kernel to raise the SIGFPE signal when the
faulting thread next calls schedule() in any of the following ways:
- When the thread is interrupted by IRQ
- When the thread calls a system call
- When the thread encounters exception

Although the SIGFPE is not delivered at the exact faulting instruction,
this patch at least allows the system to handle FPU exception scenarios
in a more generic "Linux/Unix" manner.

Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
Signed-off-by: Mohd. Faris <mohdfarisq2010@gmail.com>
---
 arch/arm/include/asm/vfp.h |    2 ++
 arch/arm/vfp/vfpmodule.c   |   15 ++++++++++++++-
 2 files changed, 16 insertions(+), 1 deletions(-)

diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
index f4ab34f..a37c265 100644
--- a/arch/arm/include/asm/vfp.h
+++ b/arch/arm/include/asm/vfp.h
@@ -71,6 +71,8 @@
 #define FPSCR_UFC		(1<<3)
 #define FPSCR_IXC		(1<<4)
 #define FPSCR_IDC		(1<<7)
+#define FPSCR_CUMULATIVE_EXCEPTION_MASK	\
+		(FPSCR_IOC|FPSCR_DZC|FPSCR_OFC|FPSCR_UFC|FPSCR_IXC|FPSCR_IDC)
 
 /* MVFR0 bits */
 #define MVFR0_A_SIMD_BIT	(0)
diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
index 8f3ccdd..39824a1 100644
--- a/arch/arm/vfp/vfpmodule.c
+++ b/arch/arm/vfp/vfpmodule.c
@@ -52,6 +52,8 @@ unsigned int VFP_arch;
  */
 union vfp_state *vfp_current_hw_state[NR_CPUS];
 
+static void vfp_raise_sigfpe(unsigned int, struct pt_regs *);
+
 /*
  * Is 'thread's most up to date state stored in this CPUs hardware?
  * Must be called from non-preemptible context.
@@ -72,8 +74,13 @@ static bool vfp_state_in_hw(unsigned int cpu, struct thread_info *thread)
  */
 static void vfp_force_reload(unsigned int cpu, struct thread_info *thread)
 {
+	unsigned int fpexc = fmrx(FPEXC);
+
 	if (vfp_state_in_hw(cpu, thread)) {
-		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
+		if ((fpexc & FPEXC_EN) &&
+			(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
+			vfp_raise_sigfpe(0, task_pt_regs(current));
+		fmxr(FPEXC, fpexc & ~FPEXC_EN);
 		vfp_current_hw_state[cpu] = NULL;
 	}
 #ifdef CONFIG_SMP
@@ -100,6 +107,8 @@ static void vfp_thread_flush(struct thread_info *thread)
 	cpu = get_cpu();
 	if (vfp_current_hw_state[cpu] == vfp)
 		vfp_current_hw_state[cpu] = NULL;
+	if (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK)
+		vfp_raise_sigfpe(0, task_pt_regs(current));
 	fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
 	put_cpu();
 
@@ -181,6 +190,10 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
 			vfp_save_state(vfp_current_hw_state[cpu], fpexc);
 #endif
 
+		if ((fpexc & FPEXC_EN) &&
+				(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
+			vfp_raise_sigfpe(0, task_pt_regs(current));
+
 		/*
 		 * Always disable VFP so we can lazily save/restore the
 		 * old state.
-- 
1.7.6


^ permalink raw reply related	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03  8:36 ` Kautuk Consul
@ 2012-02-03 13:27   ` Dave Martin
  -1 siblings, 0 replies; 16+ messages in thread
From: Dave Martin @ 2012-02-03 13:27 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 03, 2012 at 02:06:24PM +0530, Kautuk Consul wrote:
> There is a lot of ARM/VFP hardware which does not invoke the
> undefined instruction exception handler (i.e. __und_usr) at all
> even in case of an invalid floating point operation in usermode.
> For example, 100.0 divided by 0.0 will not lead to the undefined
> instruction exception being called by the processor although the
> hardware does reliably set the DZC bit in the FPSCR.

Which revision of the architecture are you referring to?

In VFPv2, I believe that exception trapping is architecturally required
to work: a CPU which doesn't trap when the corresponding exxception
enable bits in the FPSCR are set is a wrong implementation.

For VFPv3 and VFPv4 (on ARMv7+), there are two variants specified by the
architecture: "U" variants VFPv3U/VFPv4U where trapping must work (as in
VFPv2); and non-trapping variants VFPv3/VFPv4.

The non-trapping variants are common: Cortex-A8 and A9 for example have
non-trapping VFP implementations.

The architecture specified that writes to those FPSCR bits must be
ignored, and they must always read as 0, in non-trapping implementations.

> Currently, the usermode code will simply proceed to execute without
> the SIGFPE signal being raised on that process thus limiting the
> developer's knowledge of what really went wrong in his code logic.

ISO C specifies that this is exactly what is supposed to happen by
default.  This is very counterintiutive, but try:

double x = 1.0;
double y = 0.0;

int main(void)
{
        return x / y;
}

...on any x86 box or other machine, and you should get no SIGFPE.

You can request trapping exceptions by calling the standard function
feenableexcept(), but ISO C allows that some implementations may not
support trapping exceptions: feenableexcept() can return an error to
indicate this.

Due to a completely separate bug in libc though, we don't return the
error indication.  The code seems to assume the VFPv2 behaviour, and
doesn't consider the possibility that setting those exception enable
bits may fail.

> This patch enables the kernel to raise the SIGFPE signal when the
> faulting thread next calls schedule() in any of the following ways:
> - When the thread is interrupted by IRQ
> - When the thread calls a system call
> - When the thread encounters exception
> 
> Although the SIGFPE is not delivered at the exact faulting instruction,
> this patch at least allows the system to handle FPU exception scenarios
> in a more generic "Linux/Unix" manner.

Whether this is a practical/sensible at all seems questionable:

 a) This must not be unconditional, because userspace programs can
    legitimately ask not to receive SIGFPE on floating-point errors
    (indeed, this is the ISO C default behaviour at program startup,
    before feenableexcept() is called).

 b) There is currently no way for a userspace program to signal to the
    kernel that it wants the signals, except for setting the FPSCR
    exception enable bits (which aren't implemented, and always read as
    zero in non-trapping implementations) -- so without userspace
    explicitly telling the kernel this behaviour is wanted via a special
    non-standard syscall, there is no way to know whether the signal
    should be sent or not.  The default assumption will have to be that
    the signals should not be sent.

 c) Although I can find no exlpicit wording in C or POSIX to say so, I
    believe that most or all code relying on floating-point traps will
    expect the signal to be precise (i.e., triggered on the exact
    instruction which caused the exception).  If we can't guarantee this,
    it's probably better to fix libc and not pretend we can send the
    signals at all.

Checking the cumulative exception bits is potentially useful for
debugging, but you can do that from userspace by calling functions like
fetestexcept(); or a debugger can do it via ptrace().

If we synthesise fake trapping exceptions at all, it would probably have
to be a debugging feature which must explicitly be turned on per process,
via a special prctl() or something.  Most other arches don't seem to
have that, though.  Should we really be putting such functionality in
the kernel?

Cheers
---Dave

> 
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
> Signed-off-by: Mohd. Faris <mohdfarisq2010@gmail.com>
> ---
>  arch/arm/include/asm/vfp.h |    2 ++
>  arch/arm/vfp/vfpmodule.c   |   15 ++++++++++++++-
>  2 files changed, 16 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
> index f4ab34f..a37c265 100644
> --- a/arch/arm/include/asm/vfp.h
> +++ b/arch/arm/include/asm/vfp.h
> @@ -71,6 +71,8 @@
>  #define FPSCR_UFC		(1<<3)
>  #define FPSCR_IXC		(1<<4)
>  #define FPSCR_IDC		(1<<7)
> +#define FPSCR_CUMULATIVE_EXCEPTION_MASK	\
> +		(FPSCR_IOC|FPSCR_DZC|FPSCR_OFC|FPSCR_UFC|FPSCR_IXC|FPSCR_IDC)
>  
>  /* MVFR0 bits */
>  #define MVFR0_A_SIMD_BIT	(0)
> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> index 8f3ccdd..39824a1 100644
> --- a/arch/arm/vfp/vfpmodule.c
> +++ b/arch/arm/vfp/vfpmodule.c
> @@ -52,6 +52,8 @@ unsigned int VFP_arch;
>   */
>  union vfp_state *vfp_current_hw_state[NR_CPUS];
>  
> +static void vfp_raise_sigfpe(unsigned int, struct pt_regs *);
> +
>  /*
>   * Is 'thread's most up to date state stored in this CPUs hardware?
>   * Must be called from non-preemptible context.
> @@ -72,8 +74,13 @@ static bool vfp_state_in_hw(unsigned int cpu, struct thread_info *thread)
>   */
>  static void vfp_force_reload(unsigned int cpu, struct thread_info *thread)
>  {
> +	unsigned int fpexc = fmrx(FPEXC);
> +
>  	if (vfp_state_in_hw(cpu, thread)) {
> -		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
> +		if ((fpexc & FPEXC_EN) &&
> +			(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
> +			vfp_raise_sigfpe(0, task_pt_regs(current));
> +		fmxr(FPEXC, fpexc & ~FPEXC_EN);
>  		vfp_current_hw_state[cpu] = NULL;
>  	}
>  #ifdef CONFIG_SMP
> @@ -100,6 +107,8 @@ static void vfp_thread_flush(struct thread_info *thread)
>  	cpu = get_cpu();
>  	if (vfp_current_hw_state[cpu] == vfp)
>  		vfp_current_hw_state[cpu] = NULL;
> +	if (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK)
> +		vfp_raise_sigfpe(0, task_pt_regs(current));
>  	fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
>  	put_cpu();
>  
> @@ -181,6 +190,10 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
>  			vfp_save_state(vfp_current_hw_state[cpu], fpexc);
>  #endif
>  
> +		if ((fpexc & FPEXC_EN) &&
> +				(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
> +			vfp_raise_sigfpe(0, task_pt_regs(current));
> +
>  		/*
>  		 * Always disable VFP so we can lazily save/restore the
>  		 * old state.
> -- 
> 1.7.6
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel at lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 13:27   ` Dave Martin
  0 siblings, 0 replies; 16+ messages in thread
From: Dave Martin @ 2012-02-03 13:27 UTC (permalink / raw)
  To: Kautuk Consul; +Cc: Russell King, Mohd. Faris, linux-kernel, linux-arm-kernel

On Fri, Feb 03, 2012 at 02:06:24PM +0530, Kautuk Consul wrote:
> There is a lot of ARM/VFP hardware which does not invoke the
> undefined instruction exception handler (i.e. __und_usr) at all
> even in case of an invalid floating point operation in usermode.
> For example, 100.0 divided by 0.0 will not lead to the undefined
> instruction exception being called by the processor although the
> hardware does reliably set the DZC bit in the FPSCR.

Which revision of the architecture are you referring to?

In VFPv2, I believe that exception trapping is architecturally required
to work: a CPU which doesn't trap when the corresponding exxception
enable bits in the FPSCR are set is a wrong implementation.

For VFPv3 and VFPv4 (on ARMv7+), there are two variants specified by the
architecture: "U" variants VFPv3U/VFPv4U where trapping must work (as in
VFPv2); and non-trapping variants VFPv3/VFPv4.

The non-trapping variants are common: Cortex-A8 and A9 for example have
non-trapping VFP implementations.

The architecture specified that writes to those FPSCR bits must be
ignored, and they must always read as 0, in non-trapping implementations.

> Currently, the usermode code will simply proceed to execute without
> the SIGFPE signal being raised on that process thus limiting the
> developer's knowledge of what really went wrong in his code logic.

ISO C specifies that this is exactly what is supposed to happen by
default.  This is very counterintiutive, but try:

double x = 1.0;
double y = 0.0;

int main(void)
{
        return x / y;
}

...on any x86 box or other machine, and you should get no SIGFPE.

You can request trapping exceptions by calling the standard function
feenableexcept(), but ISO C allows that some implementations may not
support trapping exceptions: feenableexcept() can return an error to
indicate this.

Due to a completely separate bug in libc though, we don't return the
error indication.  The code seems to assume the VFPv2 behaviour, and
doesn't consider the possibility that setting those exception enable
bits may fail.

> This patch enables the kernel to raise the SIGFPE signal when the
> faulting thread next calls schedule() in any of the following ways:
> - When the thread is interrupted by IRQ
> - When the thread calls a system call
> - When the thread encounters exception
> 
> Although the SIGFPE is not delivered at the exact faulting instruction,
> this patch at least allows the system to handle FPU exception scenarios
> in a more generic "Linux/Unix" manner.

Whether this is a practical/sensible at all seems questionable:

 a) This must not be unconditional, because userspace programs can
    legitimately ask not to receive SIGFPE on floating-point errors
    (indeed, this is the ISO C default behaviour at program startup,
    before feenableexcept() is called).

 b) There is currently no way for a userspace program to signal to the
    kernel that it wants the signals, except for setting the FPSCR
    exception enable bits (which aren't implemented, and always read as
    zero in non-trapping implementations) -- so without userspace
    explicitly telling the kernel this behaviour is wanted via a special
    non-standard syscall, there is no way to know whether the signal
    should be sent or not.  The default assumption will have to be that
    the signals should not be sent.

 c) Although I can find no exlpicit wording in C or POSIX to say so, I
    believe that most or all code relying on floating-point traps will
    expect the signal to be precise (i.e., triggered on the exact
    instruction which caused the exception).  If we can't guarantee this,
    it's probably better to fix libc and not pretend we can send the
    signals at all.

Checking the cumulative exception bits is potentially useful for
debugging, but you can do that from userspace by calling functions like
fetestexcept(); or a debugger can do it via ptrace().

If we synthesise fake trapping exceptions at all, it would probably have
to be a debugging feature which must explicitly be turned on per process,
via a special prctl() or something.  Most other arches don't seem to
have that, though.  Should we really be putting such functionality in
the kernel?

Cheers
---Dave

> 
> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
> Signed-off-by: Mohd. Faris <mohdfarisq2010@gmail.com>
> ---
>  arch/arm/include/asm/vfp.h |    2 ++
>  arch/arm/vfp/vfpmodule.c   |   15 ++++++++++++++-
>  2 files changed, 16 insertions(+), 1 deletions(-)
> 
> diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
> index f4ab34f..a37c265 100644
> --- a/arch/arm/include/asm/vfp.h
> +++ b/arch/arm/include/asm/vfp.h
> @@ -71,6 +71,8 @@
>  #define FPSCR_UFC		(1<<3)
>  #define FPSCR_IXC		(1<<4)
>  #define FPSCR_IDC		(1<<7)
> +#define FPSCR_CUMULATIVE_EXCEPTION_MASK	\
> +		(FPSCR_IOC|FPSCR_DZC|FPSCR_OFC|FPSCR_UFC|FPSCR_IXC|FPSCR_IDC)
>  
>  /* MVFR0 bits */
>  #define MVFR0_A_SIMD_BIT	(0)
> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
> index 8f3ccdd..39824a1 100644
> --- a/arch/arm/vfp/vfpmodule.c
> +++ b/arch/arm/vfp/vfpmodule.c
> @@ -52,6 +52,8 @@ unsigned int VFP_arch;
>   */
>  union vfp_state *vfp_current_hw_state[NR_CPUS];
>  
> +static void vfp_raise_sigfpe(unsigned int, struct pt_regs *);
> +
>  /*
>   * Is 'thread's most up to date state stored in this CPUs hardware?
>   * Must be called from non-preemptible context.
> @@ -72,8 +74,13 @@ static bool vfp_state_in_hw(unsigned int cpu, struct thread_info *thread)
>   */
>  static void vfp_force_reload(unsigned int cpu, struct thread_info *thread)
>  {
> +	unsigned int fpexc = fmrx(FPEXC);
> +
>  	if (vfp_state_in_hw(cpu, thread)) {
> -		fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
> +		if ((fpexc & FPEXC_EN) &&
> +			(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
> +			vfp_raise_sigfpe(0, task_pt_regs(current));
> +		fmxr(FPEXC, fpexc & ~FPEXC_EN);
>  		vfp_current_hw_state[cpu] = NULL;
>  	}
>  #ifdef CONFIG_SMP
> @@ -100,6 +107,8 @@ static void vfp_thread_flush(struct thread_info *thread)
>  	cpu = get_cpu();
>  	if (vfp_current_hw_state[cpu] == vfp)
>  		vfp_current_hw_state[cpu] = NULL;
> +	if (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK)
> +		vfp_raise_sigfpe(0, task_pt_regs(current));
>  	fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
>  	put_cpu();
>  
> @@ -181,6 +190,10 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
>  			vfp_save_state(vfp_current_hw_state[cpu], fpexc);
>  #endif
>  
> +		if ((fpexc & FPEXC_EN) &&
> +				(fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
> +			vfp_raise_sigfpe(0, task_pt_regs(current));
> +
>  		/*
>  		 * Always disable VFP so we can lazily save/restore the
>  		 * old state.
> -- 
> 1.7.6
> 
> 
> _______________________________________________
> linux-arm-kernel mailing list
> linux-arm-kernel@lists.infradead.org
> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03 13:27   ` Dave Martin
@ 2012-02-03 14:37     ` Kautuk Consul
  -1 siblings, 0 replies; 16+ messages in thread
From: Kautuk Consul @ 2012-02-03 14:37 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 3, 2012 at 6:57 PM, Dave Martin <dave.martin@linaro.org> wrote:
> On Fri, Feb 03, 2012 at 02:06:24PM +0530, Kautuk Consul wrote:
>> There is a lot of ARM/VFP hardware which does not invoke the
>> undefined instruction exception handler (i.e. __und_usr) at all
>> even in case of an invalid floating point operation in usermode.
>> For example, 100.0 divided by 0.0 will not lead to the undefined
>> instruction exception being called by the processor although the
>> hardware does reliably set the DZC bit in the FPSCR.
>
> Which revision of the architecture are you referring to?
>
> In VFPv2, I believe that exception trapping is architecturally required
> to work: a CPU which doesn't trap when the corresponding exxception
> enable bits in the FPSCR are set is a wrong implementation.
>
> For VFPv3 and VFPv4 (on ARMv7+), there are two variants specified by the
> architecture: "U" variants VFPv3U/VFPv4U where trapping must work (as in
> VFPv2); and non-trapping variants VFPv3/VFPv4.

Yes, my board is a VFPv3 on ARMv7+.

>
> The non-trapping variants are common: Cortex-A8 and A9 for example have
> non-trapping VFP implementations.

Yes, I am using a Cortex-A8.

>
> The architecture specified that writes to those FPSCR bits must be
> ignored, and they must always read as 0, in non-trapping implementations.

The **E bits(DZE/IOE/etc) are non-programmable on my system and they
are set to 0
however I try to play with them.
However, the **C bits (DZC/IOC/etc) get set properly whenever the
corresponding exception situation
occurs. The only programming I need to do for this is to enable the
FPEXC_EN bit.
Of course, FPEXC_EX is also non-programmable on my system.

>
>> Currently, the usermode code will simply proceed to execute without
>> the SIGFPE signal being raised on that process thus limiting the
>> developer's knowledge of what really went wrong in his code logic.
>
> ISO C specifies that this is exactly what is supposed to happen by
> default. ?This is very counterintiutive, but try:
>
> double x = 1.0;
> double y = 0.0;
>
> int main(void)
> {
> ? ? ? ?return x / y;
> }
>
> ...on any x86 box or other machine, and you should get no SIGFPE.

:).
Even with this patch the app will not get a SIGFPE.
The SIGFPE will be delivered anytime the application next gets preempted due to
a call to schedule() if any of the **C bits are set.

>
> You can request trapping exceptions by calling the standard function
> feenableexcept(), but ISO C allows that some implementations may not
> support trapping exceptions: feenableexcept() can return an error to
> indicate this.
>
> Due to a completely separate bug in libc though, we don't return the
> error indication. ?The code seems to assume the VFPv2 behaviour, and
> doesn't consider the possibility that setting those exception enable
> bits may fail.

Yup.
I read a few cases on the mailing lists where people seem to be
complaining about
no SIGFPE being raised for 0.0 / 0.0 for various embedded apps so I
considered that
this might be a common problem people face.

>
>> This patch enables the kernel to raise the SIGFPE signal when the
>> faulting thread next calls schedule() in any of the following ways:
>> - When the thread is interrupted by IRQ
>> - When the thread calls a system call
>> - When the thread encounters exception
>>
>> Although the SIGFPE is not delivered at the exact faulting instruction,
>> this patch at least allows the system to handle FPU exception scenarios
>> in a more generic "Linux/Unix" manner.
>
> Whether this is a practical/sensible at all seems questionable:
>
> ?a) This must not be unconditional, because userspace programs can
> ? ?legitimately ask not to receive SIGFPE on floating-point errors
> ? ?(indeed, this is the ISO C default behaviour at program startup,
> ? ?before feenableexcept() is called).

I understand.
Maybe we can control this functionality with a config option ?

>
> ?b) There is currently no way for a userspace program to signal to the
> ? ?kernel that it wants the signals, except for setting the FPSCR
> ? ?exception enable bits (which aren't implemented, and always read as
> ? ?zero in non-trapping implementations) -- so without userspace
> ? ?explicitly telling the kernel this behaviour is wanted via a special
> ? ?non-standard syscall, there is no way to know whether the signal
> ? ?should be sent or not. ?The default assumption will have to be that
> ? ?the signals should not be sent.
>

Hmm .. maybe we could control all of this via kernel config options
instead of the
user making so many hardware based decisions from usermode ?

> ?c) Although I can find no exlpicit wording in C or POSIX to say so, I
> ? ?believe that most or all code relying on floating-point traps will
> ? ?expect the signal to be precise (i.e., triggered on the exact
> ? ?instruction which caused the exception). ?If we can't guarantee this,
> ? ?it's probably better to fix libc and not pretend we can send the
> ? ?signals at all.
>

As far as the VFP exception handling is concerned, I found out from
the ARM Infocenter
website that the VFP can only raise imprecise exceptions via the
__und_usr trap which
will lead to the VFP_bounce.
Since this exception will be imprecise anyways, we also cannot pretend
that we can send
precise exceptions on ARM whatever C/POSIZ say. :)

> Checking the cumulative exception bits is potentially useful for
> debugging, but you can do that from userspace by calling functions like
> fetestexcept(); or a debugger can do it via ptrace().
>
> If we synthesise fake trapping exceptions at all, it would probably have
> to be a debugging feature which must explicitly be turned on per process,
> via a special prctl() or something. ?Most other arches don't seem to
> have that, though. ?Should we really be putting such functionality in
> the kernel?
>
> Cheers
> ---Dave
>
>>
>> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
>> Signed-off-by: Mohd. Faris <mohdfarisq2010@gmail.com>
>> ---
>> ?arch/arm/include/asm/vfp.h | ? ?2 ++
>> ?arch/arm/vfp/vfpmodule.c ? | ? 15 ++++++++++++++-
>> ?2 files changed, 16 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
>> index f4ab34f..a37c265 100644
>> --- a/arch/arm/include/asm/vfp.h
>> +++ b/arch/arm/include/asm/vfp.h
>> @@ -71,6 +71,8 @@
>> ?#define FPSCR_UFC ? ? ? ? ? ?(1<<3)
>> ?#define FPSCR_IXC ? ? ? ? ? ?(1<<4)
>> ?#define FPSCR_IDC ? ? ? ? ? ?(1<<7)
>> +#define FPSCR_CUMULATIVE_EXCEPTION_MASK ? ? ?\
>> + ? ? ? ? ? ? (FPSCR_IOC|FPSCR_DZC|FPSCR_OFC|FPSCR_UFC|FPSCR_IXC|FPSCR_IDC)
>>
>> ?/* MVFR0 bits */
>> ?#define MVFR0_A_SIMD_BIT ? ? (0)
>> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
>> index 8f3ccdd..39824a1 100644
>> --- a/arch/arm/vfp/vfpmodule.c
>> +++ b/arch/arm/vfp/vfpmodule.c
>> @@ -52,6 +52,8 @@ unsigned int VFP_arch;
>> ? */
>> ?union vfp_state *vfp_current_hw_state[NR_CPUS];
>>
>> +static void vfp_raise_sigfpe(unsigned int, struct pt_regs *);
>> +
>> ?/*
>> ? * Is 'thread's most up to date state stored in this CPUs hardware?
>> ? * Must be called from non-preemptible context.
>> @@ -72,8 +74,13 @@ static bool vfp_state_in_hw(unsigned int cpu, struct thread_info *thread)
>> ? */
>> ?static void vfp_force_reload(unsigned int cpu, struct thread_info *thread)
>> ?{
>> + ? ? unsigned int fpexc = fmrx(FPEXC);
>> +
>> ? ? ? if (vfp_state_in_hw(cpu, thread)) {
>> - ? ? ? ? ? ? fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
>> + ? ? ? ? ? ? if ((fpexc & FPEXC_EN) &&
>> + ? ? ? ? ? ? ? ? ? ? (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
>> + ? ? ? ? ? ? ? ? ? ? vfp_raise_sigfpe(0, task_pt_regs(current));
>> + ? ? ? ? ? ? fmxr(FPEXC, fpexc & ~FPEXC_EN);
>> ? ? ? ? ? ? ? vfp_current_hw_state[cpu] = NULL;
>> ? ? ? }
>> ?#ifdef CONFIG_SMP
>> @@ -100,6 +107,8 @@ static void vfp_thread_flush(struct thread_info *thread)
>> ? ? ? cpu = get_cpu();
>> ? ? ? if (vfp_current_hw_state[cpu] == vfp)
>> ? ? ? ? ? ? ? vfp_current_hw_state[cpu] = NULL;
>> + ? ? if (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK)
>> + ? ? ? ? ? ? vfp_raise_sigfpe(0, task_pt_regs(current));
>> ? ? ? fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
>> ? ? ? put_cpu();
>>
>> @@ -181,6 +190,10 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
>> ? ? ? ? ? ? ? ? ? ? ? vfp_save_state(vfp_current_hw_state[cpu], fpexc);
>> ?#endif
>>
>> + ? ? ? ? ? ? if ((fpexc & FPEXC_EN) &&
>> + ? ? ? ? ? ? ? ? ? ? ? ? ? ? (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
>> + ? ? ? ? ? ? ? ? ? ? vfp_raise_sigfpe(0, task_pt_regs(current));
>> +
>> ? ? ? ? ? ? ? /*
>> ? ? ? ? ? ? ? ?* Always disable VFP so we can lazily save/restore the
>> ? ? ? ? ? ? ? ?* old state.
>> --
>> 1.7.6
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel at lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 14:37     ` Kautuk Consul
  0 siblings, 0 replies; 16+ messages in thread
From: Kautuk Consul @ 2012-02-03 14:37 UTC (permalink / raw)
  To: Dave Martin; +Cc: Russell King, Mohd. Faris, linux-kernel, linux-arm-kernel

On Fri, Feb 3, 2012 at 6:57 PM, Dave Martin <dave.martin@linaro.org> wrote:
> On Fri, Feb 03, 2012 at 02:06:24PM +0530, Kautuk Consul wrote:
>> There is a lot of ARM/VFP hardware which does not invoke the
>> undefined instruction exception handler (i.e. __und_usr) at all
>> even in case of an invalid floating point operation in usermode.
>> For example, 100.0 divided by 0.0 will not lead to the undefined
>> instruction exception being called by the processor although the
>> hardware does reliably set the DZC bit in the FPSCR.
>
> Which revision of the architecture are you referring to?
>
> In VFPv2, I believe that exception trapping is architecturally required
> to work: a CPU which doesn't trap when the corresponding exxception
> enable bits in the FPSCR are set is a wrong implementation.
>
> For VFPv3 and VFPv4 (on ARMv7+), there are two variants specified by the
> architecture: "U" variants VFPv3U/VFPv4U where trapping must work (as in
> VFPv2); and non-trapping variants VFPv3/VFPv4.

Yes, my board is a VFPv3 on ARMv7+.

>
> The non-trapping variants are common: Cortex-A8 and A9 for example have
> non-trapping VFP implementations.

Yes, I am using a Cortex-A8.

>
> The architecture specified that writes to those FPSCR bits must be
> ignored, and they must always read as 0, in non-trapping implementations.

The **E bits(DZE/IOE/etc) are non-programmable on my system and they
are set to 0
however I try to play with them.
However, the **C bits (DZC/IOC/etc) get set properly whenever the
corresponding exception situation
occurs. The only programming I need to do for this is to enable the
FPEXC_EN bit.
Of course, FPEXC_EX is also non-programmable on my system.

>
>> Currently, the usermode code will simply proceed to execute without
>> the SIGFPE signal being raised on that process thus limiting the
>> developer's knowledge of what really went wrong in his code logic.
>
> ISO C specifies that this is exactly what is supposed to happen by
> default.  This is very counterintiutive, but try:
>
> double x = 1.0;
> double y = 0.0;
>
> int main(void)
> {
>        return x / y;
> }
>
> ...on any x86 box or other machine, and you should get no SIGFPE.

:).
Even with this patch the app will not get a SIGFPE.
The SIGFPE will be delivered anytime the application next gets preempted due to
a call to schedule() if any of the **C bits are set.

>
> You can request trapping exceptions by calling the standard function
> feenableexcept(), but ISO C allows that some implementations may not
> support trapping exceptions: feenableexcept() can return an error to
> indicate this.
>
> Due to a completely separate bug in libc though, we don't return the
> error indication.  The code seems to assume the VFPv2 behaviour, and
> doesn't consider the possibility that setting those exception enable
> bits may fail.

Yup.
I read a few cases on the mailing lists where people seem to be
complaining about
no SIGFPE being raised for 0.0 / 0.0 for various embedded apps so I
considered that
this might be a common problem people face.

>
>> This patch enables the kernel to raise the SIGFPE signal when the
>> faulting thread next calls schedule() in any of the following ways:
>> - When the thread is interrupted by IRQ
>> - When the thread calls a system call
>> - When the thread encounters exception
>>
>> Although the SIGFPE is not delivered at the exact faulting instruction,
>> this patch at least allows the system to handle FPU exception scenarios
>> in a more generic "Linux/Unix" manner.
>
> Whether this is a practical/sensible at all seems questionable:
>
>  a) This must not be unconditional, because userspace programs can
>    legitimately ask not to receive SIGFPE on floating-point errors
>    (indeed, this is the ISO C default behaviour at program startup,
>    before feenableexcept() is called).

I understand.
Maybe we can control this functionality with a config option ?

>
>  b) There is currently no way for a userspace program to signal to the
>    kernel that it wants the signals, except for setting the FPSCR
>    exception enable bits (which aren't implemented, and always read as
>    zero in non-trapping implementations) -- so without userspace
>    explicitly telling the kernel this behaviour is wanted via a special
>    non-standard syscall, there is no way to know whether the signal
>    should be sent or not.  The default assumption will have to be that
>    the signals should not be sent.
>

Hmm .. maybe we could control all of this via kernel config options
instead of the
user making so many hardware based decisions from usermode ?

>  c) Although I can find no exlpicit wording in C or POSIX to say so, I
>    believe that most or all code relying on floating-point traps will
>    expect the signal to be precise (i.e., triggered on the exact
>    instruction which caused the exception).  If we can't guarantee this,
>    it's probably better to fix libc and not pretend we can send the
>    signals at all.
>

As far as the VFP exception handling is concerned, I found out from
the ARM Infocenter
website that the VFP can only raise imprecise exceptions via the
__und_usr trap which
will lead to the VFP_bounce.
Since this exception will be imprecise anyways, we also cannot pretend
that we can send
precise exceptions on ARM whatever C/POSIZ say. :)

> Checking the cumulative exception bits is potentially useful for
> debugging, but you can do that from userspace by calling functions like
> fetestexcept(); or a debugger can do it via ptrace().
>
> If we synthesise fake trapping exceptions at all, it would probably have
> to be a debugging feature which must explicitly be turned on per process,
> via a special prctl() or something.  Most other arches don't seem to
> have that, though.  Should we really be putting such functionality in
> the kernel?
>
> Cheers
> ---Dave
>
>>
>> Signed-off-by: Kautuk Consul <consul.kautuk@gmail.com>
>> Signed-off-by: Mohd. Faris <mohdfarisq2010@gmail.com>
>> ---
>>  arch/arm/include/asm/vfp.h |    2 ++
>>  arch/arm/vfp/vfpmodule.c   |   15 ++++++++++++++-
>>  2 files changed, 16 insertions(+), 1 deletions(-)
>>
>> diff --git a/arch/arm/include/asm/vfp.h b/arch/arm/include/asm/vfp.h
>> index f4ab34f..a37c265 100644
>> --- a/arch/arm/include/asm/vfp.h
>> +++ b/arch/arm/include/asm/vfp.h
>> @@ -71,6 +71,8 @@
>>  #define FPSCR_UFC            (1<<3)
>>  #define FPSCR_IXC            (1<<4)
>>  #define FPSCR_IDC            (1<<7)
>> +#define FPSCR_CUMULATIVE_EXCEPTION_MASK      \
>> +             (FPSCR_IOC|FPSCR_DZC|FPSCR_OFC|FPSCR_UFC|FPSCR_IXC|FPSCR_IDC)
>>
>>  /* MVFR0 bits */
>>  #define MVFR0_A_SIMD_BIT     (0)
>> diff --git a/arch/arm/vfp/vfpmodule.c b/arch/arm/vfp/vfpmodule.c
>> index 8f3ccdd..39824a1 100644
>> --- a/arch/arm/vfp/vfpmodule.c
>> +++ b/arch/arm/vfp/vfpmodule.c
>> @@ -52,6 +52,8 @@ unsigned int VFP_arch;
>>   */
>>  union vfp_state *vfp_current_hw_state[NR_CPUS];
>>
>> +static void vfp_raise_sigfpe(unsigned int, struct pt_regs *);
>> +
>>  /*
>>   * Is 'thread's most up to date state stored in this CPUs hardware?
>>   * Must be called from non-preemptible context.
>> @@ -72,8 +74,13 @@ static bool vfp_state_in_hw(unsigned int cpu, struct thread_info *thread)
>>   */
>>  static void vfp_force_reload(unsigned int cpu, struct thread_info *thread)
>>  {
>> +     unsigned int fpexc = fmrx(FPEXC);
>> +
>>       if (vfp_state_in_hw(cpu, thread)) {
>> -             fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
>> +             if ((fpexc & FPEXC_EN) &&
>> +                     (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
>> +                     vfp_raise_sigfpe(0, task_pt_regs(current));
>> +             fmxr(FPEXC, fpexc & ~FPEXC_EN);
>>               vfp_current_hw_state[cpu] = NULL;
>>       }
>>  #ifdef CONFIG_SMP
>> @@ -100,6 +107,8 @@ static void vfp_thread_flush(struct thread_info *thread)
>>       cpu = get_cpu();
>>       if (vfp_current_hw_state[cpu] == vfp)
>>               vfp_current_hw_state[cpu] = NULL;
>> +     if (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK)
>> +             vfp_raise_sigfpe(0, task_pt_regs(current));
>>       fmxr(FPEXC, fmrx(FPEXC) & ~FPEXC_EN);
>>       put_cpu();
>>
>> @@ -181,6 +190,10 @@ static int vfp_notifier(struct notifier_block *self, unsigned long cmd, void *v)
>>                       vfp_save_state(vfp_current_hw_state[cpu], fpexc);
>>  #endif
>>
>> +             if ((fpexc & FPEXC_EN) &&
>> +                             (fmrx(FPSCR) & FPSCR_CUMULATIVE_EXCEPTION_MASK))
>> +                     vfp_raise_sigfpe(0, task_pt_regs(current));
>> +
>>               /*
>>                * Always disable VFP so we can lazily save/restore the
>>                * old state.
>> --
>> 1.7.6
>>
>>
>> _______________________________________________
>> linux-arm-kernel mailing list
>> linux-arm-kernel@lists.infradead.org
>> http://lists.infradead.org/mailman/listinfo/linux-arm-kernel

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03 14:37     ` Kautuk Consul
@ 2012-02-03 15:11       ` Russell King - ARM Linux
  -1 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-02-03 15:11 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 03, 2012 at 08:07:02PM +0530, Kautuk Consul wrote:
> The **E bits(DZE/IOE/etc) are non-programmable on my system and they
> are set to 0
> however I try to play with them.

If the E bits are always zero, your VFP is incapable of _signalling_
the corresponding exception conditions.  Or, to put it another way,
those exception conditions are always masked.

And, if DZE is always zero, which is the divide-by-zero exception, then
obviously it won't raise an exception when you _do_ ask it to divide by
zero.  Instead, it will just set the cumulative exception status.

And, again, obviously, if it doesn't raise an exception, there is no
way for the system to deliver a SIGFPE to the user process.

Userspace does need to deal with this - as Dave points out, having
feenableexcept() return an error of the *E bits can't be set would
seem to be the sensible thing to do.  Whether or not user programs
even use that call (most, I suspect don't) is a separate problem.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 15:11       ` Russell King - ARM Linux
  0 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-02-03 15:11 UTC (permalink / raw)
  To: Kautuk Consul; +Cc: Dave Martin, Mohd. Faris, linux-kernel, linux-arm-kernel

On Fri, Feb 03, 2012 at 08:07:02PM +0530, Kautuk Consul wrote:
> The **E bits(DZE/IOE/etc) are non-programmable on my system and they
> are set to 0
> however I try to play with them.

If the E bits are always zero, your VFP is incapable of _signalling_
the corresponding exception conditions.  Or, to put it another way,
those exception conditions are always masked.

And, if DZE is always zero, which is the divide-by-zero exception, then
obviously it won't raise an exception when you _do_ ask it to divide by
zero.  Instead, it will just set the cumulative exception status.

And, again, obviously, if it doesn't raise an exception, there is no
way for the system to deliver a SIGFPE to the user process.

Userspace does need to deal with this - as Dave points out, having
feenableexcept() return an error of the *E bits can't be set would
seem to be the sensible thing to do.  Whether or not user programs
even use that call (most, I suspect don't) is a separate problem.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03 13:27   ` Dave Martin
@ 2012-02-03 14:45     ` Mikael Pettersson
  -1 siblings, 0 replies; 16+ messages in thread
From: Mikael Pettersson @ 2012-02-03 14:45 UTC (permalink / raw)
  To: linux-arm-kernel

Dave Martin writes:
 > On Fri, Feb 03, 2012 at 02:06:24PM +0530, Kautuk Consul wrote:
 > > There is a lot of ARM/VFP hardware which does not invoke the
 > > undefined instruction exception handler (i.e. __und_usr) at all
 > > even in case of an invalid floating point operation in usermode.
 > > For example, 100.0 divided by 0.0 will not lead to the undefined
 > > instruction exception being called by the processor although the
 > > hardware does reliably set the DZC bit in the FPSCR.
 > 
 > Which revision of the architecture are you referring to?
 > 
 > In VFPv2, I believe that exception trapping is architecturally required
 > to work: a CPU which doesn't trap when the corresponding exxception
 > enable bits in the FPSCR are set is a wrong implementation.
 > 
 > For VFPv3 and VFPv4 (on ARMv7+), there are two variants specified by the
 > architecture: "U" variants VFPv3U/VFPv4U where trapping must work (as in
 > VFPv2); and non-trapping variants VFPv3/VFPv4.
 > 
 > The non-trapping variants are common: Cortex-A8 and A9 for example have
 > non-trapping VFP implementations.
 > 
 > The architecture specified that writes to those FPSCR bits must be
 > ignored, and they must always read as 0, in non-trapping implementations.

If it's documented that non-trapping implementations are allowed,
and that they can be detected by the FPSCR bits being immutable zeros,
then there is no problem here for the kernel to deal with.  It's simply
a buggy application or buggy libc that fails to check if the FPU is
trapping or not.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 14:45     ` Mikael Pettersson
  0 siblings, 0 replies; 16+ messages in thread
From: Mikael Pettersson @ 2012-02-03 14:45 UTC (permalink / raw)
  To: Dave Martin
  Cc: Kautuk Consul, linux-arm-kernel, Russell King, linux-kernel,
	Mohd. Faris

Dave Martin writes:
 > On Fri, Feb 03, 2012 at 02:06:24PM +0530, Kautuk Consul wrote:
 > > There is a lot of ARM/VFP hardware which does not invoke the
 > > undefined instruction exception handler (i.e. __und_usr) at all
 > > even in case of an invalid floating point operation in usermode.
 > > For example, 100.0 divided by 0.0 will not lead to the undefined
 > > instruction exception being called by the processor although the
 > > hardware does reliably set the DZC bit in the FPSCR.
 > 
 > Which revision of the architecture are you referring to?
 > 
 > In VFPv2, I believe that exception trapping is architecturally required
 > to work: a CPU which doesn't trap when the corresponding exxception
 > enable bits in the FPSCR are set is a wrong implementation.
 > 
 > For VFPv3 and VFPv4 (on ARMv7+), there are two variants specified by the
 > architecture: "U" variants VFPv3U/VFPv4U where trapping must work (as in
 > VFPv2); and non-trapping variants VFPv3/VFPv4.
 > 
 > The non-trapping variants are common: Cortex-A8 and A9 for example have
 > non-trapping VFP implementations.
 > 
 > The architecture specified that writes to those FPSCR bits must be
 > ignored, and they must always read as 0, in non-trapping implementations.

If it's documented that non-trapping implementations are allowed,
and that they can be detected by the FPSCR bits being immutable zeros,
then there is no problem here for the kernel to deal with.  It's simply
a buggy application or buggy libc that fails to check if the FPU is
trapping or not.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03  8:36 ` Kautuk Consul
@ 2012-02-03 13:32   ` Mikael Pettersson
  -1 siblings, 0 replies; 16+ messages in thread
From: Mikael Pettersson @ 2012-02-03 13:32 UTC (permalink / raw)
  To: linux-arm-kernel

Kautuk Consul writes:
 > There is a lot of ARM/VFP hardware which does not invoke the
 > undefined instruction exception handler (i.e. __und_usr) at all
 > even in case of an invalid floating point operation in usermode.
 > For example, 100.0 divided by 0.0 will not lead to the undefined
 > instruction exception being called by the processor although the
 > hardware does reliably set the DZC bit in the FPSCR.

There is hardware that pretends to support FP exceptions but
fails to actually do so?  That's awful.  What manufacturers
and CPU models are affected?

 > Currently, the usermode code will simply proceed to execute without
 > the SIGFPE signal being raised on that process thus limiting the
 > developer's knowledge of what really went wrong in his code logic.
 > 
 > This patch enables the kernel to raise the SIGFPE signal when the
 > faulting thread next calls schedule() in any of the following ways:
 > - When the thread is interrupted by IRQ
 > - When the thread calls a system call
 > - When the thread encounters exception
 > 
 > Although the SIGFPE is not delivered at the exact faulting instruction,
 > this patch at least allows the system to handle FPU exception scenarios
 > in a more generic "Linux/Unix" manner.

No, a SIGFPE delivered at the wrong point in time with the wrong
context in its sigframe is MUCH worse than not getting a SIGFPE
at all.  (And likewise for all other trap signals, SEGV, ILL, etc.)

I do agree that the application developer must be informed about
the breakage, but a fake SIGFPE is not appropriate.  Options:

1. a rate-limited printk KERN_WARN that application X tried to
receive FP exceptions but hardware Y failed to do so

2. a tunable knob somehwere that optionally kills the application
in the above case

/Mikael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 13:32   ` Mikael Pettersson
  0 siblings, 0 replies; 16+ messages in thread
From: Mikael Pettersson @ 2012-02-03 13:32 UTC (permalink / raw)
  To: Kautuk Consul; +Cc: Russell King, Mohd. Faris, linux-kernel, linux-arm-kernel

Kautuk Consul writes:
 > There is a lot of ARM/VFP hardware which does not invoke the
 > undefined instruction exception handler (i.e. __und_usr) at all
 > even in case of an invalid floating point operation in usermode.
 > For example, 100.0 divided by 0.0 will not lead to the undefined
 > instruction exception being called by the processor although the
 > hardware does reliably set the DZC bit in the FPSCR.

There is hardware that pretends to support FP exceptions but
fails to actually do so?  That's awful.  What manufacturers
and CPU models are affected?

 > Currently, the usermode code will simply proceed to execute without
 > the SIGFPE signal being raised on that process thus limiting the
 > developer's knowledge of what really went wrong in his code logic.
 > 
 > This patch enables the kernel to raise the SIGFPE signal when the
 > faulting thread next calls schedule() in any of the following ways:
 > - When the thread is interrupted by IRQ
 > - When the thread calls a system call
 > - When the thread encounters exception
 > 
 > Although the SIGFPE is not delivered at the exact faulting instruction,
 > this patch at least allows the system to handle FPU exception scenarios
 > in a more generic "Linux/Unix" manner.

No, a SIGFPE delivered at the wrong point in time with the wrong
context in its sigframe is MUCH worse than not getting a SIGFPE
at all.  (And likewise for all other trap signals, SEGV, ILL, etc.)

I do agree that the application developer must be informed about
the breakage, but a fake SIGFPE is not appropriate.  Options:

1. a rate-limited printk KERN_WARN that application X tried to
receive FP exceptions but hardware Y failed to do so

2. a tunable knob somehwere that optionally kills the application
in the above case

/Mikael

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03 13:32   ` Mikael Pettersson
@ 2012-02-03 14:06     ` Russell King - ARM Linux
  -1 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-02-03 14:06 UTC (permalink / raw)
  To: linux-arm-kernel

On Fri, Feb 03, 2012 at 02:32:14PM +0100, Mikael Pettersson wrote:
> No, a SIGFPE delivered at the wrong point in time with the wrong
> context in its sigframe is MUCH worse than not getting a SIGFPE
> at all.  (And likewise for all other trap signals, SEGV, ILL, etc.)

If your FP is pipelined, then you won't get a SIGFPE for a div0 situation
as soon as the instruction appears in the program.  Think about what's
happening.

1. The FP hardware may be occupied with a computation.
2. The program issues the divide instruction, the FP hardware accepts
   this.  Meanwhile, the integer part of the core continues processing
   instructions.
3. The FP hardware completes its computation, and gets to execute the
   divide instruction.
4. The FP hardware discovers a divide-by-zero situation, and flags it
   in its status register.

At this point, there's no way for the FP hardware to flag that situation
to the integer core (there's no interrupt.)  The failure gets flagged
when the program executes the next FP instruction, and is raised by the
FP hardware refusing to accept that instruction with an exception status.

The kernel can't back-track the execution path to find out where the
divide instruction was, and the hardware doesn't tell you what address
the instruction was (the FP hardware effectively just gets to listen to
the integer core's instruction stream when the integer core sees a
'co-processor' instruction.)

The only way FP would be able to abort a divide-by-zero immediately is
if it halted the integer core while it started the divide operation,
preventing it from running integer computations in parallel with the FP
hardware.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 14:06     ` Russell King - ARM Linux
  0 siblings, 0 replies; 16+ messages in thread
From: Russell King - ARM Linux @ 2012-02-03 14:06 UTC (permalink / raw)
  To: Mikael Pettersson
  Cc: Kautuk Consul, linux-arm-kernel, linux-kernel, Mohd. Faris

On Fri, Feb 03, 2012 at 02:32:14PM +0100, Mikael Pettersson wrote:
> No, a SIGFPE delivered at the wrong point in time with the wrong
> context in its sigframe is MUCH worse than not getting a SIGFPE
> at all.  (And likewise for all other trap signals, SEGV, ILL, etc.)

If your FP is pipelined, then you won't get a SIGFPE for a div0 situation
as soon as the instruction appears in the program.  Think about what's
happening.

1. The FP hardware may be occupied with a computation.
2. The program issues the divide instruction, the FP hardware accepts
   this.  Meanwhile, the integer part of the core continues processing
   instructions.
3. The FP hardware completes its computation, and gets to execute the
   divide instruction.
4. The FP hardware discovers a divide-by-zero situation, and flags it
   in its status register.

At this point, there's no way for the FP hardware to flag that situation
to the integer core (there's no interrupt.)  The failure gets flagged
when the program executes the next FP instruction, and is raised by the
FP hardware refusing to accept that instruction with an exception status.

The kernel can't back-track the execution path to find out where the
divide instruction was, and the hardware doesn't tell you what address
the instruction was (the FP hardware effectively just gets to listen to
the integer core's instruction stream when the integer core sees a
'co-processor' instruction.)

The only way FP would be able to abort a divide-by-zero immediately is
if it halted the integer core while it started the divide operation,
preventing it from running integer computations in parallel with the FP
hardware.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
  2012-02-03 14:06     ` Russell King - ARM Linux
@ 2012-02-03 14:41       ` Mikael Pettersson
  -1 siblings, 0 replies; 16+ messages in thread
From: Mikael Pettersson @ 2012-02-03 14:41 UTC (permalink / raw)
  To: linux-arm-kernel

Russell King - ARM Linux writes:
 > On Fri, Feb 03, 2012 at 02:32:14PM +0100, Mikael Pettersson wrote:
 > > No, a SIGFPE delivered at the wrong point in time with the wrong
 > > context in its sigframe is MUCH worse than not getting a SIGFPE
 > > at all.  (And likewise for all other trap signals, SEGV, ILL, etc.)
 > 
 > If your FP is pipelined, then you won't get a SIGFPE for a div0 situation
 > as soon as the instruction appears in the program.  Think about what's
 > happening.
 > 
 > 1. The FP hardware may be occupied with a computation.
 > 2. The program issues the divide instruction, the FP hardware accepts
 >    this.  Meanwhile, the integer part of the core continues processing
 >    instructions.
 > 3. The FP hardware completes its computation, and gets to execute the
 >    divide instruction.
 > 4. The FP hardware discovers a divide-by-zero situation, and flags it
 >    in its status register.
 > 
 > At this point, there's no way for the FP hardware to flag that situation
 > to the integer core (there's no interrupt.)  The failure gets flagged
 > when the program executes the next FP instruction, and is raised by the
 > FP hardware refusing to accept that instruction with an exception status.

It's OK to deliver on the next FP insn, that's a fairly common design
and not too difficult to handle (at least not for what I've used it for,
namely to use FP exns instead of explicit NaN/Inf checks in the runtime
of the Erlang VM, and to turn the FP exns into language-level exceptions).

My reaction was against delivering at some random point in the future
where possibly incorrect decisions have already been made, and where
the state is arbitrarily different from the one at the exn origin.

However, having read Dave Martin's response I now think the problem is
a buggy application or libc, and not something the kernel should care
about.

^ permalink raw reply	[flat|nested] 16+ messages in thread

* Re: [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation
@ 2012-02-03 14:41       ` Mikael Pettersson
  0 siblings, 0 replies; 16+ messages in thread
From: Mikael Pettersson @ 2012-02-03 14:41 UTC (permalink / raw)
  To: Russell King - ARM Linux
  Cc: Mikael Pettersson, Mohd. Faris, Kautuk Consul, linux-arm-kernel,
	linux-kernel

Russell King - ARM Linux writes:
 > On Fri, Feb 03, 2012 at 02:32:14PM +0100, Mikael Pettersson wrote:
 > > No, a SIGFPE delivered at the wrong point in time with the wrong
 > > context in its sigframe is MUCH worse than not getting a SIGFPE
 > > at all.  (And likewise for all other trap signals, SEGV, ILL, etc.)
 > 
 > If your FP is pipelined, then you won't get a SIGFPE for a div0 situation
 > as soon as the instruction appears in the program.  Think about what's
 > happening.
 > 
 > 1. The FP hardware may be occupied with a computation.
 > 2. The program issues the divide instruction, the FP hardware accepts
 >    this.  Meanwhile, the integer part of the core continues processing
 >    instructions.
 > 3. The FP hardware completes its computation, and gets to execute the
 >    divide instruction.
 > 4. The FP hardware discovers a divide-by-zero situation, and flags it
 >    in its status register.
 > 
 > At this point, there's no way for the FP hardware to flag that situation
 > to the integer core (there's no interrupt.)  The failure gets flagged
 > when the program executes the next FP instruction, and is raised by the
 > FP hardware refusing to accept that instruction with an exception status.

It's OK to deliver on the next FP insn, that's a fairly common design
and not too difficult to handle (at least not for what I've used it for,
namely to use FP exns instead of explicit NaN/Inf checks in the runtime
of the Erlang VM, and to turn the FP exns into language-level exceptions).

My reaction was against delivering at some random point in the future
where possibly incorrect decisions have already been made, and where
the state is arbitrarily different from the one at the exn origin.

However, having read Dave Martin's response I now think the problem is
a buggy application or libc, and not something the kernel should care
about.

^ permalink raw reply	[flat|nested] 16+ messages in thread

end of thread, other threads:[~2012-02-03 15:11 UTC | newest]

Thread overview: 16+ messages (download: mbox.gz follow: Atom feed
-- links below jump to the message on this page --
2012-02-03  8:36 [PATCH 1/1] arm: vfp: Raising SIGFPE on invalid floating point operation Kautuk Consul
2012-02-03  8:36 ` Kautuk Consul
2012-02-03 13:27 ` Dave Martin
2012-02-03 13:27   ` Dave Martin
2012-02-03 14:37   ` Kautuk Consul
2012-02-03 14:37     ` Kautuk Consul
2012-02-03 15:11     ` Russell King - ARM Linux
2012-02-03 15:11       ` Russell King - ARM Linux
2012-02-03 14:45   ` Mikael Pettersson
2012-02-03 14:45     ` Mikael Pettersson
2012-02-03 13:32 ` Mikael Pettersson
2012-02-03 13:32   ` Mikael Pettersson
2012-02-03 14:06   ` Russell King - ARM Linux
2012-02-03 14:06     ` Russell King - ARM Linux
2012-02-03 14:41     ` Mikael Pettersson
2012-02-03 14:41       ` Mikael Pettersson

This is an external index of several public inboxes,
see mirroring instructions on how to clone and mirror
all data and code used by this external index.