linuxppc-dev.lists.ozlabs.org archive mirror
 help / color / mirror / Atom feed
From: Benjamin Herrenschmidt <benh@kernel.crashing.org>
To: Sean MacLennan <smaclennan@pikatech.com>
Cc: linuxppc-dev <linuxppc-dev@ozlabs.org>
Subject: Re: Floating point in the kernel
Date: Fri, 11 Dec 2009 07:19:39 +1100	[thread overview]
Message-ID: <1260476379.16132.224.camel@pasglop> (raw)
In-Reply-To: <20091210131311.78cab78c@lappy.seanm.ca>

On Thu, 2009-12-10 at 13:13 -0500, Sean MacLennan wrote:
> One of our drivers has code that was originally running on a DSP. The
> code makes heavy use of floating point. We have isolated all the
> floating point to one kthread in the driver. Using enable_kernel_fp()
> this has worked well.
> 
> But under a specific heavy RTP load, we started getting kernel panics.
> To make a long story short, the scheduler disables FP when you are
> context switched out. When you come back and access a FP instruction,
> you trap and call load_up_fpu() and everything is fine..... unless you
> are in the kernel. If you are in the kernel, like our kthread is, you
> get a "kernel FP unavailable exception".

Right, you must not use floating point in the kernel -and- expect it to
survive schedule. You should use preempt_disable() and ensure you don't
schedule() around a block using the FP.

Note that you may also lose the FP register content if you schedule.

> Basically we got away with it for two years because the thread is at
> high priority (-20) and tries very hard to finish within 1ms. But the
> RTP high load causes us to context switch out and crash. The following
> patch fixes this:
> 
> diff --git a/arch/powerpc/kernel/head_booke.h b/arch/powerpc/kernel/head_booke.h
> index 50504ae..3476de9 100644
> --- a/arch/powerpc/kernel/head_booke.h
> +++ b/arch/powerpc/kernel/head_booke.h
> @@ -383,7 +383,7 @@ label:
>  #define FP_UNAVAILABLE_EXCEPTION                                             \
>         START_EXCEPTION(FloatingPointUnavailable)                             \
>         NORMAL_EXCEPTION_PROLOG;                                              \
> -       beq     1f;                                                           \
> +       /* SAM beq      1f; */                                          \
>         bl      load_up_fpu;            /* if from user, just load it up */   \
>         b       fast_exception_return;                                        \
>  1:     addi    r3,r1,STACK_FRAME_OVERHEAD;                                   \
> 
> With the patch we run fine, at the expense that we lose the ability to
> catch real FP unavailable exceptions in the kernel. It is because of
> this loss that I have not submitted this patch.

I'm not sure that will work in all cases, you are playing a bit with
fire :-) I suppose I could think it through after breakfast but my first
thought is "don't do that !". Among other things you may not have a
pt_regs to save the registers to.

> We also hit another problem under high RTP load... and this is the
> patch that fixes it:
> 
> diff --git a/arch/powerpc/kernel/fpu.S b/arch/powerpc/kernel/fpu.S
> index fc8f5b1..051a02c 100644
> --- a/arch/powerpc/kernel/fpu.S
> +++ b/arch/powerpc/kernel/fpu.S
> @@ -83,6 +83,11 @@ END_FTR_SECTION_IFSET(CPU_FTR_VSX)
>         stfd    fr0,THREAD_FPSCR(r4)
>         PPC_LL  r5,PT_REGS(r4)
>         toreal(r5)
> +
> +       /* Under heavy RTP load the hsp thread can have a NULL pt_regs. */
> +       PPC_LCMPI       0,r5,0
> +       beq     1f
> +

Right and that means you just lost the content of your FP registers.

>         PPC_LL  r4,_MSR-STACK_FRAME_OVERHEAD(r5)
>         li      r10,MSR_FP|MSR_FE0|MSR_FE1
>         andc    r4,r4,r10               /* disable FP for previous task */
> 
> So, if you are still reading this far, I am just looking for any
> suggestions. Are there better ways of handling this? Have I
> missed something? Anybody know why pt_regs might be NULL?

Just don't schedule when you enable_kernel_fp() or move your workload to
userspace :-)

Cheers,
Ben.

> Cheers,
>    Sean
> _______________________________________________
> Linuxppc-dev mailing list
> Linuxppc-dev@lists.ozlabs.org
> https://lists.ozlabs.org/listinfo/linuxppc-dev

  reply	other threads:[~2009-12-10 20:19 UTC|newest]

Thread overview: 8+ messages / expand[flat|nested]  mbox.gz  Atom feed  top
2009-12-10 18:13 Floating point in the kernel Sean MacLennan
2009-12-10 20:19 ` Benjamin Herrenschmidt [this message]
2009-12-10 20:33   ` Sean MacLennan
2009-12-10 20:56     ` Benjamin Herrenschmidt
2009-12-10 21:35     ` Arnd Bergmann
2009-12-11  0:17       ` Sean MacLennan
2009-12-11 11:28         ` Arnd Bergmann
2009-12-10 20:32 ` arnd

Reply instructions:

You may reply publicly to this message via plain-text email
using any one of the following methods:

* Save the following mbox file, import it into your mail client,
  and reply-to-all from there: mbox

  Avoid top-posting and favor interleaved quoting:
  https://en.wikipedia.org/wiki/Posting_style#Interleaved_style

* Reply using the --to, --cc, and --in-reply-to
  switches of git-send-email(1):

  git send-email \
    --in-reply-to=1260476379.16132.224.camel@pasglop \
    --to=benh@kernel.crashing.org \
    --cc=linuxppc-dev@ozlabs.org \
    --cc=smaclennan@pikatech.com \
    /path/to/YOUR_REPLY

  https://kernel.org/pub/software/scm/git/docs/git-send-email.html

* If your mail client supports setting the In-Reply-To header
  via mailto: links, try the mailto: link
Be sure your reply has a Subject: header at the top and a blank line before the message body.
This is a public inbox, see mirroring instructions
for how to clone and mirror all data and code used for this inbox;
as well as URLs for NNTP newsgroup(s).